top of page

OCR-PT-CT

Semi-automatic transcription of ancient Egyptian hieroglyphic documents

This project has conceived, designed, and developed a digital toolset to perform optical character recognition (OCR) of ancient Egyptian hieroglyphs on the Pyramid Texts (PT) and the Coffin Texts (CT) to provide a semi-automatic transcription of the original text into a standard code known as Manuel de Codage (MdC). Besides the technical challenge, the OCR-PT-CT project might enable researchers to search for textual and sign parallels much more efficiently.

OCR System2.jpg

A proof of concept

From March to December 2022, the OCR-PT-CT project (PIUAH21/AH-036), funded by Universidad de Alcalá and in synergy with the MORTEXVAR project (Comunidad de Madrid) and the GEINTRA and CIARQ research groups, assured the quality of input data by using the text editions of current reference in Egyptological research. Access to these editions has been granted by the Oriental Institute (University of Chicago) and James P. Allen (Brown University, Providence).

Thanks to its interdisciplinary team (Egyptology and Engineering), the OCR-PT-CT project has implemented a task sequence that constantly considers the flexibility and range of the data set regarding its possible usability with different complex writing systems.

The OCR-PT-CT project has proposed an OCR system adapted to the chosen corpus to permit minimum manual encoding. The project has tried techniques for segmenting the hieroglyphic script in these texts and classification systems based on deep neural networks. This will allow the researchers to interactively check the chosen corpus without manually encoding much of the text at the sentence level.

Logo_OCR.png

The OCR-PT-CT team

Hr_DEF.jpg

Daniel Pizarro Pérez

Engineer

Foto carnet Sira_edited.png

Sira Palazuelos Cagigas

Engineer

Laura_edited.png

Laura de Diego Otón

Engineering student

Logo_OCR.png

Adin Bartoli

Engineer

Cesar_Guerra_edited.png

César Guerra Méndez

Egyptologist & IT

foto carnet high_edited.png

Álvaro Hernández Alonso

Engineer

FotoRNieto_HD_edited.png

Rubén Nieto Capuchino

Engineer

foto_recortada_edited_edited.png

Patricia Cuesta Ruiz

Engineering student

Foto Carlos Gracia 3_edited_edited_edite

Carlos Gracia Zamacona

Egyptologist (PI)

Former members

Beatriz Noria

Jónatan Ortiz

Sika Perdersen

The sources

Allen, J.P. 2006a. The Egyptian Coffin Texts, VIII: Middle Kingdom copies of Pyramid Texts (Oriental Institute Publications 132). Chicago: University of Chicago.

Allen, J.P. 2006b. A new concordance of the Pyramid Texts I-VI. Providence: Brown University.

De Buck, A. 1935-1961. The Egyptian Coffin Texts I-VII (Oriental Institute Publications 24, 49, 64, 67, 73, 81 & 87). Chicago: University of Chicago.

References

Barucci, A., Cucci, C., Franci, M., Loschiavo, M. & Argenti, F., A Deep Learning Approach to Ancient Egyptian Hieroglyphs Classification, IEEE Acesss 9 (2021), 1-10. (doi 10.1109/ACCESS.2021.3110082)

Barucci, A., Amendola, M., Argenti, F., Canfailla, Ch., Cucci, C., Guidi, T., Python, L., Franci, M. Discovering the ancient Egyptian hieroglyphs with Deep Learning. Rome: Consiglio Nazionale delle Ricerche (CNR), 2023.

Van den Berg, H. 1997. “Manuel de Codage”: A standard system for the computer-encoding of Egyptian transliteration and hieroglyphic texts

Cruz Cavalieri D., Bastos-Filho T., Palazuelos-Cagigas S., Sarcinelli-Filho, M. 2015. On Combining Language Models to Improve a Text-based Human-machine Interface. International Journal of Advanced Robotic Systems 12/170: 1-14. (doi 10.5772/61753)

Cruz Cavalieri D., Palazuelos-Cagigas S., Bastos-Filho T., Sarcinelli-Filho, M. 2016. Combination of Language Models for Word Prediction: An Exponential Approach. IEEE/ACM Transactions on Audio, Speech, and Language Processing 99 (doi 10.1109/TASLP.2016.2547743).

Gardiner, A.H. 1957. Egyptian grammar. Being an introduction to the study of hieroglyphs. Oxford / Londres: Griffith Institute / Oxford University Press.

Gracia Zamacona, C. 2013. A database for the Coffin Texts. In S. Polis & J. Winand (eds.), Texts, languages and information technology in Egyptology (Aegyptiaca Leodiensia 9). Lieja: Presses Universitaires de Liège, 139-155.

Gracia Zamacona, C. & J. Ortiz-García. 2021. Handbook of digital Egyptology: Texts (Monografías de Oriente Antiguo 1). Alcalá de Henares: Universidad de Alcalá.

Hu, R., Gayol, C. P., Odobez, J. M. & Gatica-Perez, D. 2017. Analyzing and visualizing ancient Maya hieroglyphics using shape: From computer vision to Digital Humanities. Digital Scholarship in the Humanities 32 (suppl. 2): 179-194.

Nederhof, M.J. & F. Rahman. 2017. A probabilistic model of ancient Egyptian writing. Journal of Language Modelling 5/1: 131-163.

The Hieroglyphic initiative.

Chung, J. & Delteil, T. (2019). A computationally efficient pipeline approach to full page offline handwritten text recognition. IEEE (ed.), 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW) 5: 35-40.

Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B. & Cohen, S. 2018. Start, follow, read: End-to-end full-page handwriting recognition. Proceedings of the European Conference on Computer Vision (ECCV), 367-383.

Yang, L., Wang, P., Li, H., Li, Z. & Zhang, Y. 2020. A holistic representation guided attention network for scene text recognition. Neurocomputing 414: 67-75.

News

Collaboration

On the 7th of October 2022, the OCR-PT-CT project (Universidad de Alcalá) started collaborating with the Museo Arqueológico Nacional (MAN) in Madrid to produce 3D digital models of Ancient Egyptian materials. A big thank you to Esther Pons and Isabel Olbés, curators of the Egyptian collection, and Andrés Carretero, director of the MAN, for their interest in technology-based research and dissemination approaches to ancient Egypt.

https://www.mortexvar.com/ocr-pt-ct

 

Left to right: Carlos Gracia, Daniel Pizarro, Esther Pons, Isabel Olbés, Sira Palazuelos and Álvaro Hernandez.

20221007_093559.jpg

ICAENT 2

Presentation of the OCR-PT-CT project at the 2nd edition of the International Conference Ancient Egypt New Technology held at the University of Naples "L'Orientale" (5-7 July 2023).

ICAENT 2 Programme

With the support of

Logo UAH_edited.jpg
OI_PrimaryLogo_RGB_Color.png
Logotipo_del_Gobierno_de_la_Comunidad_de_Madrid.svg.png

Hieroglyphs by Jsesh

bottom of page