“The manuscript heritage left by this civilization is considered, in comparison with the legacy of ancient civilizations, especially the Greek and Latin ones, the largest scientific and intellectual heritage known in the history of the human civilizations [1]. Manuscripts were kept during periods of the Islamic caliphate in the coffers of Islamic libraries such as the House of Wisdom in Baghdad and the libraries of the Abbasid and Umayyad caliphs and others.[2] However, this great heritage was exposed during wars, strife, and raids on the Islamic world to destruction, loss, and burning, as happened during the Tatar march on Baghdad, so the manuscripts were thrown into the Tigris River and burned until the waters of the Tigris turned black from the ink of books and turned red from the blood of the dead [3]. Natural factors such as humidity, light rays, heat, chemical, and biological factors also represent a threat to the preservation of manuscripts. Believing in the responsibility of preserving heritage, preserving it and protecting it from tampering, damage, and loss, manuscripts today are finding increasing interest from countries, institutions, and centers in the Western and Islamic worlds, where special departments have been established for restoration, preservation, maintenance, treatment and digitization, and the preservation of heritage has established rules, methods, and measures. Technical means and information technology have also been employed in the service of the manuscript heritage, and several technical projects have emerged to serve the Arabic manuscript. We will present the most important of them in this article. The most important digital projects in the service of manuscripts were represented by Dr. Muhammad Hosni Yahya, in his lecture at the scientific symposium: “Technical Projects in the Service of the Arabic Language,” which was set up by the Academy of the Arabic Language in Makkah Al-Mukarramah. Dr. Yahya summarized the most important areas of manuscript digitization in the following [4]:
1- Scanning and digital imaging Scanning and digital imaging of manuscripts allows transferring the balance of manuscripts on an electronic medium that helps the user to view the digital manuscript without the need to refer to the original manuscript, which protects the manuscript from damage or burning, and so on. It also helps researchers to access it remotely, reducing the search effort and the cost of obtaining the manuscript [2]. The competent authorities have set accurate standards for scanning manuscripts (400 points/inch) and appropriate settings for the scanner in terms of accuracy, colors, and the output file type. [5]
2- Digitization of manuscript catalogs: The indexes are represented in recording the metadata of the manuscript such as its title, author’s name, date of copying, type of font, number of papers, lines on each side, and other data. Several indexes have been digitized, including:
Sites and databases for browsing and downloading manuscripts:
There are several websites and digital libraries that provide complete manuscripts for browsing and downloading, and other groups and forums, including examples:
Wadood Manuscript Center: a site specialized in manuscripts, categorized by topics –
Manuscript catalog group on Telegram
4- Automatic text recognition techniques
This technology consists in automatically recognizing the manuscript text and converting it into an emblematic text that can be changed and modified by Optical Character Recognition. Dr. Muhammad Hosni Yahya summarizes the most important stages in the attached picture. The Latin and Chinese scripts have made great progress in recognizing the handwriting and historical manuscripts, while the Arabic calligraphy was not developed for several reasons that you find detailed in our article. But the Zangi application came to solve this problem, and it now provides the possibility of automatic identification of the manuscript with several types of fonts, such as the Naskh font, the commentary font, the Moroccan font, and other fonts. The application also provides automatic identification of texts from both modern and old lithographic publications and others.
5- Image search based on content
This is done by searching for words in the manuscript using word images, which is suitable for those who search for key words in manuscripts such as hadith texts, or to know the beginning and end of chapters, and so on. One of the useful programs for this is the SIAT system program, which allows browsing the manuscript, manipulating page images, and offers a search feature using an image of the word. Here we also point out that the application of Zenki this search process a lot, it provides, after automatic identification of the manuscript text, the possibility of searching in the text marked and will provide soon, God willing, the possibility of searching in the manuscript itself by means of texts and not by images.
Conclusion
Efforts to digitize manuscripts have multiplied in recent years, and the Zangi application is considered a building block of the digital edifice that serves the Arab and Islamic heritage, as it complements the services of digital imaging, scanning, digital indexing, databases, and more. Zangi represents an effective solution to automatically identify the text of the illustrated manuscripts and convert them automatically and in a very fast time into an editable and convertible text, and so on.
References
[1] Ahmed Shawqi Binbin, “In the Arabic manuscript book”, p. 21
[2] Omar Bin Araj, “Modern Mechanisms of Preserving the Arabic Manuscript Between Reality and Prospects”, Rafoof Magazine, No. IV
[3] Abd al-Aziz bin Muhammad al-Misfir, “The Arabic manuscript and some of its cases,” citing the previous source, p. 83
[4] The Scientific Symposium: “Technical Projects in the Service of the Arabic Language”, Digitizing Arabic Manuscripts, The Arabic Language Academy, Makkah Al-Mukarramah, 07/2021
[5] Technical Standards for Digital Conversion
Of Text and Graphic Materials, The Library of Congress,
[6] “Electronic Sources for Text Verification”, Dr. Mahmoud Zaki, Alukah Network