[Apologies for multiple postings]
We are happy to announce that 1 new phonetic database and 1 new speech
corpus are available in our catalogue.
*Comprehensive Arabic Phonetic Database
<https://catalog.elra.info/en-us/repository/browse/ELRA-S0493/>*
ISLRN: 511-751-240-544-8 <https://islrn.org/resources/511-751-240-544-8/>
The Comprehensive Arabic Phonetic Database is a robust and detailed
linguistic resource offering both phonemic and phonetic transcriptions,
precisely reflecting how Modern Standard Arabic words are realized in
actual speech. It is a highly comprehensive and accurate Arabic
phonetic/phonemic database, covering over 329,000 entries, including
over 61,000 general vocabulary entries, 101,000 Arab personal names,
143,000 foreign personal names in Arabic and 21,000 worldwide place
names both Arab and non-Arab. Each entry consists of canonical forms
both vocalized and unvocalized (as in natural language) accompanied by
phonetic transcriptions in IPA and X-SAMPA and the user-friendly CARS
phonemic transcription system. Additionally, unique features include
explicit indication of vowel neutralization, accurate word stress,
gender and number codes (singular or plural), and POS (part-of-speech)
codes. The database is provided in a flat TSV text file.
See also the *DiaLEX
<https://catalog.elra.info/en-us/repository/search/?q=dialex>* and
*ArabLEX <https://catalog.elra.info/en-us/repository/search/?q=arablex>*
collections for Arabic from the same provider…
*EthioSpeech
<https://catalog.elra.info/en-us/repository/browse/ELRA-S0494/>*
ISLRN:886-456-351-764-8 <https://islrn.org/resources/886-456-351-764-8/>
EthioSpeech Corpora is comprised of over 391 hours of recorded read
speech in six different Ethiopian languages by ca. 200 speakers per
language: Amharic (68 hours), Tigrigna (62 hours), Oromo (70 hours),
Somali (56 hours), Afar (68 hours), and Sidama (68 hours). The
dominating domain is media (mainly newspapers), but for some of the
languages texts from different domains were used, including spiritual
contents. The recording is made using mobile devices using the
LIG-Aikuma speech recording tool that is installed on the devices. The
gender and age balance of readers is nearly equal for Amharic, Tigrigna
and Oromo, whereas mainly male gender for the other 3 languages. The age
distribution is between 18 and 40.
For more information on the catalogue or if you would like to enquire
about having your resources distributed by ELRA, please *contact us*
<mailto:cont...@elda.org>.
_________________________________________
Visit the *ELRA Catalogue of Language Resources* <http://catalog.elra.info>
*Archives *
<https://www.elra.info/catalogues/language-resources-announcements/>of
ELRA Language Resources Catalogue Updates