[Our apologies if you have received multiple copies of this announcement.]
We are happy to announce that a new set of 15 Written Corpora is now
available in our catalogue.
*_Arabic-English, Arabic-French, Chinese-English and Chinese-French
Written Parallel Corpora:_*
This set of 15 written corpora was produced by ELDA within PEA TRAD, a
project supported by the French Ministry of Defence (DGA). Available
resources are listed below (click on the links for further details).
*ELRA-W0098 TRAD Arabic-French Newspaper Parallel corpus - Test set 1*
*ISLRN: **922-732-502-473-8 <http://islrn.org/resources/922-732-502-473-8/>*
This is a parallel corpus of 10,000 words in Arabic and 4 reference
translations in French. The source texts are articles collected in 2012
from the Arabic version of Le Monde Diplomatique.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1278
*ELRA-W0099 TRAD Arabic-English Newspaper Parallel corpus - Test set 1*
*ISLRN:****764-187-795-074-0*
<http://islrn.org/resources/764-187-795-074-0/>
This is a parallel corpus of 10,000 words in Arabic and 2 reference
translations in English. The source texts are articles collected in 2012
from the Arabic version of Le Monde Diplomatique.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1279
*ELRA-W0100 TRAD Arabic-French Newspaper Parallel corpus - Test set 2*
*ISLRN:***
<http://islrn.org/resources/918-508-885-913-7/>*722-323-886-920-3*
<http://islrn.org/resources/722-323-886-920-3/>
This is a parallel corpus of 10,000 words in Arabic and 2 reference
translations in French. The source texts are articles collected in May
2013 from the Arabic version of Le Monde Diplomatique.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1280
*ELRA-W0101 TRAD Arabic-French Parallel corpus of transcribed Broadcast
News Speech*
*ISLRN:***
<http://islrn.org/resources/918-508-885-913-7/>*862-201-329-808-4
<http://islrn.org/resources/862-201-329-808-4/>*
This is a parallel corpus of 10,000 words in Arabic and 4 reference
translations in French. The source texts are transcriptions of broadcast
news in Arabic recorded on France 24.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1281
*ELRA-W0102 TRAD Arabic-English Parallel corpus of transcribed Broadcast
News Speech*
*ISLRN: **812-050-111-234-9* <http://islrn.org/resources/812-050-111-234-9/>
This is a parallel corpus of 10,000 words in Arabic and 2 reference
translations in English. The source texts are transcriptions of
broadcast news in Arabic recorded on France 24.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1282
*ELRA-W0103 TRAD Arabic-French Web domain (blogs) Parallel corpus*
*ISLRN:* *138-395-895-757-7 <http://islrn.org/resources/138-395-895-757-7/>*
This is a parallel corpus of 10,000 words in Arabic and 4 reference
translations in French. The source texts are blog articles from 2008 to
2013.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1283
*ELRA-W0104 TRAD Arabic-English Web domain (blogs) Parallel corpus*
*ISLRN: **762-161-069-435-5 <http://islrn.org/resources/762-161-069-435-5/>*
This is a parallel corpus of 10,000 words in Arabic and 2 reference
translations in English. The source texts are blog articles from 2008 to
2013.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1284
*ELRA-W0105 TRAD Arabic-French Mailing lists Parallel corpus - Test set*
*ISLRN: **895-850-015-188-4* <http://islrn.org/resources/895-850-015-188-4/>
This is a parallel corpus of 10,000 words in Arabic and 4 reference
translations in French. The source texts are emails collected from
Wikiar-I, a mailing list for discussions about the Arabic Wikipedia.
Emails are dated from 2010 to 2012.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1285
*ELRA-W0106 TRAD Arabic-English Mailing lists Parallel corpus - Test set*
*ISLRN:* *858-529-510-480-2 <http://islrn.org/resources/858-529-510-480-2/>*
This is a parallel corpus of 10,000 words in Arabic and 2 reference
translations in English. The source texts are emails collected from
Wikiar-I, a mailing list for discussions about the Arabic Wikipedia.
Emails are dated from 2010 to 2012.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1286
*ELRA-W0107 TRAD Arabic-French Mailing lists Parallel corpus -
Development set *
*ISLRN: **333-026-450-858-0 <http://islrn.org/resources/333-026-450-858-0/>*
This is a parallel corpus of 10,000 words in Arabic and a reference
translation in French. The source texts are emails collected from
Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The
collected emails are dated from 2004 to 2007.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1287*
ELRA-W0108 TRAD Arabic-English Mailing lists Parallel corpus -
Development set *
*ISLRN: **213-044-240-074-6 <http://islrn.org/resources/213-044-240-074-6/>*
This is a parallel corpus of 10,000 words in Arabic and a reference
translation in English. The source texts are emails collected from
Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The
collected emails are dated from 2004 to 2007.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1288
*ELRA-W0109 TRAD Chinese-French Web domain (blogs) Parallel corpus *
*ISLRN:* *464-017-697-777-3 <http://islrn.org/resources/464-017-697-777-3/>*
This is a parallel corpus of 15,000 characters in Chinese (equivalent to
10,000 words) and 2 reference translations in French. The source texts
are blog articles dealing with various subjects such as economy,
environment, society, technologies, etc. Articles are dated from June 2013.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1289
*ELRA-W0110 TRAD Chinese-English Web domain (blogs) Parallel corpus*
*ISLRN:* *982-341-079-331-4* <http://islrn.org/resources/982-341-079-331-4/>
This is a parallel corpus of 15,000 characters in Chinese (equivalent to
10,000 words) and 2 reference translations in English. The source texts
are blog articles dealing with various subjects such as economy,
environment, society, technologies, etc. Articles are dated from June 2013.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1290
*ELRA-W0111 TRAD Chinese-French News Articles Parallel corpus*
*ISLRN:* *153-566-144-442-2* <http://islrn.org/resources/153-566-144-442-2/>
This is a parallel corpus of 15,000 characters in Chinese (equivalent to
10,000 words) and 2 reference translations in French. The source texts
are newspaper articles from the Chinese version of Voice of America.
Articles are dated from 2011 and 2012.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1291
*ELRA-W0112 TRAD Chinese-English News Articles Parallel corpus*
*ISLRN:* *626-096-751-907-7* <http://islrn.org/resources/626-096-751-907-7/>
This is a parallel corpus of 15,000 characters in Chinese (equivalent to
10,000 words) and 2 reference translations in English. The source texts
are newspaper articles from the Chinese version of Voice of America.
Articles are dated from 2011 and 2012.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1292
___________________________________________
Previous releases from the same project were related to *Pashto*
language and are listed below:
*ELRA-S0381 TRAD Pashto Broadcast News Speech Corpus*
*ISLRN: **918-508-885-913-7 *
<http://islrn.org/resources/918-508-885-913-7/>
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1265
*ELRA-W0092 TRAD Pashto Monolingual text Corpus*
*ISLRN: **394-903-293-388-0* <http://islrn.org/resources/394-903-293-388-0/>
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1266
*ELRA-W0093 TRAD Pashto-French Parallel corpus of transcribed Broadcast
News Speech - Training data*
*ISLRN: **802-643-297-429-4 <http://islrn.org/resources/802-643-297-429-4/>*
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1267
*ELRA-W0094 TRAD Pashto-French Parallel corpus of transcribed Broadcast
News Speech - Test data*
*ISLRN: **547-897-479-723-3 *
<http://islrn.org/resources/547-897-479-723-3/>
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1268
*ELRA-W0095 TRAD Pashto-English Parallel corpus of transcribed Broadcast
News Speech - Test data*
*ISLRN: **006-102-605-738-4* <http://islrn.org/resources/006-102-605-738-4/>
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1269
*ELRA-W0096 TRAD Pashto-French News Articles Parallel corpus*
*ISLRN: 649-628-149-051-7 <http://islrn.org/resources/649-628-149-051-7/>
*
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1270
*ELRA-W0097 TRAD Pashto-English News Articles Parallel corpus*
*ISLRN: 612-936-517-010-2 <http://islrn.org/resources/612-936-517-010-2/>
*
For more information,
see:**http://catalog.elra.info/product_info.php?products_id=1271
For more information on the catalogue, please contact Valérie Mapelli
mailto:[email protected]
If you would like to enquire about having your resources distributed by
ELRA, please do not hesitate to contact us.
Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates:
http://www.elra.info/en/catalogues/language-resources-announcements/
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list