[Apologies for multiple postings]
We are happy to announce that 1 new written corpus, 1 new monolingual
lexicon and 2 new speech resources are now available in our catalogue.
Corpus for fine-grained analysis and automatic detection of irony on
Twitter <https://catalogue.elra.info/en-us/repository/browse/ELRA-W0337/>
ISLRN: 478-366-550-085-8 <http://www.islrn.org/resources/478-366-550-085-8>
This corpus was annotated by trained annotators (Master’s students in
Linguistics) using a detailed annotation scheme for irony
categorization, which describes four labels: ‘ironic by means of a
polarity contrast’, ‘situational irony’, ‘other verbal irony’ and ‘not
ironic’. It consists of 4791 instances with an irony label and a tweet ID.
Bitext Synonym Data - General Language
<https://catalogue.elra.info/en-us/repository/browse/ELRA-L0202/>
ISLRN: 470-885-612-363-1 <http://www.islrn.org/resources/470-885-612-363-1>
The Bitext Synonym Data - General Language includes 31,723 entries and
more than 100,000 synonyms for English language. This dataset is a set
of synonyms developed to augment the English version of Wordnet, a
powerful open-source
lexical database, released in 2005. All synonyms can be linked to Bitext
Lexical Data - English (see ELRA-L0140) for lemmatization, POS and
morphological information.
Corpus of Spontaneous Japanese (CSJ)
<https://catalog.elra.info/en-us/repository/browse/ELRA-S0488/>
ISLRN: 280-594-494-328-0 <https://islrn.org/resources/280-594-494-328-0/>
The "Corpus of Spontaneous Japanese" (or CSJ) contains about 650 hours
of spontaneous speech that correspond to about 7000k words. All these
speech materials are recorded using head-worn close-talking microphones
and DAT, and down-sampled to 16kHz, 16bit accuracy. The speech material
is transcribed both at orthographic and phonetic levels. In addition,
segment label, intonation label, and other miscellaneous annotations are
provided for a subset of CSJ, called the Core, which contains about 500k
words or 45 hours of speech.
EWA-DB – Early Warning of Alzheimer speech database
<https://catalogue.elra.info/en-us/repository/browse/ELRA-S0489/>
ISLRN: 730-022-142-264-9 <http://www.islrn.org/resources/730-022-142-264-9>
EWA-DB is a speech database that contains data from 3 clinical groups:
Alzheimer's disease, Parkinson's disease, mild cognitive impairment, and
a control group of healthy subjects. Speech samples of each clinical
group were obtained using the EWA smartphone application, which contains
4 different language tasks: sustained vowel phonation, diadochokinesis,
object and action naming (30 objects and 30 actions), picture
description (two single pictures and three complex pictures). The total
number of speakers in the database is 1649. Of these, there are 87
people with Alzheimer's disease, 175 people with Parkinson's disease, 62
people with mild cognitive impairment, 2 people with a mixed diagnosis
of Alzheimer's + Parkinson's disease and 1323 healthy controls.
For more information on the catalogue or if you would like to enquire
about having your resources distributed by ELRA, please contact us
<mailto:cont...@elda.org>.
_________________________________________
Visit the ELRA Catalogue of Language Resources <http://catalog.elra.info>
Visit the Universal Catalogue <http://universal.elra.info>
Archives of ELRA Language Resources Catalogue Updates
<http://www.elra.info/en/catalogues/language-resources-announcements>
--
_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info