We are very happy to announce the twenty-third release of annotated
treebanks in Universal Dependencies, v2.17, available at
https://universaldependencies.org/.
Universal Dependencies is a project that seeks to develop
cross-linguistically consistent treebank annotation for many languages
with the goal of facilitating multilingual parser development,
cross-lingual learning, and parsing research from a language typology
perspective (de Marneffe et al., 2021; Nivre et al., 2020). The
annotation scheme is based on (universal) Stanford dependencies (de
Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags
(Petrov et al., 2012), and the Interset interlingua for morphosyntactic
tagsets (Zeman, 2008). The general philosophy is to provide a universal
inventory of categories and guidelines to facilitate consistent
annotation of similar constructions across languages, while allowing
language-specific extensions when necessary.
The *339* treebanks in v2.17 are annotated according to version 2 of the
UD guidelines and represent the following *186* languages: Abaza,
Abkhaz, Afrikaans, Akkadian, Akuntsu, Albanian, Alemannic, Amharic,
Ancient Greek, Ancient Hebrew, Apurina, Arabic, Armenian, Assyrian,
Azerbaijani, Bambara, Basque, Bavarian, Beja, Belarusian, Bengali,
Bhojpuri, Bokota, Bororo, Breton, Bulgarian, Buryat, Cantonese,
Cappadocian, Catalan, Cebuano, Central Kurdish, Chinese, Chintang,
Chukchi, Classical Armenian, Classical Chinese, Coptic, Croatian, Czech,
Danish, Dutch, Egyptian, English, Erzya, Esperanto, Estonian, Faroese,
Finnish, French, Frisian Dutch, Galician, Georgian, German, Gheg,
Gothic, Greek, Guajajara, Guarani, Gujarati, Gwichin, Haitian Creole,
Hausa, Hebrew, Highland Puebla Nahuatl, Hindi, Hittite, Hungarian,
Icelandic, Ika, Indonesian, Irish, Italian, Japanese, Javanese, Kaapor,
Kangri, Karelian, Karo, Kazakh, Khoekhoe, Khunsari, Kiche, Komi Permyak,
Komi Zyrian, Korean, Kyrgyz, Latgalian, Latin, Latvian, Ligurian,
Lithuanian, Livvi, Low Saxon, Luxembourgish, Macedonian, Madi, Maghrebi
Arabic French, Makurap, Malayalam, Maltese, Manx, Marathi, Mbya Guarani,
Middle French, Moksha, Munduruku, Naga, Naija, Nayini, Neapolitan,
Nenets, Nheengatu, North Sami, Northern Kurdish, Northwest Gbaya,
Norwegian, Occitan, Odia, Old Church Slavonic, Old East Slavic, Old
English, Old French, Old Irish, Old Occitan, Old Turkish, Ottoman
Turkish, Pashto, Paumari, Persian, Pesh, Phrygian, Polish, Pomak,
Portuguese, Romanian, Russian, Sanskrit, Scottish Gaelic, Serbian,
Shanghainese, Sicilian, Sindhi, Sinhala, Skolt Sami, Slovak, Slovenian,
Soi, South Levantine Arabic, Southern Kurdish, Spanish, Spanish Sign
Language, Swedish, Swedish Sign Language, Tagalog, Tamil, Tatar, Teko,
Telugu, Telugu English, Thai, Tswana, Tupinamba, Turkish, Turkish
English, Turkish German, Ukrainian, Umbrian, Upper Sorbian, Urdu,
Uyghur, Uzbek, Veps, Vietnamese, Warlpiri, Welsh, Western Armenian,
Western Sierra Puebla Nahuatl, Wolof, Xavante, Xibe, Yakut, Yiddish,
Yoruba, Yupik and Zaar. The 186 languages belong to *35* families:
Afro-Asiatic, Arawakan, Arawan, Austro-Asiatic, Austronesian, Basque,
Bororoan, Chibchan, Chukotko-Kamchatkan, Code switching, Constructed,
Creole, Dravidian, Eskimo-Aleut, Indo-European, Japanese, Kartvelian,
Khoe-Kwadi, Korean, Macro-Je, Mande, Mayan, Mongolic, Na-Dene,
Niger-Congo, Northwest Caucasian, Pama-Nyungan, Sign Language,
Sino-Tibetan, Tai-Kadai, Tungusic, Tupian, Turkic, Uralic and
Uto-Aztecan. Depending on the language, the treebanks range in size from
less than 1,000 tokens to over 3 million tokens. We expect the next
release to be available in May 2026.
The size of the following 44 treebanks changed significantly since the
last release (note that UD_Occitan-CorAG has been recategorized under
another language code as UD_Old_Occitan-CorAG; also, the Kurmanji
language has been renamed to Northern Kurdish, and
UD_Northern_Kurdish-Kurmanji is just a new name for UD_Kurmanji-MG):
Alemannic DIVITAL : 0 → 19743
Ancient Hebrew PTNK : 90770 → 145866
Armenian ArmTDP : 52585 → 104293
Armenian BSUT : 41805 → 46168
Central Kurdish Mukri : 0 → 765
Chintang CTNTB : 0 → 14631
Egyptian UJaen : 21927 → 24375
English CHILDES : 226470 → 302740
English LittlePrince : 0 → 6852
Esperanto Prago : 839 → 3165
French ALTS : 43832 → 68088
French PoitevinDIVITAL : 0 → 5484
Georgian GNC : 18747 → 22547
Greek Lesbian : 3333 → 5926
Greek Messinian : 0 → 899
Hausa WesternAutogramm : 0 → 13888
Ika ChibErgIS : 3706 → 5307
Italian KIParlaForest : 0 → 9348
Korean KSL : 108072 → 137122
Middle French ALTM : 0 → 7084
Middle French PROFITEROLE: 68454 → 89809
Nenets Tundra : 651 → 1272
Nheengatu CompLin : 21813 → 26033
Northern Kurdish Kurmanji: 0 → 10259
Occitan CorAG : 37585 → 0
Old East Slavic Ruthenian: 111502 → 137333
Old French ALTM : 0 → 15285
Old Occitan CorAG : 0 → 45389
Ottoman Turkish DUDU : 10287 → 17125
Pashto Sikaram : 2515 → 4067
Polish MPDT : 0 → 47273
Romanian MolDoRo : 0 → 241
Shanghainese ShUD : 0 → 8584
Sicilian STB : 0 → 11204
Sindhi Isra : 15741 → 95227
Southern Kurdish Garrusi : 0 → 1685
Swedish Old : 0 → 507
Swedish SweLL : 0 → 8644
Thai TUD : 0 → 77215
Ukrainian ParlaMint : 84189 → 109166
Umbrian IKUVINA : 786 → 1162
Uzbek UzUDT : 0 → 7542
Yiddish YiTB : 0 → 27879
Zaar Autogramm : 17682 → 20758
In total, the new release contains 2313529 sentences, 37133071 surface
tokens and 37877213 syntactic words.
Daniel Zeman, Joakim Nivre, Mitchell Abrams, Elia Ackermann, Jephtey
Adolphe, Noëmi Aepli, Hamid Aghaei, Željko Agić, Amir Ahmadi, Lars
Ahrenberg, Chika Kennedy Ajede, Arofat Akhundjanova, Furkan Akkurt,
Gabrielė Aleksandravičiūtė, Ika Alfina, Avner Algom, Khalid Alnajjar,
Chiara Alzetta, Antonios Anastasopoulos, Erik Andersen, Kirk Andrews,
Matthew Andrews, Lene Antonsen, Tatsuya Aoyama, Katya Aplonova, Angelina
Aquino, Carolina Aragon, Glyd Aranes, Maria Jesus Aranzabe, Bilge Nas
Arıcan, Þórunn Arnardóttir, Wirote Aroonmanakun, Gashaw Arutie, Jessica
Naraiswari Arwidarasti, Hiwa Asadpour, Masayuki Asahara, Katla
Ásgeirsdóttir, Deniz Baran Aslan, Cengiz Asmazoğlu, Luma Ateyah, Furkan
Atmaca, Mohammed Attia, Aitziber Atutxa, Liesbeth Augustinus, Mariana
Avelãs, Elena Badmaeva, Jana Bajorat, Keerthana Balasubramani, Miguel
Ballesteros, Esha Banerjee, Sebastian Bank, Bryan Khelven da Silva
Barbosa, Verginica Barbu Mititelu, Starkaður Barkarson, Rodolfo Basile,
Victoria Basmov, Colin Batchelor, John Bauer, Seyyit Talha Bedir,
Shabnam Behzad, Nathanaël Beiner, Juan Belieni, Alevtina Bémová, Kepa
Bengoetxea, İbrahim Benli, Yifat Ben Moshe, Marie Benzerrak, Aleksandrs
Berdicevskis, Ansu Berg, Gözde Berk, Delphine Bernhard, Astrid Berntsson
Ingelstam, Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Agnė
Bielinskienė, Esma Fatıma Bilgin Taşdemir, Helin Binici, Kristín
Bjarnadóttir, Verena Blaschke, Rogier Blokland, Nina Böbel, Victoria
Bobicev, Loïc Boizou, Stavros Bompolas, Johnatan Bonilla, Emanuel Borges
Völker, Lars Borin, Carl Börstell, Cristina Bosco, Gosse Bouma, Sam
Bowman, Adriane Boyd, Anouck Braggaar, António Branco, Myriam Bras, Théo
Brillet, Kristina Brokaitė, Lanni Bu, Eva Buráňová, Aljoscha Burchardt,
Carmen Cabeza, Natalia Cáceres Arandia, Olesea Caftanatov, Marisa
Campos, Marie Candito, Caterina Maria Cappello, Bernard Caron, Gauthier
Caron, Catarina Carvalheiro, Rita Carvalho, Lauren Cassidy, Maria Clara
Castro, Sérgio Castro, Tatiana Cavalcanti, Gülşen Cebiroğlu Eryiğit,
Flavio Massimiliano Cecchini, Giuseppe G. A. Celano, Anila Çepani,
Slavomír Čéplö, Neslihan Cesur, Savas Cetin, Özlem Çetinoğlu, Fabricio
Chalub, Liyanage Chamila, Claudine Chamoreau, Shweta Chauhan, Yifei
Chen, Ethan Chi, Taishi Chika, Yongseok Cho, Jinho Choi, Bermet
Chontaeva, Jayeol Chun, Juyeon Chung, Alessandra T. Cignarella, Silvie
Cinková, Esther Cocco, Aurélie Collomb, Çağrı Çöltekin, Miriam Connor,
Claudia Corbetta, Daniela Corbetta, Francisco Costa, Marine Courtin,
Benoît Crabbé, Mihaela Cristescu, Vladimir Cvetkoski, Netanel Dahan,
Ingerid Løyning Dale, Sabrina D'Alì, Philemon Daniel, Khensa Daoudi,
Bijayalaxmi Dash, Satya Ranjan Dash, Elizabeth Davidson, Leonel
Figueiredo de Alencar, Mathieu Dehouck, Martina de Laurentiis,
Marie-Catherine de Marneffe, Ahmet Demir, Valeria de Paiva, Mehmet Oguz
Derin, Elvis de Souza, Arantza Diaz de Ilarraza, Roberto Antonio Díaz
Hernández, Carly Dickerson, Ariani Di Felippo, Arawinda Dinakaramani,
Elisa Di Nuovo, Bamba Dione, Peter Dirix, Hoa Do, Kaja Dobrovoljc,
Caroline Döhmer, Adrian Doyle, Timothy Dozat, Kira Droganova, Magali
Sanches Duran, Puneet Dwivedi, Christian Ebert, Hanne Eckhoff, Masaki
Eguchi, Sandra Eiche, Roald Eiselen, Marhaba Eli, Ali Elkahky, Binyam
Ephrem, Olga Erina, Tomaž Erjavec, Louise Esher, Soudabeh Eslami, Farah
Essaidi, Aline Etienne, Wograine Evelyn, Sidney Facundes, Richárd
Farkas, Ján Faryad, Federica Favero, Jannatul Ferdaousi, Marília
Fernanda, Hector Fernandez Alcalde, Amal Fethi, Jennifer Foster, Barbara
Francioni, Theodorus Fransen, Cláudia Freitas, Kazunori Fujita, Katarína
Gajdošová, Daniel Galbraith, Edith Galy, Federica Gamba, Marcos Garcia,
José María García-Miguel, Moa Gärdenfors, Tanja Gaustad, Efe Eren Genç,
Fabrício Ferraz Gerardi, Kim Gerdes, Luke Gessler, Filip Ginter, Gustavo
Godoy, Iakes Goenaga, Koldo Gojenola, Memduh Gökırmak, Yoav Goldberg,
Gili Goldin, Xavier Gómez Guinovart, Berta González Saavedra, Mathieu
Goux, Bernadeta Griciūtė, Matias Grioni, Loïc Grobol, Normunds Grūzītis,
Mario Guglielmetti, Bruno Guillaume, Kirian Guiller, Céline
Guillot-Barbance, Tunga Güngör, Vladimir Gurevich, Nizar Habash, Hinrik
Hafsteinsson, Michael Hahn, Jan Hajič, Jan Hajič jr., Eva Hajičová, Mika
Hämäläinen, Linh Hà Mỹ, Na-Rae Han, Muhammad Yudistira Hanifmuti,
Takahiro Harada, Sam Hardwick, Kim Harris, Naïma Hassert, Dag Haug, Jiří
Havelka, Johannes Heinecke, Oliver Hellwig, Felix Hennig, Barbora
Hladká, Jaroslava Hlaváčová, Florinel Hociung, Diana Hoefels, Barbara
Hoff, Petter Hohle, Nick Howell, Yidi Huang, Marivel Huerta Mendez, Jena
Hwang, Takumi Ikeda, Inessa Iliadou, Anton Karl Ingason, Radu Ion, Elena
Irimia, Ọlájídé Ishola, Artan Islamaj, Kaoru Ito, Federica Iurescia,
Jessica K. Ivani, Sandra Jagodzińska, Siratun Jannat, Tomáš Jelínek,
Apoorva Jha, Ratanon Jiamsundutsadee, Katharine Jiang, Sylvanus Job,
Mayank Jobanputra, Anders Johannsen, Hildur Jónsdóttir, Fredrik
Jørgensen, Zhuoxuan Ju, Markus Juutinen, Hüner Kaşıkara, Nadezhda
Kabaeva, Sylvain Kahane, Hiroshi Kanayama, Jenna Kanerva, Neslihan Kara,
Ritván Karahóǧa, Jiří Kárník, Andre Kåsen, Tolga Kayadelen, Sarveswaran
Kengatharaiyer, Václava Kettnerová, Lilit Kharatyan, Jesse Kirchner,
Elena Klementieva, Elena Klyachko, Petr Kocharov, Arne Köhn, Abdullatif
Köksal, Veronika Kolářová, Kamil Kopacewicz, Timo Korkiakangas, Mehmet
Köse, Alexey Koshevoy, Nelda Kote, Natalia Kotsyba, Barbara Kovačić,
Jolanta Kovalevskaitė, Emmanuelle Kowner, Simon Krek, Parameswari
Krishnamurthy, Sandra Kübler, Lucie Kučová, Adrian Kuqi, Elmurod
Kuriyozov, Oğuzhan Kuyrukçu, Aslı Kuzgun, Sookyoung Kwak, Kris Kyle,
Käbi Laan, Veronika Laippala, Lorenzo Lambertino, Israel Landau, Tatiana
Lando, Septina Dian Larasati, Pierre Larrivée, Kusum Lata, Alexei
Lavrentiev, John Lee, Phương Lê Hồng, Alessandro Lenci, Wei Qi Leong,
Saran Lertpradit, Herman Leung, Lori Levin, Maria Levina, Lauren Levine,
Cheuk Ying Li, Josie Li, Keying Li, Yixuan Li, Yuan Li, KyungTae Lim,
Bruna Lima Padovani, Yi-Ju Jessica Lin, Krister Lindén, Yang Janet Liu,
Zoey Liu, Nikola Ljubešić, Irina Lobzhanidze, Olga Loginova, Markéta
Lopatková, Lucelene Lopes, Edita Luftiu, Arsenii Lukashevskyi, Stefano
Lusito, Anne-Marie Lutgen, Andry Luthfi, Mikko Luukko, Olga
Lyashevskaya, Teresa Lynn, Vivien Macketanz, Menel Mahamdi, Jean
Maillard, Punyanuch Maitreenukul, Ilya Makarchuk, Aibek Makazhanov,
Francesco Mambrini, Michael Mandl, Christopher Manning, Ruli Manurung,
Büşra Marşan, Cătălina Mărănduc, David Mareček, Katrin Marheinecke,
Stella Markantonatou, Héctor Martínez Alonso, Lorena Martín Rodríguez,
André Martins, Cláudia Martins, Arianna Masciolini, Jan Mašek, Sanatbek
Matlatipov, Hiroshi Matsuda, Yuji Matsumoto, Caterina Mauri, Alessandro
Mazzei, Ryan McDonald, Sarah McGuinness, Maitrey Mehta, Pierre André
Ménard, Gustavo Mendonça, Hilla Merhav, Tatiana Merzhevich, Paul Meurer,
Niko Miekka, Marie Mikulová, Emilia Milano, Aleksandra Miletić, Aaron
Miller, Junghyun Min, Yael Minerbi, Jiří Mírovský, Karina Mischenkova,
Anna Missilä, Cătălin Mititelu, Maria Mitrofan, Yusuke Miyao,
Biswakalpita Mohapatra, AmirHossein Mojiri Foroushani, Judit Molnár,
Amirsaeid Moloodi, Simonetta Montemagni, Amir More, Laura Moreno Romero,
Giovanni Moretti, Shinsuke Mori, Tomohiko Morioka, Shigeki Moro, Bjartur
Mortensen, Bohdan Moskalevskyi, Katerina Mouzou, Kadri Muischnek, Robert
Munro, Yugo Murawaki, Nikolett Mus, Kaili Müürisep, Pinkey Nainwani,
Mariam Nakhlé, Minoo Nassajian, Juan Ignacio Navarro Horñiacek, Anna
Nedoluzhko, Gunta Nešpore-Bērzkalne, Manuela Nevaci, Lương Nguyễn Thị,
Huyền Nguyễn Thị Minh, Yoshihiro Nikaido, Vitaly Nikolaev, Rattima
Nitisaroj, Victor Norrman, Alireza Nourian, Michal Novák, Maria das
Graças Volpe Nunes, Hanna Nurmi, Stina Ojala, Atul Kr. Ojha, Hulda
Óladóttir, Adédayọ̀ Olúòkun, Mai Omura, Emeka Onwuegbuzia, Noam Ordan,
Petya Osenova, Robert Östling, Annika Ott, Lilja Øvrelid, Masanori Oya,
Şaziye Betül Özateş, Merve Özçelik, Arzucan Özgür, Balkız Öztürk
Başaran, Teresa Paccosi, Petr Pajas, Thomas Palakapilly, Alessio Palmero
Aprosio, Jarmila Panevová, Ludovica Pannitto, Anastasia Panova, Thiago
Alexandre Salgueiro Pardo, Shantipriya Parida, Hyunji Hayley Park, Niko
Partanen, Elena Pascual, Marco Passarotti, Agnieszka Patejuk, Guilherme
Paulino-Passos, Giulia Pedonese, Oggi Peeters, Angelika Peljak-Łapińska,
Siyao Peng, Siyao Logan Peng, Rita Pereira, Sílvia Pereira,
Cenel-Augusto Perez, Natalia Perkova, Guy Perrier, Slav Petrov, Daria
Petrova, Eva Pettersson, Andrea Peverelli, Jason Phelan, Claudel
Pierre-Louis, Jussi Piitulainen, Yuval Pinter, Clara Pinto, Rodrigo
Pintucci, Tommi A Pirinen, Emily Pitler, Magdalena Plamada, Barbara
Plank, Alistair Plum, Thierry Poibeau, Charin Polpanumas, Larisa
Ponomareva, Martin Popel, Clamença Poujade, Rangga Prangwedana
Prangwedana, Lauma Pretkalniņa, Rigardt Pretorius, Sophie Prévost,
Prokopis Prokopidis, Adam Przepiórkowski, Robert Pugh, Tiina
Puolakainen, Christoph Purschke, Sampo Pyysalo, Peng Qi, Andreia
Querido, Andriela Rääbis, Ella Rabinovich, Alexandre Rademaker, Mutee-u
Rahman, Mizanur Rahoman, Taraka Rama, Loganathan Ramasamy, Carlos
Ramisch, Joana Ramos, Fam Rashel, Mohammad Sadegh Rasooli, Vinit
Ravishankar, Livy Real, Petru Rebeja, Siva Reddy, Mathilde Regnault,
Georg Rehm, Arij Riabi, Ivan Riabov, Michael Rießler, Erika Rimkutė,
Larissa Rinaldi, Laura Rituma, Putri Rizqiyah, Luisa Rocha, Eiríkur
Rögnvaldsson, Ivan Roksandic, Norton Trevisan Roman, Mykhailo Romanenko,
Natalia Romanova, Rudolf Rosa, Valentin Roșca, Paulette Roulon, Davide
Rovati, Ben Rozonoyer, Olga Rudina, Jack Rueter, Paolo Ruffolo, Kristján
Rúnarsson, Rozana Rushiti, Attapol T. Rutherford, Shoval Sadde, Pegah
Safari, Aleksi Sahala, Kalyanamalini Sahoo, Saraswati Sahoo, Shadi
Saleh, Alessio Salomoni, Tanja Samardžić, Konstantinos Sampanis,
Stephanie Samson, Xulia Sánchez-Rodríguez, Manuela Sanguinetti, Ezgi
Sanıyar, Dage Särg, Marta Sartor, Albina Sarymsakova, Mitsuya Sasaki,
Baiba Saulīte, Agata Savary, Yanin Sawanakunanon, Shefali Saxena, Kevin
Scannell, Salvatore Scarlata, Emmanuel Schang, Robert Schikowski, Nathan
Schneider, Sebastian Schuster, Lane Schwartz, Djamé Seddah, Wolfgang
Seeker, Sven Sellmer, Mojgan Seraji, Magda Ševčíková, Petr Sgall, Syeda
Shahzadi, Mo Shen, Atsuko Shimada, Gyu-Ho Shin, Hiroyuki Shirasu, Yana
Shishkina, Muh Shohibussirri, Maria Shvedova, Jean Sibille, Janine
Siewert, Einar Freyr Sigurðsson, João Silva, Aline Silveira, Natalia
Silveira, Sara Silveira, Maria Simi, Radu Simionescu, Katalin Simkó,
Mária Šimková, Haukur Barri Símonarson, Kiril Simov, Dmitri Sitchinava,
Ted Sither, Aaron Smith, Isabela Soares-Bastos, Per Erik Solberg,
Dolores Sollberger, Barbara Sonnenhauser, Shafi Sourov, Nina Speransky,
Rachele Sprugnoli, Panyut Sriwirote, Vivian Stamou, Steinþór
Steingrímsson, Antonio Stella, Jan Štěpánek, Barbora Štěpánková, Abishek
Stephen, Milan Straka, Omer Strass, Emmett Strickland, Jana Strnadová,
Sara Stymne, Alane Suhr, Yogi Lesmana Sulestio, Umut Sulubacak, Jack
Sun, Hakyung Sung, Shingo Suzuki, Daniel Swanson, Zsolt Szántó, Maria
Irena Szawerna, Chihiro Taguchi, Dima Taji, Luigi Talamo, Fabio
Tamburini, Mary Ann C. Tan, Takaaki Tanaka, Dipta Tanaya, Mirko Tavoni,
Nursena Teker, Samson Tella, Isabelle Tellier, Marinella Testori,
Santhawat Thanyawong, Guillaume Thomas, Tarık Emre Tıraş, William
Chandra Tjhi, Thea Tollersrud, Kamil Tomaszek, Sara Tonelli, Liisi
Torga, Lucas Toribio, Marsida Toska, Trond Trosterud, Anna Trukhina,
Reut Tsarfaty, Kira Tulchynska, Utku Türk, Francis Tyers, Sveinbjörn
Þórðarson, Vilhjálmur Þorsteinsson, Sumire Uematsu, Roman Untilov,
Zdeňka Urešová, Larraitz Uria, Hans Uszkoreit, Andrius Utka, Elena
Vagnoni, Sowmya Vajjala, Socrates Vak, Socrates Vakirtzian, Rob van der
Goot, Martine Vanhove, Daniel van Niekerk, Gertjan van Noord, Viktor
Varga, Helena Vaz, Uliana Vedenina, Giulia Venturi, Marianne
Vergez-Couret, Annemarie Verkerk, Barbora Vidová Hladká, Eric Villemonte
de la Clergerie, Veronika Vincze, Anishka Vissamsetty, Natalia Vlasova,
Eleni Vligouridou, Aya Wakasa, Joel C. Wallenberg, Lars Wallin, Abigail
Walsh, John Wang, Jonathan North Washington, Leonie Weissweiler,
Maximilan Wendt, Paul Widmer, Aleksandra Wieczorek, Shira Wigderson, Sri
Hartati Wijono, Vanessa Berwanger Wille, Seyi Williams, Miriam Winkler,
Shuly Wintner, Mats Wirén, Christian Wittern, Alena Witzlack-Makarevich,
Tsegay Woldemariam, Tak-sum Wong, Alina Wróblewska, Qishen Wu, Mary
Yako, Kayo Yamashita, Naoki Yamazaki, Chunxiao Yan, Qizhen Yang, Xiulin
Yang, Koichi Yasuoka, Marat M. Yavrumyan, Arife Betül Yenice, Enes
Yılandiloğlu, Olcay Taner Yıldız, Zhuoran Yu, Arlisa Yuliawati, Zdeněk
Žabokrtský, Shorouq Zahra, Amir Zeldes, Annie Zhang, Larry Zhang, He
Zhou, Hanzhi Zhu, Yilun Zhu, Anna Zhuravleva, Rayan Ziane, Artūrs
Znotiņš, Eleonora Zucchini
References
Marie-Catherine de Marneffe, Christopher Manning, Joakim Nivre, Daniel
Zeman. 2021. Universal Dependencies. In Computational Linguistics 47:2,
pp. 255–308.
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič,
Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis
Tyers, Daniel Zeman. 2020. Universal Dependencies v2: An Evergrowing
Multilingual Treebank Collection. In Proceedings of LREC.
--------------------------------------------------------------------------------
Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D.
Manning. 2006. Generating typed dependency parses from phrase structure
parses. In Proceedings of LREC.
Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The
Stanford typed dependencies representation. In COLING Workshop on
Cross-framework and Cross-domain Parser Evaluation.
Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri
Haverinen, Filip Ginter, Joakim Nivre, and Christopher Manning. 2014.
Universal Stanford Dependencies: A cross-linguistic typology. In
Proceedings of LREC.
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg,
Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo
Pyysalo, Natalia Silveira, Reut Tsarfaty, Daniel Zeman. 2016. Universal
Dependencies v1: A Multilingual Treebank Collection. In Proceedings of LREC.
Slav Petrov, Dipanjan Das, and Ryan McDonald. 2012. A universal
part-of-speech tagset. In Proceedings of LREC.
Daniel Zeman. 2008. Reusable Tagset Conversion Using Tagset Drivers. In
Proceedings of LREC.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]