Apologies for multiple postings] See you at Hilton Austin hotel on November 1st .. EMNLP 2016 Second Workshop on Computational Approaches to Linguistic Code Switching Hilton Austin hotel, Austin, Texas, USA November 1st, 2016 http://care4lang1.seas.gwu.edu/cs2/call.html Background
Code-switching (CS) is the phenomenon by which multilingual speakers switch back and forth between their common languages in written or spoken communication. CS is typically present on the inter sentential, intra sentential (mixing of words from multiple languages in the same utterance) and even morphological (mixing of morphemes) levels. CS presents serious challenges for language technologies, including parsing, Machine Translation (MT), automatic speech recognition (ASR), information retrieval (IR) and extraction (IE), and semantic processing. Traditional techniques trained for one language quickly break down when there is input mixed in from another. Even for problems that are considered solved, such as language identification, or part of speech tagging, performance will degrade at a rate proportional to the amount and level of mixed-language present. CS is pervasive in informal text communications such as news groups, tweets, blogs, and other social media of multilingual communities. Such genres are increasingly being studied as rich sources of social, commercial and political information. Apart from the informal genre challenge associated with such data within a single language processing scenario, the CS phenomenon adds another significant layer of complexity to the processing of the data. Efficiently and robustly processing CS data presents a new frontier for our NLP algorithms on all levels. This workshop aims to bring together researchers interested in solving the problem and to increase awareness of the community at large with possible viable solutions to reduce the complexity of the phenomenon. The workshop invites contributions from researchers working in NLP approaches for the analysis and/processing of mixed-language data especially with a focus on intra sentential code switching. Invited Speakers *Monojit Choudhury* <http://research.microsoft.com/en-us/people/monojitc/> *Bio:* Monojit Choudhury is a Researcher at Microsoft Research Lab India. Prior to this, he did his PhD (2007) and B.Tech (2002), both in Computer Science and Engineering, from Indian Institute of Technology Kharagpur. His research interests include NLP for low resource languages, technologies for multilingual communities, and computational approaches to linguistics, sociolinguistics, evolutionary linguistics and cognition. Monojit is very actively involved with the organization of the International Linguistics Olympiad http://www.ioling.org and its Indian national counterpart – the Panini Linguistics Olympiad http://plo-in.org – programs that try to attract the brightest high school kids to linguistics and NLP through challenging yet interesting and thought-provoking puzzles. *Kalika Bali <http://research.microsoft.com/en-us/people/kalikab/>* *Bio:* Kalika Bali is a Researcher at Microsoft Research Lab India. A linguist and an acoustic phonetician by training, she has worked for the last 15 years in the area of Speech and Language Technology, especially for resource poor languages. Her brief stint as a lecturer in the University of the South Pacific, Fiji, has left her with a lasting interest in how technology can be used to enhance and further education and some of her current research lies at the intersection of ICT and Education, for primary school students to Adults learning new skills. The primary focus of her research is on how Natural Language systems can help Human-Computer Interaction, including computer-mediated interaction, in the domain of education and social media. Workshop Program *Tuesday, November 1, 2016* *Session1: Opening Session* 08:45 - 09:00 *Remarks* 09:00 - 10:00 *Speakers* Monojit Choudhury, Kalika Bali 10:00 - 10:30 *Challenges of Computational Processing of Code-Switching* Özlem Çetino˘glu, Sarah Schulz and Ngoc Thang Vu 10:30 - 11:00 *Coffee Break* *Session2: Workshop Talks* 11:00 - 11:30 *Simple Tools for Exploring Variation in Code-switching for Linguists* Gualberto A. Guzman, Jacqueline Serigos, Barbara E. Bullock and Almeida Jacqueline Toribio 11:30 - 12:00 *Word-Level Language Identification and Predicting Codeswitching Points in Swahili-English Language Data* Mario Piergallini, Rouzbeh Shirvani, Gauri S. Gautam and Mohamed Chouikha 12:00 - 12:30 *Part of Speech Tagging for Code Switched Data* Fahad AlGhamdi, Giovanni Molina, Mona Diab, Thamar Solorio, Abdelati Hawwari, Victor Soto and Julia Hirschberg 12:30 - 14:00 *Lunch* *Session3: Shared Task* 14:00 - 14:30 *Task Overview results/ 1 min poster boosters.Overview for the Second Shared Task on Language Identification in Code-Switched Data.* Giovanni Molina, Fahad AlGhamdi, Mahmoud Ghoneim, Abdelati Hawwari, Nicolas Rey-Villamizar, Mona Diab and Thamar Solorio 14:30 - 15:00 *Multilingual Code-switching Identification via LSTM Recurrent Neural Networks* Younes Samih, Suraj Maharjan, Mohammed Attia, Laura Kallmeyer and Thamar Solorio 15:00 - 15:30 *A Neural Model for Language Identification in Code-Switched Tweets* Aaron Jaech, George Mulcaire, Mari Ostendorf and Noah A. Smith 15:30 - 16:00 *Coffee Break* *Session 4: Panel Discussion and Poster Session* 16:00 - 16:45 *Panel Discussion* 16:45 - 18:00 *Poster Session* *SAWT: Sequence Annotation Web Tool* Younes Samih, Wolfgang Maier and Laura Kallmeyer *Accurate Pinyin-English Codeswitched Language Identification* Meng Xuan Xia and Jackie Chi Kit Cheung *Unraveling the English-Bengali Code-Mixing Phenomenon* Arunavha Chanda, Dipankar Das and Chandan Mazumdar *Part-of-speech Tagging of Code-Mixed Social Media Text* Souvick Ghosh, Satanu Ghosh and Dipankar Das *Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint Modelling*Utsab Barman, Joachim Wagner and Jennifer Foster *The George Washington University System for the Code-Switching Workshop Shared Task 2016* Mohamed Al-Badrashiny and Mona Diab *Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description* Arunavha Chanda, Dipankar Das and Chandan Mazumdar *The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching * Rouzbeh Shirvani, Mario Piergallini, Gauri Shankar Gautam and Mohamed Chouikha *Codeswitching Detection via Lexical Features in Conditional Random Fields * Prajwol Shrestha *Language Identification in Code-Switched Text Using Conditional Random Fields and Babelnet * Utpal Kumar Sikdar and Björn Gambäck *Codeswitching language identification using Subword Information Enriched Word * Meng Xuan Xia Organizing Committee - Mona Diab <https://www.seas.gwu.edu/~mtdiab/> - Associate Professor - Department of Computer Science - George Washington University - mtd...@email.gwu.edu - Pascale Fung <http://www.ece.ust.hk/~pascale/> - Professor - Department of Electronic and Computer Engineering - Hong Kong University of Science and Technology - pasc...@ece.ust.hk - Mahmoud Ghoneim <http://www.ghoneim.net/> - Research Scientist - Department of Computer Science - George Washington University - mghon...@email.gwu.edu - Julia Hirschberg <http://www.cs.columbia.edu/~julia/> - Professor and Chair - Department of Computer Science - Columbia University - ju...@cs.columbia.edu - Thamar Solorio <http://solorio.uh.edu/> - Associate Professor - Department of Computer Science - University of Houston - solo...@cs.uh.edu Program Committee - Constantine Lignos <http://lignos.org/>, University of Pennsylvania - Elabbas Benmamoun <http://www.linguistics.illinois.edu/people/benmamou>, University of Illinois at Urbana-Champaign - Agnes Bolonyia <http://www.ncsu.edu/linguistics/bolonyai.php/>, NC State University - Cecilia Montes-Alcala <http://www.modlangs.gatech.edu/people/faculty/alcala>, Georgia Institute of Technology - Yves Scherre <http://www.latl.unige.ch/personal/yvesscherrer/#default.fr>, Université de Genève - Björn Gambäck <http://care4lang1.seas.gwu.edu/cs2/call.html>, Norwegian Universities of Science and Technology - Amitava Das <http://www.amitavadas.com/>, University of North Texas - Younes Samih <http://care4lang1.seas.gwu.edu/cs2/call.html>, Dusseldorf University - David Vilares <http://care4lang1.seas.gwu.edu/cs2/call.html>, Universidade da Coruña - Sunayana Sitaram <http://care4lang1.seas.gwu.edu/cs2/call.html>, Microsoft Research India - Almeida Jacqueline Toribio <https://www.utexas.edu/cola/insts/llilas/faculty/ajt95>, University of Texas at Austin - Fahad AlGhamdi <http://student.seas.gwu.edu/~fghamdi/>, The George Washington University - Giovanni Molina Ramos <http://care4lang1.seas.gwu.edu/cs2/call.html>, University of Houston - Nicolas Rey Villamizar <http://care4lang1.seas.gwu.edu/cs2/call.html>, University of Houston - Victor Soto <http://care4lang1.seas.gwu.edu/cs2/call.html>, Columbia University - Borja Navarro Colorado <http://www.dlsi.ua.es/~borja/>, Universidad de Alicante - Rabih Zbib <http://www.linguistics.illinois.edu/people/rbhatt>, BBN Technologies - Barbara Bullock <https://www.utexas.edu/cola/depts/frenchitalian/faculty/bb25848>, University of Texas at Austin Publications & Shared Task Chairs - Fahad AlGhamdi <http://student.seas.gwu.edu/~fghamdi/> The George Washington University - Mahmoud Ghoneim <http://www.ghoneim.net/> The George Washington University - Giovanni Molina Ramos <http://care4lang1.seas.gwu.edu/cs2/call.html> University of Houston ----------------- Thanks, Mahmoud Ghoneim, PhD Research Scientist Computer Science Department School of Engineering and Applied Science The George Washington University
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support