The Research unit ATILF (Computer Processing and Analysis of the French 
Language) offers a postdoctoral position in natural language processing (NLP).

Topic: Discovery of multiword expressions, their meaning and their linguistic 
properties in texts using large language models
Location: ATILF, Nancy, France
Starting date: from February 2024
Duration: 12 months (possibility to extend the duration for one more year)
Supervisors: Mathieu Constant (Univ. Lorraine, France) and Agata Savary (Univ. 
Paris-Saclay, France)
Salary: depends on experience  after PhD and salary grids, from 3070 (<2-year 
experience) to 4465 euros (>7-year-experience) before tax
Application deadline: 5th December 2023


Subject. The term « multiword expression » refers to a combination of multiple 
lexical items that displays irregular composition possibly on different 
linguistic levels (morphology, syntax, semantics, …). They include a large 
variety of phenomena such as idioms (run around in circles), support verb 
constructions (take a walk), nominal compounds (dry run), complex function 
units (in spite of). They have been the subject of extensive research work in 
the NLP community over the last 50 years.

The goal of this post-doc position is to investigate new methods for 
discovering multiword expressions, their meaning and their linguistic 
properties in texts, in order to enrich an induced semantic lexicon with new 
multiword entries, definitions, argumental structure, and other properties. The 
emergence of Large Language Models (LLM) opens new promising perspectives for 
multiword expressions, not only regarding their semantic compositionality but 
also their linguistic characterization. The methods will be primarily 
experimented on French, but other languages are also possible.


Context. The position is part of the SELEXINI project 
(https://selexini.lis-lab.fr <https://selexini.lis-lab.fr/>, 2022-2026) funded 
by the French National Research Agency (ANR). The goal of the SELEXINI project 
is to develop next-generation lexicon induction methods for natural language 
processing. The induced lexicons will not only cluster word usages according to 
their senses, but also contain multiword expressions, argumental structure, 
generated definitions, etc, combining the power of large pre-trained language 
models and existing lexical resources to address the lack of interpretability 
and diversity in current language technology. The hired researcher will be 
fully integrated in the project team.


Requirements. Applicants should hold a PhD thesis in computer science, in 
applied mathematics, in natural language processing, or in computational 
linguistics. Applications from PhD students planning their defense by December 
31st, 2023 are also welcome.
The hired post-doc researcher should have the following skills:
expertise in deep learning for NLP and notably large language models  
excellent programming skills
good linguistic skills
good knowledge of French would be a plus
team spirit

Application. The applicants should submit a cover letter, a CV including their 
publications, a list of references for recommendation, a transcript of Master 
grades, on the following official web site: 
https://emploi.cnrs.fr/Offres/CDD/UMR7118-SABMAR-017/Default.aspx?Lang=EN 
<https://emploi.cnrs.fr/Offres/CDD/UMR7118-SABMAR-017/Default.aspx?Lang=EN>. 
The applications should be submitted not later than December 5.
_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info

Reply via email to