LIFL (Lille) and LIG (Grenoble)

Laurent Besacier Fri, 25 Jul 2014 14:54:13 -0700

> 
> 
> PhD thesis offer in France/ Learning from Post-Edition in Machine
> Translation / LIFL (Lille) and LIG (Grenoble)
> 
> 
> Contacts :   Olivier Pietquin : olivier.pietq...@univ-lille1.fr Laurent 
> Besacier : laurent.besac...@imag.fr
> 
> 
> Problem
> 
> Statistical Machine Translation (SMT) is the process by which texts are
> automatically translated from a source language to a target language by
> a machine that has been trained on corpora in both languages. Thanks to
> progress in the training of SMT engines, machine translation has become
> good enough so that it has become advantageous for translators to
> post-edit machine outputs rather than translate from scratch. However,
> current enhancement of SMT systems from human post-edition (PE) are
> rather basic: the post-edited output is added to the training corpus and
> the translation model and language model are re-trained, with no clear
> view of how much has been improved and how much is left to be
> improved. Moreover, the final PE result is the only feedback used:
> available technologies do not take advantage of logged sequences of
> post-edition actions, which inform on the cognitive processes of the
> post-editor.
> 
> The proposed thesis aims at using the post-edition process as a
> demonstration of how an expert translator modifies the SMT result to
> produce a perfect translation. Learning from demonstration is an
> emerging field in machine learning, mostly applied to robotics [1] that
> will thus be explored further in the particular framework of SMT.
> 
> Topic of research
> 
> A novel approach to SMT training will be adopted in this thesis, i.e.
> considering the post-edition process as a sequential decision making
> process performed by human experts who should be imitated. This thesis’
> first fundamental contribution to SMT will be to reformulate the problem
> of post-edition in SMT as a sequential decision making problem
> [4]. Indeed, the hypothesis selection and ranking process occurring in
> an SMT system can be seen as an action selection strategy, choosing
> after each post-edition step amongst a large number of actions (all
> possible hypotheses and rankings). This strategy has to be modified
> according to post-edition results arising sequentially and being
> influenced by previous actions (hypothesis selection) of the system.
> 
> From this, SMT will be casted into an imitation learning problem, that
> is learning from demonstrations made by an expert: post-edition results
> can be seen as examples of what the system should do, again in a
> sequential decision making process and not in a static one such as
> supervised learning. Indeed, SMT decoding, whether it is based on
> phrases or chunks, can be seen as a sequential decision making
> process. The sequences of decisions taken by an expert during the
> post-edition process can be seen as a target for the system, which will
> try to imitate them in similar situations. To do so, we will extend the
> work described in [2], that modelled semantic parsing as an Inverse
> Reinforcement Learning (IRL) [3].
> 
> In addition, the question of automatically selecting the sentences that
> should be used for post-edition and further learning will be addressed.
> Especially, this will be studied under the active learning
> paradigm. Large and diversified amounts of post-edited data, collected
> in an industrial setting, will be made available for the research
> project.
> 
> 
> Profile
> 
> The applicants must hold an Engineering or a Master degree in
> Computational Linguistics or computer science, preferably with
> experience in the fields of statistical machine learning and/or natural
> language processing. Good background in programming will also be
> required. He/she will also be involved in a research project, funded by
> the French National Agency for Research, involving 2 research labs (LIFL
> in Lille and LIG in Grenoble) and a company (Lingua & Machina). For this
> reason good English level is required (good command of French being a
> plus). Finally effective communication skills in English, both written
> and verbal are mandatory.
> 
> Context
> 
> The candidate will be hired by University Lille 1 in the framework of a
> national research project. S/he will mainly be hosted in the SequeL (
> Sequential Learning) team of the Laboratoire d’Informatique Fondamentale
> de Lille (LIFL). SequeL is also a common team-project with INRIA
> (national institute for research in computer science and mathematics)
> and espe- cially the INRIA Lille - Nord Europe Center. The group
> involves around 25 researchers working on sequential learning and is
> internationally recognized. Lille is the largest city of the north of
> France, a metropolis with 1 million inhabitants, with excellent train
> connections to Brussels (30 min), Paris (1h) and London (1h30).
> 
> This thesis will be supervised in strong collaboration with the GETALP
> team of Laboratoire d’Informatique de Grenoble (LIG), widely renowned
> for its research on natural language and speech processing. Grenoble is
> a high-tech city with 4 universities. It is located at the heart of the
> Alps, in outstanding scientific and natural surroundings. It is 3h by
> train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is
> less than 1h from Lyon international airport.
> 
> The PhD thesis will be co-supervised by Olivier Pietquin in Lille and
> Laurent Besacier in Grenoble.
> 
> Contacts
> 
> Interviews will be held in Sept 2014. Meetings during Interspeech 2014
> in Singapore can be also organized. For further info, please contact:
> 
> Olivier Pietquin : olivier.pietq...@univ-lille1.fr 
> Laurent Besacier : laurent.besac...@imag.fr
> 
> References
> 
> [1] Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett
> Browning. A survey of robot learning from demonstration. Robotics and
> Autonomous Systems, 57(5):469–483, May 2009.
> 
> [2] Gergely Neu and Csaba SzepesvÃąri. Training parsers by inverse
> reinforcement learning. Machine Learning, 77(2-3):303–337, 2009.
> 
> [3] Andrew Y. Ng and Stuart J. Russell. Algorithms for inverse
> reinforcement learning. In Proceedings of the Seventeenth International
> Conference on Machine Learning, ICML ’00, pages 663–670, San Francisco,
> CA, USA, 2000. Morgan Kaufmann Publishers Inc.
> 
> [4] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An
> Introduction. The MIT Press, 3rd edition, March 1998.
> 
> 
> 
> 
>


------------------------
Laurent Besacier
Professeur à l'Université Joseph Fourier (Grenoble 1)
Laboratoire d'Informatique de Grenoble (LIG)
Membre Junior de l'Institut Universitaire de France (IUF 2012-2017)
laurent.besac...@imag.fr
-------------------------

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] PhD thesis offer in France/Learning from Post-Edition in Machine Translation / LIFL (Lille) and LIG (Grenoble)

Reply via email to