date:20140725

[Moses-support] PhD thesis offer in France/Learning from Post-Edition in Machine Translation / LIFL (Lille) and LIG (Grenoble)

2014-07-25 Thread Laurent Besacier

> 
> 
> PhD thesis offer in France/ Learning from Post-Edition in Machine
> Translation / LIFL (Lille) and LIG (Grenoble)
> 
> 
> Contacts :   Olivier Pietquin : olivier.pietq...@univ-lille1.fr Laurent 
> Besacier : laurent.besac...@imag.fr
> 
> 
> Problem
> 
> Statistical Machine Translation (SMT) is the process by which texts are
> automatically translated from a source language to a target language by
> a machine that has been trained on corpora in both languages. Thanks to
> progress in the training of SMT engines, machine translation has become
> good enough so that it has become advantageous for translators to
> post-edit machine outputs rather than translate from scratch. However,
> current enhancement of SMT systems from human post-edition (PE) are
> rather basic: the post-edited output is added to the training corpus and
> the translation model and language model are re-trained, with no clear
> view of how much has been improved and how much is left to be
> improved. Moreover, the final PE result is the only feedback used:
> available technologies do not take advantage of logged sequences of
> post-edition actions, which inform on the cognitive processes of the
> post-editor.
> 
> The proposed thesis aims at using the post-edition process as a
> demonstration of how an expert translator modifies the SMT result to
> produce a perfect translation. Learning from demonstration is an
> emerging field in machine learning, mostly applied to robotics [1] that
> will thus be explored further in the particular framework of SMT.
> 
> Topic of research
> 
> A novel approach to SMT training will be adopted in this thesis, i.e.
> considering the post-edition process as a sequential decision making
> process performed by human experts who should be imitated. This thesis’
> first fundamental contribution to SMT will be to reformulate the problem
> of post-edition in SMT as a sequential decision making problem
> [4]. Indeed, the hypothesis selection and ranking process occurring in
> an SMT system can be seen as an action selection strategy, choosing
> after each post-edition step amongst a large number of actions (all
> possible hypotheses and rankings). This strategy has to be modified
> according to post-edition results arising sequentially and being
> influenced by previous actions (hypothesis selection) of the system.
> 
> From this, SMT will be casted into an imitation learning problem, that
> is learning from demonstrations made by an expert: post-edition results
> can be seen as examples of what the system should do, again in a
> sequential decision making process and not in a static one such as
> supervised learning. Indeed, SMT decoding, whether it is based on
> phrases or chunks, can be seen as a sequential decision making
> process. The sequences of decisions taken by an expert during the
> post-edition process can be seen as a target for the system, which will
> try to imitate them in similar situations. To do so, we will extend the
> work described in [2], that modelled semantic parsing as an Inverse
> Reinforcement Learning (IRL) [3].
> 
> In addition, the question of automatically selecting the sentences that
> should be used for post-edition and further learning will be addressed.
> Especially, this will be studied under the active learning
> paradigm. Large and diversified amounts of post-edited data, collected
> in an industrial setting, will be made available for the research
> project.
> 
> 
> Profile
> 
> The applicants must hold an Engineering or a Master degree in
> Computational Linguistics or computer science, preferably with
> experience in the fields of statistical machine learning and/or natural
> language processing. Good background in programming will also be
> required. He/she will also be involved in a research project, funded by
> the French National Agency for Research, involving 2 research labs (LIFL
> in Lille and LIG in Grenoble) and a company (Lingua & Machina). For this
> reason good English level is required (good command of French being a
> plus). Finally effective communication skills in English, both written
> and verbal are mandatory.
> 
> Context
> 
> The candidate will be hired by University Lille 1 in the framework of a
> national research project. S/he will mainly be hosted in the SequeL (
> Sequential Learning) team of the Laboratoire d’Informatique Fondamentale
> de Lille (LIFL). SequeL is also a common team-project with INRIA
> (national institute for research in computer science and mathematics)
> and espe- cially the INRIA Lille - Nord Europe Center. The group
> involves around 25 researchers working on sequential learning and is
> internationally recognized. Lille is the largest city of the north of
> France, a metropolis with 1 million inhabitants, with excellent train
> connections to Brussels (30 min), Paris (1h) and London (1h30).
> 
> This thesis will be supervised in strong collaboration with the GETALP
> team of Laboratoire d’Informatique de Grenoble (L

[Moses-support] Issues with Incremental retraining using Moses

2014-07-25 Thread Sandipan Dandapat

Hi,
I am trying to use Moses Incremental Retraining as described in
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc33

I have two doubts:

1. I am able to generate the new-alignment-file  using additional data on
top of previously used data. Once the new alignment file is generated, the
page says to update the model.  I am unable to understand how to use the
same during decoding? Can you please help me to understand how can I
proceed once my new-allignment file is generated?

2. What is happening when we are updating the moses.ini file using

 PhraseDictionaryDynSuffixArray source=
target= alignment=

 I am unable to see any reference to this updated moses.ini file in the
rest of the section.

Thanks and regards,
sandipan

Sandipan Dandapat
Postdoctoral Researcher
CNGL, School of Computing
Dublin City University
Google Scholar Profile:
http://scholar.google.co.in/citations?user=DWD_FiQJ&hl=en
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] PhD thesis offer in France/Learning from Post-Edition in Machine Translation / LIFL (Lille) and LIG (Grenoble)

[Moses-support] Issues with Incremental retraining using Moses

2 matches

Site Navigation

Mail list logo

Footer information