Abu-MaTran: automatic building of machine translation


 *Marie Curie IAPP project FP7-PEOPLE-2012-IAPP*


 24-month recruitment of a postdoctoral researcher


 Overview


Prompsit^1 <#sdfootnote1sym> is a research-shaped company created in 2006 inside the Transducens^2 <#sdfootnote2sym> research group at the Department of Software and Computing Systems^3 <#sdfootnote3sym> (Universitat d'Alacant - Spain). It's a leading company in the development of machine translation, specially linguistically-motivated systems such as Apertium^4 <#sdfootnote4sym> rule-based systems or linguistically-augmented Moses^5 <#sdfootnote5sym> statistical systems. The company activity in R&D is intense both as an industry-driven activity or by participation in public national and international R&D programs.


Abu-MaTran^6 <#sdfootnote6sym> (/Automatic Building of Machine Translation/) is an IAPP-FP7^7 <#sdfootnote7sym> project in which the company is currently involved. The project aims at increasing the hitherto low industrial adoption of machine translation by identifying crucial cutting-edge research techniques (automatic acquisition of corpora and linguistic resources, pivot techniques, linguistically augmented statistical translation and diagnostic evaluation) and preparing them to be suitable for commercial exploitation.


Besides Prompsit as a central node of interaction, the project involves four top research institutions (Dublin City University^8 <#sdfootnote8sym> - project coordinator, Universitat d'Alacant^9 <#sdfootnote9sym>, University of Zagreb^10 <#sdfootnote10sym> and /Institute for Language and Speech Processing^11 <#sdfootnote11sym>/). At Prompsit, the project will be led by researcher Sergio Ortiz Rojas, responsible for most of the code of the Apertium MT platform, Prompsit's linguistically-augmented Moses system, a modular version of the Bitextor^12 <#sdfootnote12sym> parallel text collector and other natural language processing tools for information extraction or opinion analysis.


The position involves research, development and participation in outreach activities to achieve the goals of the Abu-MaTran project as well as collaboration with all researchers in the project.


 Job Description


   Main Duties and Responsibilities

 *

   Investigate, in collaboration with the partners, better techniques
   to automate:

     o

       monolingual and bilingual general and domain-focused corpora
       acquisition

     o

       monolingual and bilingual terminology extraction

     o

       automatic induction of transfer rules

     o

       building of pivot or linguistically-augmented machine
       translation systems

     o

       machine translation automatic evaluation

 *

   Implement the techniques for each of the previous points

 *

   Carry out experiments to evaluate their performance.

 *

   Release the output as free/open-source tools with appropriate
   interfaces to use them.

 *

   Write the appropriate documentation for each of the work lines:
   technical documentation for developers, academic-oriented (papers,
   posters, etc.) publications, and tutorials or manuals for developers.

 *

   Attend project-related conferences and meetings

 *

   Present the results at relevant conferences and scientific meetings

 *

   Review work plan with the collaborators according to project
   intermediate milestones and results.

 *

   Get involved and give support to outreach activities (linguistic
   olympiads^13 <#sdfootnote13sym>, FreeRBMT workshop^14
   <#sdfootnote14sym>)


   Person Specification

Applicants should provide evidence in their applications that they meet the following criteria.

The staff in charge of this recruitment process, will use a range of selection methods to measure candidates' abilities in these areas including reviewing your application, seeking references, inviting shortlisted candidates to be interviewed, and other forms of assessment action relevant to the post.


     Criteria

*Qualifications (compulsory):* PhD in Computer Science (or at least 4 years of full-time research experience) and less than 10 years of full-time research experience.

*International procedure (compulsory): *the candidate cannot have worked or lived for more than 12 months within the last 3 years in Spain.

*Experience in:*

 *

   Natural language processing, particularly in machine translation
   (compulsory).

 *

   User-level or developer-level experience in Apertium, Moses,
   OpenMaTrEx^15 <#sdfootnote15sym>, Bitextor, FBC and FMC^16
   <#sdfootnote16sym>, ccLexExtractor^17 <#sdfootnote17sym> and
   DELiC4MT^18 <#sdfootnote18sym> (desirable)

 *

   Data acquisition (desirable)

 *

   Terminology extraction (desirable)

 *

   Machine learning (desirable)

 *

   MT evaluation (desirable)

 *

   Creation of user interfaces and software releasing/sharing (desirable)*
   *

*Programming languages:*C++, Python, PHP (compulsory). JAVA (desirable).

*Multilingual skills:* Good level of English (compulsory). Basic knowledge of Spanish or Catalan (desirable). Knowledge of the South Slavic Languages targeted in the project use case -- Croatian, Bosnian, Serbian or Montenegrin and Slovenian (desirable).

*Good writing and communication skills: *ability to intercommunicate with people and to communicate results, ideas, etc. (compulsory).

*Collaborative working skills: *ability to take and delegate responsibilities**(compulsory).

*Experience in free/open-source software development:*participation in free/open-source software development projects as user or, better, as contributor (desirable).

*Experience in transfer of knowledge between the industry and the academy:*interaction between industry and academy in previous positions is highly valued (desirable).

*Creativity and flexibility skills: *ability to be open to different ideas or opinions, to analyse and solve problems and to make decisions (desirable).

*Active research skills:*ability to follow state-of-the-art research lines associated with the project, to learn and acquire new skills relevant to the project, to write scientific works, and to meet deadlines (desirable).


 Further Information

This post is fixed-term and full-time at Prompsit (Elx/Elche, Spain). The starting date is January 2014 and duration is 24 months. Splits are not possible.


       Terms and conditions of employment:

Terms and conditions will be according to the stipulations of the IAPP program. The recruited candidate will have a full-time contract with full social security coverage subject to the Spanish laws and taxes.


       Salary:

For an Experience Researcher with less than 10 year experience the stipulated salary will be a EUR57,154 per year gross salary corresponding to living allowance and additionally EUR683/EUR977 per month for mobility allowance (depending on family charges).


       Closing date:

1st June 2013.


       Informal enquiries:

For informal enquiries about this job contact us at i...@prompsit.com.

1 <#sdfootnote1anc>http://www.prompsit.com

2 <#sdfootnote2anc>http://transducens.dlsi.ua.es/

3 <#sdfootnote3anc>http://www.dlsi.ua.es/

4 <#sdfootnote4anc>http://apertium.org

5 <#sdfootnote5anc>http://www.statmt.org/moses/

6 <#sdfootnote6anc>http://www.abumatran.eu/, more info at http://cordis.europa.eu/projects/rcn/106334_en.html

7 <#sdfootnote7anc>http://ec.europa.eu/research/mariecurieactions/about-mca/actions/iapp/index_en.htm

8 <#sdfootnote8anc>http://www.dcu.ie

9 <#sdfootnote9anc>http://www.ua.es

10 <#sdfootnote10anc>http://www.unizg.hr/homepage/

11 <#sdfootnote11anc>http://www.ilsp.gr/

12 <#sdfootnote12anc>http://bitextor.sourceforge.net/ (see /prompsit/ branch in source code)

13 <#sdfootnote13anc>http://en.wikipedia.org/wiki/International_Linguistics_Olympiad

14 <#sdfootnote14anc>http://www.chalmers.se/hosted/freerbmt12-en

15 <#sdfootnote15anc>http://www.openmatrex.org/

16 <#sdfootnote16anc>http://nlp.ilsp.gr/soaplab2-axis/

17 <#sdfootnote17anc>http://www.nljubesic.net/resources/tools/cclexex/

18 <#sdfootnote18anc>http://www.computing.dcu.ie/~atoral/delic4mt/

--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes InformĂ tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to