Re: [Moses-support] gappy phrases (Nadir Durrani)

Nadir Durrani Mon, 04 Nov 2013 08:06:50 -0800

The recent version of OSM-decoder from LMU-Munich uses discontinuous
source-side phrases. We used it in this year's WMT campaign. Details
on phrase extraction can be looked at in


http://www.statmt.org/wmt13/pdf/WMT13.pdf

It gives improvements although not consistently which I suppose is
also true for discontinuous Phrasal.





On Mon, Nov 4, 2013 at 2:50 PM,  <moses-support-requ...@mit.edu> wrote:
> Send Moses-support mailing list submissions to
>         moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>         moses-support-requ...@mit.edu
>
> You can reach the person managing the list at
>         moses-support-ow...@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>    1. Release 1.0 details (Tom Hoar)
>    2. Re: gappy phrases (Matthias Huck)
>    3. Re: -lm training parameter (John D. Burger)
>    4. Re: Release 1.0 details (Hieu Hoang)
>    5. Re: Syntax model in source side (burak ayd?n)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 04 Nov 2013 21:12:37 +0700
> From: Tom Hoar <tah...@precisiontranslationtools.com>
> Subject: [Moses-support] Release 1.0 details
> To: Moses-Support <moses-support@mit.edu>
> Message-ID: <5277ab55.60...@precisiontranslationtools.com>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Where can I find the options that were used to compile the release 1.0
> binaries and training tools? A complete list would be nice, but
> specifically, I'm looking into whether the distributed Moses binary
> includes --with-xmlrpc-c. I suspect not, because the mosesserver binary
> is missing from the bin folder.
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 04 Nov 2013 14:39:35 +0000
> From: Matthias Huck <mh...@inf.ed.ac.uk>
> Subject: Re: [Moses-support] gappy phrases
> To: moses-support@mit.edu
> Message-ID: <1383575975.20373.84.camel@portedgar>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi,
>
> RWTH Aachen University implemented extraction of discontinuous phrases
> and decoding with source-side gaps in the Jane toolkit
> [www.hltpr.rwth-aachen.de/jane/].
> We did not see any clear improvements over standard phrase-based setups
> in our experiments, though.
>
> Some results were published in PBML:
>
> M. Huck, E. Scharw?chter, and H. Ney. Source-Side Discontinuous Phrases
> for Machine Translation: A Comparative Study on Phrase Extraction and
> Search. The Prague Bulletin of Mathematical Linguistics, number 99,
> pages 17-38, Prague, Czech Republic, April 2013.
> http://www.hltpr.rwth-aachen.de/publications/download/848/Huck-PBML-2013.pdf
>
> The Jane Hiero implementation yields better translation quality on
> Chinese-English. But note that RWTH did not modify Jane's phrase-based
> decoder to support target-side gaps.
>
> I would be very much interested in seeing whether other groups than
> Stanford achieve encouraging results with discontinuous phrases in their
> toolkits.
>
> Erik Scharw?chter wrote most of the code related to discontinuous
> phrases in the Jane toolkit as part of his Bachelor's thesis. I don't
> know how you define a "massive undertaking", but an excellent
> undergraduate student can obviously implement it, run some experiments
> and write a thesis about it within a limited amount of time.
>
> Cheers,
> Matthias
>
>
>
> On Sun, 2013-11-03 at 20:34 -0800, Kenneth Heafield wrote:
>> Hi,
>>
>>       I'll throw in the anecdote that gappy phrases are currently not in use
>> at Stanford.  My predecessor told me that it took a lot longer and only
>> improved BLEU slightly on Chinese-English.  But it's also possible that
>> something didn't get passed down correctly from Michel to my predecessor
>> to me. . .
>>
>> Kenneth
>>
>> On 11/03/13 14:18, Read, James C wrote:
>> > My understanding is that they used a similar approach as the grammar 
>> > extraction to extract the gappy phrases. Would it be a massive undertaking 
>> > to get Moses to support this?
>> >
>> > James
>> > ________________________________________
>> > From: Barry Haddow [bhad...@staffmail.ed.ac.uk]
>> > Sent: 30 October 2013 09:26
>> > To: Read, James C
>> > Cc: moses-support@mit.edu
>> > Subject: Re: [Moses-support] gappy phrases
>> >
>> > No, but it does support hiero and syntax models.
>> >
>> > On 29/10/13 22:23, Read, James C wrote:
>> >> Hi,
>> >>
>> >> does anybody know if Moses supports gappy phrases 
>> >> http://www-nlp.stanford.edu/pubs/naacl10-discontinuous_phrases.pdf
>> >>
>> >> James
>> >>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 4 Nov 2013 09:41:46 -0500
> From: "John D. Burger" <j...@mitre.org>
> Subject: Re: [Moses-support] -lm training parameter
> To: Moses-support <moses-support@mit.edu>
> Message-ID: <c450eaa6-8afc-4f0a-986e-d033dd02d...@mitre.org>
> Content-Type: text/plain; charset=us-ascii
>
> We've done something like this in the past. The fact that the check for a 
> non-empty LM happens at the very beginning is somewhat annoying if you have a 
> setup that builds the phrase models and language models in parallel, for 
> instance on a cluster.
>
> - JB
>
> On Nov 4, 2013, at 07:48 , Tom Hoar wrote:
>
>> Yes, on both counts. You can edit the moses.ini file to change to a
>> different LM. Editing the train-model.perl script should work. We take a
>> different approach. We create a temporary /tmp/placeholder.lm before
>> running the script and then remove it afterwards. We then regex the
>> pattern and change the moses.ini file to any LM we want.
>>
>>
>> On 11/04/2013 04:57 AM, Read, James C wrote:
>>> Thanks.
>>>
>>> So if you wanted to train and at a later date use a different LM with the 
>>> already trained TM would it just be a simple case of manually editing 
>>> moses.ini?
>>>
>>> If I were to edit the training script to skip the check that LM file exists 
>>> (it doesn't) it wouldn't break anything would it?
>>>
>>> James
>>>
>>> ________________________________________
>>> From:moses-support-boun...@mit.edu  [moses-support-boun...@mit.edu] on 
>>> behalf of Tom Hoar [tah...@precisiontranslationtools.com]
>>> Sent: 03 November 2013 13:03
>>> To:moses-support@mit.edu
>>> Subject: Re: [Moses-support] -lm training parameter
>>>
>>> You are correct that train-model.perl script does not use the -lm
>>> parameter through any of the word alignment or phrase scoring steps. The
>>> script's step 9 builds a template moses.ini configuration file and
>>> includes the values from the -lm parameter. At the beginning, the script
>>> checks that the -lm value points to a non-zero length file. If the file
>>> is missing or is zero length, the script halts.
>>>
>>>
>>>
>>> On 11/03/2013 06:03 PM, Read, James C wrote:
>>>> Hi,
>>>>
>>>> does anybody know what the effect of the -lm training parameter in the 
>>>> training script is? Surely the LM used has no effect on typical training 
>>>> tasks like word alignment and phrase scoring?
>>>>
>>>> thanks,
>>>> James
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 4 Nov 2013 14:46:02 +0000
> From: Hieu Hoang <hieu.ho...@ed.ac.uk>
> Subject: Re: [Moses-support] Release 1.0 details
> To: Tom Hoar <tah...@precisiontranslationtools.com>
> Cc: Moses-Support <moses-support@mit.edu>
> Message-ID:
>         <CAEKMkbhSQWBOhfDRS_B3zOTY5PofDM5Z=ehsco8hkkyjywo...@mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Sorry, i didn't write it down. They were compiled with IRSTLM (and KenLM),
> but not SRILM. I don't usually compile mosesserver, so the command would be
> something like:
>    nohup ./bjam  --with-irstlm=/home/hieu/workspace/irstlm/trunk/
>
> I'll try & remember to document it more throughly in the next round
>
>
> On 4 November 2013 14:12, Tom Hoar 
> <tah...@precisiontranslationtools.com>wrote:
>
>> Where can I find the options that were used to compile the release 1.0
>> binaries and training tools? A complete list would be nice, but
>> specifically, I'm looking into whether the distributed Moses binary
>> includes --with-xmlrpc-c. I suspect not, because the mosesserver binary
>> is missing from the bin folder.
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20131104/27cc6026/attachment-0001.htm
>
> ------------------------------
>
> Message: 5
> Date: Mon, 4 Nov 2013 16:50:24 +0200
> From: burak ayd?n <bayd...@gmail.com>
> Subject: Re: [Moses-support] Syntax model in source side
> To: moses-support@mit.edu
> Message-ID:
>         <cah+r-slrhr5tyog3p3qaaiuw4b4+whpdmkqpf1zsyunzbbh...@mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-9"
>
> Hi everyone,
>
> I want to use Collins parser while translating from En. I checked the
> sample ems configs and applied it. The experiment did not crash or get any
> error, but bleu scores were dramatically low, implying that there must be
> something wrong. Here the additional parameters for sytnax with Collins' :
>
> #syntactic parsers
> input-parser = "$moses-script-dir/training/wrappers/parse-en-collins.perl
> -collins /usr/local/smt/COLLINS-PARSER -mxpost /usr/local/smt/MXPOST/ "
>
> #training options
> training-options = "-mgiza -mgiza-cpus 4 -sort-buffer-size 8G
> -sort-compress gzip -sort-parallel 4 -cores 4 -source-syntax"
>
> Do I need additional parameters except the ones above? I would appreciate
> any help.
>
> Thanks
>
>
> 2013/11/4 burak ayd?n <bayd...@gmail.com>
>
>> Hi everyone,
>>
>> I want to use Collins parser while translating from En. I checked the
>> sample ems configs and applied it. The experiment did not crash or get any
>> error, but bleu scores were dramatically low, implying that there must be
>> something wrong. Here the additional parameters for sytnax with Collins' :
>>
>> #syntactic parsers
>> input-parser = "$moses-script-dir/training/wrappers/parse-en-collins.perl
>> -collins /usr/local/smt/COLLINS-PARSER -mxpost /usr/local/smt/MXPOST/ "
>>
>> #training options
>> training-options = "-mgiza -mgiza-cpus 4 -sort-buffer-size 8G
>> -sort-compress gzip -sort-parallel 4 -cores 4 -source-syntax"
>>
>> Do I need additional parameters except the ones above? I would appreciate
>> any help.
>>
>> Thanks
>> Burak
>>
>>
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20131104/15521e14/attachment.htm
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 85, Issue 6
> ********************************************
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] gappy phrases (Nadir Durrani)

Reply via email to