The recent version of OSM-decoder from LMU-Munich uses discontinuous source-side phrases. We used it in this year's WMT campaign. Details on phrase extraction can be looked at in
http://www.statmt.org/wmt13/pdf/WMT13.pdf It gives improvements although not consistently which I suppose is also true for discontinuous Phrasal. On Mon, Nov 4, 2013 at 2:50 PM, <moses-support-requ...@mit.edu> wrote: > Send Moses-support mailing list submissions to > moses-support@mit.edu > > To subscribe or unsubscribe via the World Wide Web, visit > http://mailman.mit.edu/mailman/listinfo/moses-support > or, via email, send a message with subject or body 'help' to > moses-support-requ...@mit.edu > > You can reach the person managing the list at > moses-support-ow...@mit.edu > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Moses-support digest..." > > > Today's Topics: > > 1. Release 1.0 details (Tom Hoar) > 2. Re: gappy phrases (Matthias Huck) > 3. Re: -lm training parameter (John D. Burger) > 4. Re: Release 1.0 details (Hieu Hoang) > 5. Re: Syntax model in source side (burak ayd?n) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 04 Nov 2013 21:12:37 +0700 > From: Tom Hoar <tah...@precisiontranslationtools.com> > Subject: [Moses-support] Release 1.0 details > To: Moses-Support <moses-support@mit.edu> > Message-ID: <5277ab55.60...@precisiontranslationtools.com> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Where can I find the options that were used to compile the release 1.0 > binaries and training tools? A complete list would be nice, but > specifically, I'm looking into whether the distributed Moses binary > includes --with-xmlrpc-c. I suspect not, because the mosesserver binary > is missing from the bin folder. > > > ------------------------------ > > Message: 2 > Date: Mon, 04 Nov 2013 14:39:35 +0000 > From: Matthias Huck <mh...@inf.ed.ac.uk> > Subject: Re: [Moses-support] gappy phrases > To: moses-support@mit.edu > Message-ID: <1383575975.20373.84.camel@portedgar> > Content-Type: text/plain; charset="UTF-8" > > Hi, > > RWTH Aachen University implemented extraction of discontinuous phrases > and decoding with source-side gaps in the Jane toolkit > [www.hltpr.rwth-aachen.de/jane/]. > We did not see any clear improvements over standard phrase-based setups > in our experiments, though. > > Some results were published in PBML: > > M. Huck, E. Scharw?chter, and H. Ney. Source-Side Discontinuous Phrases > for Machine Translation: A Comparative Study on Phrase Extraction and > Search. The Prague Bulletin of Mathematical Linguistics, number 99, > pages 17-38, Prague, Czech Republic, April 2013. > http://www.hltpr.rwth-aachen.de/publications/download/848/Huck-PBML-2013.pdf > > The Jane Hiero implementation yields better translation quality on > Chinese-English. But note that RWTH did not modify Jane's phrase-based > decoder to support target-side gaps. > > I would be very much interested in seeing whether other groups than > Stanford achieve encouraging results with discontinuous phrases in their > toolkits. > > Erik Scharw?chter wrote most of the code related to discontinuous > phrases in the Jane toolkit as part of his Bachelor's thesis. I don't > know how you define a "massive undertaking", but an excellent > undergraduate student can obviously implement it, run some experiments > and write a thesis about it within a limited amount of time. > > Cheers, > Matthias > > > > On Sun, 2013-11-03 at 20:34 -0800, Kenneth Heafield wrote: >> Hi, >> >> I'll throw in the anecdote that gappy phrases are currently not in use >> at Stanford. My predecessor told me that it took a lot longer and only >> improved BLEU slightly on Chinese-English. But it's also possible that >> something didn't get passed down correctly from Michel to my predecessor >> to me. . . >> >> Kenneth >> >> On 11/03/13 14:18, Read, James C wrote: >> > My understanding is that they used a similar approach as the grammar >> > extraction to extract the gappy phrases. Would it be a massive undertaking >> > to get Moses to support this? >> > >> > James >> > ________________________________________ >> > From: Barry Haddow [bhad...@staffmail.ed.ac.uk] >> > Sent: 30 October 2013 09:26 >> > To: Read, James C >> > Cc: moses-support@mit.edu >> > Subject: Re: [Moses-support] gappy phrases >> > >> > No, but it does support hiero and syntax models. >> > >> > On 29/10/13 22:23, Read, James C wrote: >> >> Hi, >> >> >> >> does anybody know if Moses supports gappy phrases >> >> http://www-nlp.stanford.edu/pubs/naacl10-discontinuous_phrases.pdf >> >> >> >> James >> >> > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > > ------------------------------ > > Message: 3 > Date: Mon, 4 Nov 2013 09:41:46 -0500 > From: "John D. Burger" <j...@mitre.org> > Subject: Re: [Moses-support] -lm training parameter > To: Moses-support <moses-support@mit.edu> > Message-ID: <c450eaa6-8afc-4f0a-986e-d033dd02d...@mitre.org> > Content-Type: text/plain; charset=us-ascii > > We've done something like this in the past. The fact that the check for a > non-empty LM happens at the very beginning is somewhat annoying if you have a > setup that builds the phrase models and language models in parallel, for > instance on a cluster. > > - JB > > On Nov 4, 2013, at 07:48 , Tom Hoar wrote: > >> Yes, on both counts. You can edit the moses.ini file to change to a >> different LM. Editing the train-model.perl script should work. We take a >> different approach. We create a temporary /tmp/placeholder.lm before >> running the script and then remove it afterwards. We then regex the >> pattern and change the moses.ini file to any LM we want. >> >> >> On 11/04/2013 04:57 AM, Read, James C wrote: >>> Thanks. >>> >>> So if you wanted to train and at a later date use a different LM with the >>> already trained TM would it just be a simple case of manually editing >>> moses.ini? >>> >>> If I were to edit the training script to skip the check that LM file exists >>> (it doesn't) it wouldn't break anything would it? >>> >>> James >>> >>> ________________________________________ >>> From:moses-support-boun...@mit.edu [moses-support-boun...@mit.edu] on >>> behalf of Tom Hoar [tah...@precisiontranslationtools.com] >>> Sent: 03 November 2013 13:03 >>> To:moses-support@mit.edu >>> Subject: Re: [Moses-support] -lm training parameter >>> >>> You are correct that train-model.perl script does not use the -lm >>> parameter through any of the word alignment or phrase scoring steps. The >>> script's step 9 builds a template moses.ini configuration file and >>> includes the values from the -lm parameter. At the beginning, the script >>> checks that the -lm value points to a non-zero length file. If the file >>> is missing or is zero length, the script halts. >>> >>> >>> >>> On 11/03/2013 06:03 PM, Read, James C wrote: >>>> Hi, >>>> >>>> does anybody know what the effect of the -lm training parameter in the >>>> training script is? Surely the LM used has no effect on typical training >>>> tasks like word alignment and phrase scoring? >>>> >>>> thanks, >>>> James >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> Moses-support@mit.edu >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > ------------------------------ > > Message: 4 > Date: Mon, 4 Nov 2013 14:46:02 +0000 > From: Hieu Hoang <hieu.ho...@ed.ac.uk> > Subject: Re: [Moses-support] Release 1.0 details > To: Tom Hoar <tah...@precisiontranslationtools.com> > Cc: Moses-Support <moses-support@mit.edu> > Message-ID: > <CAEKMkbhSQWBOhfDRS_B3zOTY5PofDM5Z=ehsco8hkkyjywo...@mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Sorry, i didn't write it down. They were compiled with IRSTLM (and KenLM), > but not SRILM. I don't usually compile mosesserver, so the command would be > something like: > nohup ./bjam --with-irstlm=/home/hieu/workspace/irstlm/trunk/ > > I'll try & remember to document it more throughly in the next round > > > On 4 November 2013 14:12, Tom Hoar > <tah...@precisiontranslationtools.com>wrote: > >> Where can I find the options that were used to compile the release 1.0 >> binaries and training tools? A complete list would be nice, but >> specifically, I'm looking into whether the distributed Moses binary >> includes --with-xmlrpc-c. I suspect not, because the mosesserver binary >> is missing from the bin folder. >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20131104/27cc6026/attachment-0001.htm > > ------------------------------ > > Message: 5 > Date: Mon, 4 Nov 2013 16:50:24 +0200 > From: burak ayd?n <bayd...@gmail.com> > Subject: Re: [Moses-support] Syntax model in source side > To: moses-support@mit.edu > Message-ID: > <cah+r-slrhr5tyog3p3qaaiuw4b4+whpdmkqpf1zsyunzbbh...@mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-9" > > Hi everyone, > > I want to use Collins parser while translating from En. I checked the > sample ems configs and applied it. The experiment did not crash or get any > error, but bleu scores were dramatically low, implying that there must be > something wrong. Here the additional parameters for sytnax with Collins' : > > #syntactic parsers > input-parser = "$moses-script-dir/training/wrappers/parse-en-collins.perl > -collins /usr/local/smt/COLLINS-PARSER -mxpost /usr/local/smt/MXPOST/ " > > #training options > training-options = "-mgiza -mgiza-cpus 4 -sort-buffer-size 8G > -sort-compress gzip -sort-parallel 4 -cores 4 -source-syntax" > > Do I need additional parameters except the ones above? I would appreciate > any help. > > Thanks > > > 2013/11/4 burak ayd?n <bayd...@gmail.com> > >> Hi everyone, >> >> I want to use Collins parser while translating from En. I checked the >> sample ems configs and applied it. The experiment did not crash or get any >> error, but bleu scores were dramatically low, implying that there must be >> something wrong. Here the additional parameters for sytnax with Collins' : >> >> #syntactic parsers >> input-parser = "$moses-script-dir/training/wrappers/parse-en-collins.perl >> -collins /usr/local/smt/COLLINS-PARSER -mxpost /usr/local/smt/MXPOST/ " >> >> #training options >> training-options = "-mgiza -mgiza-cpus 4 -sort-buffer-size 8G >> -sort-compress gzip -sort-parallel 4 -cores 4 -source-syntax" >> >> Do I need additional parameters except the ones above? I would appreciate >> any help. >> >> Thanks >> Burak >> >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20131104/15521e14/attachment.htm > > ------------------------------ > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > End of Moses-support Digest, Vol 85, Issue 6 > ******************************************** _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support