[Moses-support] tuning (weights, normalization)

2014-02-25 Thread Jani Dugonik
Hi, I have a few questions about tuning weights. a) On statmt website it says: Good values for the weights for phrase translation table (|weight-t|, short |tm|), language model (|weight-l|, short |lm|), and reordering model (|weight-d|, short |d|) are 0.1-1, good values for the

[Moses-support] Summer work on MT in the Google Summer of Code

2014-02-25 Thread Francis Tyers
Dear Moses people! Apertium[1] was accepted in the Google Summer of Code[2] this year. We are looking for students who would be interested in working on different aspects of rule-based MT for three months during the summer. Apertium is primarily a rule-based project, but we also apply machine

[Moses-support] Second CFP: COLING 2014

2014-02-25 Thread John Judge
** Apologies for cross-posting ** Second (Main) Call for Papers - Coling 2014 COLING 2014 Dublin, Ireland, 23-29 August, 2014 The International Committee on Computational Linguistics (ICCL) is pleased to announce the 25th International Conference on Computational Linguistics

Re: [Moses-support] Constraint decoding

2014-02-25 Thread Hieu Hoang
I think you asked this question 2 weeks ago http://article.gmane.org/gmane.comp.nlp.moses.user/10406/match=saeed On 25 February 2014 06:50, Saeed Farzi saeedfa...@gmail.com wrote: Dear all, I am trying to use constraint decoding with moses, Any body knows how do this? By the way, are

Re: [Moses-support] ERROR: Lexical reordering scoring failed at moses/scripts/training/train-model.perl line 1738.

2014-02-25 Thread 李惠惠
Hi Barry, Thanks for the instructions. I check the tmp directory and it seems that the align. file is broken. And I rerun the extract-parallel perl separately but it doesn't work either. Here is the link to my small corpus. https://www.dropbox.com/sh/beffab8g215awz6/2dCp0G_pdj Thank you, Hui

Re: [Moses-support] ERROR: Lexical reordering scoring failed at moses/scripts/training/train-model.perl line 1738.

2014-02-25 Thread Barry Haddow
Hi Hui What do you mean by broken? I meant for you to upload the contents of the tmp directory to dropbox, cheers - Barry On 25/02/14 15:54, 李惠惠 wrote: Hi Barry, Thanks for the instructions. I check the tmp directory and it seems that the align. file is broken. And I rerun the

Re: [Moses-support] Moses Compilation error

2014-02-25 Thread Hieu Hoang
There's a problem with your SRILM library. My advise - don't bother linking to SRILM. KenLM is much better than SRILM for most purposes these days Your compilation command should be as simple as bjam -a -j2 On 25 February 2014 07:36, priyanka priya...@cse.iitb.ac.in wrote: Hello

[Moses-support] Moses training performance

2014-02-25 Thread Andrzej Zydron
Dear Support, I realize that there may not be a simple answer, but I would like to understand why running training on a 9300 segment corpus takes nearly three times as long on a 12 core Xeon E5-1650v2 128GB RAM Running CentOS 6.5, than on my MacBook Pro 4 core i7 3720QM 8GB RAM running

Re: [Moses-support] Moses training performance

2014-02-25 Thread Hieu Hoang
Strange and interesting. I can think of 2 issues: 1. The number of cores isn't relevant unless you explicitly ask mgiza the various extraction steps to use multiple cores. 2. It looks like mgiza is the issue 3. I'm not sure how io-bound mgiza is. However, in my test with virtual machines,

[Moses-support] Restrict WordTranslationFeature to certain pairs

2014-02-25 Thread Marcin Junczys-Dowmunt
Hi, Is there a non-programming way to restrict WordTranslationFeature to specific pairs rather than the complete product of two separate source and target word lists? Best, Marcin ___ Moses-support mailing list Moses-support@mit.edu

Re: [Moses-support] Moses training performance

2014-02-25 Thread Andrzej Zydron
Title: Email signature standard Many thanks Hieu, I did specify "-mgiza-cpus 4" for the Mac and "-mgiza-cpus 12" for the Xeon server. Interestingly "-mgiza-cpus 10" gave slightly better performance (5 mins). Looking at the io stats mgiza did not appear to

Re: [Moses-support] Moses training performance

2014-02-25 Thread Barry Haddow
Title: Email signature standard Hi Andrzej What if you give the same number of threads to each run? It may be that the small size of your data set means that the threading overhead outweights the benefits of using lots of threads, cheers - Barry

Re: [Moses-support] Moses training performance

2014-02-25 Thread Marcin Junczys-Dowmunt
I guess the mkcls time is a good hint here. Could it be that the Xeon system is much slower on a per-core basis, like low CPU frequency compared to your Mac? Mkcls is single process, so this is not a multi-threading issue. Maybe there is heavy load on that Xeon from other sources? W dniu

[Moses-support] Weight normalization in mert-moses.pl

2014-02-25 Thread Marcin Junczys-Dowmunt
Hi, I am wondering about this piece of code in the get_weights_from_mert function of mert-moses.pl: my $sum = 0.0; while ($fh) { if (/^F(\d+) ([\-\.\de]+)/) { # regular features $WEIGHT[$1] = $2; $sum += abs($2); } elsif (/^M(\d+_\d+) ([\-\.\de]+)/)

Re: [Moses-support] Weight normalization in mert-moses.pl

2014-02-25 Thread Marcin Junczys-Dowmunt
OK, second part of question is resolved. The multiple weight sets only occur if kbmira outputs to stdout. If a file name is given it overwrites the file and thus the previous weight set. I am however still wondering why the sparse weights are not being summed into the normalization factor. W

Re: [Moses-support] --activate-features in mert-moses.perl not working?

2014-02-25 Thread Marcin Junczys-Dowmunt
Hi Hieu, Rico, this does not seem to be an issue with the ini-file. It actually works as well with stand-alone moses. The issue seems to be the mert-moses.pl script which switches off features that are not returned by the decoder because they are set to tuneable=false. In the function

[Moses-support] CFP Seven SIGIR’14 Workshops on emerging areas in IR

2014-02-25 Thread Richi Nayak
[Apologies if you receive this more than once] The workshop program of the SIGIR’14: 37th Annual ACM SIGIR Conference, Gold Coast, Australia, 6-11 July, 2014 http://sigir.org/sigir2014/ will host seven attractive workshops covering novel ideas and emerging areas in IR: