[Moses-support] Alignment information in Phrase table

2008-08-05 Thread Qin Gao
Hi, I wonder whether the word alignment information in the phrase table is used in decoding? Such as ||| (0) (1,2,3) () ||| (2) (1,2) |||. I think moses does not use the lexicon, so how does these information be used? Thanks. Qin ___ Moses-support maili

Re: [Moses-support] Alignment information in Phrase table

2008-08-05 Thread Hieu Hoang
nope. the alignment information is not used by the decoder in the main trunk. _ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Qin Gao Sent: 05 August 2008 14:08 To: moses-support@mit.edu Subject: [Moses-support] Alignment information in Phrase table Hi, I wonder wheth

[Moses-support] Fwd: decoding: reordering only

2008-08-05 Thread John D. Burger
Oops, forgot to CC the list. > From: "John D. Burger" <[EMAIL PROTECTED]> > Date: August 4, 2008 13:30:30 EDT > To: [EMAIL PROTECTED] > Subject: Re: [Moses-support] decoding: reordering only > > Sanne Korzec wrote: > >> Is there a way to force the moses or pharaoh decoder, to use a >> certain se

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread John D. Burger
Hi - I'm still trying to debug my differences between old and new versions of Moses, which (for us) use SRILM and IRSTLM respectively. My current puzzle is over the very different sizes of the language models resulting from SRILM and IRSTLM - the latter has 5 times as many 5-grams, for in

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread Miles Osborne
by default the srilm prunes singletons Miles 2008/8/5 John D. Burger <[EMAIL PROTECTED]> > Hi - > > I'm still trying to debug my differences between old and new versions > of Moses, which (for us) use SRILM and IRSTLM respectively. My > current puzzle is over the very different sizes of the lan

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread John D. Burger
Miles Osborne wrote: > by default the srilm prunes singletons OK, that's good to know. But when I prune the IRST LM, I still get lots =more= 4-grams than the SRI LM, but lots =fewer= 5-grams (although less than a factor of two in either case). But perhaps I'm a bit in the weeds here ... :)

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread Miles Osborne
you want to also check that ngrams are not getting pruned by probability (in addition to counts) this whole business is a bit on the murky side and the only reason i know about it was when i was writing a disk-based version of ngram-count a year or so back Miles 2008/8/5 John D. Burger <[EMAIL P

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread John D. Burger
Miles Osborne wrote: > you want to also check that ngrams are not getting pruned by > probability (in addition to counts) Yes, in fact, this: http://www.speech.sri.com/projects/srilm/manpages/ngram-count.1.html seems to suggest that pruning is done based on not changing perplexity very m

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread Miles Osborne
it has been a while since i looked at this, but look at this (good-turning): *not pruning* [rydell]miles: ./ngram-count -lm /tmp/test2.lm -order 3 -gt1min 0 -gt2min 0 -gt3min 0 -text ../../../mt/diskbased-l m-training/temp.txt warning: discount coeff 1 is out of range: 5.55654e-17 warning: disc

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread Marcello Federico
Hi all, IRSTLM uses simpler smoothing methods than SRILM. In particular, improved kneser ney smoothing with SRILM uses corrected frequencies for lower order n-grams, while IRSTLM does not. (This indeed results in less n-grams.) The reason is that introducing corrected frequencies makes it hard to

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread amittai axelrod
2008/8/5 John D. Burger <[EMAIL PROTECTED]>: > I'm starting to think it's a lost cause to try to get one LM > implementation to act very much like the other. Thanks for the > insights, though! I also spent some time unsuccessfully trying to exactly match the SRILM toolkit's output. Aside from the

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-05 Thread Miles Osborne
actually there are two parts here --building large LMs and deploying them.i currently have a summer MSc project looking at using Hadoop and Hbase to do this Google-style. this really does use a cluster of machines, for both parts. in either case, building them on-disk with a single machine or

Re: [Moses-support] Moses: Prepare Data, Build Language Model and Train Model

2008-08-05 Thread Anung Ariwibowo
Hi, I installed Ubuntu Linux 8 with successful execution of training-factored-model. I am also tried to install Ubuntu Linux 7.10 on coLinux, and the result is successful also. I don't have any particular idea why in solaris the training is fail. One different thing I learn is the MERT version wh