Re: [Moses-support] The BELU score from MultiEval is much lower than which generated by the Moses mert-moses.pl script

2013-01-24 Thread Tan, Jun
Hi Barry, Thanks for you information. The scores are calculated by MultiEval on the test set. And I used only one reference in development. I re-caculated the BELU score via the mutli-bleu.pl. BLEU = 29.02, 65.8/36.2/22.0/13.7 (BP=0.996, ratio=0.996, hyp_len=19684, ref_len=19755) It's very

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Rico Sennrich
saeed smith writes: > > Thank you all (specially for the paper Chris mentioned).I agree with you Barry. But as Germán said, when optimizer is not involved in experiments (e.g. evaluating decoder modifications), the tool can be very useful. Am I missing something? I guess the point is that even

Re: [Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Hieu Hoang
The same function in ChartManager should be what you want Hieu Sent while bumping into things On 24 Jan 2013, at 05:55 PM, "Jesús González Rubio" wrote: Thanks Christian. I have read the code of OutputSearchNode and it seems to be designed to write a word-grap, not a hypergraph. ¿May be possib

Re: [Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Jesús González Rubio
Thanks Christian. I have read the code of OutputSearchNode and it seems to be designed to write a word-grap, not a hypergraph. ¿May be possible that OutputSearchNode is the function called when the -osg option is passed to moses, and a different function is called for the same option of moses-char

Re: [Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Christian Buck
Hi, I am not aware of updated documentation on this. Your best chance is probably to read through void OutputSearchNode in moses/src/Manager.cpp which is pretty readable. cheers, Christian On 24/01/13 17:24, Jesús González Rubio wrote: > Hi, > > I'm generating some translations using the -osg

[Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Jesús González Rubio
Hi, I'm generating some translations using the -osg option of moses-chart and I have some difficulties to fully understand the format in which the search hypergraph is outputted. ¿Is there a description of the osg format available? Cheers. -- Jesús ___

Re: [Moses-support] statistical significance tests

2013-01-24 Thread saeed smith
Thank you all (specially for the paper Chris mentioned). I agree with you Barry. But as Germán said, when optimizer is not involved in experiments (e.g. evaluating decoder modifications), the tool can be very useful. Am I missing something? Cheers, SD -- *NRC Center for Language* On Thu, Jan

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Tom Hoar
The question is, "significance" to what? Physics and other hard sciences aren't the same as a social science with applied technology. I think until someone can define a better significance test for human authorship of both original content and translation, I agree with Barry. It's better to ke

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Barry Haddow
Hi Saeed In my experience, significance tests are often badly applied or interpreted, so I don't get good feelings when I read an MT paper *with* significance tests. I think having such a tool in Moses would make things worse. I don't want to have to read/review papers which claim that "Moses

Re: [Moses-support] Creating Language Model from google 1gram file

2013-01-24 Thread John D. Burger
If you move the count field to the beginning of the line, you can use the -text-has-weights switch of ngram-counts: > -text-has-weights > Treat the first field in each text input line as a weight factor by which > the N-gram counts for that line are to be multiplied. More here: http://ww

Re: [Moses-support] The BELU score from MultiEval is much lower than which generated by the Moses mert-moses.pl script

2013-01-24 Thread Jonathan Clark
Hi Jun, all: I just released a new version of MultEval (V0.5.1) that does not give the strange NaN's, but instead prints a warning message telling you that you're using a single optimizer run so that no value can be calculated and that any conclusions you draw from these numbers may be unreliable

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Germán Sanchis Trilles
Indeed, I fully agree with the point about understanding the limits. In fact, in some multi-reference corpora I have observed variations of more than 10 BLEU points when computing inter-reference BLEU scores (i.e., one reference against the other references). However, this issue is much broader

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Chris Dyer
If you're interested in statistical significant testing, you really ought to read the Clark et al. (2011) paper (http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf). We showed that the Koehn technique and related methods can indicate significance for reasons that have little to do with the experi

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Lane Schwartz
That would be great! On Thursday, January 24, 2013, Germán Sanchis Trilles wrote: > Hi all, > > personally I have an implementation of Koehn's 2004 ACL paper about > statistical sifgnificance tests for MT evaluation. It implements both > "stand-alone confidence intervals" (sec.5, bootstrap resamp

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Germán Sanchis Trilles
Hi all, personally I have an implementation of Koehn's 2004 ACL paper about statistical sifgnificance tests for MT evaluation. It implements both "stand-alone confidence intervals" (sec.5, bootstrap resampling) and paired bootstrap resampling, if a baseline is given. Right now, it computes co

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Kenneth Heafield
Hi, Amusingly enough, the parallel thread regarding MultEval answers your question: https://github.com/jhclark/multeval . Kenneth On 01/24/13 11:15, Patrik Lambert wrote: > Hi Saeed, > > I fully agree with you. I don't think that in Physics, for example, a > paper without a reliable est

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Patrik Lambert
Hi Saeed, I fully agree with you. I don't think that in Physics, for example, a paper without a reliable estimation of the error on the measurements would be publishable, nor would you see in a paper results with more digits than the significant ones. Having easy-to-use statistical significant

Re: [Moses-support] The BELU score from MultiEval is much lower than which generated by the Moses mert-moses.pl script

2013-01-24 Thread Rico Sennrich
Barry Haddow writes: > The NaNs in the MultiEval output are a bit strange. I'm not familiar > with this tool, but Moses contains multi-bleu.pl (in scripts/generic) > which you can also use to calculate Bleu, > > cheers - Barry s_opt is the variance of different optimizer runs. MultEval is int

Re: [Moses-support] Creating Language Model from google 1gram file

2013-01-24 Thread HOANG Cong Duy Vu
Hi, I guess you can run as follows: build-sublm.pl --size --ngrams --sublm [--prune-singletons] [--kneser-ney|--witten-bell] merge-sublm.pl --size --sublm -lm iARPA_LM.gz (then with ARPA files you can use KenLM to build binary LM files) -- Cheers, Vu On Thu, Jan 24, 2013 at 6:14 AM, Pele

Re: [Moses-support] The BELU score from MultiEval is much lower than which generated by the Moses mert-moses.pl script

2013-01-24 Thread Barry Haddow
Hi Jun mert-moses.pl is not an evaluation script, it's for tuning the MT engine. It will report bleu scores obtained during tuning, but these are on the development set. The scores you're showing using MultiEval are (I hope!) on the test set, which would make them different. It's quite a big d

[Moses-support] Creating Language Model from google 1gram file

2013-01-24 Thread Peled Guy
Hi, I'm working on a Transliteration project. The input is a word in one language and the output is the same word in English (not translated). My language Model will created from google 1gram file - while each letter of a word should be a word. This is the original file: 95119665584 9511