Hi Barry,
Thanks for you information.
The scores are calculated by MultiEval on the test set. And I used only one
reference in development.
I re-caculated the BELU score via the mutli-bleu.pl.
BLEU = 29.02, 65.8/36.2/22.0/13.7 (BP=0.996, ratio=0.996, hyp_len=19684,
ref_len=19755)
It's very
saeed smith writes:
>
> Thank you all (specially for the paper Chris mentioned).I agree with you
Barry. But as Germán said, when optimizer is not involved in experiments (e.g.
evaluating decoder modifications), the tool can be very useful. Am I missing
something?
I guess the point is that even
The same function in ChartManager should be what you want
Hieu
Sent while bumping into things
On 24 Jan 2013, at 05:55 PM, "Jesús González Rubio"
wrote:
Thanks Christian.
I have read the code of OutputSearchNode and it seems to be designed to
write a word-grap, not a hypergraph. ¿May be possib
Thanks Christian.
I have read the code of OutputSearchNode and it seems to be designed to
write a word-grap, not a hypergraph. ¿May be possible that OutputSearchNode
is the function called when the -osg option is passed to moses, and a
different function is called for the same option of moses-char
Hi,
I am not aware of updated documentation on this. Your best chance is
probably to read through
void OutputSearchNode
in moses/src/Manager.cpp which is pretty readable.
cheers,
Christian
On 24/01/13 17:24, Jesús González Rubio wrote:
> Hi,
>
> I'm generating some translations using the -osg
Hi,
I'm generating some translations using the -osg option of moses-chart and I
have some difficulties to fully understand the format in which the search
hypergraph is outputted. ¿Is there a description of the osg format
available?
Cheers.
--
Jesús
___
Thank you all (specially for the paper Chris mentioned).
I agree with you Barry. But as Germán said, when optimizer is not involved
in experiments (e.g. evaluating decoder modifications), the tool can be
very useful. Am I missing something?
Cheers,
SD
--
*NRC Center for Language*
On Thu, Jan
The question is, "significance" to what? Physics and other hard
sciences aren't the same as a social science with applied technology.
I think until someone can define a better significance test for human
authorship of both original content and translation, I agree with Barry.
It's better to ke
Hi Saeed
In my experience, significance tests are often badly applied or
interpreted, so I don't get good feelings when I read an MT paper *with*
significance tests.
I think having such a tool in Moses would make things worse. I don't
want to have to read/review papers which claim that "Moses
If you move the count field to the beginning of the line, you can use the
-text-has-weights switch of ngram-counts:
> -text-has-weights
> Treat the first field in each text input line as a weight factor by which
> the N-gram counts for that line are to be multiplied.
More here:
http://ww
Hi Jun, all:
I just released a new version of MultEval (V0.5.1) that does not give the
strange NaN's, but instead prints a warning message telling you that you're
using a single optimizer run so that no value can be calculated and that
any conclusions you draw from these numbers may be unreliable
Indeed, I fully agree with the point about understanding the limits. In
fact, in some multi-reference corpora I have observed variations of more
than 10 BLEU points when computing inter-reference BLEU scores (i.e., one
reference against the other references). However, this issue is much
broader
If you're interested in statistical significant testing, you really
ought to read the Clark et al. (2011) paper
(http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf). We showed that
the Koehn technique and related methods can indicate significance for
reasons that have little to do with the experi
That would be great!
On Thursday, January 24, 2013, Germán Sanchis Trilles wrote:
> Hi all,
>
> personally I have an implementation of Koehn's 2004 ACL paper about
> statistical sifgnificance tests for MT evaluation. It implements both
> "stand-alone confidence intervals" (sec.5, bootstrap resamp
Hi all,
personally I have an implementation of Koehn's 2004 ACL paper about
statistical sifgnificance tests for MT evaluation. It implements both
"stand-alone confidence intervals" (sec.5, bootstrap resampling) and
paired bootstrap resampling, if a baseline is given. Right now, it
computes co
Hi,
Amusingly enough, the parallel thread regarding MultEval answers your
question: https://github.com/jhclark/multeval .
Kenneth
On 01/24/13 11:15, Patrik Lambert wrote:
> Hi Saeed,
>
> I fully agree with you. I don't think that in Physics, for example, a
> paper without a reliable est
Hi Saeed,
I fully agree with you. I don't think that in Physics, for example, a
paper without a reliable estimation of the error on the measurements
would be publishable, nor would you see in a paper results with more
digits than the significant ones.
Having easy-to-use statistical significant
Barry Haddow writes:
> The NaNs in the MultiEval output are a bit strange. I'm not familiar
> with this tool, but Moses contains multi-bleu.pl (in scripts/generic)
> which you can also use to calculate Bleu,
>
> cheers - Barry
s_opt is the variance of different optimizer runs. MultEval is int
Hi,
I guess you can run as follows:
build-sublm.pl --size --ngrams --sublm
[--prune-singletons] [--kneser-ney|--witten-bell]
merge-sublm.pl --size --sublm -lm iARPA_LM.gz
(then with ARPA files you can use KenLM to build binary LM files)
--
Cheers,
Vu
On Thu, Jan 24, 2013 at 6:14 AM, Pele
Hi Jun
mert-moses.pl is not an evaluation script, it's for tuning the MT
engine. It will report bleu scores obtained during tuning, but these are
on the development set. The scores you're showing using MultiEval are (I
hope!) on the test set, which would make them different. It's quite a
big d
Hi,
I'm working on a Transliteration project.
The input is a word in one language and the output is the same word in
English (not translated).
My language Model will created from google 1gram file - while each letter
of a word should be a word.
This is the original file:
95119665584
9511
21 matches
Mail list logo