Hi Jorg,
I recall Marco Tuchi had some statistical analysis about these under
different conditions for translating from Spanish-English.
I did have to create lots of good pictures for my own paper for different
language pairs and corpora. And the data i.e. BLEU score for different
training data
, Prasanth K prasanthk.m...@gmail.com
wrote:
Hi,
I am trying to use the target side of the parallel corpora to train my LM
from the EMS. From the example on the EMS webpage: I have defined, my LM
section as
[LM:multiun]
lowercased-corpus = [CORPUS:multiun:lowercased]
The LM
Hi,
I am trying to use the target side of the parallel corpora to train my LM
from the EMS. From the example on the EMS webpage: I have defined, my LM
section as
[LM:multiun]
lowercased-corpus = [CORPUS:multiun:lowercased]
The LM training crashes, since the pipeline does not link this to the
Hi,
I have recently moved from a multi-core machine to a cluster setup and am
trying to set up Moses. In testing the installation with minimal datasets
(100 sentences for train, 10 and 5 sentences as devel and test), I have
setup the EMS to use resources on individual nodes conservatively. This
for toy examples. Then when you're actually submitting jobs, tell lmplz
less memory than you asked for from PBS/SGE/SLURM.
Kenneth
On 05/07/14 05:58, Prasanth K wrote:
Hi,
I have recently moved from a multi-core machine to a cluster setup and
am trying to set up Moses. In testing
Hi Nadeem,
Miles is right. I think I can add some details of experiments which show
the same thing you have observed.
There have been large scale experiments doing what you are trying to see:
evaluating the translation quality on data that has been used to train the
system. The link to one of
Hi,
I noticed a recent thread about the use of SGE cluster to run Moses. I now
know Thomas Meyer provided a script to get Moses (decoder) running on a
cluster using SGE. Also, that folks at Edinburgh are using a large
multi-core machine to run Moses (from @Hieu's mail in the same thread).
My
Hi Pranjal,
Its not uncommon to observe such differences when changing the direction of
translation. Translation from English to Bengali is relatively harder as
Bengali is morphologically rich, making it difficult for the correct
surface forms to be generated. Given that BLEU is a pattern
, Dec 12, 2013 at 8:58 PM, Prasanth K prasanthk.m...@gmail.comwrote:
Hi Pranjal,
Its not uncommon to observe such differences when changing the direction
of translation. Translation from English to Bengali is relatively harder as
Bengali is morphologically rich, making it difficult
hieuho...@gmail.com wrote:
ok, i can't reproduce your error
FUnction not implemented
you should find out exactly how lmplz is being run, it may be that you
have a slightly older version and doesn't know all the arguments you've
given it.
On 26/11/2013 06:47, Prasanth K wrote:
Hello Hieu
22:40, Prasanth K wrote:
Hi Kenneth,
Thanks for the clarification w.r.t. calculating the memory size. But I
am running these on a Mac (10.9 Mavericks). Do you think I should still
port the lmplz code to Mac for the estimation of probabilities?
One thing though, I did change the default
Hi,
I am trying to use KenLM for building a language model on the Europarl
corpus. Following the instructions in (
http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc19),
I added the few lines for getting KenLM to estimate the LM probabilities
(order/n=5) to my config file
Hi,
There is a work by Marco Turchi, where they look at evolution of BLEU with
respect to increasing data set size used for MERT. The investigation is
primarily for Spanish-English language pair, so the inferences might not be
scalable when considering for a challenging language pair.
The draft
Hi,
This sounds like a problem with SRILM (the language model) rather than
Moses. To resolve this:
1. In case you are using Ubuntu, try adding the NLP repository by
Eric Nichols to your apt repository. That should allow you to install SRILM
through apt-get, and is much simple. Look here
Hi all,
I have tried using this evaluator that Matous is talking about.
The problem is that there is a difference of 0.5-1 BLEU points difference in
the value given by the evaluator script and the regular mteval-v13.plscript.
Certainly not a case of precision error!
I am evaluating the tokenized
Hi all,
I am facing the same error that Cyrine Nasri mentioned in this thread.
I will try and give more information that what has already been mentioned.
1. I was using a single-threaded version of moses until earlier and was
having no problem with the experiments using EMS.
2. I recently
Hi all,
I've recently posted a query on the problem I was having with MERT. I found
a similar thread on the list (by Cyrine Nasr) and posted the question on the
same thread.
Not to be repeating myself, but the problem I was facing was: MERT step was
crashing after running exactly one iteration.
Hello All,
I am conducting a series of experiments to build translation systems using
Moses in which the corpus of the current experiment is a subset of the
corpora used in the previous experiment. I have started with the Europarl
corpora and am likely to repeat this process about 20 times.
Hi Sir,
Thank you for replying to my mail. Yes, I have thought about this solution
for alignments, but the heuristics used in moses got me thinking, and I
wanted to use the heuristic to obtain the final alignments(since the
alignments are of a higher quality). So, my question would be more like,
I am working on Statistical and Hybridized Methods for Machine Translation
in Indian languages. An associate of mine has recently submitted a paper for
cicling describing an alignment algorithm tailored for Indian
languages(English-Hindi pair to be exact). The algorithm reports slightly
better
20 matches
Mail list logo