[Moses-support] decoding a confusion network using Moses' API

2012-04-26 Thread Sylvain Raybaud
Hi all

  I'm using Moses API for decoding a confusion network. The CN is
created from the output of an ASR engine and a confusion matrix. More
precisely (even though it's probably irrelevant to my problem), the ASR
engine provides a string of phonemes (1-best) and the confusion matrix
provides alternatives for each phonemes (the idea was described in Jiang
et al., _Phonetic representation based speech translation_, MT Summit
XIII, 2011).

When the CN is dumped into a file and I use
moses -f moses.phonemes.cn.ini < CN
to decode it, everything is fine.

But when I use Moses API (loading the same configuration file), I get
incomplete translations, like:

ASR output (French): "nous font sont toujours chimistes plume
rassembleront ch je trouve que le office de ce tout de suite"
Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
swa s swa t u d s h i t"
Translation: "of"
score: 903011968.00

Note that the transcription is poor (I haven't really tuned the ASR
engine), but still, the translation ought to be more than just "of".
Sometimes it's several words, I guess it's a phrase in the phrase table.
The word generally seems to be the translation of a word in the source
sentence.
When I use moses on command line to translate either the 1-best or the
the CN, I get a reasonable translation. When I use the API to translate
the 1-best phonetic representation, I also get a reasonable translation.
I think the CN object is created correctly because moses loads it and
prints it prior to decoding (this is normal verbose behavior). I also
tried to create a PCN object, and got exactly the same results. So I
guess the problem is either how I tell moses to decode it or how I
extract the result from the Hypothesis object. But I'm clueless about
what's the problem is here, since the code is working when I just
translate a string. The translation score seems ridiculously high too.
I'll give below the corresponding code.

Decoding and hypothesis extraction:
***
[...]
Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
&system);
manager->ProcessSentence();
const Hypothesis* hypo = manager->GetBestHypothesis();
string hyp = moses_get_hyp(hypo);
[...]
pair->translation_score = UntransformScore(hypo->GetScore());
[...]

string moses_get_hyp(const Hypothesis* hypo) {
return hypo->GetTargetPhraseStringRep();
}


Creation of the CN:
***

/** new class derived from ConfusionNet, with a new method for directly
creating CN */
class MyConfusionNet : public ConfusionNet {
  public:
void addCol(Column);
};

void MyConfusionNet::addCol(Column col) {
data.push_back(col);
}

/** create a column of the CN */
static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
cm, const char * ph, int width, double thresh, const vector
&factor_order) {

MyConfusionNet::Column col;

phoneme_conf_t * ph_conf =
(phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
if(ph_conf==NULL) {
return col;
}

int i;
for(i = 0; in_phonemes; i++) {
vector scores;
float score = float(ph_conf[i].p);
if((width<=0 || i=thresh)) {
string wd(cm->phonemes[ph_conf[i].phoneme]);
Word word;
word.CreateFromString(Input,factor_order,wd,false);
scores.push_back(score);
pair > linkdata(word,scores);
col.push_back(linkdata);
}
}

return col;
}

/** Creates a confusion network from a NULL terminated phonemes list and
a phonemes confusion matrix */
static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
char ** phonemes, int width, double thresh, const vector
&factor_order) {
debug("start");

MyConfusionNet * cn = new MyConfusionNet();

int i = 0;
while(phonemes[i]!=NULL) {
debug("%s",phonemes[i]);
MyConfusionNet::Column col =
create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
cn->addCol(col);
i += 1;
}

return cn;
}

So, if anyone has an idea about what's wrong here thanks!

cheers,

-- 
Sylvain Raybaud
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] decoding a confusion network using Moses' API

2012-04-26 Thread Barry Haddow
Hi Sylvain

I'm not familiar with this part of the code, but the strange score suggests 
that there's some uninitialised memory. You could try running through valgrind 
and it might give some clues,

cheers - Barry

On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
> Hi all
> 
>   I'm using Moses API for decoding a confusion network. The CN is
> created from the output of an ASR engine and a confusion matrix. More
> precisely (even though it's probably irrelevant to my problem), the ASR
> engine provides a string of phonemes (1-best) and the confusion matrix
> provides alternatives for each phonemes (the idea was described in Jiang
> et al., _Phonetic representation based speech translation_, MT Summit
> XIII, 2011).
> 
> When the CN is dumped into a file and I use
> moses -f moses.phonemes.cn.ini < CN
> to decode it, everything is fine.
> 
> But when I use Moses API (loading the same configuration file), I get
> incomplete translations, like:
> 
> ASR output (French): "nous font sont toujours chimistes plume
> rassembleront ch je trouve que le office de ce tout de suite"
> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
> swa s swa t u d s h i t"
> Translation: "of"
> score: 903011968.00
> 
> Note that the transcription is poor (I haven't really tuned the ASR
> engine), but still, the translation ought to be more than just "of".
> Sometimes it's several words, I guess it's a phrase in the phrase table.
> The word generally seems to be the translation of a word in the source
> sentence.
> When I use moses on command line to translate either the 1-best or the
> the CN, I get a reasonable translation. When I use the API to translate
> the 1-best phonetic representation, I also get a reasonable translation.
> I think the CN object is created correctly because moses loads it and
> prints it prior to decoding (this is normal verbose behavior). I also
> tried to create a PCN object, and got exactly the same results. So I
> guess the problem is either how I tell moses to decode it or how I
> extract the result from the Hypothesis object. But I'm clueless about
> what's the problem is here, since the code is working when I just
> translate a string. The translation score seems ridiculously high too.
> I'll give below the corresponding code.
> 
> Decoding and hypothesis extraction:
> ***
> [...]
> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> &system);
> manager->ProcessSentence();
> const Hypothesis* hypo = manager->GetBestHypothesis();
> string hyp = moses_get_hyp(hypo);
> [...]
> pair->translation_score = UntransformScore(hypo->GetScore());
> [...]
> 
> string moses_get_hyp(const Hypothesis* hypo) {
> return hypo->GetTargetPhraseStringRep();
> }
> 
> 
> Creation of the CN:
> ***
> 
> /** new class derived from ConfusionNet, with a new method for directly
> creating CN */
> class MyConfusionNet : public ConfusionNet {
>   public:
> void addCol(Column);
> };
> 
> void MyConfusionNet::addCol(Column col) {
> data.push_back(col);
> }
> 
> /** create a column of the CN */
> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
> cm, const char * ph, int width, double thresh, const vector
> &factor_order) {
> 
> MyConfusionNet::Column col;
> 
> phoneme_conf_t * ph_conf =
> (phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
> if(ph_conf==NULL) {
> return col;
> }
> 
> int i;
> for(i = 0; in_phonemes; i++) {
> vector scores;
> float score = float(ph_conf[i].p);
> if((width<=0 || i=thresh)) {
> string wd(cm->phonemes[ph_conf[i].phoneme]);
> Word word;
> word.CreateFromString(Input,factor_order,wd,false);
> scores.push_back(score);
> pair > linkdata(word,scores);
> col.push_back(linkdata);
> }
> }
> 
> return col;
> }
> 
> /** Creates a confusion network from a NULL terminated phonemes list and
> a phonemes confusion matrix */
> static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
> char ** phonemes, int width, double thresh, const vector
> &factor_order) {
> debug("start");
> 
> MyConfusionNet * cn = new MyConfusionNet();
> 
> int i = 0;
> while(phonemes[i]!=NULL) {
> debug("%s",phonemes[i]);
> MyConfusionNet::Column col =
> create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
> cn->addCol(col);
> i += 1;
> }
> 
> return cn;
> }
> 
> So, if anyone has an idea about what's wrong here thanks!
> 
> cheers,
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] default inputtype changed?

2012-04-26 Thread John Morgan
Hello,
My system builds with ems that worked last week are failing now at the
tuning step.
If I set
decoder-settings = "-inputtype 3"
under the TUNING stanza  in my experiment.perl configuration file
tuning runs again.
Did something change?
Thanks,
-- 
Regards,
John J Morgan
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] decoding a confusion network using Moses' API

2012-04-26 Thread Sylvain Raybaud
Hi Barrow

  Thanks for the tip, that sounds likely indeed. I'll try it again but
last time I ran the software through valgrind, I got so many errors in
external libs that I just gave up.

In the meantime, here is the complete fonction that handles the
decoding, in case someone sees something obviously wrong in here...

static void moses_translate_phonemes(manager_data_t * pool,
translation_pair_t * pair) {
debug("starting");

const TranslationSystem& system =
StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
/* there is only one translation system for now */
const StaticData &staticData = StaticData::Instance();
const vector &inputFactorOrder =
staticData.GetInputFactorOrder();

MyConfusionNet * cn =
phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);

Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
&system);
manager->ProcessSentence();
const Hypothesis* hypo = manager->GetBestHypothesis();

string hyp = moses_get_hyp(hypo);
char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
strcpy(hyp_ret,hyp.c_str());

pair->translation_score = UntransformScore(hypo->GetScore());
translation_pair_set_target(pair, hyp_ret,NULL);

delete manager;
delete cn;

}

cheers,

Sylvain

On 26/04/12 13:49, Barry Haddow wrote:
> Hi Sylvain
> 
> I'm not familiar with this part of the code, but the strange score suggests 
> that there's some uninitialised memory. You could try running through 
> valgrind 
> and it might give some clues,
> 
> cheers - Barry
> 
> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
>> Hi all
>>
>>   I'm using Moses API for decoding a confusion network. The CN is
>> created from the output of an ASR engine and a confusion matrix. More
>> precisely (even though it's probably irrelevant to my problem), the ASR
>> engine provides a string of phonemes (1-best) and the confusion matrix
>> provides alternatives for each phonemes (the idea was described in Jiang
>> et al., _Phonetic representation based speech translation_, MT Summit
>> XIII, 2011).
>>
>> When the CN is dumped into a file and I use
>> moses -f moses.phonemes.cn.ini < CN
>> to decode it, everything is fine.
>>
>> But when I use Moses API (loading the same configuration file), I get
>> incomplete translations, like:
>>
>> ASR output (French): "nous font sont toujours chimistes plume
>> rassembleront ch je trouve que le office de ce tout de suite"
>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
>> swa s swa t u d s h i t"
>> Translation: "of"
>> score: 903011968.00
>>
>> Note that the transcription is poor (I haven't really tuned the ASR
>> engine), but still, the translation ought to be more than just "of".
>> Sometimes it's several words, I guess it's a phrase in the phrase table.
>> The word generally seems to be the translation of a word in the source
>> sentence.
>> When I use moses on command line to translate either the 1-best or the
>> the CN, I get a reasonable translation. When I use the API to translate
>> the 1-best phonetic representation, I also get a reasonable translation.
>> I think the CN object is created correctly because moses loads it and
>> prints it prior to decoding (this is normal verbose behavior). I also
>> tried to create a PCN object, and got exactly the same results. So I
>> guess the problem is either how I tell moses to decode it or how I
>> extract the result from the Hypothesis object. But I'm clueless about
>> what's the problem is here, since the code is working when I just
>> translate a string. The translation score seems ridiculously high too.
>> I'll give below the corresponding code.
>>
>> Decoding and hypothesis extraction:
>> ***
>> [...]
>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>> &system);
>> manager->ProcessSentence();
>> const Hypothesis* hypo = manager->GetBestHypothesis();
>> string hyp = moses_get_hyp(hypo);
>> [...]
>> pair->translation_score = UntransformScore(hypo->GetScore());
>> [...]
>>
>> string moses_get_hyp(const Hypothesis* hypo) {
>> return hypo->GetTargetPhraseStringRep();
>> }
>>
>>
>> Creation of the CN:
>> ***
>>
>> /** new class derived from ConfusionNet, with a new method for directly
>> creating CN */
>> class MyConfusionNet : public ConfusionNet {
>>   public:
>> void addCol(Column);
>> };
>>
>> void MyConfusionNet::addCol(Column col) {
>> data.push_back(col);
>> }
>>
>> /** create a column of the CN */
>> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
>> cm, const char * ph, int width, double thresh, const vector
>> &factor_order) {
>>
>> MyConfusionNet::Column col;
>>
>> phoneme_conf_t * ph_conf =
>> (phoneme_conf_t*)g_hash_table_lookup(cm->mat

Re: [Moses-support] decoding a confusion network using Moses' API

2012-04-26 Thread Sylvain Raybaud
wild guessing here: in TranslationTask::Run, I see there are many
alternatives for processing the sentence, like doLatticeMBR etc, not
just runing Manager::ProcessSentence()
Maybe one of these alternatives must be run for processing confusion
networks?

cheers

Sylvain


On 26/04/12 15:53, Sylvain Raybaud wrote:
> Hi Barrow
> 
>   Thanks for the tip, that sounds likely indeed. I'll try it again but
> last time I ran the software through valgrind, I got so many errors in
> external libs that I just gave up.
> 
> In the meantime, here is the complete fonction that handles the
> decoding, in case someone sees something obviously wrong in here...
> 
> static void moses_translate_phonemes(manager_data_t * pool,
> translation_pair_t * pair) {
> debug("starting");
> 
> const TranslationSystem& system =
> StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
> /* there is only one translation system for now */
> const StaticData &staticData = StaticData::Instance();
> const vector &inputFactorOrder =
> staticData.GetInputFactorOrder();
> 
> MyConfusionNet * cn =
> phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);
> 
> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> &system);
> manager->ProcessSentence();
> const Hypothesis* hypo = manager->GetBestHypothesis();
> 
> string hyp = moses_get_hyp(hypo);
> char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
> strcpy(hyp_ret,hyp.c_str());
> 
> pair->translation_score = UntransformScore(hypo->GetScore());
> translation_pair_set_target(pair, hyp_ret,NULL);
> 
> delete manager;
> delete cn;
> 
> }
> 
> cheers,
> 
> Sylvain
> 
> On 26/04/12 13:49, Barry Haddow wrote:
>> Hi Sylvain
>>
>> I'm not familiar with this part of the code, but the strange score suggests 
>> that there's some uninitialised memory. You could try running through 
>> valgrind 
>> and it might give some clues,
>>
>> cheers - Barry
>>
>> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
>>> Hi all
>>>
>>>   I'm using Moses API for decoding a confusion network. The CN is
>>> created from the output of an ASR engine and a confusion matrix. More
>>> precisely (even though it's probably irrelevant to my problem), the ASR
>>> engine provides a string of phonemes (1-best) and the confusion matrix
>>> provides alternatives for each phonemes (the idea was described in Jiang
>>> et al., _Phonetic representation based speech translation_, MT Summit
>>> XIII, 2011).
>>>
>>> When the CN is dumped into a file and I use
>>> moses -f moses.phonemes.cn.ini < CN
>>> to decode it, everything is fine.
>>>
>>> But when I use Moses API (loading the same configuration file), I get
>>> incomplete translations, like:
>>>
>>> ASR output (French): "nous font sont toujours chimistes plume
>>> rassembleront ch je trouve que le office de ce tout de suite"
>>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
>>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
>>> swa s swa t u d s h i t"
>>> Translation: "of"
>>> score: 903011968.00
>>>
>>> Note that the transcription is poor (I haven't really tuned the ASR
>>> engine), but still, the translation ought to be more than just "of".
>>> Sometimes it's several words, I guess it's a phrase in the phrase table.
>>> The word generally seems to be the translation of a word in the source
>>> sentence.
>>> When I use moses on command line to translate either the 1-best or the
>>> the CN, I get a reasonable translation. When I use the API to translate
>>> the 1-best phonetic representation, I also get a reasonable translation.
>>> I think the CN object is created correctly because moses loads it and
>>> prints it prior to decoding (this is normal verbose behavior). I also
>>> tried to create a PCN object, and got exactly the same results. So I
>>> guess the problem is either how I tell moses to decode it or how I
>>> extract the result from the Hypothesis object. But I'm clueless about
>>> what's the problem is here, since the code is working when I just
>>> translate a string. The translation score seems ridiculously high too.
>>> I'll give below the corresponding code.
>>>
>>> Decoding and hypothesis extraction:
>>> ***
>>> [...]
>>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>>> &system);
>>> manager->ProcessSentence();
>>> const Hypothesis* hypo = manager->GetBestHypothesis();
>>> string hyp = moses_get_hyp(hypo);
>>> [...]
>>> pair->translation_score = UntransformScore(hypo->GetScore());
>>> [...]
>>>
>>> string moses_get_hyp(const Hypothesis* hypo) {
>>> return hypo->GetTargetPhraseStringRep();
>>> }
>>>
>>>
>>> Creation of the CN:
>>> ***
>>>
>>> /** new class derived from ConfusionNet, with a new method for directly
>>> creating CN */
>>> class MyConfus

Re: [Moses-support] decoding a confusion network using Moses' API

2012-04-26 Thread Barry Haddow
Hi Sylvain

I think ProcessSentence() is the right method to call. If you look at moses 
server then you'll see a less cluttered example of how to use the Moses api. 
It may be your  moses_get_hyp() is not back-tracking through the hypothesis 
correctly.

Note that you are calling  UntransformScore() which probably explains your odd 
translation score. It doesn't make much sense to do this, as you won't get a 
probability (it's not normalised). It is unusual though, that you appear to 
have a positive translation score (in log space).

If you increase the verbosity of moses (to 2 or 3) you'll get a better idea 
what it is doing, and you can see whether it really is producing "of" as the 
translation, and why.

cheers - Barry

On Thursday 26 April 2012 16:41:06 Sylvain Raybaud wrote:
> wild guessing here: in TranslationTask::Run, I see there are many
> alternatives for processing the sentence, like doLatticeMBR etc, not
> just runing Manager::ProcessSentence()
> Maybe one of these alternatives must be run for processing confusion
> networks?
> 
> cheers
> 
> Sylvain
> 
> On 26/04/12 15:53, Sylvain Raybaud wrote:
> > Hi Barrow
> >
> >   Thanks for the tip, that sounds likely indeed. I'll try it again but
> > last time I ran the software through valgrind, I got so many errors in
> > external libs that I just gave up.
> >
> > In the meantime, here is the complete fonction that handles the
> > decoding, in case someone sees something obviously wrong in here...
> >
> > static void moses_translate_phonemes(manager_data_t * pool,
> > translation_pair_t * pair) {
> > debug("starting");
> >
> > const TranslationSystem& system =
> > StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
> > /* there is only one translation system for now */
> > const StaticData &staticData = StaticData::Instance();
> > const vector &inputFactorOrder =
> > staticData.GetInputFactorOrder();
> >
> > MyConfusionNet * cn =
> > phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->
> >mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);
> >
> > Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> > &system);
> > manager->ProcessSentence();
> > const Hypothesis* hypo = manager->GetBestHypothesis();
> >
> > string hyp = moses_get_hyp(hypo);
> > char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
> > strcpy(hyp_ret,hyp.c_str());
> >
> > pair->translation_score = UntransformScore(hypo->GetScore());
> > translation_pair_set_target(pair, hyp_ret,NULL);
> >
> > delete manager;
> > delete cn;
> >
> > }
> >
> > cheers,
> >
> > Sylvain
> >
> > On 26/04/12 13:49, Barry Haddow wrote:
> >> Hi Sylvain
> >>
> >> I'm not familiar with this part of the code, but the strange score
> >> suggests that there's some uninitialised memory. You could try running
> >> through valgrind and it might give some clues,
> >>
> >> cheers - Barry
> >>
> >> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
> >>> Hi all
> >>>
> >>>   I'm using Moses API for decoding a confusion network. The CN is
> >>> created from the output of an ASR engine and a confusion matrix. More
> >>> precisely (even though it's probably irrelevant to my problem), the ASR
> >>> engine provides a string of phonemes (1-best) and the confusion matrix
> >>> provides alternatives for each phonemes (the idea was described in
> >>> Jiang et al., _Phonetic representation based speech translation_, MT
> >>> Summit XIII, 2011).
> >>>
> >>> When the CN is dumped into a file and I use
> >>> moses -f moses.phonemes.cn.ini < CN
> >>> to decode it, everything is fine.
> >>>
> >>> But when I use Moses API (loading the same configuration file), I get
> >>> incomplete translations, like:
> >>>
> >>> ASR output (French): "nous font sont toujours chimistes plume
> >>> rassembleront ch je trouve que le office de ce tout de suite"
> >>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
> >>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
> >>> swa s swa t u d s h i t"
> >>> Translation: "of"
> >>> score: 903011968.00
> >>>
> >>> Note that the transcription is poor (I haven't really tuned the ASR
> >>> engine), but still, the translation ought to be more than just "of".
> >>> Sometimes it's several words, I guess it's a phrase in the phrase
> >>> table. The word generally seems to be the translation of a word in the
> >>> source sentence.
> >>> When I use moses on command line to translate either the 1-best or the
> >>> the CN, I get a reasonable translation. When I use the API to translate
> >>> the 1-best phonetic representation, I also get a reasonable
> >>> translation. I think the CN object is created correctly because moses
> >>> loads it and prints it prior to decoding (this is normal verbose
> >>> behavior). I also tried to create a PCN object, and got exactly the
> >>> same results. So I guess the problem is either

[Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread Daniel Schaut
Hi all,

I'm running some experiments for my thesis and I've been told by a more
experienced user that the achieved scores for BLEU/METEOR of my MT engine
were too good to be true. Since this is the very first MT engine I've ever
made and I am not experienced with interpreting scores, I really don't know
how to reflect them. The first test set achieves a BLEU score of 0.6508
(v13). METEOR's final score is 0.7055 (v1.3, exact, stem, paraphrase). A
second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR
score of 0.6748.

Here are some basic facts about my system:
Decoding direction: EN-DE
Training corpus: 1.8 mil sentences
Tuning runs: 5
Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain)
LM type: trigram
TM type: unfactored

I'm now trying to figure out if these scores are realistic at all, as
different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang
2011. Any comments regarding the mentioned decoding direction and related
scores will be much appreciated.

Best,
Daniel
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread Barry Haddow
Hi Daniel

BLEU scores do vary according to test set, but the scores you report are much 
higher than usual.

The most likely thing is that you have some of your test set included in your 
training set,

cheers - Barry

On Thursday 26 April 2012 19:18:33 Daniel Schaut wrote:
> Hi all,
> 
> I'm running some experiments for my thesis and I've been told by a more
> experienced user that the achieved scores for BLEU/METEOR of my MT engine
> were too good to be true. Since this is the very first MT engine I've ever
> made and I am not experienced with interpreting scores, I really don't know
> how to reflect them. The first test set achieves a BLEU score of 0.6508
> (v13). METEOR's final score is 0.7055 (v1.3, exact, stem, paraphrase). A
> second test set indicated a slightly lower BLEU score of 0.6267 and a
>  METEOR score of 0.6748.
> 
> Here are some basic facts about my system:
> Decoding direction: EN-DE
> Training corpus: 1.8 mil sentences
> Tuning runs: 5
> Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain)
> LM type: trigram
> TM type: unfactored
> 
> I'm now trying to figure out if these scores are realistic at all, as
> different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang
> 2011. Any comments regarding the mentioned decoding direction and related
> scores will be much appreciated.
> 
> Best,
> Daniel
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread Francis Tyers
El dj 26 de 04 de 2012 a les 20:18 +0200, en/na Daniel Schaut va
escriure:
> Hi all,
> 
> I’m running some experiments for my thesis and I’ve been told by a
> more experienced user that the achieved scores for BLEU/METEOR of my
> MT engine were too good to be true. Since this is the very first MT
> engine I’ve ever made and I am not experienced with interpreting
> scores, I really don’t know how to reflect them. The first test set
> achieves a BLEU score of 0.6508 (v13). METEOR’s final score is 0.7055
> (v1.3, exact, stem, paraphrase). A second test set indicated a
> slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748.
> 
> Here are some basic facts about my system:
> 
> Decoding direction: EN-DE
> 
> Training corpus: 1.8 mil sentences
> 
> Tuning runs: 5
> 
> Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain)
> 
> LM type: trigram
> 
> TM type: unfactored
> 
> I’m now trying to figure out if these scores are realistic at all, as
> different papers indicate by far lower BLEU scores, e.g. Koehn and
> Hoang 2011. Any comments regarding the mentioned decoding direction
> and related scores will be much appreciated.

Did you try looking at the sentences ? -- 1,000 is few enough to eyeball
them. Have you tried the same system with a different corpus ? (e.g.
EuroParl). Have you checked that your test set and your training set do
not intersect ?

If the scores don't seem believable, then probably they aren't :)

Fran

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread John D Burger
I =think= I recall that pairwise BLEU scores for human translators are usually 
around 0.50, so anything much better than that is indeed suspect.

- JB

On Apr 26, 2012, at 14:18 , Daniel Schaut wrote:

> Hi all,
> 
> 
> I’m running some experiments for my thesis and I’ve been told by a more 
> experienced user that the achieved scores for BLEU/METEOR of my MT engine 
> were too good to be true. Since this is the very first MT engine I’ve ever 
> made and I am not experienced with interpreting scores, I really don’t know 
> how to reflect them. The first test set achieves a BLEU score of 0.6508 
> (v13). METEOR’s final score is 0.7055 (v1.3, exact, stem, paraphrase). A 
> second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR 
> score of 0.6748.
> 
> 
> Here are some basic facts about my system:
> 
> Decoding direction: EN-DE
> 
> Training corpus: 1.8 mil sentences
> 
> Tuning runs: 5
> 
> Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain)
> 
> LM type: trigram
> 
> TM type: unfactored
> 
> 
> I’m now trying to figure out if these scores are realistic at all, as 
> different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 
> 2011. Any comments regarding the mentioned decoding direction and related 
> scores will be much appreciated.
> 
> 
> Best,
> 
> Daniel
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread Miles Osborne
Very short sentences will give you high scores.

Also multiple references will boost them

Miles
On Apr 26, 2012 8:13 PM, "John D Burger"  wrote:

> I =think= I recall that pairwise BLEU scores for human translators are
> usually around 0.50, so anything much better than that is indeed suspect.
>
> - JB
>
> On Apr 26, 2012, at 14:18 , Daniel Schaut wrote:
>
> > Hi all,
> >
> >
> > I’m running some experiments for my thesis and I’ve been told by a more
> experienced user that the achieved scores for BLEU/METEOR of my MT engine
> were too good to be true. Since this is the very first MT engine I’ve ever
> made and I am not experienced with interpreting scores, I really don’t know
> how to reflect them. The first test set achieves a BLEU score of 0.6508
> (v13). METEOR’s final score is 0.7055 (v1.3, exact, stem, paraphrase). A
> second test set indicated a slightly lower BLEU score of 0.6267 and a
> METEOR score of 0.6748.
> >
> >
> > Here are some basic facts about my system:
> >
> > Decoding direction: EN-DE
> >
> > Training corpus: 1.8 mil sentences
> >
> > Tuning runs: 5
> >
> > Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain)
> >
> > LM type: trigram
> >
> > TM type: unfactored
> >
> >
> > I’m now trying to figure out if these scores are realistic at all, as
> different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang
> 2011. Any comments regarding the mentioned decoding direction and related
> scores will be much appreciated.
> >
> >
> > Best,
> >
> > Daniel
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Merging language models with IRSTLM..?

2012-04-26 Thread Marcello Federico
Hi,

we are currently working on  a project that includes incremental training of 
LMs. 
Hence, there are plans to introduce quick adaptation in IRSTLM, but not soon.

The question is indeed how often you need to adapt the LM. If you are working
with large news LMs then it seems that adapting once a week is enough (you
simply do not collect enough data in fewer days to significantly change the 
LM). 

If you want to continuously update the LM you can also consider using an 
external 
interpolation.  You interpolate two distinct LMs, one fixed and one smaller 
that 
is continuously retrained (should be fast to do), using the interpolate-lm 
command (see manual).

Greetings,
Marcello

 

On Apr 22, 2012, at 9:12 PM, Pratyush Banerjee wrote:

> Hi,
> 
> I have recently been trying to create incremental adapted language models 
> using IRSTLM.
> 
> I have a in-domain data set on which the mixture adapted weights are computed 
> using the -lm=mix option and i have a larger out-domain dataset from which i 
> incrementally add data to create adapted LMs of different size.
> 
> Currently, every time saveBIN is called, the entire lmtable is estimated and 
> saved which makes the process slow...
> 
> Is there a functionality in IRSTLM to incrementally train/save adapted 
> Language models?
> 
> Secondly, given a existing adapted language model in ARPA format (old), and 
> another small language model built on incremental data (new), 
> 
> would it be safe to update the smoothed probabilities (fstar) using the 
> following formula: 
> c_sum(wh) = c_old(wh) + c_new(wh)
> f*_old(w|h)*(c_old(wh)/c_sum(wh)) + f*_new(w|h)*(c_new(wh)/c_sum(wh))
> 
> where the c_old and c_new counts are estimated from the ngram tables?
> 
> 
> Thanks and Regards,
> 
> Pratyush
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support