Re: [Moses-support] language models options

2016-04-06 Thread Christophe Servan
Hello Vincent,

According to some experiences, your option number 3 is the best one.

 

Cheers,

 

Christophe 

 

 

De : moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu] De la 
part de Vincent Nguyen
Envoyé : mercredi 6 avril 2016 17:11
À : Philipp Koehn 
Cc : moses-support 
Objet : Re: [Moses-support] language models options

 

Sorry Philipp, I did not ask my question properly.

I was not talking about the phrase table.

I was talking about the language model options that we have. when I said corpus 
I was referring to the data for the LM itself.

and in terms of "performance" I was more talking about the impact on quality.

so
option 1 : 2 LM built from 2 data corpora A and B with 2 weights in moses.ini
option 2 : 1 LM built from data corpora A+B
option 3 : 2 LM built from corpora A and B and then interpolated into 1 single 
LM

Hope it's clearer




Le 06/04/2016 16:53, Philipp Koehn a écrit :

Hi, 

 

the number of phrase tables should not matter much, but the number of 

language models has a significant impact on speed. There are no general

hard numbers on this, since it depends on a lot of other settings, but 

adding a second language model will slow down decoder around 30-50%.

 

The size of phrase tables and language models matter, too, but not 

as much, and it seems that in your scenario you are just wondering 

about splitting up a fixed pool of data.

 

-phi

 

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen mailto:vngu...@neuf.fr> > wrote:

Hi,

What are (in terms of performance) the difference between the 3
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu> 
http://mailman.mit.edu/mailman/listinfo/moses-support

 

 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models options

2016-04-06 Thread Vincent Nguyen

Sorry Philipp, I did not ask my question properly.

I was not talking about the phrase table.

I was talking about the language model options that we have. when I said 
corpus I was referring to the data for the LM itself.


and in terms of "performance" I was more talking about the impact on 
quality.


so
option 1 : 2 LM built from 2 data corpora A and B with 2 weights in 
moses.ini

option 2 : 1 LM built from data corpora A+B
option 3 : 2 LM built from corpora A and B and then interpolated into 1 
single LM


Hope it's clearer



Le 06/04/2016 16:53, Philipp Koehn a écrit :

Hi,

the number of phrase tables should not matter much, but the number of
language models has a significant impact on speed. There are no general
hard numbers on this, since it depends on a lot of other settings, but
adding a second language model will slow down decoder around 30-50%.

The size of phrase tables and language models matter, too, but not
as much, and it seems that in your scenario you are just wondering
about splitting up a fixed pool of data.

-phi

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen > wrote:


Hi,

What are (in terms of performance) the difference between the 3
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu 
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models options

2016-04-06 Thread Philipp Koehn
Hi,

the number of phrase tables should not matter much, but the number of
language models has a significant impact on speed. There are no general
hard numbers on this, since it depends on a lot of other settings, but
adding a second language model will slow down decoder around 30-50%.

The size of phrase tables and language models matter, too, but not
as much, and it seems that in your scenario you are just wondering
about splitting up a fixed pool of data.

-phi

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen  wrote:

> Hi,
>
> What are (in terms of performance) the difference between the 3
> following solutions :
>
> 2 corpus, 2 LM, 2 weights calculated at tuning time
>
> 2 corpus merged into one, 1 LM
>
> 2 corpus, 2 LM interpolated into 1 LM with tuning
>
> Will the results be different in the end ?
>
> thanks.
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] language models options

2016-04-06 Thread Vincent Nguyen
Hi,

What are (in terms of performance) the difference between the 3 
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support