Re: [Moses-support] GHKM translation is slow

2016-04-06 Thread Ayah El Maghraby
My decoding options are the default, I tried using cube-pruning-limit, &
stack size but still not good enough

On Wed, Apr 6, 2016 at 8:02 PM, Ayah ElMaghraby 
wrote:

> Hello
>
> I am trying to created a SMT using ghkm extraction but it is very slow
> during translation it translated 53 sentences in 24 hrs using Ubuntu-64bit
> 14.04 & 4GB Ram
>
> I changed rule table to be on disk using executable createOnDiskPt.
>
> Is there any thing I can do to speed it up a little bit.
>
> I am currently using these options
>
> --target-syntax -glue-grammar -max-phrase-length 5 -ghkm
> --score-options="--GoodTuring" -external-bin-dir
> ~/mosesdecoder/tools/mgizapp -mgiza -mgiza-cpus 2 -snt2cooc snt2cooc.pl
> -parallel -sort-batch-size 253 -sort-compress gzip --giza-option
> m1=5,m2=2,mh=0,m3=3,m4=3
>
>
>
>
>
> Regards,
> Ayah ElMaghraby
>
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Moses server with --output-search-graph

2016-04-06 Thread Lane Schwartz
When running mosesserver with --output-search-graph, I don't get a search
graph file created. Is this the expected behavior? Or is something else
going on?

Thanks,
Lane
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models options

2016-04-06 Thread Christophe Servan
Hello Vincent,

According to some experiences, your option number 3 is the best one.

 

Cheers,

 

Christophe 

 

 

De : moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu] De la 
part de Vincent Nguyen
Envoyé : mercredi 6 avril 2016 17:11
À : Philipp Koehn 
Cc : moses-support 
Objet : Re: [Moses-support] language models options

 

Sorry Philipp, I did not ask my question properly.

I was not talking about the phrase table.

I was talking about the language model options that we have. when I said corpus 
I was referring to the data for the LM itself.

and in terms of "performance" I was more talking about the impact on quality.

so
option 1 : 2 LM built from 2 data corpora A and B with 2 weights in moses.ini
option 2 : 1 LM built from data corpora A+B
option 3 : 2 LM built from corpora A and B and then interpolated into 1 single 
LM

Hope it's clearer




Le 06/04/2016 16:53, Philipp Koehn a écrit :

Hi, 

 

the number of phrase tables should not matter much, but the number of 

language models has a significant impact on speed. There are no general

hard numbers on this, since it depends on a lot of other settings, but 

adding a second language model will slow down decoder around 30-50%.

 

The size of phrase tables and language models matter, too, but not 

as much, and it seems that in your scenario you are just wondering 

about splitting up a fixed pool of data.

 

-phi

 

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen  > wrote:

Hi,

What are (in terms of performance) the difference between the 3
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu  
http://mailman.mit.edu/mailman/listinfo/moses-support

 

 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models options

2016-04-06 Thread Vincent Nguyen

Sorry Philipp, I did not ask my question properly.

I was not talking about the phrase table.

I was talking about the language model options that we have. when I said 
corpus I was referring to the data for the LM itself.


and in terms of "performance" I was more talking about the impact on 
quality.


so
option 1 : 2 LM built from 2 data corpora A and B with 2 weights in 
moses.ini

option 2 : 1 LM built from data corpora A+B
option 3 : 2 LM built from corpora A and B and then interpolated into 1 
single LM


Hope it's clearer



Le 06/04/2016 16:53, Philipp Koehn a écrit :

Hi,

the number of phrase tables should not matter much, but the number of
language models has a significant impact on speed. There are no general
hard numbers on this, since it depends on a lot of other settings, but
adding a second language model will slow down decoder around 30-50%.

The size of phrase tables and language models matter, too, but not
as much, and it seems that in your scenario you are just wondering
about splitting up a fixed pool of data.

-phi

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen > wrote:


Hi,

What are (in terms of performance) the difference between the 3
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu 
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models options

2016-04-06 Thread Philipp Koehn
Hi,

the number of phrase tables should not matter much, but the number of
language models has a significant impact on speed. There are no general
hard numbers on this, since it depends on a lot of other settings, but
adding a second language model will slow down decoder around 30-50%.

The size of phrase tables and language models matter, too, but not
as much, and it seems that in your scenario you are just wondering
about splitting up a fixed pool of data.

-phi

On Wed, Apr 6, 2016 at 6:50 AM, Vincent Nguyen  wrote:

> Hi,
>
> What are (in terms of performance) the difference between the 3
> following solutions :
>
> 2 corpus, 2 LM, 2 weights calculated at tuning time
>
> 2 corpus merged into one, 1 LM
>
> 2 corpus, 2 LM interpolated into 1 LM with tuning
>
> Will the results be different in the end ?
>
> thanks.
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] language models options

2016-04-06 Thread Vincent Nguyen
Hi,

What are (in terms of performance) the difference between the 3 
following solutions :

2 corpus, 2 LM, 2 weights calculated at tuning time

2 corpus merged into one, 1 LM

2 corpus, 2 LM interpolated into 1 LM with tuning

Will the results be different in the end ?

thanks.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Filtering Binarized LM

2016-04-06 Thread Kenneth Heafield
Probing format models can't be filtered because they only retain hashes
of ngrams.

Trie format models can be filtered and dumped, but only with the very
hacky and undocumented dump_trie program in the bounded-noquant branch.
 Hasn't been a priority to make it release quality; volunteers?

Kenneth

On 04/06/2016 11:13 AM, liling tan wrote:
> Dear Moses devs/users,
> 
> The filter tool in KenLM is able to filter a LM based on a dev set
> (https://kheafield.com/code/kenlm/filter/) but it only allows raw|arpa
> file. 
> 
> Is there another tool that filters binarized LMs? Given a binarized LM,
> is there a way to "debinarize" the LM? 
> 
> Thanks in advance for the tips!
> 
> Regards,
> Liling
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] compile.sh with --static

2016-04-06 Thread liling tan
Dear Matthias and Kenneth,

Thank you for the note on the --static options!

Regards,
Liling
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Filtering Binarized LM

2016-04-06 Thread liling tan
Dear Moses devs/users,

The filter tool in KenLM is able to filter a LM based on a dev set (
https://kheafield.com/code/kenlm/filter/) but it only allows raw|arpa file.

Is there another tool that filters binarized LMs? Given a binarized LM, is
there a way to "debinarize" the LM?

Thanks in advance for the tips!

Regards,
Liling
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support