Dear Moses devs/users,

*How should one work with big models?*

Originally, I've 4.5 million parallel sentences and ~13 million sentences
monolingual data for source and target languages.

After cleaning with
https://github.com/alvations/mosesdecoder/blob/master/scripts/other/gacha_filter.py
and
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/clean-corpus-n.perl
, I got 2.6 million parallel sentences.


And after training a phrase-based model with reordering, i get:

9.9GB of phrase-table.gz
3.2GB of reordering-table.gz
~45GB of language-model.arpa.gz


With language model, I've binarized it and got to

~75GB of language-model.binary

We ran moses-mert.pl and it completed the tuning in 3-4 days on both
directions on the dev set (3000 sentences), after filtering:


364M phrase-table.gz
1.8GB reordering-table.gz


On the test set, we did the filtering too but when decoding it took 18
hours to load only 50% of the phrase table:

1.5GB phrase-table.gz
6.7GB reordering-table.gz


So we decided to compactize the phrase table.

With the phrase-table and reordering, we used the processPhraseTableMin and
processLexicalTableMin and I'm still waiting to get the minimized
phrasetable table. It has been running for 3 hours on 10 threads each on a
2.5GHz cores.

*Anyone have any rough idea how small the phrase table and lexical table
would get?*



*With that kind of model, how much RAM would be necessary? And how long
would it take to load the model onto the RAM? Any other tips/hints on
working with big models efficiently? *

*Is it even possible for us to use models at such a size on our small
server (24 cores, 2.5GHz, 128RAM)? If not, how big should our sever get?*

Regards,
Liling
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to