Dear Moses devs/users, *How should one work with big models?*
Originally, I've 4.5 million parallel sentences and ~13 million sentences monolingual data for source and target languages. After cleaning with https://github.com/alvations/mosesdecoder/blob/master/scripts/other/gacha_filter.py and https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/clean-corpus-n.perl , I got 2.6 million parallel sentences. And after training a phrase-based model with reordering, i get: 9.9GB of phrase-table.gz 3.2GB of reordering-table.gz ~45GB of language-model.arpa.gz With language model, I've binarized it and got to ~75GB of language-model.binary We ran moses-mert.pl and it completed the tuning in 3-4 days on both directions on the dev set (3000 sentences), after filtering: 364M phrase-table.gz 1.8GB reordering-table.gz On the test set, we did the filtering too but when decoding it took 18 hours to load only 50% of the phrase table: 1.5GB phrase-table.gz 6.7GB reordering-table.gz So we decided to compactize the phrase table. With the phrase-table and reordering, we used the processPhraseTableMin and processLexicalTableMin and I'm still waiting to get the minimized phrasetable table. It has been running for 3 hours on 10 threads each on a 2.5GHz cores. *Anyone have any rough idea how small the phrase table and lexical table would get?* *With that kind of model, how much RAM would be necessary? And how long would it take to load the model onto the RAM? Any other tips/hints on working with big models efficiently? * *Is it even possible for us to use models at such a size on our small server (24 cores, 2.5GHz, 128RAM)? If not, how big should our sever get?* Regards, Liling
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support