Re: [Moses-support] processPhraseTableMin Cannot encode numbers largerthan 268435455

2019-05-09 Thread He Shiming
Ok, thank you Marcin. On Fri, May 10, 2019 at 11:53 AM Marcin Junczys-Dowmunt wrote: > Hi, > > Yes, a smaller phrase table should help. I wrote the table, but that was > in 2012 and I cannot really remember what goes on in there. I think making > sure that you do not have too many target phrases

Re: [Moses-support] processPhraseTableMin Cannot encode numbers largerthan 268435455

2019-05-09 Thread Marcin Junczys-Dowmunt
Hi, Yes, a smaller phrase table should help. I wrote the table, but that was in 2012 and I cannot really remember what goes on in there. I think making sure that you do not have too many target phrases per source phrase should help. From: He Shiming Sent: Thursday, May 9, 2019 8:49 PM To: moses

[Moses-support] processPhraseTableMin Cannot encode numbers larger than 268435455

2019-05-09 Thread He Shiming
Hi, I'm training a Chinese-to-English phrase-based model, using 33 million sentence pairs. My phrase table is 90GB gzipped, and the reordering table is 27GB gzipped. When running processPhraseTableMin, it dies in step 3 because of the following error: Intermezzo: Calculating Huffman code sets