I have managed to replicate the issue on a smaller corpus. Do you need the training corpus, the tables (phrase- and reordering-), or all of them?
Vito 2016-09-30 13:30 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>: > wow, that looks like a seriously problem. > > I've not seen this before. If you can make the data file available for > download, it would be much appreciated > > Hieu Hoang > http://www.hoang.co.uk/hieu > > On 30 September 2016 at 09:11, Vito Mandorino <vito.mandorino@ > linguacustodia.com> wrote: > >> I tried the following command: >> >> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl >> --phrase-table=/home/vito/phrase-table.sorted >> --lex-ro=/home/vito/reordering-table.sorted >> --output-dir=/home/vito/integrated_phrase-reordering/ --num-lex-scores=6 >> >> but it gets stuck with the following message: >> >> Reading phrase table finished, writing remaining files to disk. >> terminate called after throwing an instance of >> 'util::ProbingSizeException' >> what(): ./util/probing_hash_table.hh:150 in >> util::ProbingHashTable<EntryT, HashT, EqualT, ModT>::Entry* >> util::ProbingHashTable<EntryT, HashT, EqualT, ModT>::Insert(const T&) [with >> T = Moses2::Entry; EntryT = Moses2::Entry; HashT = boost::hash<long >> unsigned int>; EqualT = std::equal_to<long unsigned int>; ModT = >> util::DivMod; util::ProbingHashTable<EntryT, HashT, EqualT, >> ModT>::MutableIterator = Moses2::Entry*; util::ProbingHashTable<EntryT, >> HashT, EqualT, ModT>::Entry = Moses2::Entry] threw ProbingSizeException >> because `++entries_ >= buckets_'. >> Hash table with 1 buckets is full. >> >> >> >> >> 2016-09-29 16:25 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>: >> >>> use the script: >>> scripts/generic/binarize4moses2.perl >>> It takes as input the (text) phrase-table and the (text) lexro model. It >>> will give you the probing pt which contains the info for both. >>> >>> To use this script, Moses MUST be compiled with the flag --with-cmph. >>> Also, the program in contrib/sigtest-filter MUST have been successfully >>> compiled. >>> >>> >>> >>> Hieu Hoang >>> http://www.hoang.co.uk/hieu >>> >>> On 29 September 2016 at 15:19, Vito Mandorino < >>> vito.mandor...@linguacustodia.com> wrote: >>> >>>> Ok thank you, I'll check that. Do you know how to perform the >>>> integration? Juxtaposing the 4 phrase-table scores and the 6 reordering >>>> scores before calling CreateProbingPT2 would be enough? >>>> >>>> (I have used the CreateProbingPT2 binary and not CreateProbingPT so far) >>>> >>>> Vito >>>> >>>> 2016-09-29 16:07 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>: >>>> >>>>> you will get another big speedup fromm integrating the lexro into the >>>>> pt >>>>> >>>>> Hieu Hoang >>>>> http://www.hoang.co.uk/hieu >>>>> >>>>> On 29 September 2016 at 15:03, Vito Mandorino < >>>>> vito.mandor...@linguacustodia.com> wrote: >>>>> >>>>>> Yes the model includes a lexicalised reordering model but is not >>>>>> integrated into the probingPT. The size of the LM is 1.8G. >>>>>> >>>>>> 2016-09-29 15:59 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>: >>>>>> >>>>>>> ps. how big is your LM? >>>>>>> >>>>>>> Hieu Hoang >>>>>>> http://www.hoang.co.uk/hieu >>>>>>> >>>>>>> On 29 September 2016 at 14:58, Hieu Hoang <hieuho...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> great, thanks. Do you use the lexicalised reordering model, and is >>>>>>>> it integrated into the phrase-table in Moses2? >>>>>>>> >>>>>>>> There is latency in communicating with the server. As Moses2 is >>>>>>>> much faster now, the client can't feed it fast enough. You should see >>>>>>>> that >>>>>>>> moses2 command line will max out the CPU, whereas the server won't. I'm >>>>>>>> thinking of extending the server to processing multiple sentences at a >>>>>>>> time >>>>>>>> to speed it up >>>>>>>> >>>>>>>> Hieu Hoang >>>>>>>> http://www.hoang.co.uk/hieu >>>>>>>> >>>>>>>> On 29 September 2016 at 14:49, Vito Mandorino < >>>>>>>> vito.mandor...@linguacustodia.com> wrote: >>>>>>>> >>>>>>>>> Yes, here are some data: >>>>>>>>> >>>>>>>>> Average source sentence length: 29 tokens >>>>>>>>> Phrase-table size, probingPT: 11G >>>>>>>>> Phrase-table size, compact phrase-table: 2.1G >>>>>>>>> >>>>>>>>> Translation time Moses2 with 32 threads: 1m36.511s >>>>>>>>> Translation time Moses with 32 threads: 6m14.248s >>>>>>>>> Translation time Moses2 with 32 threads in server mode: 16m30.137s >>>>>>>>> Translation time Moses with 32 threads in server mode: 62m33.208s >>>>>>>>> >>>>>>>>> Ram consumption during decoding: 4G for Moses2, 5G for Moses >>>>>>>>> >>>>>>>>> So Moses2 is 4 times faster, and 3 times faster in server mode. >>>>>>>>> >>>>>>>>> Do you know why in server mode the speed is so much slower with >>>>>>>>> respect to batch mode, for both Moses and Moses2? >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Vito >>>>>>>>> >>>>>>>>> 2016-09-28 18:52 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>: >>>>>>>>> >>>>>>>>>> cool. do you have any indications of speed, especially when using >>>>>>>>>> multiple threads? model sizes and average input sentence length are >>>>>>>>>> also >>>>>>>>>> relevant. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *M**. Vito MANDORINO -- Chief Scientist* >>>>>>>>> >>>>>>>>> >>>>>>>>> [image: Description : Description : lingua_custodia_final full >>>>>>>>> logo] >>>>>>>>> >>>>>>>>> *The Translation Trustee* >>>>>>>>> >>>>>>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >>>>>>>>> >>>>>>>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >>>>>>>>> <%2B33%206%2084%2065%2068%2089>* >>>>>>>>> >>>>>>>>> *Email :* *vito.mandor...@linguacustodia.com >>>>>>>>> <massinissa.ah...@linguacustodia.com>* >>>>>>>>> >>>>>>>>> *Website :* >>>>>>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *M**. Vito MANDORINO -- Chief Scientist* >>>>>> >>>>>> >>>>>> [image: Description : Description : lingua_custodia_final full logo] >>>>>> >>>>>> *The Translation Trustee* >>>>>> >>>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >>>>>> >>>>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >>>>>> <%2B33%206%2084%2065%2068%2089>* >>>>>> >>>>>> *Email :* *vito.mandor...@linguacustodia.com >>>>>> <massinissa.ah...@linguacustodia.com>* >>>>>> >>>>>> *Website :* >>>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *M**. Vito MANDORINO -- Chief Scientist* >>>> >>>> >>>> [image: Description : Description : lingua_custodia_final full logo] >>>> >>>> *The Translation Trustee* >>>> >>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >>>> >>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >>>> <%2B33%206%2084%2065%2068%2089>* >>>> >>>> *Email :* *vito.mandor...@linguacustodia.com >>>> <massinissa.ah...@linguacustodia.com>* >>>> >>>> *Website :* >>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >>>> >>> >>> >> >> >> -- >> *M**. Vito MANDORINO -- Chief Scientist* >> >> >> [image: Description : Description : lingua_custodia_final full logo] >> >> *The Translation Trustee* >> >> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >> >> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >> <%2B33%206%2084%2065%2068%2089>* >> >> *Email :* *vito.mandor...@linguacustodia.com >> <massinissa.ah...@linguacustodia.com>* >> >> *Website :* >> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >> > > -- *M**. Vito MANDORINO -- Chief Scientist* [image: Description : Description : lingua_custodia_final full logo] *The Translation Trustee* *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89* *Email :* *vito.mandor...@linguacustodia.com <massinissa.ah...@linguacustodia.com>* *Website :* *www.linguacustodia.finance <http://www.linguacustodia.com/>*
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support