I have managed to replicate the issue on a smaller corpus. Do you need the
training corpus, the tables (phrase- and reordering-), or all of them?

Vito

2016-09-30 13:30 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:

> wow, that looks like a seriously problem.
>
> I've not seen this before. If you can make the data file available for
> download, it would be much appreciated
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 30 September 2016 at 09:11, Vito Mandorino <vito.mandorino@
> linguacustodia.com> wrote:
>
>> I tried the following command:
>>
>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>> --phrase-table=/home/vito/phrase-table.sorted
>> --lex-ro=/home/vito/reordering-table.sorted
>> --output-dir=/home/vito/integrated_phrase-reordering/ --num-lex-scores=6
>>
>> but it gets stuck with the following message:
>>
>> Reading phrase table finished, writing remaining files to disk.
>> terminate called after throwing an instance of
>> 'util::ProbingSizeException'
>>   what():  ./util/probing_hash_table.hh:150 in
>> util::ProbingHashTable<EntryT, HashT, EqualT, ModT>::Entry*
>> util::ProbingHashTable<EntryT, HashT, EqualT, ModT>::Insert(const T&) [with
>> T = Moses2::Entry; EntryT = Moses2::Entry; HashT = boost::hash<long
>> unsigned int>; EqualT = std::equal_to<long unsigned int>; ModT =
>> util::DivMod; util::ProbingHashTable<EntryT, HashT, EqualT,
>> ModT>::MutableIterator = Moses2::Entry*; util::ProbingHashTable<EntryT,
>> HashT, EqualT, ModT>::Entry = Moses2::Entry] threw ProbingSizeException
>> because `++entries_ >= buckets_'.
>> Hash table with 1 buckets is full.
>>
>>
>>
>>
>> 2016-09-29 16:25 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>
>>> use the script:
>>>    scripts/generic/binarize4moses2.perl
>>> It takes as input the (text) phrase-table and the (text) lexro model. It
>>> will give you the probing pt which contains the info for both.
>>>
>>> To use this script, Moses MUST be compiled with the flag --with-cmph.
>>> Also, the program in contrib/sigtest-filter MUST have been successfully
>>> compiled.
>>>
>>>
>>>
>>> Hieu Hoang
>>> http://www.hoang.co.uk/hieu
>>>
>>> On 29 September 2016 at 15:19, Vito Mandorino <
>>> vito.mandor...@linguacustodia.com> wrote:
>>>
>>>> Ok thank you, I'll check that. Do you know how to perform the
>>>> integration? Juxtaposing the 4 phrase-table scores and the 6 reordering
>>>> scores before calling CreateProbingPT2 would be enough?
>>>>
>>>> (I have used the CreateProbingPT2 binary and not CreateProbingPT so far)
>>>>
>>>> Vito
>>>>
>>>> 2016-09-29 16:07 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>>>
>>>>> you will get another big speedup fromm integrating the lexro into the
>>>>> pt
>>>>>
>>>>> Hieu Hoang
>>>>> http://www.hoang.co.uk/hieu
>>>>>
>>>>> On 29 September 2016 at 15:03, Vito Mandorino <
>>>>> vito.mandor...@linguacustodia.com> wrote:
>>>>>
>>>>>> Yes the model includes a lexicalised reordering model but is not
>>>>>> integrated into the probingPT. The size of the LM is 1.8G.
>>>>>>
>>>>>> 2016-09-29 15:59 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>>>>>
>>>>>>> ps. how big is your LM?
>>>>>>>
>>>>>>> Hieu Hoang
>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>>
>>>>>>> On 29 September 2016 at 14:58, Hieu Hoang <hieuho...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> great, thanks. Do you use the lexicalised reordering model, and is
>>>>>>>> it integrated into the phrase-table in Moses2?
>>>>>>>>
>>>>>>>> There is latency in communicating with the server. As Moses2 is
>>>>>>>> much faster now, the client can't feed it fast enough. You should see 
>>>>>>>> that
>>>>>>>> moses2 command line will max out the CPU, whereas the server won't. I'm
>>>>>>>> thinking of extending the server to processing multiple sentences at a 
>>>>>>>> time
>>>>>>>> to speed it up
>>>>>>>>
>>>>>>>> Hieu Hoang
>>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>>>
>>>>>>>> On 29 September 2016 at 14:49, Vito Mandorino <
>>>>>>>> vito.mandor...@linguacustodia.com> wrote:
>>>>>>>>
>>>>>>>>> Yes, here are some data:
>>>>>>>>>
>>>>>>>>> Average source sentence length: 29 tokens
>>>>>>>>> Phrase-table size, probingPT: 11G
>>>>>>>>> Phrase-table size, compact phrase-table: 2.1G
>>>>>>>>>
>>>>>>>>> Translation time Moses2 with 32 threads: 1m36.511s
>>>>>>>>> Translation time Moses with 32 threads: 6m14.248s
>>>>>>>>> Translation time Moses2 with 32 threads in server mode: 16m30.137s
>>>>>>>>> Translation time Moses with 32 threads in server mode: 62m33.208s
>>>>>>>>>
>>>>>>>>> Ram consumption during decoding: 4G for Moses2, 5G for Moses
>>>>>>>>>
>>>>>>>>> So Moses2 is 4 times faster, and 3 times faster in server mode.
>>>>>>>>>
>>>>>>>>> Do you know why in server mode the speed is so much slower with
>>>>>>>>> respect to batch mode, for both Moses and Moses2?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Vito
>>>>>>>>>
>>>>>>>>> 2016-09-28 18:52 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> cool. do you have any indications of speed, especially when using
>>>>>>>>>> multiple threads? model sizes and average input sentence length are 
>>>>>>>>>> also
>>>>>>>>>> relevant.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [image: Description : Description : lingua_custodia_final full
>>>>>>>>> logo]
>>>>>>>>>
>>>>>>>>>  *The Translation Trustee*
>>>>>>>>>
>>>>>>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>>>>>>
>>>>>>>>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>>>>>>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>>>>>>
>>>>>>>>> *Email :*  *vito.mandor...@linguacustodia.com
>>>>>>>>> <massinissa.ah...@linguacustodia.com>*
>>>>>>>>>
>>>>>>>>> *Website :*
>>>>>>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>>>
>>>>>>
>>>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>>>
>>>>>>  *The Translation Trustee*
>>>>>>
>>>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>>>
>>>>>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>>>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>>>
>>>>>> *Email :*  *vito.mandor...@linguacustodia.com
>>>>>> <massinissa.ah...@linguacustodia.com>*
>>>>>>
>>>>>> *Website :*
>>>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>
>>>>
>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>
>>>>  *The Translation Trustee*
>>>>
>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>
>>>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>
>>>> *Email :*  *vito.mandor...@linguacustodia.com
>>>> <massinissa.ah...@linguacustodia.com>*
>>>>
>>>> *Website :*
>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>
>>>
>>>
>>
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>>  *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :*  *vito.mandor...@linguacustodia.com
>> <massinissa.ah...@linguacustodia.com>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>
>


-- 
*M**. Vito MANDORINO -- Chief Scientist*


[image: Description : Description : lingua_custodia_final full logo]

 *The Translation Trustee*

*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*

*Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89*

*Email :*  *vito.mandor...@linguacustodia.com
<massinissa.ah...@linguacustodia.com>*

*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to