Re: [Moses-support] Help. First request to MosesServer very slow

Barry Haddow Tue, 11 Mar 2014 04:22:31 -0700

HI Marcos

For command-line Moses, it has to output the sentences in the same order 
as they are input. So if, say, you have 4 threads and provide Moses with 
a large file of input sentences, it will give 1 sentence to each of the 
4 threads. But no output will be produced until sentence 1 is 
translated, so if all sentences are about the same length then there 
will be a delay for the first one, then all others will then be output 
very quickly.


The MTMonkey paper would have used an older version of Moses, which had 
a single global translation options cache, and this could explain the 
difference between your observed behaviour and their reported 
performance. The change in the cache happened some time in 2013, and 
unfortunately seems to have had a negative impact on Moses server. As 
Kenneth has pointed out though, it is something that needs to be fixed 
in Moses server.

cheers - Barry

On 10/03/14 09:22, Marcos Fernandez wrote:
>> Hi Marcos
>>
>> I think the problem is that the rules (or phrase pairs) are now cached
>> on a per thread basis. This is good for command-line Moses as it uses a
>> pool of threads, and having per-thread caches means that there is no
>> locking on the caches, as there used to be.
> Barry, I am not sure that that is the cause, because in that case my "20
> short sentences" file would get translated much faster with command-line
> Moses, however it takes 4.5 seconds to translate it. This is, 2 seconds
> (measured) for loading tables in memory, and the same 2.5 seconds than with
> MosesServer to translate the file. And the behaviour is also the same as
> before, the first sentence in the file takes much longer than the rest.
>
> However, what you say perhaps could be the cause of the difference in time
> between using a different xmlrpc ServerProxy object for each request
> (probably, in this case xmlrpc executes each request in a different thread),
> or reusing one only ServerProxy for all the requests (where there would be
> only one thread, so it could take advantage of the cache).
>
> What I understand then, is that the cache stores information of the
> previously translated sentences to accelerate the translation of the next
> ones. But that does not eliminate the problem of the first slower request.
> As you can see, I am finding that issue even in command-line Moses (not with
> the first "request", but with the first sentence in a file).
>
> I am thinking that perhaps I have no problem, and this is just the usual way
> in which Moses works. Just to make sure:
>
> 1. would you say that a time of 2-3 seconds for translating (spa-eng) a
> single sentence (~15-20 words) could be a normal response time (discounting
> the time of loading tables)? (Intel Xeon with 32GB RAM)
> 2. if now you write 3 similar sentences in a file, and execute Moses
> (command-line, serial) over this file, would you expect it to take a much
> shorter time (perhaps the half) than the sum of the times for the 3 single
> sentences?
>
> If the answer is yes to both (specially the second), then I am probably
> worrying in vain. My worries started when I read the "MTMonkey" paper:
> http://ufal.mff.cuni.cz/~pecina/files/pbml-2013.pdf
>
> Here the authors use the approach of creating a new ServerProxy instance
> each time a request is sent to MosesServer (the worst case scenario for me),
> and they get great results, so I thought they were not experiencing that
> overhead for every request. But perhaps they just used sentences that get
> translated very fast even with that overhead.
>
> Well, in the case that this is the usual way of working for Moses, if what
> Kenneth suggests is possible, that would eliminate the overhead almost
> completely for MosesServer. I mean, there would still be an overhead for the
> first request that each thread serves after the server is just launched, but
> never more, because the caches would be filled with useful information from
> that point on. I think that this would be extremely interesting for web
> translating services, or to do web-page translations "on the fly".
>
> However, I don't see a way to avoid that overhead in command-line Moses, as
> it "dies" after each execution.
>
> Marcos.
>
>
>> mosesserver, afaik, creates a new thread for each connection, so it
>> can't take advantage of the cache. This is done in the xmlrpc-c library
>> so we don't have much control over it. If you dig around in the xmlrpc-c
>> documentation (or code!) you might find a way to control the threading
>> policy.
>>
>> I just spoke to Marcin about the problem, and we're not sure if loading
>> the compact phrase table into memory would help, as you still would need
>> the higher level cache (in PhraseDictionary). But you could try this anyway.
>>
>> cheers - Barry
>>
>> On 06/03/14 17:20, Marcos Fernandez wrote:
>>> Hi, I am having an issue with MosesServer.
>>>
>>> I am using compact phrase and reordering table, and KENLM.
>>>
>>> The problem is this (I'll explain with an example):
>>>
>>> - I have one file with 20 very short sentences. I split and tokenize
>>> them and send one XMLPRC request per sentence to MosesServer
>>> - If I create just one XMLRPC ServerProxy instance and I use it to send
>>> all the requests through it, all the sentences get translated in approx
>>> 2.5 sec. The problem is that the first sentence takes almost 2 seconds
>>> to get translated, while the other 19 are much faster
>>> - If I create one ServerProxy instance per request, the translation time
>>> rises to 30 sec (now every sentence takes almost 2 sec)
>>>
>>> I don't understand the reason of that delay for the first request. I
>>> have followed the source of this delay to the function:
>>>
>>> GetTargetPhraseCollectionLEGACY(const Phrase& src)
>>>
>>> in the file: ...TranslationModel/PhraseDictionary.cpp
>>>
>>> It seems that for the first request it's needed  to look for something
>>> in the phrase table, while for subsequent requests it can be retrieved
>>> (most of the times) from a cache.
>>>
>>> But, as the sentences in my file are not related one to another in any
>>> way, the information on this cache can not be sentence-dependent, so why
>>> wouldn't it be possible for the cache to be preloaded with the
>>> information needed?
>>>
>>> I think that perhaps I have something misconfigured, because I have seen
>>> other people using the approach of creating one ServerProxy object for
>>> each XMLRPC request (which would facilitate things a lot for me), so I
>>> don't think they are experiencing this overhead. Perhaps using the
>>> compact formats can have something to do with it?
>>>
>>> Any help would be much appreciated. I paste below my moses.ini, if that
>>> helps:
>>>
>>> Thanks :)
>>>
>>> ### MOSES CONFIG FILE ###
>>> ###################
>>>
>>> # input factors
>>> [input-factors]
>>> 0
>>>
>>> # mapping steps
>>> [mapping]
>>> 0 T 0
>>>
>>> # translation tables: table type (hierarchical(0), textual (0), binary
>>> (1)), source-factors, target-factors, number of scores, file
>>> # OLD FORMAT is still handled for back-compatibility
>>> # OLD FORMAT translation tables: source-factors, target-factors, number
>>> of scores, file
>>> # OLD FORMAT a binary table type (1) is assumed
>>> [ttable-file]
>>> 12 0 0 5 /opt/moses-compiling/modelos/es-en/phrase-model/phrase-table
>>>
>>> # no generation models, no generation-file section
>>>
>>> # language models: type(srilm/irstlm), factors, order, file
>>> [lmodel-file]
>>> 8 0 5
>>> /opt/moses-compiling/modelos/es-en/lm/13-19-03gen_intec_head8m_sb5LM.kenlm
>>>
>>>
>>> # limit on how many phrase translations e for each phrase f are loaded
>>> # 0 = all elements loaded
>>> [ttable-limit]
>>> 10
>>>
>>> # distortion (reordering) files
>>> [distortion-file]
>>> 0-0 wbe-msd-bidirectional-fe-allff 6
>>> /opt/moses-compiling/modelos/es-en/phrase-model/reordering-table
>>>
>>> # distortion (reordering) weight
>>> [weight-d]
>>> 0.097107
>>> 0.150373
>>> -0.0551767
>>> -0.0307787
>>> 0.114613
>>> 0.214587
>>> 0.0467398
>>>
>>> # language model weights
>>> [weight-l]
>>> 0.0442748
>>>
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.00370888
>>> 0.0425665
>>> 0.0719956
>>> 0.0202699
>>> 0.071147
>>>
>>> # no generation models, no weight-generation section
>>>
>>> # word penalty
>>> [weight-w]
>>> 0.0366626
>>>
>>> [distortion-limit]
>>> 6
>>>
>>> [v]
>>> 0
>>>
>>>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Help. First request to MosesServer very slow

Reply via email to