Hi, Raymond.

Interesting. Is your parallelization also tolerant to random job failures? How 
does it decide when to stop waiting? Cannot it degrade to optimizing only on 
eg. one of the splits, all others failing?

An option to commit your code in a more visible way is to put it in the main 
branch under a different name, if the change affects just one script. But I 
agree it's not very nice and clean.

Cheers, O.

On April 16, 2014 5:38:35 PM CEST, Hieu Hoang <hieu.ho...@ed.ac.uk> wrote:
>hi raymond
>
>you're welcome to create a branch on the Moses github repository and
>add
>your code there. It's unlikely anyone will look at it or use it, but at
>least it won't get lost.
>
>Maybe in future, you or someone else might want to merge it with the
>master
>branch where it will get much more exposure
>
>
>On 16 April 2014 16:21, Raymond W. M. Ng <wm...@sheffield.ac.uk> wrote:
>
>> Dear Ondrej,
>>
>> I have checked with Hieu when I met him in February, seems that the
>SGE
>> submission in MERT is using the -sync mode, which makes submission
>> difficult (user still in submission states until the all jobs end). 
>In
>> short, the modification runs in a "no-sync" mode.
>>
>> In terms of efficiency, as for the reasons you have mentioned, the
>> combined wallclock time of N machines (N times actual program
>runtime) may
>> be longer than the single-threaded execution. But as in a lot of
>shared
>> computing environments, a single-threaded job execution for 20+ hours
>is
>> not favourable (sometimes disallowed). So by using the parallel mode,
>> runtime of single jobs is shortened. In my experience, using parallel
>mode
>> shortens the tuning time from 30hours (single-threaded) to 3 hours
>(20
>> threads on different machines). We are having infiniband access among
>nodes
>> and its a bit more sophisticated than NFS mounting though.
>>
>> best
>> raymond
>>
>>
>> On 16 April 2014 13:48, Ondrej Bojar <bo...@ufal.mff.cuni.cz> wrote:
>>
>>> Dear Raymond,
>>>
>>> The existing scripts always allowed running MERT in parallel jobs on
>SGE,
>>> one just had to use generic/moses-parallel as the "moses
>executable".
>>>
>>> Is there some other functionality that your modifications now bring?
>>>
>>> Btw, in my experience, parallelization into SGE jobs can be even
>less
>>> efficient than single-job-multi-threaded execution. It is hard to
>exactly
>>> describe the circumstances, but in general if your models are big
>and
>>> loaded from NFS, and you run many experiments at the same time, the
>>> slowdown of the network multiplied by the many SGE jobs makes the
>>> parallelization much more wasteful and sometimes slower (in
>wallclock time).
>>>
>>> Cheers, Ondrej.
>>>
>>> On April 16, 2014 1:07:37 PM CEST, "Raymond W. M. Ng" <
>>> wm...@sheffield.ac.uk> wrote:
>>> >Hi Moses support,
>>> >
>>> >Not sure I am making this enquiry in the right mailing list....
>>> >I have some modified scripts for parallel MERT tuning which can run
>on
>>> >SGE.
>>> >Now I would like to share this. It is based on an old version of
>moses
>>> >(around April 2012), what is the best way for sharing?
>>> >
>>> >Best
>>> >raymond
>>> >
>>> >
>>>
>>------------------------------------------------------------------------
>>> >
>>> >_______________________________________________
>>> >Moses-support mailing list
>>> >Moses-support@mit.edu
>>> >http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> --
>>> Ondrej Bojar (mailto:o...@cuni.cz / bo...@ufal.mff.cuni.cz)
>>> http://www.cuni.cz/~obo
>>>
>>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>

-- 
Ondrej Bojar (mailto:o...@cuni.cz / bo...@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to