Re: [Moses-support] Parallelising Giza++ for supercomputers

Qin Gao Fri, 20 Feb 2009 05:42:42 -0800

Hi, James,

PGIZA++ is not using OpenMPI, and only use shared storage to transfer model
files, that could be a bottleneck, MGIZA++ is just using multi-thread. So
they are not quite complete and can be further improved, the advantage of
PGIZA++ is it already decomposed every training step (E/M, model 1, hmm
3,4,5 etc), so it could be easy for you to understand the logic.


The bottleneck is majorly  the huge T-Table (translation table), which is
larger during model 1 training and then becomes smaller as training goes on,
every child has to get the table (typically several gigas for model 1 on
large data). I think OpenMPI etc is a right way to go, and please let me
know if you have any question on PGIZA++.

Best,
Qin

On Thu, Feb 19, 2009 at 4:54 PM, James Read <j.rea...@sms.ed.ac.uk> wrote:

> Wow!
>
> Thanks for that. That was great. I've had a quick read through your paper.
> I'm guessing the basis of PGiza++ is OpenMPI calls and the basis of MGiza++
> is OpenMP calls right?
>
> Your paper was very fascinating. You mentioned I/O bottlenecks quite a lot
> with reference to PGiza++ which is to be expected. Did you run any
> experiments to find what those bottlenecks typically are? How many
> processors did you hit before you started to lose speed up? Did this number
> vary for different data sets?
>
> Also, you mention breaking up the files into chunks and working on them on
> different processors. Obviously you're referring to some kind of data
> decomposition plan. Does your algorithm have any kind of intelligent data
> decomposition strategy for reducing communications? Or is it just a case of
> cutting the file up into n bits and assigning each one to a processor?
>
> The reason I ask is that our project would now have to come up with some
> kind of superior data decomposition plan in order to justify proceeding with
> the project.
>
> Thanks
>
> James
>
>
> Quoting Qin Gao <q...@cs.cmu.edu>:
>
>  Hi James,
>>
>> The GIZA++ is a very typical EM algorithm and probably you want to
>> parallelize the e-step since it takes long time then M-Step. You may want
>> to
>> check out the PGIZA++ and MGIZA++ implementations which you can download
>> in
>> my homepage:
>>
>> http://www.cs.cmu.edu/~qing <http://www.cs.cmu.edu/%7Eqing>
>>
>> And you may also be interested in a paper describing the work:
>>
>> www.aclweb.org/anthology-new/W/W08/W08-0509.pdf
>>
>> Please let me know if there are anything I can help.
>>
>> Best,
>> Qin
>>
>> On Thu, Feb 19, 2009 at 4:12 PM, James Read <j.rea...@sms.ed.ac.uk>
>> wrote:
>>
>>  Hi all,
>>>
>>> as the title suggest I am involved in a project which may involve
>>> parallelising the code of Giza++ so that it will run on supercomputers
>>> scalably on n number of processors. This would have obvious benefits
>>> to any researchers making regular use of Giza++ who would like it to
>>> finish in minutes rather than hours.
>>>
>>> The first step of such a project was profiling Giza++ to see where the
>>> executable spends most of its time on a typical run. Such profiling
>>> indicated a number of candidate functions. One of which was
>>> model1::em_loop found in the model1.cpp file.
>>>
>>> In order to parallelise such a function (using OpenMPI) it is
>>> necessary to first come up with some kind of data decomposition
>>> strategy which minimizes the latency of interprocessor communication
>>> but ensures that the parallelisation has no side effects other than
>>> running faster on a number of processors up to some optimal number of
>>> processors where the latency of communication begins to outweigh the
>>> benefits of throwing more processors at the job.
>>>
>>> In order to do this I am trying to gain an understanding of the logic
>>> in the model1::em_loop function. However, intuitive comments are
>>> lacking in the code. Does anyone on this list have a good internal
>>> knowledge of this function? Enough to give a rough outline of the
>>> logic it contains in some kind of readable pseudocode?
>>>
>>> Thanks
>>>
>>> P.S. Apologies to anybody to whom this email was not of interest.
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> ==========================================
>> Qin Gao
>> Language Technology Institution
>> Carnegie Mellon University
>> http://geek.kyloo.net
>>
>> ------------------------------------------------------------------------------------
>> Please help improving NLP articles on Wikipedia
>> ==========================================
>>
>>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>


-- 
==========================================
Qin Gao
Language Technology Institution
Carnegie Mellon University
http://geek.kyloo.net
------------------------------------------------------------------------------------
Please help improving NLP articles on Wikipedia
==========================================

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Parallelising Giza++ for supercomputers

Reply via email to