On Apr 24, 2014, at 10:47 PM, Ralph Castain <r...@open-mpi.org> wrote:

>> And, as George pointed out, I see a trend towards heterogeneity in
>> HPC, to I'd say this feature will be rather more important in the
>> future.
> 
> We have been hearing about such "trends" for a long time, but have yet to see 
> them actually happen. Not saying it couldn't some day - just saying it still 
> hasn't happened in production.

+1

MPI was designed to support heterogeneity all the way back from MPI-1.0 (1994) 
on these same kinds of arguments.  It hasn't really panned out for more than a 
handful of users.

Keep in mind that data size heterogeneity is an unsolved problem.  What do you 
do if one process sends a 4-byte integer of value 0xff00 0000 to a peer with 
only 2-byte integers?

>> So, would repairing the code be significantly more complicated than a
>> clean extraction?
> 
> So here's what I suggest: if someone is willing to take the lead in fixing 
> hetero operations, and has the hardware upon which to verify it, then please 
> step forward. Otherwise, I agree with Jeff that we should remove it and move 
> on.


The broken part(s) is(are) likely somewhere in the datatype and/or PML code (my 
guess).  Keep in mind that my only testing of this feature is in *homogeneous* 
mode -- i.e., I compile with --enable-heterogeneous and then run tests on 
homogeneous machines.  Meaning: it's not only broken for actual heterogeneity, 
it's also broken in the "unity"/homogeneous case.

So which is more complicated: fix or remove?  I don't know; as George mentions, 
I suspect removal is likely to be a little tricky.  

But ask that question a little differently: which is more complicated, 
long-term maintenance of a feature which no one really tests (or even has the 
hardware setup to test) or removal?  

To me, the answer is a little more clear that way.

That being said, there are 3 disagreements with this RFC so far:

1. George: on principle
2. Andreas: (might) use heterogeneity if it worked
3. Siegmar: uses heterogeneity in older OMPI versions in his SPARC+Intel setups

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to