Re: [OMPI devel] matching code rewrite in OB1

Jeff Squyres Wed, 12 Dec 2007 11:57:23 -0500

Gleb --

How about making a tarball with this patch in it that can be thrown ateveryone's MTT? (we can put the tarball on www.open-mpi.org somewhere)



On Dec 11, 2007, at 4:14 PM, Richard Graham wrote:

I will re-iterate my concern. The code that is there now is mostlynineyears old (with some mods made when it was brought over to OpenMPI). Ittook about 2 months of testing on systems with 5-13 way networkparallelismto track down all KNOWN race conditions. This code is at the centerof MPIcorrectness, so I am VERY concerned about changing it w/o some verystrong
reasons.  Not apposed, just very cautious.

Rich


On 12/11/07 11:47 AM, "Gleb Natapov" <[email protected]> wrote:
On Tue, Dec 11, 2007 at 08:36:42AM -0800, Andrew Friedley wrote:
Possibly, though I have results from a benchmark I've writtenindicatingthe reordering happens at the sender. I believe I found it wasdue tothe QP striping trick I use to get more bandwidth -- if you backdown to
one QP (there's a define in the code you can change), the reordering
rate drops.
Ah, OK. My assumption was just from looking into code, so I may be
wrong.
Also I do not make any recursive calls to progress -- at least not
directly in the BTL; I can't speak for the upper layers. Thereason Ido many completions at once is that it is a big help in turningaroundreceive buffers, making it harder to run out of buffers and dropfrags.I want to say there was some performance benefit as well but Ican't
say for sure.
Currently upper layers of Open MPI may call BTL progress function
recursively. I hope this will change some day.
Andrew

Gleb Natapov wrote:
On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
Try UD, frags are reordered at a very high rate so should be agood test.
Good Idea I'll try this. BTW I thing the reason for such a highrate of
reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
(500) and process them one by one and if progress function iscalled
recursively next 500 completion will be reordered versus previous
completions (reordering happens on a receiver, not sender).
Andrew

Richard Graham wrote:
Gleb,
I would suggest that before this is checked in this be testedon a
system
that has N-way network parallelism, where N is as large as youcan find.This is a key bit of code for MPI correctness, and out-of-orderoperationswill break it, so you want to maximize the chance for suchoperations.
Rich


On 12/11/07 10:54 AM, "Gleb Natapov" <[email protected]> wrote:
Hi,
I did a rewrite of matching code in OB1. I made it muchsimpler and 2times smaller (which is good, less code - less bugs). I alsogot ridof huge macros - very helpful if you need to debug something.Thereis no performance degradation, actually I even see very smallperformanceimprovement. I ran MTT with this patch and the result is thesame as ontrunk. I would like to commit this to the trunk. The patch isattached
for everybody to try.

--
Gleb.
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Gleb.
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Gleb.
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

Re: [OMPI devel] matching code rewrite in OB1

Reply via email to