Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Sébastien Boisvert
rg/mailman/listinfo.cgi/devel "Innovation comes only from an assault on the unknown" -Sydney Brenner /* Ray Copyright (C) 2010 Sébastien Boisvert http://DeNovoAssembler.SourceForge.Net/ This program is free software: you can redistribute it and/or modify it under the

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 16:32, Sébastien Boisvert wrote: > Yes, Ray version 0.1.0 and below are not fully-compliant > with MPI 2.2. > > I will release Ray 1.0.0 as soon as my regression tests > are done. That should be tomorrow. Wonderful, thank you! :-) - --

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Sébastien Boisvert
Yes, Ray version 0.1.0 and below are not fully-compliant with MPI 2.2. I will release Ray 1.0.0 as soon as my regression tests are done. That should be tomorrow. Le mercredi 24 novembre 2010 à 00:01 -0500, Christopher Samuel a écrit : > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 2

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-24 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 24/11/10 09:17, Sébastien Boisvert wrote: > As Mr. George Bosilca underlined, since the same test case works for > small messages, the problem is about congestion of the FIFOs which leads > to resource locking, and as you wrote, deadlock. Hmm, we'

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Le mardi 23 novembre 2010 à 20:21 -0500, Jeff Squyres (jsquyres) a écrit : > Beware that MPI-request-free on active buffers is valid but evil. You CANNOT > be sure when the buffer is available for reuse. Yes, but as I said, in my program an MPI rank never flood other MPI ranks. (I like to thin

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Thank you ! Your support is outstanding ! Le mardi 23 novembre 2010 à 22:25 -0500, Eugene Loh a écrit : > Jeff Squyres (jsquyres) wrote: > > >Ya, it sounds like we should fix this eager limit help text so that others > >aren't misled. We did say "attempt", but that's probably a bit too subtle.

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Eugene Loh
Jeff Squyres (jsquyres) wrote: Ya, it sounds like we should fix this eager limit help text so that others aren't misled. We did say "attempt", but that's probably a bit too subtle. Eugene - iirc: this is in the btl base (or some other central location) because it's shared between all btls.

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Jeff Squyres (jsquyres)
Beware that MPI-request-free on active buffers is valid but evil. You CANNOT be sure when the buffer is available for reuse. There was a sentence or paragraph added yo MPI 2.2 describing exactly this case. Sent from my PDA. No type good. On Nov 23, 2010, at 5:36 PM, Sébastien Boisvert wro

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Jeff Squyres (jsquyres)
Ya, it sounds like we should fix this eager limit help text so that others aren't misled. We did say "attempt", but that's probably a bit too subtle. Eugene - iirc: this is in the btl base (or some other central location) because it's shared between all btls. Sent from my PDA. No type good.

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Whoa ! Thank, I will try that. Le mardi 23 novembre 2010 à 18:03 -0500, George Bosilca a écrit : > If you know the max size of the receives I would take a different approach. "max size" is the maximum buffer size required, right ? in my case, it is 4096. > Post few persistent receives, and man

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread George Bosilca
If you know the max size of the receives I would take a different approach. Post few persistent receives, and manage them in a circular buffer. Instead of doing an MPI_Iprobe, use MPI_Test on the current head of your circular buffer. Once you use the data related to the receive, just do an MPI_S

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Eugene Loh
George Bosilca wrote: Moreover, eager send can improve performance if and only if the matching receives are already posted on the peer. If not, the data will become unexpected, and there will be one additional memcpy. I don't think the first sentence is strictly true. There is a cost associ

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Eugene Loh
Sébastien Boisvert wrote: Le mardi 23 novembre 2010 à 16:07 -0500, Eugene Loh a écrit : Sébastien Boisvert wrote: Case 1: 30 MPI ranks, message size is 4096 bytes File: mpirun-np-30-Program-4096.txt Outcome: It hangs -- I killed the poor thing after 30 seconds or s

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Le mardi 23 novembre 2010 à 17:38 -0500, George Bosilca a écrit : > The eager size reported by ompi_info includes the Open MPI internal headers. > They are anywhere between 20 and 64 bytes long (potentially more for some > particular networks), so what Eugene suggested was a safe boundary. I see

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread George Bosilca
The eager size reported by ompi_info includes the Open MPI internal headers. They are anywhere between 20 and 64 bytes long (potentially more for some particular networks), so what Eugene suggested was a safe boundary. Moreover, eager send can improve performance if and only if the matching rec

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Le mardi 23 novembre 2010 à 17:28 -0500, George Bosilca a écrit : > Sebastien, > > Using MPI_Isend doesn't guarantee asynchronous progress. As you might be > aware, the non-blocking communications are guaranteed to progress only when > the application is in the MPI library. Currently very few MP

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Le mardi 23 novembre 2010 à 16:07 -0500, Eugene Loh a écrit : > Sébastien Boisvert wrote: > > >Now I can describe the cases. > > > > > The test cases can all be explained by the test requiring eager messages > (something that test4096.cpp does not require). > > >Case 1: 30 MPI ranks, message s

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread George Bosilca
Sebastien, Using MPI_Isend doesn't guarantee asynchronous progress. As you might be aware, the non-blocking communications are guaranteed to progress only when the application is in the MPI library. Currently very few MPI implementations progress asynchronously (and unfortunately Open MPI is no

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread George Bosilca
No message is eager if there is congestion. 64K is eager for TCP only if the kernel buffer has enough room to hold the 64k. For SM it only works if there are ready buffers. In fact, eager is an optimization of the MPI library, not something the users should be aware of, or base their application

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Sébastien Boisvert
Le mardi 23 novembre 2010 à 15:17 -0500, Jeff Squyres (jsquyres) a écrit : > Sorry for the delay in replying - many of us were at SC last week. Nothing to be sorry for ! > > Admittedly, I'm looking at your code on a PDA, so I might be missing some > things. But I have 2 q's: You got it all ri

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Eugene Loh
Sébastien Boisvert wrote: Now I can describe the cases. The test cases can all be explained by the test requiring eager messages (something that test4096.cpp does not require). Case 1: 30 MPI ranks, message size is 4096 bytes File: mpirun-np-30-Program-4096.txt Outcome: It hangs -- I kil

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Eugene Loh
To add to Jeff's comments: Sébastien Boisvert wrote: The reason is that I am developping an MPI-based software, and I use Open-MPI as it is the only implementation I am aware of that send messages eagerly (powerful feature, that is). As wonderful as OMPI is, I am fairly sure other MPI implem

Re: [OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-23 Thread Jeff Squyres (jsquyres)
Sorry for the delay in replying - many of us were at SC last week. Admittedly, I'm looking at your code on a PDA, so I might be missing some things. But I have 2 q's: 1 your send routine doesn't seem to protect from sending to yourself. Correct? 2 you're not using nonblocking sends, which, if

[OMPI devel] Simple program (103 lines) makes Open-1.4.3 hang

2010-11-16 Thread Sébastien Boisvert
Dear awesome community, Over the last months, I closely followed the evolution of bug 2043, entitled 'sm BTL hang with GCC 4.4.x'. https://svn.open-mpi.org/trac/ompi/ticket/2043 The reason is that I am developping an MPI-based software, and I use Open-MPI as it is the only implementation I am a