Re: [OMPI devel] Error in VT
Hi Leonardo, I guess that your program uses POSIX threads and needs the MPI thread support level MPI_THREAD_MULTIPLE, right? Unfortunately, the OMPI integrated version of VT doesn't support neither Pthreads nor any MPI thread level. The latest "stand-alone-version" of VT (5.6.3) supports at least Pthreads and the MPI thread support levels MPI_THREAD_SINGLE and MPI_THREAD_FUNNELED. So if you can change the MPI thread level requirement to MPI_THREAD_SINGLE or MPI_THREAD_FUNNELED tracing of your code should work. You can download the latest VT version at http://www.tu-dresden.de/zih/vampirtrace. Please give it a try. Regards, Matthias Jurenz On Mon, 2009-03-30 at 19:04 +0200, Leonardo Fialho wrote: > Hi Jeff, > > There are... > > Thanks a lot, > Leonardo > > Jeff Squyres escribió: > > Can you send all the information listed here: > > > > http://www.open-mpi.org/community/help/ > > > > > > On Mar 30, 2009, at 11:46 AM, Leonardo Fialho wrote: > > > >> Hi, > >> > >> I'm experimenting the following errors while using Open MPI release > >> 1.3.1 combined with VT. > >> > >> STAT P 2.258062 43.% 488.997562 0 > >> STAT P 2.260121 44.% 485.672638 0 > >> STAT P 2.262175 45.% 486.854935 0 > >> RFG_Regions_stackPop(): Error: Stack underflow > >> RFG_Regions_stackPop(): Error: Stack underflow > >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable > >> [nodo1][[43845,1],0][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] > >> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable > >> RFG_Regions_stackPop(): Error: Stack underflow > >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable > >> -- > >> > >> mpirun has exited due to process rank 1 with PID 8814 on > >> node nodo2 exiting without calling "finalize". This may > >> have caused other processes in the application to be > >> terminated by signals sent by mpirun (as reported here). > >> -- > >> > >> [fialho@aoclsd gmwat]$ > >> > >> Along different executions the error occurs in different situations. > >> > >> Any help? > >> > >> Thanks, > >> > >> -- > >> Leonardo Fialho > >> Computer Architecture and Operating Systems Department - CAOS > >> Universidad Autonoma de Barcelona - UAB > >> ETSE, Edifcio Q, QC/3088 > >> http://www.caos.uab.es > >> Phone: +34-93-581-2888 > >> Fax: +34-93-581-2478 > >> > >> ___ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > > plain text document attachment (environ.txt) > declare -x > LD_LIBRARY_PATH="/home/fialho/local/tau-2.18.1p1/i386_linux/lib:/home/fialho/OSS/lib:/home/fialho/gmate/lib:/home/fialho/local/openmpi-1.3.1/lib:/home/fialho/dyninst/lib:/home/fialho/local/lib:/home/fialho/local/gsl-1.9/lib/:/home/fialho/dyninst/lib" > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] SM init failures
On Mar 31, 2009, at 11:00 AM, Jeff Squyres wrote: On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (little) experience, this was truly painful : - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even kill -9 ?) - Naming issues : shm segments identified as 32 bits key potentially causing conflicts between applications or layers of the same application on one node - Space issues : the total shm size on a system is bound to /proc/sys/kernel/shmmax, needing admin configuration and causing conflicts between MPI applications running on the same node Indeed. The one saving grace here is that the cleanup issues apparently can be solved on Linux with a special flag that indicates "automatically remove this shmem when all processes attaching to it have died." That was really the impetus for [re-]investigating sysv shm. I, too, remember the sysv pain because we used it in LAM, too... What about the other issues? I remember those being a PITA about 15 to 20 years ago, but obviously a lot could have improved in the meantime. Iain
[OMPI devel] mallopt fixes
Inevitably, when you're testing in your own, private environment, everything works great. You test test test and are finally convinced that it's all perfect. Seconds after you merge it into the main SVN trunk, you find a dozen little mistakes. Sigh. :-\ After a bunch of SVN commits, I think I have all the mallopt fixes on the SVN trunk. Please test as much as you can. We should let this soak on the trunk for a day or three before moving it to the v1.3 branch (CMR already filed). -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r20926
Ah -- good catch. Thanks. Should the same fixes be applied to type_create_keyval_f.c and win_create_keyval_f.c? On Apr 1, 2009, at 3:31 PM, wrote: Author: igb Date: 2009-04-01 15:31:46 EDT (Wed, 01 Apr 2009) New Revision: 20926 URL: https://svn.open-mpi.org/trac/ompi/changeset/20926 Log: Fix Fortran bindings for MPI_KEYVAL_CREATE and MPI_COMM_CREATE_KEYVAL. The EXTRA_STATE parameter is passed by reference, and thus should be dereferenced before it is stored. Similarly, the stored value should be passed by reference to the copy and delete routines. This fixes #1864. Text files modified: trunk/ompi/attribute/attribute.c |16 ++-- trunk/ompi/mpi/f77/comm_create_keyval_f.c | 2 +- trunk/ompi/mpi/f77/keyval_create_f.c | 2 +- 3 files changed, 12 insertions(+), 8 deletions(-) Modified: trunk/ompi/attribute/attribute.c = = = = = = = = == --- trunk/ompi/attribute/attribute.c(original) +++ trunk/ompi/attribute/attribute.c2009-04-01 15:31:46 EDT (Wed, 01 Apr 2009) @@ -249,9 +249,10 @@ /* MPI-1 Fortran-style */ \ if (0 != (keyval_obj->attr_flag & OMPI_KEYVAL_F77_MPI1)) { \ MPI_Fint attr_val = translate_to_fortran_mpi1(attribute); \ + MPI_Fint extra_state = (MPI_Fint)keyval_obj- >extra_state; \ (*((keyval_obj- >delete_attr_fn).attr_mpi1_fortran_delete_fn)) \ (&(((ompi_##type##_t *)object)->attr_##type##_f), \ - &f_key, &attr_val, (int*)keyval_obj->extra_state, &f_err); \ + &f_key, &attr_val, &extra_state, &f_err); \ if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \ if (need_lock) { \ OPAL_THREAD_UNLOCK(&alock); \ @@ -262,9 +263,10 @@ /* MPI-2 Fortran-style */ \ else { \ MPI_Aint attr_val = translate_to_fortran_mpi2(attribute); \ + MPI_Aint extra_state = (MPI_Aint)keyval_obj- >extra_state; \ (*((keyval_obj- >delete_attr_fn).attr_mpi2_fortran_delete_fn)) \ (&(((ompi_##type##_t *)object)->attr_##type##_f), \ - &f_key, (int*)&attr_val, (int*)keyval_obj- >extra_state, &f_err); \ + &f_key, (int*)&attr_val, &extra_state, &f_err); \ if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \ if (need_lock) { \ OPAL_THREAD_UNLOCK(&alock); \ @@ -297,11 +299,12 @@ ompi_fortran_logical_t f_flag; \ /* MPI-1 Fortran-style */ \ if (0 != (keyval_obj->attr_flag & OMPI_KEYVAL_F77_MPI1)) { \ -MPI_Fint in, out; \ + MPI_Fint in, out, extra_state; \ in = translate_to_fortran_mpi1(in_attr); \ + extra_state = (MPI_Fint)keyval_obj->extra_state; \ (*((keyval_obj- >copy_attr_fn).attr_mpi1_fortran_copy_fn)) \ (&(((ompi_##type##_t *)old_object)- >attr_##type##_f), \ - &f_key, (int*)keyval_obj->extra_state, \ + &f_key, &extra_state, \ &in, &out, &f_flag, &f_err); \ if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \ OPAL_THREAD_UNLOCK(&alock); \ @@ -313,11 +316,12 @@ } \ /* MPI-2 Fortran-style */ \ else { \ -MPI_Aint in, out; \ + MPI_Aint in, out, extra_state; \ in = translate_to_fortran_mpi2(in_attr); \ + extra_state = (MPI_Aint)keyval_obj->extra_state; \ (*((keyval_obj- >copy_attr_fn).attr_mpi2_fortran_copy_fn)) \ (&(((ompi_##type##_t *)old_object)- >attr_##type##_f), \ - &f_key, keyval_obj->extra_state, &in, &out, \ + &f_key, &extra_state, &in, &out, \ &f_flag, &f_err); \ if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \ OPAL_THREAD_UNLOCK(&alock); \ Modified: trunk/ompi/mpi/f77/comm_create_keyval_f.c = = = = = = = = == --- trunk/ompi/mpi/f77/comm_create_keyval_f.c (original) +++ trunk/ompi/mpi/f77/comm_create_keyval_f.c 2009-04-01 15:31:46 EDT (Wed, 01 Apr 2009) @@ -79,7 +79,7 @@ to the old MPI-1 INTEGER-parameter functions). */ ret = ompi_attr_create_keyval(COMM_ATTR, copy_fn, del_fn, - comm_keyval, extra_state, OMPI_KEYVAL_F77, + comm_keyval, (void*)*extra_state, OMPI_KEYVAL_F77, NULL); if (MPI_SUCCESS != ret) { Modified: trunk/ompi/mpi/f77/keyval_create_f.c = = = = = = = = == --- trunk/ompi/mpi/f77/keyval_create_f.c(original) +++ trunk/ompi/mpi/f77/keyval_create_f.c
Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r20926
On Apr 1, 2009, at 4:29 PM, Jeff Squyres wrote: Should the same fixes be applied to type_create_keyval_f.c and win_create_keyval_f.c? Good question. I'll have a look at them. Iain
Re: [OMPI devel] SM init failures
On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: > > System V shared memory used to be the main way to do shared memory on > > MPICH and from my (little) experience, this was truly painful : > > - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even > > kill > > -9 ?) > Indeed. The one saving grace here is that the cleanup issues > apparently can be solved on Linux with a special flag that indicates > "automatically remove this shmem when all processes attaching to it > have died." That was really the impetus for [re-]investigating sysv > shm. I, too, remember the sysv pain because we used it in LAM, too... Unless there is something newer than IPC_RMID that I haven't heard of this is far from a complete solution, setting RMID causes it to be deleted when the attach count becomes zero so it handles the kill -9 case however it has the down side that once it's been set no further processes can attach to the memory so you have to leave a window during init during which any crash will leave the memory. I've always been of the opinion that mmaping shared files was a much more advanced solution. Ashley Pittman.
[OMPI devel] Open MPI 2009 released
The Open MPI Team, representing a consortium of bailed-out banks, car manufacturers, and insurance companies, is pleased to announce the release of the "unbreakable" / bug-free version Open MPI 2009, (expected to be available by mid-2011). This release is essentially a complete rewrite of Open MPI based on new technologies such as C#, Java, and object-oriented Cobol (so say we all!). Buffer overflows and memory leaks are now things of the past. We strongly recommend that all users upgrade to Windows 7 to fully take advantage of the new powers embedded in Open MPI. This version can be downloaded from the The Onion web site or from many BitTorrent networks (seeding now; the Open MPI ISO is approximately 3.97GB -- please wait for the full upload). Here is an abbreviated list of changes in Open MPI 2009 as compared to the previous version: - Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7 standard. MPI_COOK_DINNER support is only available with additional equipment (some assembly may be required). An experimental PVM-like API has been introduced to deal with the current limitations of the MPI 11.7 API. - Added a Twitter network transport capable of achieving peta-scale per second bandwidth (but only on useless data). - Dropped support for the barely-used x86 and x86_64 architectures in favor of the most recent ARM6 architecture. As a direct result, several Top500 sites are planning to convert from their now obsolete peta-scale machines to high-reliability iPhone clusters using the low-latency AT&T 3G network. - The iPhone iMPI app (powered by iOpen MPI) is now downloadable from the iTunes Store. Blackberry support will be included in a future release. - Fix all compiler errors related to the PGI 8.0 compiler by completely dropping support. - Add some "green" features for energy savings. The new "--bike" mpirun option will only run your parallel jobs only during the operation hours of the official Open MPI biking team. The "--preload-result" option will directly embed the final result in the parallel execution, leading to more scalable and reliable runs and decreasing the execution time of any parallel application under the real-time limit of 1 second. Open MPI is therefore EnergyStar compliant when used with these options. - In addition to moving Open MPI's lowest point-to-point transports to be an external project, limited support will be offered for industry-standard platforms. Our focus will now be to develop highly scalable transports based on widely distributed technologies such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM, RSS/Atom, DNS, and Bonjour. - Opportunistic integration with Conflicker in order to utilize free resources distributed world-wide. - Support for all Fortran versions prior to Fortran 2020 has been dropped. Make today an Open MPI day!
Re: [OMPI devel] SM init failures
So everyone hates SYSV. Ok. :-) Given that part of the problems we've been having with mmap have been due to filesystem issues, should we just unlink() the file once all processes have mapped it? I believe we didn't do that originally for two reasons: - leave it around for debugging purposes - possibly supporting MPI-2 dynamics someday We still don't support the sm BTL for dynamics, so why not unlink()? (I'm probably forgetting something obvious...?) On Apr 1, 2009, at 5:12 PM, Ashley Pittman wrote: On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: > > System V shared memory used to be the main way to do shared memory on > > MPICH and from my (little) experience, this was truly painful : > > - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even > > kill > > -9 ?) > Indeed. The one saving grace here is that the cleanup issues > apparently can be solved on Linux with a special flag that indicates > "automatically remove this shmem when all processes attaching to it > have died." That was really the impetus for [re-]investigating sysv > shm. I, too, remember the sysv pain because we used it in LAM, too... Unless there is something newer than IPC_RMID that I haven't heard of this is far from a complete solution, setting RMID causes it to be deleted when the attach count becomes zero so it handles the kill -9 case however it has the down side that once it's been set no further processes can attach to the memory so you have to leave a window during init during which any crash will leave the memory. I've always been of the opinion that mmaping shared files was a much more advanced solution. Ashley Pittman. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] SM init failures
IIRC, we certainly used to unlink the file after init. Are you sure somebody changed that? On Apr 1, 2009, at 4:29 PM, Jeff Squyres wrote: So everyone hates SYSV. Ok. :-) Given that part of the problems we've been having with mmap have been due to filesystem issues, should we just unlink() the file once all processes have mapped it? I believe we didn't do that originally for two reasons: - leave it around for debugging purposes - possibly supporting MPI-2 dynamics someday We still don't support the sm BTL for dynamics, so why not unlink()? (I'm probably forgetting something obvious...?) On Apr 1, 2009, at 5:12 PM, Ashley Pittman wrote: On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: > > System V shared memory used to be the main way to do shared memory on > > MPICH and from my (little) experience, this was truly painful : > > - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even > > kill > > -9 ?) > Indeed. The one saving grace here is that the cleanup issues > apparently can be solved on Linux with a special flag that indicates > "automatically remove this shmem when all processes attaching to it > have died." That was really the impetus for [re-]investigating sysv > shm. I, too, remember the sysv pain because we used it in LAM, too... Unless there is something newer than IPC_RMID that I haven't heard of this is far from a complete solution, setting RMID causes it to be deleted when the attach count becomes zero so it handles the kill -9 case however it has the down side that once it's been set no further processes can attach to the memory so you have to leave a window during init during which any crash will leave the memory. I've always been of the opinion that mmaping shared files was a much more advanced solution. Ashley Pittman. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Open MPI 2009 released
Bravo!! This is beautiful. By far my favorite part is "Cobol (so say we all!)". However, I question why ARM6 was targeted as opposed to ARM7 ;-) -Paul George Bosilca wrote: The Open MPI Team, representing a consortium of bailed-out banks, car manufacturers, and insurance companies, is pleased to announce the release of the "unbreakable" / bug-free version Open MPI 2009, (expected to be available by mid-2011). This release is essentially a complete rewrite of Open MPI based on new technologies such as C#, Java, and object-oriented Cobol (so say we all!). Buffer overflows and memory leaks are now things of the past. We strongly recommend that all users upgrade to Windows 7 to fully take advantage of the new powers embedded in Open MPI. This version can be downloaded from the The Onion web site or from many BitTorrent networks (seeding now; the Open MPI ISO is approximately 3.97GB -- please wait for the full upload). Here is an abbreviated list of changes in Open MPI 2009 as compared to the previous version: - Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7 standard. MPI_COOK_DINNER support is only available with additional equipment (some assembly may be required). An experimental PVM-like API has been introduced to deal with the current limitations of the MPI 11.7 API. - Added a Twitter network transport capable of achieving peta-scale per second bandwidth (but only on useless data). - Dropped support for the barely-used x86 and x86_64 architectures in favor of the most recent ARM6 architecture. As a direct result, several Top500 sites are planning to convert from their now obsolete peta-scale machines to high-reliability iPhone clusters using the low-latency AT&T 3G network. - The iPhone iMPI app (powered by iOpen MPI) is now downloadable from the iTunes Store. Blackberry support will be included in a future release. - Fix all compiler errors related to the PGI 8.0 compiler by completely dropping support. - Add some "green" features for energy savings. The new "--bike" mpirun option will only run your parallel jobs only during the operation hours of the official Open MPI biking team. The "--preload-result" option will directly embed the final result in the parallel execution, leading to more scalable and reliable runs and decreasing the execution time of any parallel application under the real-time limit of 1 second. Open MPI is therefore EnergyStar compliant when used with these options. - In addition to moving Open MPI's lowest point-to-point transports to be an external project, limited support will be offered for industry-standard platforms. Our focus will now be to develop highly scalable transports based on widely distributed technologies such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM, RSS/Atom, DNS, and Bonjour. - Opportunistic integration with Conflicker in order to utilize free resources distributed world-wide. - Support for all Fortran versions prior to Fortran 2020 has been dropped. Make today an Open MPI day! ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory
Re: [OMPI devel] Open MPI 2009 released
My wife thought it was frackin' brilliant. :) -jms Sent from my PDA. No type good. - Original Message - From: devel-boun...@open-mpi.org To: Open MPI Developers Sent: Wed Apr 01 18:58:55 2009 Subject: Re: [OMPI devel] Open MPI 2009 released Bravo!! This is beautiful. By far my favorite part is "Cobol (so say we all!)". However, I question why ARM6 was targeted as opposed to ARM7 ;-) -Paul George Bosilca wrote: > The Open MPI Team, representing a consortium of bailed-out banks, car > manufacturers, and insurance companies, is pleased to announce the > release of the "unbreakable" / bug-free version Open MPI 2009, > (expected to be available by mid-2011). This release is essentially a > complete rewrite of Open MPI based on new technologies such as C#, > Java, and object-oriented Cobol (so say we all!). Buffer overflows > and memory leaks are now things of the past. We strongly recommend > that all users upgrade to Windows 7 to fully take advantage of the new > powers embedded in Open MPI. > > This version can be downloaded from the The Onion web site or from > many BitTorrent networks (seeding now; the Open MPI ISO is > approximately 3.97GB -- please wait for the full upload). > > Here is an abbreviated list of changes in Open MPI 2009 as compared to > the previous version: > > - Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7 > standard. MPI_COOK_DINNER support is only available with additional > equipment (some assembly may be required). An experimental PVM-like > API has been introduced to deal with the current limitations of the > MPI 11.7 API. > - Added a Twitter network transport capable of achieving peta-scale > per second bandwidth (but only on useless data). > - Dropped support for the barely-used x86 and x86_64 architectures in > favor of the most recent ARM6 architecture. As a direct result, > several Top500 sites are planning to convert from their now obsolete > peta-scale machines to high-reliability iPhone clusters using the > low-latency AT&T 3G network. > - The iPhone iMPI app (powered by iOpen MPI) is now downloadable from > the iTunes Store. Blackberry support will be included in a future > release. > - Fix all compiler errors related to the PGI 8.0 compiler by > completely dropping support. > - Add some "green" features for energy savings. The new "--bike" > mpirun option will only run your parallel jobs only during the > operation hours of the official Open MPI biking team. The > "--preload-result" option will directly embed the final result in > the parallel execution, leading to more scalable and reliable runs > and decreasing the execution time of any parallel application under > the real-time limit of 1 second. Open MPI is therefore EnergyStar > compliant when used with these options. > - In addition to moving Open MPI's lowest point-to-point transports to > be an external project, limited support will be offered for > industry-standard platforms. Our focus will now be to develop > highly scalable transports based on widely distributed technologies > such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM, > RSS/Atom, DNS, and Bonjour. > - Opportunistic integration with Conflicker in order to utilize free > resources distributed world-wide. > - Support for all Fortran versions prior to Fortran 2020 has been > dropped. > > Make today an Open MPI day! > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] SM init failures
On Apr 1, 2009, at 6:58 PM, Ralph Castain wrote: IIRC, we certainly used to unlink the file after init. Are you sure somebody changed that? It looks like we unlink() it during btl sm component close (effectively during MPI_FINALIZE), not before. -- Jeff Squyres Cisco Systems
[OMPI devel] trac 1857: SM btl hangs when msg >=4k
In osu_bw, process 0 pumps lots of Isend's to process 1, and process 1 in turn sets up lots of matching Irecvs. Many messages are in flight. The question is what happens when resources are exhausted and OMPI cannot handle so much in-flight traffic. Let's specifically consider the case of long, rendezvous messages. There are at least two situations. 1) When the sender no longer has any fragments (nor can grow its free list any more), it queues a send up with add_request_to_send_pending() and somehow life is good. The PML seems to handle this case "correctly". 2) When the receiver -- specifically mca_pml_ob1_recv_request_ack_send_btl() -- no longer has any fragments to send ACKs back to confirm readiness for rendezvous, the resource-exhaustion signal travels up the call stack to mca_pml_ob1_recv_request_ack_send(), who does a MCA_PML_OB1_ADD_ACK_TO_PENDING(). In short, the PML adds the ACK to pckt_pending. Somehow, this code path doesn't work. The reason we see the problem now is that I added "autosizing" of the shared-memory area. We used to mmap *WAY* too much shared-memory for small-np jobs. (Yes, that's a subjective statement.) Meanwhile, at large-np, we didn't mmap enough and jobs wouldn't start. (Objective statement there.) So, I added heuristics to size the shared area "appropriately". The heuristics basically targetted the needs of MPI_Init(). If you want fragment free lists to grow on demand after MPI_Init(), you now basically have to bump mpool_sm_min_size up explicitly. I'd like feedback on a fix. Here are two options: A) Someone (could be I) increases the default resources. E.g., we could start with a larger eager free list. Or, I could change those "heuristics" to allow some amount of headroom for free lists to grow on demand. Either way, I'd appreciate feedback on how big to set these things. B) Someone (not I, since I don't know how) fixes the ob1 PML to handle scenario 2 above correctly.