Re: [OMPI users] Open MPI 1.4.2 released
On 05/28/2010 08:20 AM, Jeff Squyres wrote: On May 16, 2010, at 5:21 AM, Aleksej Saushev wrote: http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/parallel/openmpi/patches/ Sorry for the high latency reply... aa: We haven't added RPATH support yet. We've talked about it but never done it. There are some in OMPI who insist that rpath support needs to be optional. A full patch solution would be appreciated. We have problems with rpath overriding LD_RUN_PATH. LD_RUN_PATH is an intrinsic part of the way we configure our user's environment. We effectively use (impose) rpath but through the flexible, concatenatable LD_RUN_PATH. David
Re: [hwloc-users] hwloc on systems with more than 64 cpus?
On Thursday 27 May 2010 11:47:25 pm Brice Goglin wrote: > Le 27/05/2010 23:28, Jirka Hladky a écrit : > >> hwloc-calc doesn't accept input from stdin, it only reads the > >> command-line. We have a TODO entry about this, I'll work on it soon. > >> > >> For now, you can do: > >> hwloc-distrib ... | xargs -n 1 utils/hwloc-calc > > > > I forgot to use "-n 1" switch in xargs to send only 1 cpu set per one > > hwloc- calc command. > > > > This works just fine: :-) > > hwloc-distrib --single 8 | xargs -n1 hwloc-calc --taskset > > > > Perhaps you can add this example to hwloc-distrib man page? > > I've added the stdin support to hwloc-calc so I don't think it matters > anymore: "hwloc-distrib --single 8 | hwloc-calc --taskset" should do > what you want. I'll add something like this to the manpage. > > Brice Great, thanks! Jirka
Re: [OMPI users] Open MPI 1.4.2 released
On May 16, 2010, at 5:21 AM, Aleksej Saushev wrote: > http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/parallel/openmpi/patches/ Sorry for the high latency reply... aa: We haven't added RPATH support yet. We've talked about it but never done it. There are some in OMPI who insist that rpath support needs to be optional. A full patch solution would be appreciated. ab: This should now be moot on the dev trunk as of r23158. It won't go to v1.4, but it is slated for the v1.5 series. I was waiting for your reply to my off-list pings on testing this stuff before I filed a v1.5 CMR, but I just went ahead and filed one anyway: https://svn.open-mpi.org/trac/ompi/ticket/2423. ac: ditto to ab ad: ditto to ab ae: ditto to ab af: ditto to ab -- but I might have missed this one. Can you test? ag: ditto to ab -- but I might have missed this one. Can you test? ah: this should be applied -- did we miss it? Gah! I just checked and it didn't go. What the heck happened here... (checking) I see that it went into v1.5. It supposedly went into v1.4 in r22890. gahh! It looks like the commit message on r22890 *says* it put in r22640, but it didn't actually *do* it. :-( ag: should be moot by ab, above. ai: I think you explained this to me before, but I forget (sorry!). These are actually configuration files, not example files. Hence, we install them into sysconfdir. Is this a difference of definitions, somehow? (i.e., what you define as usage policies for exampledir and sysconfdir) aj: ditto to ai ak: ditto to ai -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] request_get_status: Recheck request status [PATCH]
Thanks for the ping -- sorry it took so long! Committed to the SVN trunk in r23215; I filed CMR's for v1.4 and v1.5. It's technically not a bug, so I don't know if the v1.4 RM's will allow it. On May 27, 2010, at 12:02 PM, Shaun Jackman wrote: > Ping. > > On Tue, 2010-05-04 at 14:06 -0700, Shaun Jackman wrote: > > Hi Jeff, > > > > request_get_status polls request->req_complete before calling > > opal_progress. Ideally, it would check req_complete, call opal_progress, > > and check req_complete one final time. This patch identically mirrors > > the logic of > > ompi_request_default_test in ompi/request/req_test.c. > > > > We've discussed this patch on the mailing list previously. I think we > > both agreed it was a good idea, but never made it around to being > > applied. > > > > Cheers, > > Shaun > > > > 2009-09-14 Shaun Jackman> > > > * ompi/mpi/c/request_get_status.c (MPI_Request_get_status): > > If opal_progress is called then check the status of the request > > before returning. opal_progress is called only once. This logic > > parallels MPI_Test (ompi_request_default_test). > > > > --- ompi/mpi/c/request_get_status.c.orig 2008-11-04 12:56:27.0 > > -0800 > > +++ ompi/mpi/c/request_get_status.c 2009-09-24 15:30:09.99585 -0700 > > @@ -41,6 +41,10 @@ > > int MPI_Request_get_status(MPI_Request request, int *flag, > > MPI_Status *status) > > { > > +#if OMPI_ENABLE_PROGRESS_THREADS == 0 > > +int do_it_once = 0; > > +#endif > > + > > MEMCHECKER( > > memchecker_request(); > > ); > > @@ -57,6 +61,9 @@ > > } > > } > > > > +#if OMPI_ENABLE_PROGRESS_THREADS == 0 > > + recheck_request_status: > > +#endif > > opal_atomic_mb(); > > if( (request == MPI_REQUEST_NULL) || (request->req_state == > > OMPI_REQUEST_INACTIVE) ) { > > *flag = true; > > @@ -78,9 +85,17 @@ > > } > > return MPI_SUCCESS; > > } > > -*flag = false; > > #if OMPI_ENABLE_PROGRESS_THREADS == 0 > > -opal_progress(); > > +if( 0 == do_it_once ) { > > +/** > > + * If we run the opal_progress then check the status of the > > request before > > + * leaving. We will call the opal_progress only once per call. > > + */ > > +opal_progress(); > > +do_it_once++; > > +goto recheck_request_status; > > +} > > #endif > > +*flag = false; > > > > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] request_get_status: Recheck request status [PATCH]
Ping. On Tue, 2010-05-04 at 14:06 -0700, Shaun Jackman wrote: > Hi Jeff, > > request_get_status polls request->req_complete before calling > opal_progress. Ideally, it would check req_complete, call opal_progress, > and check req_complete one final time. This patch identically mirrors > the logic of > ompi_request_default_test in ompi/request/req_test.c. > > We've discussed this patch on the mailing list previously. I think we > both agreed it was a good idea, but never made it around to being > applied. > > Cheers, > Shaun > > 2009-09-14 Shaun Jackman> > * ompi/mpi/c/request_get_status.c (MPI_Request_get_status): > If opal_progress is called then check the status of the request > before returning. opal_progress is called only once. This logic > parallels MPI_Test (ompi_request_default_test). > > --- ompi/mpi/c/request_get_status.c.orig 2008-11-04 12:56:27.0 > -0800 > +++ ompi/mpi/c/request_get_status.c 2009-09-24 15:30:09.99585 -0700 > @@ -41,6 +41,10 @@ > int MPI_Request_get_status(MPI_Request request, int *flag, > MPI_Status *status) > { > +#if OMPI_ENABLE_PROGRESS_THREADS == 0 > +int do_it_once = 0; > +#endif > + > MEMCHECKER( > memchecker_request(); > ); > @@ -57,6 +61,9 @@ > } > } > > +#if OMPI_ENABLE_PROGRESS_THREADS == 0 > + recheck_request_status: > +#endif > opal_atomic_mb(); > if( (request == MPI_REQUEST_NULL) || (request->req_state == > OMPI_REQUEST_INACTIVE) ) { > *flag = true; > @@ -78,9 +85,17 @@ > } > return MPI_SUCCESS; > } > -*flag = false; > #if OMPI_ENABLE_PROGRESS_THREADS == 0 > -opal_progress(); > +if( 0 == do_it_once ) { > +/** > + * If we run the opal_progress then check the status of the request > before > + * leaving. We will call the opal_progress only once per call. > + */ > +opal_progress(); > +do_it_once++; > +goto recheck_request_status; > +} > #endif > +*flag = false; >
Re: [OMPI users] Some Questions on Building OMPI on Linux Em64t
On May 26, 2010, at 3:32 PM, Michael E. Thomadakis wrote: > How do you handle thread/task and memory affinity? Do you pass the requested > affinity desires to the batch scheduler and them let it issue the specific > placements for threads to the nodes ? Not as of yet, no. At the moment, Open MPI only obeys its own affinity settings, usually passed via mpirun (see mpirun(1)). > This is something we are concerned as we are running multiple jobs on same > node and we don't want to oversubscribe cores by binding there threads > inadvertandly. > > Looking at ompi_info > $ ompi_info | grep -i aff >MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4.2) >MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2) > > does this mean we have the full affinity support included or do I need to > involve HWLOC in any way ? Yes, Open MPI processes can bind themselves to sockets / cores. The 1.4 series uses PLPA behind the scenes for processor affinity stuff (the first_use stuff is for memory affinity). The 1.5 series will eventually use hwloc (we just recently imported it into our development trunk, but it's still "soaking" before moving over to the v1.5 branch (we've found at least one minor problem so far). It'll likely be there for the v1.5.1 series. That being said, you can certainly ignore OMPI's intrinsic binding capabilities and use a standalone program like hwloc-bind or taskset to bind MPI processes. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Deadlock question
On May 24, 2010, at 20:27 , Eugene Loh wrote: > Gijsbert Wiesenekker wrote: > >> My MPI program consists of a number of processes that send 0 or more >> messages (using MPI_Isend) to 0 or more other processes. The processes check >> periodically if messages are available to be processed. It was running fine >> until I increased the message size, and I got deadlock problems. Googling >> learned I was running into a classic deadlock problem if (see for example >> http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html). The workarounds >> suggested like changing the order of MPI_Send and MPI_Recv do not work in my >> case, as it could be that one processor does not send any message at all to >> the other processes, so MPI_Recv would wait indefinitely. >> Any suggestions on how to avoid deadlock in this case? >> > The problems you describe would seem to arise with blocking functions like > MPI_Send and MPI_Recv. With the non-blocking variants MPI_Isend/MPI_Irecv, > there shouldn't be this problem. There should be no requirement of ordering > the functions in the way that web page describes... that workaround is > suggested for the blocking calls. It feels to me that something is missing > from your description. > > If you know the maximum size any message will be, you can post an MPI_Irecv > with wild card tags and source ranks. You can post MPI_Isend calls for > whatever messages you want to send. You can use MPI_Test to check if any > message has been received; if so, process the received message and re-post > the MPI_Irecv. You can use MPI_Test to check if any send messages have > completed; if so, you can reuse those send buffers. You need some signal to > indicate to processes that no further messages will be arriving. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users My program was running fine using the methods you describe (MPI_Isend/MPI_Test/MPI_Irecv), until I increased the message size. My program was not running very efficient because of the MPI overhead associated with sending/receiving a large number of small messages. So I decided to combine messages before sending them, and then I got the deadlock problems: the MPI_Test calls never returned true, so the MPI_Isend calls never completed. As described on the link given above, the reason was that I exhausted the MPI system buffer space, in combination with the unsafe ordering of the send/receive calls (but I cannot see how I can change that order given the nature of my program). See for example also http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.pe.doc/pe_422/am10600481.html: 'Destination buffer space unavailability cannot cause a safe MPI program to fail, but could cause hangs in unsafe MPI programs. An unsafe program is one that assumes MPI can guarantee system buffering of sent data until the receive is posted.' Gijsbert