Re: [OMPI users] Tool communication

Jeff Squyres Tue, 9 Oct 2007 02:09:16 -0400

Interesting idea.

One obvious solution would be to mpirun your controller tasks and, asyou mentioned, use MPI to communicate between them. Then you can useMPI_COMM_SPAWN to launch the actual MPI job that you want to monitor.

However, this will only more-or-less work. OMPI currently pollsaggressively to make message passing progress, so if you end up over-subscribing nodes (because you filled up the cores on one node withall the target MPI processes but also have 1 or more controllerprocesses running on the same node), they'll thrash each other andyou'll get -- at best -- unreliable/unrepeatable performance fraughtwith lots of race conditions.

Another issue is that OMPI's MPI_COMM_SPAWN does not give goodoptions to allow specific process placement, so it might be a littledicey to get processes to land exactly where you want them.

Alternatively, you could simply locally fork()/exec() your targetprocess from the controller. But the MPI spec does state that theuse of fork() is undefined within an MPI process. Indeed, if youare using a high-speed network such as InfiniBand or Myrinet, callingfork() after you call MPI_INIT, Bad Things(tm) will happen (we canexplain more if you care). But if you're only using TCP, you shouldbe fine.

Another option might be to mpirun your target MPI app, have it waitin some kind of local barrier, and then mpirun your controllers onthe same machines. The controllers find/attach to your targetprocesses, release them from the local barrier, and then you're goodto go -- both your controllers and your target app are fully up andrunning under MPI. You'll still have the spinning/performance issue,though -- so you won't want to oversubscribe nodes.


Does this help?


On Oct 1, 2007, at 10:49 PM, Oleg Morajko wrote:

Hello,
In the context of my PhD research, I have been developing a run-time performance analyzer for MPI-based applications.My tool provides a controller process for each MPI task. Inparticular, when a MPI job is started, a special wrapper script isgenerated that first starts my controller processes and next eachcontroller spawns an actual MPI task (that performs MPI_Init etc.).I use dynamic instrumentation API (DynInst API) to control andinstrument MPI tasks.
The point is I need to intercommunicate my controller processes, inparticular I need a point-to-point communication between arbitrarypair of controllers. So it seems reasonable to take advantage ofMPI itself and use it for communication. However I am not sure whatwould be the impact of calling MPI_Init and communicating fromcontroller processes taking into account both controllers andactual MPI processes where started with the same mpiruninvocation. Actually I would need to assure that controllers have aseparate MPI execution enviroment while the application has anotherone.
Any suggestions how to achive that? Obviously another option is touse sockets to intercommunicate controllers, but having MPI thisseems to be overkill.
Thank you in advance for your help.

Regards,
--Oleg

PhD student, Universitat Autonoma de Barcelona, Spain

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

Re: [OMPI users] Tool communication

Reply via email to