> Yeah, we added those capabilities specifically for this purpose. Indeed, > another researcher added this to Torque a couple of years ago, though it > didn't get pushed upstream. Also was added to Slurm.
Thanks for your help . By any chance you have more info on that one? Or a faint idea where I can find some info on that? I never found something like that.... Best, Suraj On Feb 22, 2014, at 6:38 PM, Ralph Castain wrote: > > On Feb 22, 2014, at 9:30 AM, Suraj Prabhakaran <suraj.prabhaka...@gmail.com> > wrote: > >> Thanks Ralph. >> >> I cannot get rid of Torque since I am actually working on dynamic allocation >> of nodes for a running job on Torque. What I actually want to do is spawn >> processes on the dynamically assigned nodes since that is the most easiest >> way to expand MPI processes when a resource allocation is expanded. > > No problem - just set "-mca plm rsh" on your cmd line. You'll still use the > Torque allocator, but use ssh to launch the results. > >> >> I also contemplated on whether any of my changes to the Torque daemons could >> be the problem but it cannot be because of 2 reasons. >> >> 1. For the cases which I have sent you, no dynamic allocation is done. Just >> using MPI_Comm_spawn on a normal allocation of resources. So my changes to >> torque are irrelevant here as they are not even called. >> 2. Further, the processes start successfully on all the nodes. Torque logs >> don't report any problems and the processes do exist on all the nodes. And >> the fact that "sometimes" they work and don't have a problem! >> >> I am not sure how many users really use MPI_Comm_spawn (spawning large >> processes) under the Torque environment to actually not such a problem. >> Because, mpiexec works just fine for any number of processes. > > Not many - comm_spawn is only used by researchers upon occasion. I haven't > seen a "real" application yet, though we may just have not heard about it. > Still, we do have users with Torque, and perhaps someone can check it. > >> >> Any suggestions or hints on this would be highly appreciated. OpenMPI also >> seems to be the only implementation we can use for this work at the moment >> because of the "add-host" info argument for MPI_Comm_spawn which we are >> using comfortably when spawning onto dynamically allocated hosts which were >> not a part of the original allocation. > > Yeah, we added those capabilities specifically for this purpose. Indeed, > another researcher added this to Torque a couple of years ago, though it > didn't get pushed upstream. Also was added to Slurm. > > Sadly, I no longer have access to a Torque machine and so I can only offer > advice. OMPI is executing a state machine, so you could look at one of the > procs on a machine where they are stalled (look for someone not reporting out > of the modex) and see where it hung. You can also watch it move thru the > state machine by setting > > -mca state_base_verbose 10 > > on your command line. > > Happy to provide advice - sorry for the problem > Ralph > > >> >> Best, >> Suraj >> >> >> On Feb 22, 2014, at 4:30 PM, Ralph Castain wrote: >> >>> >>> On Feb 21, 2014, at 5:55 PM, Suraj Prabhakaran >>> <suraj.prabhaka...@gmail.com> wrote: >>> >>>> Hmm.. but in actual the MPI_Comm_spawn of parents and MPI_Init of children >>>> never returned! >>> >>> Understood - my point was that the output shows no errors or issues. For >>> some reason, the progress thread appears to just stop. This usually >>> indicates some kind of recursive behavior, but that isn't showing up in the >>> output. >>> >>>> >>>> I configured MPI with >>>> >>>> ./configure --prefix=/dir/ --enable-debug --with-tm=/usr/local/ >>> >>> Should be fine. I don't have access to a Torque-based system, and we aren't >>> hearing issues from other Torque users, so this may have something to do >>> with how Torque is configured on your system. Perhaps someone with a >>> Torque-based system on the list could also test this? >>> >>> Meantime, I would suggest just using rsh/ssh (since you said that works) >>> for now as Torque really isn't doing anything for you in this use-case. >>> >>> >>>> >>>> >>>> On Feb 22, 2014, at 12:53 AM, Ralph Castain wrote: >>>> >>>>> Strange - it all looks just fine. How was OMPI configured? >>>>> >>>>> On Feb 21, 2014, at 3:31 PM, Suraj Prabhakaran >>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>> >>>>>> Ok, I figured out that it was not a problem with the node grsacc04 >>>>>> because I now conducted the same on totally different set of nodes. >>>>>> >>>>>> I must really say that with --bind-to none option, the program completed >>>>>> "many" times compared to earlier but still "sometimes" it hangs! >>>>>> Attaching now the output of the same case conducted on different set of >>>>>> nodes with the --bind-to none option. >>>>>> >>>>>> mpiexec -mca plm_base_verbose 5 -mca ess_base_verbose 5 -mca >>>>>> grpcomm_base_verbose 5 --bind-to none -np 3 ./example >>>>>> >>>>>> Best, >>>>>> Suraj >>>>>> >>>>>> <output.rtf> >>>>>> >>>>>> >>>>>> On Feb 21, 2014, at 5:03 PM, Ralph Castain wrote: >>>>>> >>>>>>> Well, that all looks fine. However, I note that the procs on grsacc04 >>>>>>> all stopped making progress at the same point, which is why the job >>>>>>> hung. All the procs on the other nodes were just fine. >>>>>>> >>>>>>> So let's try a couple of things: >>>>>>> >>>>>>> 1. add "--bind-to none" to your cmd line so we avoid any contention >>>>>>> issues >>>>>>> >>>>>>> 2. if possible, remove grsacc04 from the allocation (you can just use >>>>>>> the -host option on the cmd line to ignore it), and/or replace that >>>>>>> host with another one. Let's see if the problem has something to do >>>>>>> with that specific node. >>>>>>> >>>>>>> >>>>>>> On Feb 21, 2014, at 4:08 AM, Suraj Prabhakaran >>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>> >>>>>>>> Right, so I have the output here. Same case, >>>>>>>> >>>>>>>> mpiexec -mca plm_base_verbose 5 -mca ess_base_verbose 5 -mca >>>>>>>> grpcomm_base_verbose 5 -np 3 ./simple_spawn >>>>>>>> >>>>>>>> Output attached. >>>>>>>> >>>>>>>> Best, >>>>>>>> Suraj >>>>>>>> >>>>>>>> <output> >>>>>>>> >>>>>>>> On Feb 21, 2014, at 5:30 AM, Ralph Castain wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Feb 20, 2014, at 7:05 PM, Suraj Prabhakaran >>>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks Ralph! >>>>>>>>>> >>>>>>>>>> I must have mentioned though. Without the Torque environment, >>>>>>>>>> spawning with ssh works ok. But Under the torque environment, not. >>>>>>>>> >>>>>>>>> Ah, no - you forgot to mention that point. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I started the simple_spawn with 3 processes and spawned 9 processes >>>>>>>>>> (3 per node on 3 nodes). >>>>>>>>>> >>>>>>>>>> There is no problem with the Torque environment because all the 9 >>>>>>>>>> processes are started on the respective nodes. But the >>>>>>>>>> MPI_Comm_spawn of the parent and MPI_Init of the children, >>>>>>>>>> "sometimes" don't return! >>>>>>>>> >>>>>>>>> Seems odd - the launch environment has nothing to do with MPI_Init, >>>>>>>>> so if the processes are indeed being started, they should run. One >>>>>>>>> possibility is that they aren't correctly getting some wireup info. >>>>>>>>> >>>>>>>>> Can you configure OMPI --enable-debug and then rerun the example with >>>>>>>>> "-mca plm_base_verbose 5 -mca ess_base_verbose 5 -mca >>>>>>>>> grpcomm_base_verbose 5" on the command line? >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is the output of simple_spawn - which confirms the above >>>>>>>>>> statement. >>>>>>>>>> >>>>>>>>>> [pid 31208] starting up! >>>>>>>>>> [pid 31209] starting up! >>>>>>>>>> [pid 31210] starting up! >>>>>>>>>> 0 completed MPI_Init >>>>>>>>>> Parent [pid 31208] about to spawn! >>>>>>>>>> 1 completed MPI_Init >>>>>>>>>> Parent [pid 31209] about to spawn! >>>>>>>>>> 2 completed MPI_Init >>>>>>>>>> Parent [pid 31210] about to spawn! >>>>>>>>>> [pid 28630] starting up! >>>>>>>>>> [pid 28631] starting up! >>>>>>>>>> [pid 9846] starting up! >>>>>>>>>> [pid 9847] starting up! >>>>>>>>>> [pid 9848] starting up! >>>>>>>>>> [pid 6363] starting up! >>>>>>>>>> [pid 6361] starting up! >>>>>>>>>> [pid 6362] starting up! >>>>>>>>>> [pid 28632] starting up! >>>>>>>>>> >>>>>>>>>> Any hints? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Suraj >>>>>>>>>> >>>>>>>>>> On Feb 21, 2014, at 3:44 AM, Ralph Castain wrote: >>>>>>>>>> >>>>>>>>>>> Hmmm...I don't see anything immediately glaring. What do you mean >>>>>>>>>>> by "doesn't work"? Is there some specific behavior you see? >>>>>>>>>>> >>>>>>>>>>> You might try the attached program. It's a simple spawn test we use >>>>>>>>>>> - 1.7.4 seems happy with it. >>>>>>>>>>> >>>>>>>>>>> <simple_spawn.c> >>>>>>>>>>> >>>>>>>>>>> On Feb 20, 2014, at 10:14 AM, Suraj Prabhakaran >>>>>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> I am using 1.7.4! >>>>>>>>>>>> >>>>>>>>>>>> On Feb 20, 2014, at 7:00 PM, Ralph Castain wrote: >>>>>>>>>>>> >>>>>>>>>>>>> What OMPI version are you using? >>>>>>>>>>>>> >>>>>>>>>>>>> On Feb 20, 2014, at 7:56 AM, Suraj Prabhakaran >>>>>>>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am having problem using MPI_Comm_spawn under torque. It >>>>>>>>>>>>>> doesn't work when spawning more than 12 processes on various >>>>>>>>>>>>>> nodes. To be more precise, "sometimes" it works, and "sometimes" >>>>>>>>>>>>>> it doesn't! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here is my case. I obtain 5 nodes, 3 cores per node and my >>>>>>>>>>>>>> $PBS_NODEFILE looks like below. >>>>>>>>>>>>>> >>>>>>>>>>>>>> node1 >>>>>>>>>>>>>> node1 >>>>>>>>>>>>>> node1 >>>>>>>>>>>>>> node2 >>>>>>>>>>>>>> node2 >>>>>>>>>>>>>> node2 >>>>>>>>>>>>>> node3 >>>>>>>>>>>>>> node3 >>>>>>>>>>>>>> node3 >>>>>>>>>>>>>> node4 >>>>>>>>>>>>>> node4 >>>>>>>>>>>>>> node4 >>>>>>>>>>>>>> node5 >>>>>>>>>>>>>> node5 >>>>>>>>>>>>>> node5 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I started a hello program (which just spawns itself and of >>>>>>>>>>>>>> course, the children don't spawn), with >>>>>>>>>>>>>> >>>>>>>>>>>>>> mpiexec -np 3 ./hello >>>>>>>>>>>>>> >>>>>>>>>>>>>> Spawning 3 more processes (on node 2) - works! >>>>>>>>>>>>>> spawning 6 more processes (node 2 and 3) - works! >>>>>>>>>>>>>> spawning 9 processes (node 2,3,4) - "sometimes" OK, "sometimes" >>>>>>>>>>>>>> not! >>>>>>>>>>>>>> spawning 12 processes (node 2,3,4,5) - "mostly" not! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I ideally want to spawn about 32 processes with large number of >>>>>>>>>>>>>> nodes, but this is at the moment impossible. I have attached my >>>>>>>>>>>>>> hello program to this email. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will be happy to provide any more info or verbose outputs if >>>>>>>>>>>>>> you could please tell me what exactly you would like to see. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Suraj >>>>>>>>>>>>>> >>>>>>>>>>>>>> <hello.c>_______________________________________________ >>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> devel mailing list >>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel