Am 31.01.2012 um 20:38 schrieb Ralph Castain: > Not sure I fully grok this thread, but will try to provide an answer. > > When you start a singleton, it spawns off a daemon that is the equivalent of > "mpirun". This daemon is created for the express purpose of allowing the > singleton to use MPI dynamics like comm_spawn - without it, the singleton > would be unable to execute those functions. > > The first thing the daemon does is read the local allocation, using the same > methods as used by mpirun. So whatever allocation is present that mpirun > would have read, the daemon will get. This includes hostfiles and SGE > allocations.
So it should honor also the default hostfile of Open MPI if running outside of SGE, i.e. from the command line? > The exception to this is when the singleton gets started in an altered > environment - e.g., if SGE changes the environmental variables when launching > the singleton process. We see this in some resource managers - you can get an > allocation of N nodes, but when you launch a job, the envar in that job only > indicates the number of nodes actually running processes in that job. In such > a situation, the daemon will see the altered value as its "allocation", > potentially causing confusion. Not sure whether I get it right. When I launch the same application with: "mpiexec -np1 ./Mpitest" (and get an allocation of 2+2 on the two machines): 27422 ? Sl 4:12 /usr/sge/bin/lx24-x86/sge_execd 9504 ? S 0:00 \_ sge_shepherd-3791 -bg 9506 ? Ss 0:00 \_ /bin/sh /var/spool/sge/pc15370/job_scripts/3791 9507 ? S 0:00 \_ mpiexec -np 1 ./Mpitest 9508 ? R 0:07 \_ ./Mpitest 9509 ? Sl 0:00 \_ /usr/sge/bin/lx24-x86/qrsh -inherit -nostdin -V pc15381 orted -mca 9513 ? S 0:00 \_ /home/reuti/mpitest/Mpitest --child 2861 ? Sl 10:47 /usr/sge/bin/lx24-x86/sge_execd 25434 ? Sl 0:00 \_ sge_shepherd-3791 -bg 25436 ? Ss 0:00 \_ /usr/sge/utilbin/lx24-x86/qrsh_starter /var/spool/sge/pc15381/active_jobs/3791.1/1.pc15381 25444 ? S 0:00 \_ orted -mca ess env -mca orte_ess_jobid 821952512 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri 25447 ? S 0:01 \_ /home/reuti/mpitest/Mpitest --child 25448 ? S 0:01 \_ /home/reuti/mpitest/Mpitest --child This is what I expect (main + 1 child, other node gets 2 children). Now I launch the singleton instead (nothing changed besides this, still 2+2 granted): "./Mpitest" and get: 27422 ? Sl 4:12 /usr/sge/bin/lx24-x86/sge_execd 9546 ? S 0:00 \_ sge_shepherd-3793 -bg 9548 ? Ss 0:00 \_ /bin/sh /var/spool/sge/pc15370/job_scripts/3793 9549 ? R 0:00 \_ ./Mpitest 9550 ? Ss 0:00 \_ orted --hnp --set-sid --report-uri 6 --singleton-died-pipe 7 9551 ? Sl 0:00 \_ /usr/sge/bin/lx24-x86/qrsh -inherit -nostdin -V pc15381 orted 9554 ? S 0:00 \_ /home/reuti/mpitest/Mpitest --child 9555 ? S 0:00 \_ /home/reuti/mpitest/Mpitest --child 2861 ? Sl 10:47 /usr/sge/bin/lx24-x86/sge_execd 25494 ? Sl 0:00 \_ sge_shepherd-3793 -bg 25495 ? Ss 0:00 \_ /usr/sge/utilbin/lx24-x86/qrsh_starter /var/spool/sge/pc15381/active_jobs/3793.1/1.pc15381 25502 ? S 0:00 \_ orted -mca ess env -mca orte_ess_jobid 814940160 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri 25503 ? S 0:00 \_ /home/reuti/mpitest/Mpitest --child Only one child is going to the other node. The environment is the same in both cases. Is this the correct behavior? -- Reuti > For this reason, I generally recommend that you run dynamic applications > using miprun when operating in RM-managed environments to avoid confusion. Or > at least use "printenv" to check that the envars are going to be right before > trying to start from a singleton. > > HTH > Ralph > > On Jan 31, 2012, at 12:19 PM, Reuti wrote: > >> Am 31.01.2012 um 20:12 schrieb Jeff Squyres: >> >>> I only noticed after the fact that Tom is also here at Cisco (it's a big >>> company, after all :-) ). >>> >>> I've contacted him using our proprietary super-secret Cisco handshake >>> (i.e., the internal phone network); I'll see if I can figure out the issues >>> off-list. >> >> But I would be interested in a statement about a hostlist for singleton >> startups. Or whether it's honoring the tight integration nodes more by >> accident than by design. And as said: I see a wrong allocation, as the >> initial ./Mpitest doesn't count as process. I get a 3+1 allocation instead >> of 2+2 (what is granted by SGE). If started with "mpiexec -np 1 ./Mpitest" >> all is fine. >> >> -- Reuti >> >> >>> On Jan 31, 2012, at 1:08 PM, Dave Love wrote: >>> >>>> Reuti <re...@staff.uni-marburg.de> writes: >>>> >>>>> Maybe it's a side effect of a tight integration that it would start on >>>>> the correct nodes (but I face an incorrect allocation of slots and an >>>>> error message at the end if started without mpiexec), as in this case >>>>> it has no command line option for the hostfile. How to get the >>>>> requested nodes if started from the command line? >>>> >>>> Yes, I wouldn't expect it to work without mpirun/mpiexec and, of course, >>>> I basically agree with Reuti about the rest. >>>> >>>> If there is an actual SGE problem or need for an enhancement, though, >>>> please file it per https://arc.liv.ac.uk/trac/SGE#mail >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users