Thanks Ralph.
Tetsuya > I tracked it down - not Torque specific, but impacts all managed environments. Will fix > > > On Apr 1, 2014, at 2:23 AM, tmish...@jcity.maeda.co.jp wrote: > > > > > Hi Ralph, > > > > I saw another hangup with openmpi-1.8 when I used more than 4 nodes > > (having 8 cores each) under managed state by Torque. Although I'm not > > sure you can reproduce it with SLURM, at leaset with Torque it can be > > reproduced in this way: > > > > [mishima@manage ~]$ qsub -I -l nodes=4:ppn=8 > > qsub: waiting for job 8726.manage.cluster to start > > qsub: job 8726.manage.cluster ready > > > > [mishima@node09 ~]$ mpirun -np 65 ~/mis/openmpi/demos/myprog > > -------------------------------------------------------------------------- > > There are not enough slots available in the system to satisfy the 65 slots > > that were requested by the application: > > /home/mishima/mis/openmpi/demos/myprog > > > > Either request fewer slots for your application, or make more slots > > available > > for use. > > -------------------------------------------------------------------------- > > <<< HANG HERE!! >>> > > Abort is in progress...hit ctrl-c again within 5 seconds to forcibly > > terminate > > > > I found this behavior when I happened to input wrong procs. With less than > > 4 > > nodes or rsh - namely unmanaged state, it works. I'm afraid to say I have > > no > > idea how to resolve it. I hope you could fix the problem. > > > > Tetsuya > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Searchable archives: http://www.open-mpi.org/community/lists/devel/2014/04/index.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: http://www.open-mpi.org/community/lists/devel/2014/04/14438.php