Ralph, If you don't mind I would like to understand this issue a little bit more. What exactly is broken in the termination detection?
>From a network point of view, there is a slight issue with the commit 25245. A >direct call to exit will close all pending sockets, with a linger of 60 >seconds (quite bad if you use static ports as an example). There are proper >protocols to shutdown sockets in a reliable way, maybe it is time to implement >one of them. Thanks, george. On Oct 10, 2011, at 12:40 , Ralph Castain wrote: > It wasn't the launcher that was broken, but termination detection, and not > for all environments (e.g., worked fine for slurm). It is a progress-related > issue. > > Should be fixed in r25245. > > > On Oct 10, 2011, at 8:33 AM, Shamis, Pavel wrote: > >> + 1 , I see the same issue. >> >>> -----Original Message----- >>> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] >>> On Behalf Of Yevgeny Kliteynik >>> Sent: Monday, October 10, 2011 10:24 AM >>> To: OpenMPI Devel >>> Subject: [OMPI devel] Launcher in trunk is broken? >>> >>> It looks like the process launcher is broken in the OMPI trunk: >>> If you run any simple test (not necessarily including MPI calls) on 4 or >>> more nodes, the MPI processes won't be killed after the test finishes. >>> >>> $ mpirun -host host_1,host_2,host_3,host_4 -np 4 --mca btl sm,tcp,self >>> /bin/hostname >>> >>> Output: >>> host_1 >>> host_2 >>> host_3 >>> host_4 >>> >>> And test is hanging...... >>> >>> I have an older trunk (r25228), and everything is OK there. >>> Not sure if it means that something was broken after that, or the problem >>> existed before, but kicked in only now due to some other change. >>> >>> -- YK >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel