Ralph,

If you don't mind I would like to understand this issue a little bit more. What 
exactly is broken in the termination detection?

>From a network point of view, there is a slight issue with the commit 25245. A 
>direct call to exit will close all pending sockets, with a linger of 60 
>seconds (quite bad if you use static ports as an example). There are proper 
>protocols to shutdown sockets in a reliable way, maybe it is time to implement 
>one of them.

Thanks,
  george.

On Oct 10, 2011, at 12:40 , Ralph Castain wrote:

> It wasn't the launcher that was broken, but termination detection, and not 
> for all environments (e.g., worked fine for slurm). It is a progress-related 
> issue.
> 
> Should be fixed in r25245.
> 
> 
> On Oct 10, 2011, at 8:33 AM, Shamis, Pavel wrote:
> 
>> + 1 , I see the same issue.
>> 
>>> -----Original Message-----
>>> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
>>> On Behalf Of Yevgeny Kliteynik
>>> Sent: Monday, October 10, 2011 10:24 AM
>>> To: OpenMPI Devel
>>> Subject: [OMPI devel] Launcher in trunk is broken?
>>> 
>>> It looks like the process launcher is broken in the OMPI trunk:
>>> If you run any simple test (not necessarily including MPI calls) on 4 or
>>> more nodes, the MPI processes won't be killed after the test finishes.
>>> 
>>> $ mpirun -host host_1,host_2,host_3,host_4 -np 4 --mca btl sm,tcp,self
>>> /bin/hostname
>>> 
>>> Output:
>>> host_1
>>> host_2
>>> host_3
>>> host_4
>>> 
>>> And test is hanging......
>>> 
>>> I have an older trunk (r25228), and everything is OK there.
>>> Not sure if it means that something was broken after that, or the problem
>>> existed before, but kicked in only now due to some other change.
>>> 
>>> -- YK
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to