Thanks! I won't have time to work on it this week, but appreciate your effort. 
Also, thanks for clarifying the race condition vis 1.8 - I agree it is not a 
blocker for that release.

Ralph

On Sep 22, 2014, at 4:49 PM, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com> wrote:

> Ralph,
> 
> here is the patch i am using so far.
> i will resume working on this from Wednesday (there is at least one remaining 
> race condition yet) unless you have the time to take care of it today.
> 
> so far, the race condition has only been observed in real life with the 
> grpcomm/rcd module, and this is not the default in v1.8, so imho this is not 
> a blocker for v1.8.3
> 
> Cheers,
> 
> Gilles
> 
> On Tue, Sep 23, 2014 at 7:46 AM, Ralph Castain <r...@open-mpi.org> wrote:
> Gilles - please let me know if/when you think you'll do this. I'm debating 
> about adding it to 1.8.3, but don't want to delay that release too long. 
> Alternatively, I can take care of it if you don't have time (I'm asking if 
> you can do it solely because you have the reproducer).
> 
> 
> On Sep 21, 2014, at 6:54 AM, Ralph Castain <r...@open-mpi.org> wrote:
> 
>> Sounds fine with me - please go ahead, and thanks
>> 
>> On Sep 20, 2014, at 10:26 PM, Gilles Gouaillardet 
>> <gilles.gouaillar...@gmail.com> wrote:
>> 
>>> Thanks for the pointer George !
>>> 
>>> On Sat, Sep 20, 2014 at 5:46 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>>> Or copy the handshake protocol design of the TCP BTL...
>>> 
>>> 
>>> the main difference between oob/tcp and btl/tcp is the way we resolve the 
>>> situation in which two processes send their first message to each other at 
>>> the same time.
>>> 
>>> in oob/tcp, all (e.g. one or two) sockets are closed and the higher vpid is 
>>> directed to retry establishing a connection.
>>> 
>>> in btl/tcp, the useless socket is closed (e.g. the one that was connect-ed 
>>> on the lower vpid and the one that was accept-ed on the higher vpid.
>>> 
>>> 
>>> my first impression is that oob/tcp is un-necessary complex and it should 
>>> use the simpler and most efficient protocol of btl/tcp.
>>> that being said, this conclusion could be too naive and for some good 
>>> reasons i ignore, the btl/tcp handshake protocol might not be a good fit 
>>> for oob/tcp.
>>> 
>>> any thoughts ?
>>> 
>>> i will revamp oob/tcp in order to use the same btl/tcp handshake protocol 
>>> from tomorrow unless indicated otherwise
>>> 
>>> Cheers,
>>> 
>>> Gilles
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/09/15885.php
>> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/09/15895.php
> 
> <oobtcp2.patch>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/09/15897.php

Reply via email to