Ralph,

here is the patch i am using so far.
i will resume working on this from Wednesday (there is at least one
remaining race condition yet) unless you have the time to take care of it
today.

so far, the race condition has only been observed in real life with the
grpcomm/rcd module, and this is not the default in v1.8, so imho this is
not a blocker for v1.8.3

Cheers,

Gilles

On Tue, Sep 23, 2014 at 7:46 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Gilles - please let me know if/when you think you'll do this. I'm debating
> about adding it to 1.8.3, but don't want to delay that release too long.
> Alternatively, I can take care of it if you don't have time (I'm asking if
> you can do it solely because you have the reproducer).
>
>
> On Sep 21, 2014, at 6:54 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
> Sounds fine with me - please go ahead, and thanks
>
> On Sep 20, 2014, at 10:26 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
> Thanks for the pointer George !
>
> On Sat, Sep 20, 2014 at 5:46 AM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
>
>> Or copy the handshake protocol design of the TCP BTL...
>>
>>
> the main difference between oob/tcp and btl/tcp is the way we resolve the
> situation in which two processes send their first message to each other at
> the same time.
>
> in oob/tcp, all (e.g. one or two) sockets are closed and the higher vpid
> is directed to retry establishing a connection.
>
> in btl/tcp, the useless socket is closed (e.g. the one that was connect-ed
> on the lower vpid and the one that was accept-ed on the higher vpid.
>
>
> my first impression is that oob/tcp is un-necessary complex and it should
> use the simpler and most efficient protocol of btl/tcp.
> that being said, this conclusion could be too naive and for some good
> reasons i ignore, the btl/tcp handshake protocol might not be a good fit
> for oob/tcp.
>
> any thoughts ?
>
> i will revamp oob/tcp in order to use the same btl/tcp handshake protocol
> from tomorrow unless indicated otherwise
>
> Cheers,
>
> Gilles
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15885.php
>
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/09/15895.php
>

Attachment: oobtcp2.patch
Description: Binary data

Reply via email to