----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59584/#review176195 -----------------------------------------------------------
Ship it! Ship It! - Vinod Kone On May 26, 2017, 12:56 a.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59584/ > ----------------------------------------------------------- > > (Updated May 26, 2017, 12:56 a.m.) > > > Review request for mesos, Anand Mazumdar, Greg Mann, and Vinod Kone. > > > Bugs: MESOS-5332, MESOS-7057 and MESOS-7569 > https://issues.apache.org/jira/browse/MESOS-5332 > https://issues.apache.org/jira/browse/MESOS-7057 > https://issues.apache.org/jira/browse/MESOS-7569 > > > Repository: mesos > > > Description > ------- > > PID-based v0 executors using Mesos libraries >= 1.1.2 always re-link > with the agent upon receiving the reconnect message. This avoids the > executor replying on a half-open TCP connection to the old agent > (possible if netfilter is dropping packets, see: MESOS-7057). > However, PID-based executors using Mesos libraries < 1.1.2 do not > re-link and are therefore prone to replying on a half-open connection > after the agent restarts. If we only send a single reconnect message, > these "old" executors will reply on their half-open connection, > receive a RST, and think the agent just died. To ensure these "old" > executors can reconnect in the presence of netfilter dropping packets, > we introduced optional retries of the reconnect message. This results > in "old" executors correctly establishing a link when processing the > second reconnect message. > > Generally, users should not enable this flag unless they are affected > by this issue. > > > Diffs > ----- > > src/slave/flags.hpp b66995630f89dfb95a6d0cf66efc5d7590e90cbc > src/slave/flags.cpp 0c8276e425a6a7d22ee68edc6cc25b331635ec44 > src/slave/slave.cpp 15e4d68714556ca30a766acd3b9729367df680c3 > > > Diff: https://reviews.apache.org/r/59584/diff/2/ > > > Testing > ------- > > Added tests in follow up reviews. > > > Thanks, > > Benjamin Mahler > >