[ https://issues.apache.org/jira/browse/MESOS-887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Mahler updated MESOS-887: ---------------------------------- Component/s: slave master Description: The slave already links with the master, but it does not use the built in exited() notification from libprocess to trigger re-registration. Of particular concern is that, if the socket breaks and subsequent messages are successfully sent on ephemeral sockets, then we don't re-register with the master. was: The Scheduler Driver already links with the master, but it does not use the built in exited() notification from libprocess to detect socket closure. This would fast-track the delay from zookeeper detecting a leadership change, and would minimize the number of dropped messages leaving the driver. Labels: reliability (was: framework reliability) Summary: Slave should use exited() to detect disconnection with Master. (was: Scheduler Driver should use exited() to detect disconnection with Master.) > Slave should use exited() to detect disconnection with Master. > -------------------------------------------------------------- > > Key: MESOS-887 > URL: https://issues.apache.org/jira/browse/MESOS-887 > Project: Mesos > Issue Type: Improvement > Components: master, slave > Affects Versions: 0.13.0, 0.14.0, 0.14.1, 0.14.2, 0.16.0, 0.15.0 > Reporter: Benjamin Mahler > Labels: reliability > > The slave already links with the master, but it does not use the built in > exited() notification from libprocess to trigger re-registration. > Of particular concern is that, if the socket breaks and subsequent messages > are successfully sent on ephemeral sockets, then we don't re-register with > the master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)