[
https://issues.apache.org/jira/browse/MESOS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043969#comment-14043969
]
Vinod Kone commented on MESOS-1529:
-----------------------------------
{quote}
It's not clear to me why (2) is required.
{quote}
This is mainly to speed up the re-registration instead of waiting for the
timeout to elapse. This is useful in case the slave -> master link is broken
but slave --> ZK is fine.
{quote}
Will (3) also check the ping is from the leading master and trigger
re-registration if a ping is received from a different master?
{quote}
It will definitely count a ping as successful only if it is from the leading
master. If it receives a ping from a non-leading master it means that the slave
--> master link is broken while master --> slave link is fine. In this case a
re-registration should already be in progress. If the ping was a delayed ping
from an old master the slave should've already re-registered/re-registering
with the new master.
{quote}
2) What does an "exit" event signify? Why would we need to check that it was
for a leading master?
{quote}
An "exited" event signifies that a link between slave --> master is broken.
This could be due to network partition or master failover. We need to check if
it was from the leading master because, before "exited" event is received by
the slave, the slave might have received a "new master detected" event from zk
and re-registered with a new master. In that case, the slave can safely ignore
the "exited" event.
{quote}
3) How is the 75 seconds determined?
{quote}
It is nice to be "greater than" 75s which is the timeout used by the master to
remove a slave so that slave(s) don't overwhelm masters with re-registration
attempts when master likely didn't even remove them. The greater the value the
longer it will take for the master and slave to reconcile. We can make it
configurable and let the operators choose.
{quote}
Does this lock us into a phased upgrade path if this timeout value needs to
change?
{quote}
I don't see why it would lock us into an upgrade path.
{quote}
If we get a ping from a non-leading master, we should likely ignore it and not
immediately trigger re-registration. IE: let the timeout take effect.
{quote}
Yes we will ignore it. See above.
> Handle a network partition between Master and Slave
> ---------------------------------------------------
>
> Key: MESOS-1529
> URL: https://issues.apache.org/jira/browse/MESOS-1529
> Project: Mesos
> Issue Type: Bug
> Reporter: Dominic Hamon
>
> If a network partition occurs between a Master and Slave, the Master will
> remove the Slave (as it fails health check) and mark the tasks being run
> there as LOST. However, the Slave is not aware that it has been removed so
> the tasks will continue to run.
> (To clarify a little bit: neither the master nor the slave receives 'exited'
> event, indicating that the connection between the master and slave is not
> closed).
> There are at least two possible approaches to solving this issue:
> 1. Introduce a health check from Slave to Master so they have a consistent
> view of a network partition. We may still see this issue should a one-way
> connection error occur.
> 2. Be less aggressive about marking tasks and Slaves as lost. Wait until the
> Slave reappears and reconcile then. We'd still need to mark Slaves and tasks
> as potentially lost (zombie state) but maybe the Scheduler can make a more
> intelligent decision.
--
This message was sent by Atlassian JIRA
(v6.2#6252)