I'm not following what the bug is.
The code you pointed to is called from here:
https://github.com/apache/mesos/blob/1.4.0/src/slave/status_update_manager.cpp#L762-L776
Where we ignore duplicates and also ensure that the ack matches the latest
update we've sent.
So, from the code you pointed to I
We use explicit ack from Scheduler.
Here, is a snippet of the logs. Please see logs for Status Update UUID:
a918f5ed-a604-415a-ad62-5a34fb6334ef
W0416 00:41:25.843505 124530 status_update_manager.cpp:761] Duplicate
status update acknowledgment (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8)
for upda
Do you have logs? Which acknowledgements did the agent receive? Which
TASK_RUNNING in the sequence was it re-sending?
On Tue, Apr 10, 2018 at 6:41 PM, Benjamin Mahler wrote:
> > Issue is that, *old executor reference is hold by slave* (assuming it
> did not receive acknowledgement, whereas maste
> Issue is that, *old executor reference is hold by slave* (assuming it did not
receive acknowledgement, whereas master and scheduler have processed the
status updates), so it continues to retry TASK_RUNNING infinitely.
The agent only retries so long as it does not get an acknowledgement, is
the
Hi,
We are running into an issue with slave status update manager. Below is the
behavior I am seeing.
Our use case is, we run Stateful container (Cassandra process), here
Executor polls JMX port at 60 second interval to get Cassandra State and
sends the state to agent -> master -> framework.
*RU
(1) Assuming you're referring to the scheduler's acknowledgement of a
status update, the agent will not forward TS2 until TS1 has been
acknowledged. So, TS2 will not be acknowledged before TS1 is acknowledged.
FWICT, we'll ignore any violation of this ordering and log a warning.
(2) To reverse the
Hi,
While designing the correct behavior with one of our framework, we
encounters some questions about behavior of status update:
The executor continuously polls the workload probe to get current mode of
workload (a Cassandra server), and send various status update states
(STARTING, RUNNING, FAIL