Chun-Hung Hsiao created MESOS-9537:
--------------------------------------
Summary: SLRP sends inconsistent status updates for dropped
operations.
Key: MESOS-9537
URL: https://issues.apache.org/jira/browse/MESOS-9537
Project: Mesos
Issue Type: Bug
Components: storage
Affects Versions: 1.7.0, 1.6.1, 1.6.0, 1.5.2, 1.5.1, 1.5.0, 1.7.1
Reporter: Chun-Hung Hsiao
Assignee: Chun-Hung Hsiao
The bug manifests in the following scenario:
1. Upon receiving profile updates, the SLRP sends an {{UPDATE_STATE}} to the
agent with a new resource version.
2. At the same time, the agent sends an {{APPLY_OPERATION}} to the SLRP with
the original resource version.
3. The SLRP asks the status update manager (SUM) to reply with an
{{OPERATION_DROPPED}} to the framework because of the resource version
mismatch. The status update is required to be acked. Then, it simply discards
the operation (i.e., no bookkeeping).
4. The agent finds a missing operation in the {{UPDATE_STATE}} so it sends a
{{RECONCILE_OPERATIONS}}.
5. The SLRP asks the SUM to reply with an {{OPERATION_DROPPED}} to the agent
(without a framework ID set) because it no longer knows about the operation.
6. The SUM returns an error because the latter {{OPERATION_DROPPED}} is
inconsistent with the earlier one since it does not have a framework ID.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)