[ 
https://issues.apache.org/jira/browse/MESOS-6922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6922:
------------------------------
    Assignee: Vinod Kone

Looks like the reason for multiple status updates is because we call `resume` 
on status update manager twice:

1) when the agent receives ReregisteredSlaveMessage from the master after 
recovery
2) when the agent receives UpdateFrameworkMessage from the master after 
re-registration

One way to fix the race is to drop the UpdateFrameworkMessage in the test.

> SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky
> ------------------------------------------------------
>
>                 Key: MESOS-6922
>                 URL: https://issues.apache.org/jira/browse/MESOS-6922
>             Project: Mesos
>          Issue Type: Bug
>          Components: tests
>         Environment: CentOS 7
>            Reporter: Greg Mann
>            Assignee: Vinod Kone
>              Labels: tests
>         Attachments: SlaveRecoveryTest.RecoverTerminatedExecutor.txt
>
>
> This was observed on ASF CI. Find attached the log from a failed run; it 
> appears that too many status updates are being received:
> {code}
> /mesos/src/tests/slave_recovery_tests.cpp:1350: Failure
> Mock function called more times than expected - returning directly.
>     Function call: statusUpdate(0x7ffcf00155b8, @0x2b3f4f7ab8c0 120-byte 
> object <50-66 6A-45 3F-2B 00-00 00-00 00-00 00-00 00-00 DF-13 00-00 00-00 
> 00-00 70-59 01-90 3F-2B 00-00 A0-D7 00-90 3F-2B 00-00 05-00 00-00 01-00 00-00 
> D0-01 91-04 00-00 00-00 D0-9C 00-90 3F-2B 00-00 C0-EB 01-90 3F-2B 00-00 18-00 
> 00-00 00-2B 00-00 47-98 7C-B9 92-29 D6-41 90-5B 02-90 3F-2B 00-00 00-00 00-00 
> 00-00 00-00 70-6E 01-90 3F-2B 00-00 00-00 00-00 00-00 00-00>)
>          Expected: to be called once
>            Actual: called twice - over-saturated and active
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to