[ 
https://issues.apache.org/jira/browse/MESOS-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-6245:
----------------------------------
    Description: It seems that the agent code sets 
{{StatusUpdate}}->{{slave_id}} but does not set the 
{{TaskStatus}}->{{slave_id}} if it's not already set. On the driver, when we 
receive such a status update and if it has explicit ACK enabled, it would pass 
the {{TaskStatus}} to the scheduler. But, the scheduler has no way of acking 
this update due to {{slave_id}} not being present. Note that, implicit 
acknowledgements still work since they use the {{slave_id}} from 
{{StatusUpdate}}. Hence, we never noticed this in our tests as all of them use 
implicit acknowledgements on the driver.  (was: It seems that the driver has an 
old check relying on the `PID`. The `PID` is always `UPID()` for HTTP based 
executors. If a scheduler is using explicit acknowledgements, it won't ever be 
able to acknowledge the update since the driver would clean up the {{uuid}} 
field!

Note that all our tests use implicit acknowledgements and we never got around 
to catching this issue till Marathon started using the HTTP based executors.)

> Driver based schedulers performing explicit acknowledgements cannot 
> acknowledge updates from HTTP based executors.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-6245
>                 URL: https://issues.apache.org/jira/browse/MESOS-6245
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Anand Mazumdar
>            Assignee: Anand Mazumdar
>              Labels: mesosphere
>             Fix For: 1.1.0, 1.0.2
>
>
> It seems that the agent code sets {{StatusUpdate}}->{{slave_id}} but does not 
> set the {{TaskStatus}}->{{slave_id}} if it's not already set. On the driver, 
> when we receive such a status update and if it has explicit ACK enabled, it 
> would pass the {{TaskStatus}} to the scheduler. But, the scheduler has no way 
> of acking this update due to {{slave_id}} not being present. Note that, 
> implicit acknowledgements still work since they use the {{slave_id}} from 
> {{StatusUpdate}}. Hence, we never noticed this in our tests as all of them 
> use implicit acknowledgements on the driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to