[jira] [Comment Edited] (MESOS-243) driver stop() should block until outstanding requests have been persisted

2016-07-13 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375240#comment-15375240
 ] 

Anand Mazumdar edited comment on MESOS-243 at 7/13/16 3:51 PM:
---

[~vladap2016] Did you get around to submitting an updated patch after 
discarding this? https://reviews.apache.org/r/48625

Also, it would be good to address MESOS-5262 first i.e. to ensure that the 
agent always sends an _acknowledgment_ for all cases. This would avoid 
instances where the driver might keep on waiting indefinitely for an 
acknowledgement.


was (Author: anandmazumdar):
[~vladap2016] Did you get around to submitting an updated patch after 
discarding this? https://reviews.apache.org/r/48625

Also, it would be good to address MESOS-5276 first i.e. to ensure that the 
agent always sends an _acknowledgment_ for all cases. This would avoid 
instances where the driver might keep on waiting indefinitely for an 
acknowledgement.

> driver stop() should block until outstanding requests have been persisted
> -
>
> Key: MESOS-243
> URL: https://issues.apache.org/jira/browse/MESOS-243
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 0.14.1, 
> 0.14.2, 0.15.0
>Reporter: brian wickman
>Assignee: Vladimir Petrovic
>
> in our executor, we send a terminal status update message and immediately 
> call driver.stop().  it turns out that the status update is dispatched 
> asynchronously and races with driver shutdown, causing tasks to instead 
> periodically go into LOST state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-243) driver stop() should block until outstanding requests have been persisted

2016-07-13 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375240#comment-15375240
 ] 

Anand Mazumdar edited comment on MESOS-243 at 7/13/16 3:50 PM:
---

[~vladap2016] Did you get around to submitting an updated patch after 
discarding this? https://reviews.apache.org/r/48625

Also, it would be good to address MESOS-5276 first i.e. to ensure that the 
agent always sends an _acknowledgment_ for all cases. This would avoid 
instances where the driver might keep on waiting indefinitely for an 
acknowledgement.


was (Author: anandmazumdar):
[~vladap2016] Did you get around to submitting an updated patch after 
discarding this: https://reviews.apache.org/r/48625

Also, it would be good to address MESOS-5276 first i.e. to ensure that the 
agent always sends an _acknowledgment_ for all cases. This would avoid 
instances where the driver might keep on waiting indefinitely for an 
acknowledgement.

> driver stop() should block until outstanding requests have been persisted
> -
>
> Key: MESOS-243
> URL: https://issues.apache.org/jira/browse/MESOS-243
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 0.14.1, 
> 0.14.2, 0.15.0
>Reporter: brian wickman
>Assignee: Vladimir Petrovic
>
> in our executor, we send a terminal status update message and immediately 
> call driver.stop().  it turns out that the status update is dispatched 
> asynchronously and races with driver shutdown, causing tasks to instead 
> periodically go into LOST state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-243) driver stop() should block until outstanding requests have been persisted

2016-07-19 Thread Vladimir Petrovic (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384410#comment-15384410
 ] 

Vladimir Petrovic edited comment on MESOS-243 at 7/19/16 4:02 PM:
--

I have checked MESOS-5262 but I'm not sure that patching agent is a correct fix 
for the problem. In my opinion, problem is with executors which don't handle 
shutdown message.

With my patch, driver based executor will still be killed by the 
ShutdownProcess in the situation described in MESOS-5262.



was (Author: vladap2016):
I have checked MESOS-5262 but I'm not sure that patching agent is a correct fix 
for the problem. In my opinion, problem is with executors which don't handle 
shutdown message.

With my patch, driver based executor will properly shutdown when it receives 
shutdown message even if it has pending acknowledgements.

> driver stop() should block until outstanding requests have been persisted
> -
>
> Key: MESOS-243
> URL: https://issues.apache.org/jira/browse/MESOS-243
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 0.14.1, 
> 0.14.2, 0.15.0
>Reporter: brian wickman
>Assignee: Vladimir Petrovic
> Fix For: 1.1.0
>
>
> in our executor, we send a terminal status update message and immediately 
> call driver.stop().  it turns out that the status update is dispatched 
> asynchronously and races with driver shutdown, causing tasks to instead 
> periodically go into LOST state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)