[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010396#comment-15010396 ] Adam B commented on MESOS-313: -- I've added you as a contributor, and assigned this issue to you. We will review your patch soon. > report executor deaths to framework schedulers > -- > > Key: MESOS-313 > URL: https://issues.apache.org/jira/browse/MESOS-313 > Project: Mesos > Issue Type: Improvement >Reporter: Charles Reiss >Assignee: Zhitao Li > Labels: mesosphere, newbie > > The Scheduler interface has a callback for executorLost, but currently it is > never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009544#comment-15009544 ] Adam B commented on MESOS-313: -- [~zhitao] You are absolutely correct. The Mesos master needs to send the message to the SchedulerDriver, if it doesn't already; and the SchedulerDriver needs to call the existing executorLost callback on the framework's scheduler. See the SlaveLost/LostSlave messages/methods for a good example. As [~vinodkone] mentioned, we would like to be able to make this message reliably delivered, like some of the other StatusUpdate messages, but let's leave that for a phase 2 and try to get an MVP that at least attempts to notify the scheduler. > report executor deaths to framework schedulers > -- > > Key: MESOS-313 > URL: https://issues.apache.org/jira/browse/MESOS-313 > Project: Mesos > Issue Type: Improvement >Reporter: Charles Reiss > Labels: mesosphere, newbie > > The Scheduler interface has a callback for executorLost, but currently it is > never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010380#comment-15010380 ] Zhitao Li commented on MESOS-313: - [~adam-mesos] I created https://reviews.apache.org/r/40429/ but it does not seem to link to the bug here. Do I need to be added to the "contributor" list? (I sent an email for that but got no response.) > report executor deaths to framework schedulers > -- > > Key: MESOS-313 > URL: https://issues.apache.org/jira/browse/MESOS-313 > Project: Mesos > Issue Type: Improvement >Reporter: Charles Reiss > Labels: mesosphere, newbie > > The Scheduler interface has a callback for executorLost, but currently it is > never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705250#comment-14705250 ] Frank Scholten commented on MESOS-313: -- Any update on this one? We are implementing volume based storage for Elasticsearch and want to handle executor deaths so containers are restarted on the same slave as their volume. report executor deaths to framework schedulers -- Key: MESOS-313 URL: https://issues.apache.org/jira/browse/MESOS-313 Project: Mesos Issue Type: Improvement Reporter: Charles Reiss Labels: mesosphere, newbie The Scheduler interface has a callback for executorLost, but currently it is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644375#comment-14644375 ] Frank Scholten commented on MESOS-313: -- We are running into the same issue with the Elastisearch framework. report executor deaths to framework schedulers -- Key: MESOS-313 URL: https://issues.apache.org/jira/browse/MESOS-313 Project: Mesos Issue Type: Improvement Reporter: Charles Reiss Labels: mesosphere The Scheduler interface has a callback for executorLost, but currently it is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299250#comment-14299250 ] Ben Whitehead commented on MESOS-313: - I recently ran into this issue when working on developing a framework to run Cassandra. I had to figure out a work around of starting a no-op task (with some special data in it so I can track it) so that I could track that the executor was still running. If this no-op task is ever lost I assume the executor is lost with it and I can then try and recover from there. It's quite confusing that {executorLost} is in the API and the behavior is documented (at least on the Java interface) when it isn't implemented. report executor deaths to framework schedulers -- Key: MESOS-313 URL: https://issues.apache.org/jira/browse/MESOS-313 Project: Mesos Issue Type: Improvement Reporter: Charles Reiss Labels: mesosphere The Scheduler interface has a callback for executorLost, but currently it is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-313) report executor deaths to framework schedulers
[ https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299294#comment-14299294 ] Vinod Kone commented on MESOS-313: -- Part of the reason why it was not implemented was because the termination of executor is currently not reliably communicated between slave and the master. While the resource accounting is correctly done, the terminated signal is not. So the reliability needs to be fixed before we expose it in the API. report executor deaths to framework schedulers -- Key: MESOS-313 URL: https://issues.apache.org/jira/browse/MESOS-313 Project: Mesos Issue Type: Improvement Reporter: Charles Reiss Labels: mesosphere The Scheduler interface has a callback for executorLost, but currently it is never called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)