[
https://issues.apache.org/jira/browse/MESOS-10084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015863#comment-17015863
]
Benjamin Bannier edited comment on MESOS-10084 at 1/21/20 12:21 PM:
--------------------------------------------------------------------
Reviews:
https://reviews.apache.org/r/72033/
https://reviews.apache.org/r/72034/
https://reviews.apache.org/r/72035/
was (Author: bbannier):
Reviews:
https://reviews.apache.org/r/72002/
https://reviews.apache.org/r/72003/
> Detecting whether executor is generated for command task should work when the
> launcher_dir changes
> --------------------------------------------------------------------------------------------------
>
> Key: MESOS-10084
> URL: https://issues.apache.org/jira/browse/MESOS-10084
> Project: Mesos
> Issue Type: Bug
> Reporter: Andrei Sekretenko
> Assignee: Benjamin Bannier
> Priority: Critical
>
> As currently implemented, on recovery Mesos agent determines that the
> executor is generated for command task by comparing the executor command with
> a current path to Mesos executor:
> https://github.com/apache/mesos/blob/1.7.x/src/slave/slave.cpp#L9635
> During upgrade of production cluster we observed this check to break due to
> the new launcher_dir being different from the one of checkpointed executor.
> This can cause problems of various kind: for example, after such upgrade,
> Mesos master can begin to treat the checkpointed command executors as subject
> to resource quota.
> Design considerations:
> - proper solution is to checkpoint the flag indicating whether the executor
> is a command/docker one.
> - for correct upgrade from older Mesos versions, we will need some kind of
> workaround to detect command executors after upgrade; the workaround logic
> should be skipped if there is a checkpointed flag.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)