Jan Schlicht created MESOS-9885:
-----------------------------------
Summary: Resource provider configuration are only removing its
container, causing issues in failover scenarios
Key: MESOS-9885
URL: https://issues.apache.org/jira/browse/MESOS-9885
Project: Mesos
Issue Type: Bug
Components: resource provider
Affects Versions: 1.8.0
Reporter: Jan Schlicht
An agent could crash while it is handling a {{REMOVE_RESOURCE_PROVIDER_CONFIG}}
call. In that case, the resource provider won't be removed. This is because its
configuration is only removed if the actual resource provider container has
been stopped. I.e. in {{LocalResourceProviderDaemonProcess::remove}} {{os::rm}}
is only called if {{cleanupContainers}} was successful. After agent failover,
the resource provider will still be running. This can be a problem for
frameworks/operators, because there isn't a feedback channel that informs them
if their removal requests was successful or not.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)