[
https://issues.apache.org/jira/browse/MESOS-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Farner resolved MESOS-152.
-------------------------------
Resolution: Won't Fix
Talked with Ben offline about this. Will hold off until mesos API supports
querying for list of known tasks in the cluster and use that to reconcile our
internal state.
> Slave should forward status updates for unknown tasks
> -----------------------------------------------------
>
> Key: MESOS-152
> URL: https://issues.apache.org/jira/browse/MESOS-152
> Project: Mesos
> Issue Type: Bug
> Reporter: Bill Farner
>
> The slave swallows status updates for tasks that it does not recognize. Due
> to the way we handle tasks and history in the twitter framework, it would be
> ideal if these messages were passed along.
> Relevant code in slave.cpp:
> Executor* executor = framework->getExecutor(status.task_id());
> if (executor != NULL) {
> executor->updateTaskState(status.task_id(), status.state());
> // Handle the task appropriately if it's terminated.
> if (status.state() == TASK_FINISHED ||
> status.state() == TASK_FAILED ||
> status.state() == TASK_KILLED ||
> status.state() == TASK_LOST) {
> executor->removeTask(status.task_id());
> dispatch(isolationModule,
> &IsolationModule::resourcesChanged,
> framework->id, executor->id, executor->resources);
> }
> // Send message and record the status for possible resending.
> StatusUpdateMessage message;
> message.mutable_update()->MergeFrom(update);
> message.set_pid(self());
> send(master, message);
> UUID uuid = UUID::fromBytes(update.uuid());
> // Send us a message to try and resend after some delay.
> delay(STATUS_UPDATE_RETRY_INTERVAL_SECONDS,
> self(), &Slave::statusUpdateTimeout,
> framework->id, uuid);
> framework->updates[uuid] = update;
> stats.tasks[status.state()]++;
> stats.validStatusUpdates++;
> } else {
> LOG(WARNING) << "Status update error: couldn't lookup "
> << "executor for framework " << update.framework_id();
> stats.invalidStatusUpdates++;
> }
> Ideally, this code would behave more like:
> Look up executor
> if executor exists:
> Update executor state
> else:
> Log warning
> send message
> Of course, this is still in a scope where the framework is known.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira