Hi,

I've been making some modifications to the Hadoop framework recently and
have come up against a brick wall. I'm wondering if the concept of killing
an executor from a framework has been discussed before?

Currently we are launching two tasks for each Hadoop TaskTracker, one that
has a bit of CPU and all the memory, and then another with the rest of the
CPU. In total this equals the amount of resources we want to give each
TaskTracker. This is *kind of* how spark works, ish.

The reason we do this is to be able to free up CPU resources and remove
slots from a TaskTracker (killing it half dead) but keeping the executor
alive. At some undefined point in the future we then want to kill the
executor, this happens by killing the other "control" task.

This approach doesn't work very well in practice as a result of
https://issues.apache.org/jira/browse/MESOS-1812 which means tasks are not
launched in order on the slave, so there is no way to guarantee the control
task comes up first, which leads to all sorts of interesting races.

Is this is bad road to go down? I can't use framework messages as I don't
believe those are a reliable way of sending signals, so not sure where else
to turn.

Cheers,

Tom.

Reply via email to