Thanks Vinod. I missed that issue when searching!
I did consider sending a shutdown task, though my worry was that there may be cases where the task might not launch. Perhaps due to resource starvation and/or no offers being received. Presumably it would not be correct to store the original OfferId and launch a new task from that offer, as it *could* be days old. On Tue, Sep 30, 2014 at 2:10 AM, Vinod Kone <vinodk...@gmail.com> wrote: > Adding a shutdownExecutor() driver call has been discussed before. > https://issues.apache.org/jira/browse/MESOS-330 > As a work around, have you considered sending a special "kill" task as a > signal to the executor to commit suicide? > On Mon, Sep 29, 2014 at 5:27 PM, Tom Arnfeld <t...@duedil.com> wrote: >> Hi, >> >> I've been making some modifications to the Hadoop framework recently and >> have come up against a brick wall. I'm wondering if the concept of killing >> an executor from a framework has been discussed before? >> >> Currently we are launching two tasks for each Hadoop TaskTracker, one that >> has a bit of CPU and all the memory, and then another with the rest of the >> CPU. In total this equals the amount of resources we want to give each >> TaskTracker. This is *kind of* how spark works, ish. >> >> The reason we do this is to be able to free up CPU resources and remove >> slots from a TaskTracker (killing it half dead) but keeping the executor >> alive. At some undefined point in the future we then want to kill the >> executor, this happens by killing the other "control" task. >> >> This approach doesn't work very well in practice as a result of >> https://issues.apache.org/jira/browse/MESOS-1812 which means tasks are not >> launched in order on the slave, so there is no way to guarantee the control >> task comes up first, which leads to all sorts of interesting races. >> >> Is this is bad road to go down? I can't use framework messages as I don't >> believe those are a reliable way of sending signals, so not sure where else >> to turn. >> >> Cheers, >> >> Tom. >>