Hi I'm in a need to manually fail a task (tip - TaskInProgress), probably via failing its task attempts, whose tasktracker is suddenly not available I want to do this from the FairScheduler.update() (my own scheduler extending FairScheduler and its update() method). The problem is when I just do, what is usually done by JobClient, when you fail a task from the command line:
taskTrackerManager.killTask(taskAttemptId, true); after the tasktracker process is killed, then this job hangs. I simulate this by: 1. running the job 2. while the job runs killing the tasktracker process 3. then when I detect this in my version of FairScheduler.update() method, I want to fail the task so that the JobTracker will go on (not waiting for the global timeout). What happens now when I do the above (for all task attempts of for this TIP) is that the task attempts are marked as failed/killed by the task still hangs. Executing tip.kill() additionally after failing all task attempts does not change this behavior. Do you have any ideas on how to effectively more or less instantly fail this task and let the whole process go on, not hang? -- Wojtek