This SO question was asked about 1yr ago.
http://stackoverflow.com/questions/31799755/how-to-deal-with-tasks-running-too-long-comparing-to-others-in-job-in-yarn-cli

I answered this question with a suggestion to try speculation but it
doesn't quite do what the OP expects. I have been running into this issue
more these days. Out of 5000 tasks, 4950 completes in 5mins but the last 50
never really completes, have tried waiting for 4hrs. This can be a memory
issue or maybe the way spark's fine grained mode works with mesos, I am
trying to enable jmxsink to get a heap dump.

But in the mean time, is there a better fix for this? (in any version of
spark, I am using 1.5.1 but can upgrade). It would be great if the last 50
tasks in my example can be killed (timed out) and the stage completes
successfully.

-- 
Thanks,
-Utkarsh

Reply via email to