[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542852 ]
Milind Bhandarkar commented on HADOOP-2141: ------------------------------------------- In my past life, using 2x the average to determine outliers has worked well (I don't know the theory behind it ;-) Regarding other questions, We have seen stuck tasks mostly for reduces, (maybe because reduces write to DFS), but I would prefer a uniform treatment for tasks, regardless of map/reduce. Some streaming users do write to DFS in map tasks as side-effects (taking care of each attempt writing to a separate file/directory). This will help them as well. Is there a strong reason for disabling it for maps ? > speculative execution start up condition based on completion time > ----------------------------------------------------------------- > > Key: HADOOP-2141 > URL: https://issues.apache.org/jira/browse/HADOOP-2141 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.15.0 > Reporter: Koji Noguchi > Assignee: Arun C Murthy > Fix For: 0.16.0 > > > We had one job with speculative execution hang. > 4 reduce tasks were stuck with 95% completion because of a bad disk. > Devaraj pointed out > bq . One of the conditions that must be met for launching a speculative > instance of a task is that it must be at least 20% behind the average > progress, and this is not true here. > It would be nice if speculative execution also starts up when tasks stop > making progress. > Devaraj suggested > bq. Maybe, we should introduce a condition for average completion time for > tasks in the speculative execution check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.