[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542900 ]
Owen O'Malley commented on HADOOP-2141: --------------------------------------- Ok, after talking to Koji, it looks like part of the problem was that the jobs were streaming and thus didn't time out after 10 minutes. I therefore filed HADOOP-2211 that proposes changing the default timeout for streaming to 1 hour instead of infinite. > speculative execution start up condition based on completion time > ----------------------------------------------------------------- > > Key: HADOOP-2141 > URL: https://issues.apache.org/jira/browse/HADOOP-2141 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.15.0 > Reporter: Koji Noguchi > Assignee: Arun C Murthy > Fix For: 0.16.0 > > > We had one job with speculative execution hang. > 4 reduce tasks were stuck with 95% completion because of a bad disk. > Devaraj pointed out > bq . One of the conditions that must be met for launching a speculative > instance of a task is that it must be at least 20% behind the average > progress, and this is not true here. > It would be nice if speculative execution also starts up when tasks stop > making progress. > Devaraj suggested > bq. Maybe, we should introduce a condition for average completion time for > tasks in the speculative execution check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.