[
https://issues.apache.org/jira/browse/HADOOP-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amar Kamat updated HADOOP-4716:
-------------------------------
Attachment: HADOOP-4716-v1.1.patch
Looks like {{TestJobTrackerRestartWithLostTracker}} can still fail. Consider a
map that is
- hosted by a tracker that will be lost
- the map completion event is not logged to the job history (i.e its in the
buffer)
Call it a _hanging-map_. Once the reducer reaches the _hanging-map_, it will be
stuck there forever as the map location is not known to the jobtracker and
hence wont be added to the _ignore-list_. The previous fix solves the issue.
What it does is :
- clears the old/stale mapping of map output locations
- reduces the backoff time so that the reducer doesnt back off for long
Updates the patch to trunk.
> testRestartWithLostTracker frequently times out
> -----------------------------------------------
>
> Key: HADOOP-4716
> URL: https://issues.apache.org/jira/browse/HADOOP-4716
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Johan Oskarsson
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-4716-v1.1.patch, HADOOP-4716-v1.patch, log.txt
>
>
> This test frequently times out:
> org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker.testRestartWithLostTracker
> Example:
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3637/testReport/org.apache.hadoop.mapred/TestJobTrackerRestartWithLostTracker/testRestartWithLostTracker/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.