[
https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-4302:
----------------------------------
Attachment: 4302-2.patch
*sigh* There's still a race condition in the last patch. If the third output is
fetching (allocated) but not closed when the second closes, it's possible to
merge the first two to disk before allocating the following three, which
trigger a similar fault. The reduce will begin with all segments merged to
disk. The solution sets {{mapred.job.shuffle.merge.percent}} high enough to
avoid an intermediate merge in the test until the fetch thread is stalled on
the final output.
> TestReduceFetch fails intermittently
> ------------------------------------
>
> Key: HADOOP-4302
> URL: https://issues.apache.org/jira/browse/HADOOP-4302
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Devaraj Das
> Assignee: Chris Douglas
> Priority: Blocker
> Fix For: 0.19.0
>
> Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch,
> 4302-2.patch
>
>
> I see TestReduceFetch failing once in a while. Here is one such failure
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/
> Marking this as a blocker until we get to the root cause.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.