[jira] [Commented] (MAPREDUCE-6166) Reducers do not catch bad map output transfers during shuffle if data shuffled directly to disk

Jason Lowe (JIRA) Mon, 24 Nov 2014 06:19:33 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223009#comment-14223009
 ]


Jason Lowe commented on MAPREDUCE-6166:
---------------------------------------

I'd be a little wary of doing this.  I believe the MergeManager and MapOutput 
classes are being used by third-party software like SyncSort, see 
MAPREDUCE-4808, MAPREDUCE-4039, and related JIRAs.  By changing the input 
stream being passed to mapOutput.shuffle to an IFileInputStream then calling 
read() on the data subtly changes the behavior.  Before it was an IFileInput 
stream, calling read() would read all the data and the checksum.  After it's 
wrapped at a higher level it won't.  If the third-party software is itself 
wrapping the stream with IFileInputStream to handle the trailing checksum then 
after this change the stream would be double-wrapped and checksum verification 
would fail.


> Reducers do not catch bad map output transfers during shuffle if data 
> shuffled directly to disk
> -----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6166
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6166
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.6.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: MAPREDUCE-6166.v1.201411221941.txt
>
>
> In very large map/reduce jobs (50000 maps, 2500 reducers), the intermediate 
> map partition output gets corrupted on disk on the map side. If this 
> corrupted map output is too large to shuffle in memory, the reducer streams 
> it to disk without validating the checksum. In jobs this large, it could take 
> hours before the reducer finally tries to read the corrupted file and fails. 
> Since retries of the failed reduce attempt will also take hours, this delay 
> in discovering the failure is multiplied greatly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6166) Reducers do not catch bad map output transfers during shuffle if data shuffled directly to disk

Reply via email to