[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017771#comment-14017771
 ] 

Carlo Curino commented on MAPREDUCE-5196:
-----------------------------------------

Answering Remus:

(I am not 100% sure, as I wrote this code over a year ago, but let me try to 
recall) 
As part of the preemption work we explored doing HDFS-based shuffling. 
The benefits of this were:
1) performance enhancements on certain data size ranges (stream-merge on the 
reducers)
2) the reducer checkpoint state was much smaller (no data, just offset of the 
last read key from each map)

That was an initial sperimentation, but making it robust was non-trivial 
(missing mapoutput were hard to 
recover) so we didn't push it yet. In that context, the mapOutput was not on 
localFS but on HDFS, and 
the change you pointed out was fixing that. But this clearly does not work for 
windows. My guess is that
reverting that part should be fine here. 



> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing 
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5196
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, 
> MAPREDUCE-5196.3.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
>
>
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles 
> propagation of the preemption requests received from the RM to the 
> appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the 
> task state is handled in upcoming JIRAs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to