[ https://issues.apache.org/jira/browse/MAPREDUCE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663200#comment-13663200 ]
Carlo Curino commented on MAPREDUCE-5176: ----------------------------------------- The current implementation is reactive to a preemption request, since for the reduce phase it also means committing partial output for the task and in general requires saving possibly large state (not amenable to periodic invocations). We are consider a rework of the shuffle phase that will reduce this by a lot, in which case we could consider alternatives (this is related to this [HDFS-based shuffle comment | https://issues.apache.org/jira/browse/YARN-666?focusedCommentId=13659024&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13659024]). More generally, I think checkpointing is a very basic and flexible tool that can be used also for: 1) dyanmic optimizations (e.g., task-splitting to handle skew, or to pause reducers when there is heavy map skew and reducers are sitting idle waiting for the last map to complete) 2) general fault tolerance (e.g., this would allow to run a number of tasks that matches the available slots, and use checkpoint to limit the wasted work in case of failures). This generality is why we factored out a CheckpointService API in MAPREDUCE-5197 instead of simply writing to HDFS from our code (BTW check the API for CheckpointService out if you have time, I am curious to get people's reaction to it). > Preemptable annotations (to support preemption in MR) > ----------------------------------------------------- > > Key: MAPREDUCE-5176 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5176 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 > Reporter: Carlo Curino > Assignee: Carlo Curino > Attachments: MAPREDUCE-5176.1.patch, MAPREDUCE-5176.patch > > > Proposing a patch that introduces a new annotation @Preemptable that > represents to the framework property of user-supplied classes (e.g., Reducer, > OutputCommiter). The intended semantics is that a tagged class is safe to be > preempted between invocations. > (this is in spirit similar to the Output Contracts of [Nephele/PACT | > https://stratosphere.eu/sites/default/files/papers/ComparingMapReduceAndPACTs_11.pdf]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira