[ https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733494#action_12733494 ]
Owen O'Malley commented on MAPREDUCE-372: ----------------------------------------- We really should make the context objects into interfaces. I agree that the new API makes this harder, because the run method means you have to allow a pull model instead of a push. The easiest way to do it would be to have a blocking queue and each stage in the pipeline is a separate thread. So the first mapper would read from the RecordReader (via the "real" context) and write outputs into a BlockingQueue. The next step would pull from that BlockingQueue and write to the next BlockingQueue and so on until the last wrote to the "real" context. Thus each thread is in the "run" method of each pipeline. Issues include: 1. Needing a thread per a step. 2. Need to clone the keys and values between steps. 3. Need to figure out the size of the queues. Probably 1 to start with... > Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api. > ----------------------------------------------------------------------- > > Key: MAPREDUCE-372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-372 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Reporter: Amareshwari Sriramadasu > Assignee: Amareshwari Sriramadasu > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.