[
https://issues.apache.org/jira/browse/HADOOP-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jothi Padmanabhan reassigned HADOOP-475:
----------------------------------------
Assignee: Jothi Padmanabhan (was: Vivek Ratan)
> The value iterator to reduce function should be clonable
> --------------------------------------------------------
>
> Key: HADOOP-475
> URL: https://issues.apache.org/jira/browse/HADOOP-475
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Runping Qi
> Assignee: Jothi Padmanabhan
>
> In the current framework, when the user implements the reduce method of
> Reducer class,
> the user can only iterate through the value iterator once.
> This makes it hard for the user to perform join-like operations with in the
> reduce method.
> To address problem, one approach is to make the input value iterator
> clonable. Then the user can iterate the values in different ways.
> If the iterator can be reset, then the user can perform nested iterations
> over the data, thus
> carry out join-likeoperations.
> The user code in reduce method would be something like:
> iterator1 = values.clone();
> iterator2 = values.clone();
> while (iterator1.hasNext()) {
> val1 = iterator1.next();
> iterator2.reset();
> while (iterator2.hasNext()) {
> val2 = iterator.next();
> do something vased on val1 and val2
> .......................
> }
> }
> One possible optimization is that if the values are sorted based on a
> secondary key,
> the reset function can take a secondary key as an argument and reset the
> iterator to the begining
> position of the secondary key. It will be very helpful if there is a utility
> that returns a list of iterators,
> one per secondary key value, from the given iterator:
> TreeMap getIteratorsBasedOnSecondaryKey(iterator);
> Each entry in the returned map object is a pair of <secondary key, iterator
> for the values with the same secondary key>.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.