Tsuyoshi OZAWA created MAPREDUCE-4911:
-
Summary: Add node-level aggregation flag
feature(setLocalAggregation(boolean)) to JobConf
Key: MAPREDUCE-4911
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4911
Tsuyoshi OZAWA created MAPREDUCE-4910:
-
Summary: Adding AggregationWaitMap to some components(MRAppMaster,
TaskAttemptListener, JobImpl, MapTaskImpl).
Key: MAPREDUCE-4910
URL: https://issues.apache.org/jira/br
Arpit Agarwal created MAPREDUCE-4909:
Summary: TestKeyValueTextInputFormat fails with Open JDK 7 on
Windows
Key: MAPREDUCE-4909
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4909
Project: H
Hi Suresh,
This has been in good progress for a while, please see
https://issues.apache.org/jira/browse/MAPREDUCE-4502 and sub-tasks.
On Wed, Jan 2, 2013 at 5:53 PM, Suresh S wrote:
> Hello,
>
> I think, running combiner function at node level (to combine all the
> map task output of the n
Hi Suresh,
I mean, in current approach combine phase will occur per mapper
instance only, but NOT per node.
I think additional local shuffle and sort phase should happen so
that you can combine per node.
But NOT really sure whether this can be achievable or not. Y
Definitely it will cost some overload. This lead to less intermediate
movement and less time for reduce phase. This benefit may improve the
overall performance.
*Regards*
*S.Suresh,*
*Research Scholar,*
*Department of Computer Applications,*
*National Institute of Technology,*
*Tiruchirappalli - 6
Continued,
Also one more shuffle and sort phase should occur so that you can
merge/combine them properly.
So you should decide whether additional shuffle and sort phase will be
overhead in contrast with combine per node.
Best,
Mahesh Balija,
Calsoft Labs.
On Wed, Jan 2, 2013 at 6:14 PM, Mahesh B
Hi Suresh,
The combiner function will aggregate the data from a single
map instance. But NOT for all the maps running in a given node.
AFAIK As the maps will be running in the individual child
JVMs, still the intermediate data need to be serialized (moved) so that
you