[
https://issues.apache.org/jira/browse/HADOOP-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jothi Padmanabhan updated HADOOP-4683:
--------------------------------------
Attachment: hadoop-4683.patch
Attaching a patch.
A 100 node, 100 byte, 100K maps loadgen showed a 3x performance improvement
(~800 seconds with patch, ~2500 seconds without the patch)
{noformat}
bin/hadoop jar hadoop-$BUILD-test.jar loadgen \
-D test.randomtextwrite.bytes_per_map=$((100)) \
-D test.randomtextwrite.total_bytes=$((100*100000)) \
-D mapred.compress.map.output=false \
-r 1 \
-outKey org.apache.hadoop.io.Text \
-outValue org.apache.hadoop.io.Text \
-outFormat org.apache.hadoop.mapred.lib.NullOutputFormat \
-outdir fakeout
{noformat}
Testpatch results:
[exec] -1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] -1 tests included. The patch doesn't appear to include any new
or modified tests.
[exec] Please justify why no tests are needed for
this patch.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath
integrity.
[exec]
> Move the call to getMapCompletionEvents in
> ReduceTask.ReduceCopier.fetchOutputs to a separate thread
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-4683
> URL: https://issues.apache.org/jira/browse/HADOOP-4683
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Devaraj Das
> Assignee: Jothi Padmanabhan
> Fix For: 0.20.0
>
> Attachments: hadoop-4683.patch
>
>
> The method ReduceTask.ReduceCopier.fetchOutputs makes a call to
> getMapCompletionEvents every iteration of the loop. This should be moved out
> to a separate thread. This might slow down the shuffle scheduler in some
> cases since there is a sleep inside the getMapCompletionEvents method.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.