[
https://issues.apache.org/jira/browse/MAPREDUCE-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123332#comment-14123332
]
Hadoop QA commented on MAPREDUCE-2841:
--------------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12666832/mr-2841-merge.txt
against trunk revision 9e941d9.
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 71 new
or modified test files.
{color:red}-1 javac{color}. The applied patch generated 1304 javac
compiler warnings (more than the trunk's current 1264 warnings).
{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3
warning messages.
See
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4855//artifact/trunk/patchprocess/diffJavadocWarnings.txt
for details.
{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.
{color:red}-1 findbugs{color}. The patch appears to cause Findbugs
(version 2.0.3) to fail.
{color:red}-1 release audit{color}. The applied patch generated 8
release audit warnings.
{color:red}-1 core tests{color}. The test build failed in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/sdk/example/CustomModule
{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.
Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4855//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4855//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Javac warnings:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4855//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4855//console
This message is automatically generated.
> Task level native optimization
> ------------------------------
>
> Key: MAPREDUCE-2841
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2841
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: task
> Environment: x86-64 Linux/Unix
> Reporter: Binglin Chang
> Assignee: Sean Zhong
> Attachments: DESIGN.html, MAPREDUCE-2841.v1.patch,
> MAPREDUCE-2841.v2.patch, MR-2841benchmarks.pdf, dualpivot-0.patch,
> dualpivotv20-0.patch, fb-shuffle.patch,
> hadoop-3.0-mapreduce-2841-2014-7-17.patch, micro-benchmark.txt,
> mr-2841-merge.txt
>
>
> I'm recently working on native optimization for MapTask based on JNI.
> The basic idea is that, add a NativeMapOutputCollector to handle k/v pairs
> emitted by mapper, therefore sort, spill, IFile serialization can all be done
> in native code, preliminary test(on Xeon E5410, jdk6u24) showed promising
> results:
> 1. Sort is about 3x-10x as fast as java(only binary string compare is
> supported)
> 2. IFile serialization speed is about 3x of java, about 500MB/s, if hardware
> CRC32C is used, things can get much faster(1G/
> 3. Merge code is not completed yet, so the test use enough io.sort.mb to
> prevent mid-spill
> This leads to a total speed up of 2x~3x for the whole MapTask, if
> IdentityMapper(mapper does nothing) is used
> There are limitations of course, currently only Text and BytesWritable is
> supported, and I have not think through many things right now, such as how to
> support map side combine. I had some discussion with somebody familiar with
> hive, it seems that these limitations won't be much problem for Hive to
> benefit from those optimizations, at least. Advices or discussions about
> improving compatibility are most welcome:)
> Currently NativeMapOutputCollector has a static method called canEnable(),
> which checks if key/value type, comparator type, combiner are all compatible,
> then MapTask can choose to enable NativeMapOutputCollector.
> This is only a preliminary test, more work need to be done. I expect better
> final results, and I believe similar optimization can be adopt to reduce task
> and shuffle too.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)