SammiChen created HADOOP-15499:
----------------------------------

             Summary: Performance several drop when running 
RawErasureCoderBenchmark with NativeRSRawErasureCoder
                 Key: HADOOP-15499
                 URL: https://issues.apache.org/jira/browse/HADOOP-15499
             Project: Hadoop Common
          Issue Type: Improvement
    Affects Versions: 3.0.2, 3.0.1, 3.0.0
            Reporter: SammiChen
            Assignee: SammiChen


Run RawErasureCoderBenchmark  which is a micro-benchmark to test EC codec 
encoding/decoding performance. 

50 concurrency Native ISA-L coder has the less throughput than 1 concurrency 
Native ISA-L case. It's abnormal. 

 

bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar 
org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 1 
1024 1024
Using 126MB buffer.
ISA-L coder encode 1008MB data, with chunk size 1024KB
Total time: 0.19 s.
Total throughput: 5390.37 MB/s
Threads statistics:
1 threads in total.
Min: 0.18 s, Max: 0.18 s, Avg: 0.18 s, 90th Percentile: 0.18 s.

 

bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar 
org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 50 
1024 10240
Using 120MB buffer.
ISA-L coder encode 54000MB data, with chunk size 10240KB
Total time: 11.58 s.
Total throughput: 4662 MB/s
Threads statistics:
50 threads in total.
Min: 0.55 s, Max: 11.5 s, Avg: 6.32 s, 90th Percentile: 10.45 s.

 

RawErasureCoderBenchmark shares a single coder between all concurrent threads. 
While 

NativeRSRawEncoder and NativeRSRawDecoder has synchronized key work on doDecode 
and doEncode function. So 50 concurrent threads are forced to use the shared 
coder encode/decode function one by one. 

 

To resolve the issue, there are two approaches. 
 # Refactor RawErasureCoderBenchmark  to use dedicated coder for each 
concurrent thread.
 # Refactor NativeRSRawEncoder  and NativeRSRawDecoder  to get better 
concurrency.  Since the synchronized key work is to try to protect the private 
variable nativeCoder from being checked in doEncode/doDecode and  being 
modified in release.  We can use reentrantReadWriteLock to increase the 
concurrency since doEncode/doDecode can be called multiple times without change 
the nativeCoder state.

 I prefer approach 2 and will upload a patch later. 

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to