[ https://issues.apache.org/jira/browse/HADOOP-15499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
SammiChen updated HADOOP-15499: ------------------------------- Summary: Performance severe drop when running RawErasureCoderBenchmark with NativeRSRawErasureCoder (was: Performance several drop when running RawErasureCoderBenchmark with NativeRSRawErasureCoder) > Performance severe drop when running RawErasureCoderBenchmark with > NativeRSRawErasureCoder > ------------------------------------------------------------------------------------------ > > Key: HADOOP-15499 > URL: https://issues.apache.org/jira/browse/HADOOP-15499 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 3.0.0, 3.0.1, 3.0.2 > Reporter: SammiChen > Assignee: SammiChen > Priority: Major > > Run RawErasureCoderBenchmark which is a micro-benchmark to test EC codec > encoding/decoding performance. > 50 concurrency Native ISA-L coder has the less throughput than 1 concurrency > Native ISA-L case. It's abnormal. > > bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 1 > 1024 1024 > Using 126MB buffer. > ISA-L coder encode 1008MB data, with chunk size 1024KB > Total time: 0.19 s. > Total throughput: 5390.37 MB/s > Threads statistics: > 1 threads in total. > Min: 0.18 s, Max: 0.18 s, Avg: 0.18 s, 90th Percentile: 0.18 s. > > bin/hadoop jar ./share/hadoop/common/hadoop-common-3.2.0-SNAPSHOT-tests.jar > org.apache.hadoop.io.erasurecode.rawcoder.RawErasureCoderBenchmark encode 3 > 50 1024 10240 > Using 120MB buffer. > ISA-L coder encode 54000MB data, with chunk size 10240KB > Total time: 11.58 s. > Total throughput: 4662 MB/s > Threads statistics: > 50 threads in total. > Min: 0.55 s, Max: 11.5 s, Avg: 6.32 s, 90th Percentile: 10.45 s. > > RawErasureCoderBenchmark shares a single coder between all concurrent > threads. While > NativeRSRawEncoder and NativeRSRawDecoder has synchronized key work on > doDecode and doEncode function. So 50 concurrent threads are forced to use > the shared coder encode/decode function one by one. > > To resolve the issue, there are two approaches. > # Refactor RawErasureCoderBenchmark to use dedicated coder for each > concurrent thread. > # Refactor NativeRSRawEncoder and NativeRSRawDecoder to get better > concurrency. Since the synchronized key work is to try to protect the > private variable nativeCoder from being checked in doEncode/doDecode and > being modified in release. We can use reentrantReadWriteLock to increase the > concurrency since doEncode/doDecode can be called multiple times without > change the nativeCoder state. > I prefer approach 2 and will upload a patch later. > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org