[jira] [Commented] (HDFS-15202) HDFS-client: boost ShortCircuit Cache

Hudson (Jira) Mon, 18 May 2020 09:54:57 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110449#comment-17110449
 ]


Hudson commented on HDFS-15202:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18270 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18270/])
Revert "HDFS-15202 Boost short circuit cache (rebase PR-1884) (#2016)" 
(weichiu: rev 4525292d41482330a86f1cc3935e072f9f67c308)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderLocal.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderFactory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
HDFS-15202 Boost short circuit cache (rebase PR-1884) (#2016) (weichiu: rev 
2abcf7762ae74b936e1cedb60d5d2b4cc4ee86ea)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderFactory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderLocal.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java


> HDFS-client: boost ShortCircuit Cache
> -------------------------------------
>
>                 Key: HDFS-15202
>                 URL: https://issues.apache.org/jira/browse/HDFS-15202
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: dfsclient
>         Environment: 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 8 RegionServers (2 by host)
> 8 tables by 64 regions by 1.88 Gb data in each = 900 Gb total
> Random read in 800 threads via YCSB and a little bit updates (10% of reads)
>            Reporter: Danil Lipovoy
>            Assignee: Danil Lipovoy
>            Priority: Minor
>             Fix For: 3.3.1, 3.4.0
>
>         Attachments: HDFS-15202-Addendum-01.patch, HDFS_CPU_full_cycle.png, 
> cpu_SSC.png, cpu_SSC2.png, hdfs_cpu.png, hdfs_reads.png, hdfs_scc_3_test.png, 
> hdfs_scc_test_full-cycle.png, locks.png, requests_SSC.png
>
>
> ТотI want to propose how to improve reading performance HDFS-client. The 
> idea: create few instances ShortCircuit caches instead of one. 
> The key points:
> 1. Create array of caches (set by 
> clientShortCircuitNum=*dfs.client.short.circuit.num*, see in the pull 
> requests below):
> {code:java}
> private ClientContext(String name, DfsClientConf conf, Configuration config) {
> ...
>     shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
>     for (int i = 0; i < this.clientShortCircuitNum; i++) {
>       this.shortCircuitCache[i] = ShortCircuitCache.fromConf(scConf);
>     }
> {code}
> 2 Then divide blocks by caches:
> {code:java}
>   public ShortCircuitCache getShortCircuitCache(long idx) {
>     return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
>   }
> {code}
> 3. And how to call it:
> {code:java}
> ShortCircuitCache cache = 
> clientContext.getShortCircuitCache(block.getBlockId());
> {code}
> The last number of offset evenly distributed from 0 to 9 - that's why all 
> caches will full approximately the same.
> It is good for performance. Below the attachment, it is load test reading 
> HDFS via HBase where clientShortCircuitNum = 1 vs 3. We can see that 
> performance grows ~30%, CPU usage about +15%. 
> Hope it is interesting for someone.
> Ready to explain some unobvious things.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15202) HDFS-client: boost ShortCircuit Cache

Reply via email to