[ https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046199#comment-17046199 ]
Danil Lipovoy commented on HBASE-23887: --------------------------------------- Ok, full results: *UNIFORM, hbase.lru.cache.data.block.percent=100* [OVERALL], RunTime(ms), 36278 [OVERALL], Throughput(ops/sec), 27564.915375709796 [TOTAL_GCS_PS_Scavenge], Count, 10 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 109 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.3004575775952368 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 54 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.1488505430288329 [TOTAL_GCs], Count, 11 [TOTAL_GC_TIME], Time(ms), 163 [TOTAL_GC_TIME_%], Time(%), 0.44930812062406966 [READ], Operations, 1000000 [READ], AverageLatency(us), 3406.607554 [READ], MinLatency(us), 118 [READ], MaxLatency(us), 913919 [READ], 95thPercentileLatency(us), 2725 [READ], 99thPercentileLatency(us), 11823 [READ], Return=OK, 1000000 *UNIFORM, hbase.lru.cache.data.block.percent=8* [OVERALL], RunTime(ms), 18752 [OVERALL], Throughput(ops/sec), 53327.64505119454 [TOTAL_GCS_PS_Scavenge], Count, 8 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 108 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.575938566552901 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 50 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.2666382252559727 [TOTAL_GCs], Count, 9 [TOTAL_GC_TIME], Time(ms), 158 [TOTAL_GC_TIME_%], Time(%), 0.8425767918088738 [READ], Operations, 1000000 [READ], AverageLatency(us), 1631.166745 [READ], MinLatency(us), 128 [READ], MaxLatency(us), 575487 [READ], 95thPercentileLatency(us), 3599 [READ], 99thPercentileLatency(us), 13751 [READ], Return=OK, 1000000 *LATEST, hbase.lru.cache.data.block.percent=100* [OVERALL], RunTime(ms), 36015 [OVERALL], Throughput(ops/sec), 27766.20852422602 [TOTAL_GCS_PS_Scavenge], Count, 9 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 113 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.313758156323754 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 49 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.13605442176870747 [TOTAL_GCs], Count, 10 [TOTAL_GC_TIME], Time(ms), 162 [TOTAL_GC_TIME_%], Time(%), 0.44981257809246145 [READ], Operations, 1000000 [READ], AverageLatency(us), 3381.125299 [READ], MinLatency(us), 109 [READ], MaxLatency(us), 944127 [READ], 95thPercentileLatency(us), 3735 [READ], 99thPercentileLatency(us), 9103 [READ], Return=OK, 1000000 *LATEST, hbase.lru.cache.data.block.percent=8* [OVERALL], RunTime(ms), 21773 [OVERALL], Throughput(ops/sec), 45928.443485050295 [TOTAL_GCS_PS_Scavenge], Count, 9 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 104 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.477655812244523 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 42 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.19289946263721122 [TOTAL_GCs], Count, 10 [TOTAL_GC_TIME], Time(ms), 146 [TOTAL_GC_TIME_%], Time(%), 0.6705552748817343 [READ], Operations, 1000000 [READ], AverageLatency(us), 1915.557332 [READ], MinLatency(us), 124 [READ], MaxLatency(us), 325631 [READ], 95thPercentileLatency(us), 8447 [READ], 99thPercentileLatency(us), 26127 [READ], Return=OK, 1000000 *ZIPFIAN, hbase.lru.cache.data.block.percent=100* [OVERALL], RunTime(ms), 36480 [OVERALL], Throughput(ops/sec), 27412.280701754386 [TOTAL_GCS_PS_Scavenge], Count, 9 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 99 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.2713815789473684 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 43 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.11787280701754385 [TOTAL_GCs], Count, 10 [TOTAL_GC_TIME], Time(ms), 142 [TOTAL_GC_TIME_%], Time(%), 0.3892543859649123 [READ], Operations, 1000000 [READ], AverageLatency(us), 3315.799223 [READ], MinLatency(us), 122 [READ], MaxLatency(us), 648191 [READ], 95thPercentileLatency(us), 2933 [READ], 99thPercentileLatency(us), 6823 [READ], Return=OK, 1000000 *ZIPFIAN, hbase.lru.cache.data.block.percent=8* [OVERALL], RunTime(ms), 24828 [OVERALL], Throughput(ops/sec), 40277.10649266957 [TOTAL_GCS_PS_Scavenge], Count, 10 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 113 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.4551313033671661 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 53 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.2134686644111487 [TOTAL_GCs], Count, 11 [TOTAL_GC_TIME], Time(ms), 166 [TOTAL_GC_TIME_%], Time(%), 0.6685999677783148 [READ], Operations, 1000000 [READ], AverageLatency(us), 2223.297153 [READ], MinLatency(us), 127 [READ], MaxLatency(us), 532991 [READ], 95thPercentileLatency(us), 3783 [READ], 99thPercentileLatency(us), 21679 [READ], Return=OK, 1000000 > BlockCache performance improve > ------------------------------ > > Key: HBASE-23887 > URL: https://issues.apache.org/jira/browse/HBASE-23887 > Project: HBase > Issue Type: Improvement > Components: BlockCache, Performance > Reporter: Danil Lipovoy > Priority: Minor > Attachments: cmp.png > > > Hi! > I first time here, correct me please if something wrong. > I want propose how to improve performance when data in HFiles much more than > BlockChache (usual story in BigData). The idea - caching only part of DATA > blocks. It is good becouse LruBlockCache starts to work and save huge amount > of GC. See the picture in attachment with test below. Requests per second is > higher, GC is lower. > > The key point of the code: > Added the parameter: *hbase.lru.cache.data.block.percent* which by default = > 100 > > But if we set it 0-99, then will work the next logic: > > > {code:java} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean > inMemory) { > if (cacheDataBlockPercent != 100 && buf.getBlockType().isData()) > if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) > return; > ... > // the same code as usual > } > {code} > > > Descriptions of the test: > 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem. > 4 RegionServers > 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF) > Total BlockCache Size = 48 Gb (8 % of data in HFiles) > Random read in 20 threads > > I am going to make Pull Request, hope it is right way to make some > contribution in this cool product. > -- This message was sent by Atlassian Jira (v8.3.4#803005)