[
https://issues.apache.org/jira/browse/HBASE-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182563#comment-17182563
]
fanrui commented on HBASE-24915:
--------------------------------
This is a simple test report:
h3. Test Conditions:
{code:java}
HBase version: 2.1.0
JVM: -Xmx2g -Xms2g
hfile.block.cache.size:0.02(40M)
number of record:10,000,000(HFile 374.0 M)
disk:SSD
OS:CentOS Linux release 7.4.1708 (Core)
JMH Benchmark:
@Fork(value = 3)
@Warmup(iterations = 60)
@Measurement(iterations = 60)
{code}
{{}}
h3. Test Results:
id 1~5 means that when the patch is added, change the BlockCache size and test
the performance of get respectively.
ID 6~10 means that without the patch, change the BlockCache size and test the
performance of get respectively.
[https://docs.google.com/spreadsheets/d/1fI4rk0nVKweyHANHhlcbtZZavxPh8WktEu-wQi-i-Fs/edit?usp=sharing]
h3. Test conclusion:
Adding the patch does not necessarily improve the performance, because the
patch occupies a small proportion in the entire read process, and the read
performance fluctuates, so after adding this function, the performance
occasionally decreases.
id 10 is the case where the data is completely in the BlockCache, and
FlameGraph is attached. It can be seen that CombinedBlockCache.getBlock
occupies 12.15%, and LruBlockCache.containsBlock occupies 1.05%. In theory,
after adding the current patch, case 10 can optimize
LruBlockCache.containsBlock, which means that it can save 1% of the CPU.
*image link:*
case 10 FlameGraph:
[https://drive.google.com/file/d/1Q-fyzTiLDKfPXP6Du33pDMe0Tw2Zyrhm/view?usp=sharing]
ConbinedBlockCache.getBlock.png:
[https://drive.google.com/file/d/1twwKfxXAZj_8tf8seaXahw6nHE2XfMPO/view?usp=sharing]
LruBlockCache.containsBlock.png:
[https://drive.google.com/file/d/13pQloaqziSgmKC3BEPYXgZpVyX2PFtxi/view?usp=sharing]
h3. Note:
The above is my test plan. In theory, after adding the patch, the performance
can only be improved and cannot be reduced, but the performance of some cases
is reduced due to data fluctuations.
> Improve BlockCache read performance by specifying BlockType
> -----------------------------------------------------------
>
> Key: HBASE-24915
> URL: https://issues.apache.org/jira/browse/HBASE-24915
> Project: HBase
> Issue Type: Improvement
> Components: BlockCache, Performance
> Reporter: fanrui
> Assignee: fanrui
> Priority: Major
>
> CombinedBlockCache contains l1Cache and l2Cache. l1Cache stores MetaBlock and
> l2Cache stores DataBlock. Because getBlock does not know the BlockType, the
> getBlock of CombinedBlockCache queries l1Cache first, and then l2Cache. But
> actually querying DataBlock is not necessary to query l1Cache.
> Therefore, in some cases where BlockType is known, BlockCache read
> performance can be improved.
> h2. Code:
> BlockCache: default call old getBlock
> {code:java}
> default Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean
> repeat,
> boolean updateCacheMetrics, BlockType blockType) {
> return getBlock(cacheKey, caching, repeat, updateCacheMetrics);
> }
> {code}
> CombinedBlockCache:
> {code:java}
> @Override
> public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean
> repeat,
> boolean updateCacheMetrics, BlockType blockType) {
> if (blockType == null) {
> return getBlock(cacheKey, caching, repeat, updateCacheMetrics);
> }
> boolean metaBlock = isMetaBlock(blockType);
> if (metaBlock) {
> return l1Cache.getBlock(cacheKey, caching, repeat, updateCacheMetrics);
> } else {
> return l2Cache.getBlock(cacheKey, caching, repeat, updateCacheMetrics);
> }
> }private boolean isMetaBlock(BlockType blockType) {
> return blockType.getCategory() != BlockCategory.DATA;
> }
> {code}
> HFileReaderImpl#getCachedBlock call BlockCache#getBlock(XXX,
> expectedBlockType)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)