[jira] [Created] (HBASE-27370) Avoid decompressing blocks when reading from bucket cache prefetch threads

Wellington Chevreuil (Jira) Tue, 13 Sep 2022 13:30:24 -0700

Wellington Chevreuil created HBASE-27370:
--------------------------------------------


             Summary: Avoid decompressing blocks when reading from bucket cache 
prefetch threads 
                 Key: HBASE-27370
                 URL: https://issues.apache.org/jira/browse/HBASE-27370
             Project: HBase
          Issue Type: Improvement
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


When prefetching blocks into bucket cache, we had observed a consistent CPU 
usage around 70% with no other workloads ongoing. For large bucket caches (i.e. 
when using file based bucket cache), the prefetch can last for sometime and 
having such a high CPU usage may impact the database usage by client 
applications.

Further analysis of the prefetch threads stack trace showed that very often, 
decompress logic is being executed by these threads:
{noformat}
"hfile-prefetch-1654895061122" #234 daemon prio=5 os_prio=0 
tid=0x0000557bb2907000 nid=0x406d runnable [0x00007f294a504000]
   java.lang.Thread.State: RUNNABLE
        at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native
 Method)
        at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompress(SnappyDecompressor.java:235)
        at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
        at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        - locked <0x00000002d24c0ae8> (a java.io.BufferedInputStream)
        at 
org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:105)
        at 
org.apache.hadoop.hbase.io.compress.Compression.decompress(Compression.java:465)
        at 
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:90)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:650)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1342)
 {noformat}

This is because *HFileReaderImpl.readBlock* is always decompressing blocks even 
when *hbase.block.data.cachecompressed* is set to true. 

This patch proposes an alternative flag to differentiate prefetch from normal 
reads, so that doesn't decompress DATA blocks when prefetching with  
*hbase.block.data.cachecompressed* set to true. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-27370) Avoid decompressing blocks when reading from bucket cache prefetch threads

Reply via email to