[ https://issues.apache.org/jira/browse/HBASE-29135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Dimiduk resolved HBASE-29135. ---------------------------------- Resolution: Fixed Pushed to branch-2.5+. Thanks [~charlesconnell]! > ZStandard decompression can operate directly on ByteBuffs > --------------------------------------------------------- > > Key: HBASE-29135 > URL: https://issues.apache.org/jira/browse/HBASE-29135 > Project: HBase > Issue Type: Improvement > Reporter: Charles Connell > Assignee: Charles Connell > Priority: Minor > Labels: pull-request-available > Fix For: 3.0.0-beta-2, 2.6.3, 2.5.12 > > Attachments: create-decompression-stream-zstd.html > > > I've been thinking about ways to improve HBase's performance when reading > HFiles, and I believe there is significant opportunity. I look at many > RegionServer profile flamegraphs of my company's servers. A pattern that I've > discovered is that object allocation in a very hot code path is a performance > killer. The HFile decoding code makes some effort to avoid this, but it isn't > totally successful. > Each time a block is decoded in {{HFileBlockDefaultDecodingContext}}, a new > {{DecompressorStream}} is allocated and used. This is a lot of allocation, > and the use of the streaming pattern requires copying every byte to be > decompressed more times than necessary. Each byte is copied from a > {{ByteBuff}} into a {{byte[]}}, then decompressed, then copied back to a > {{ByteBuff}}. For decompressors like > {{org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor}} that only > operate on direct memory, two additional copies are introduced to move from a > {{byte[]}} to a direct NIO {{ByteBuffer}}, then back to a {{byte[]}}. > Aside from the copies inherent in the decompression algorithm, the necessity > of copying from an compressed buffer to an uncompressed buffer, all of these > other copies can be avoided without sacrificing functionality. Along the way, > we'll also avoid allocating objects. -- This message was sent by Atlassian Jira (v8.20.10#820010)