[
https://issues.apache.org/jira/browse/HBASE-29135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Dimiduk resolved HBASE-29135.
----------------------------------
Resolution: Fixed
Pushed to branch-2.5+. Thanks [~charlesconnell]!
> ZStandard decompression can operate directly on ByteBuffs
> ---------------------------------------------------------
>
> Key: HBASE-29135
> URL: https://issues.apache.org/jira/browse/HBASE-29135
> Project: HBase
> Issue Type: Improvement
> Reporter: Charles Connell
> Assignee: Charles Connell
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.0.0-beta-2, 2.6.3, 2.5.12
>
> Attachments: create-decompression-stream-zstd.html
>
>
> I've been thinking about ways to improve HBase's performance when reading
> HFiles, and I believe there is significant opportunity. I look at many
> RegionServer profile flamegraphs of my company's servers. A pattern that I've
> discovered is that object allocation in a very hot code path is a performance
> killer. The HFile decoding code makes some effort to avoid this, but it isn't
> totally successful.
> Each time a block is decoded in {{HFileBlockDefaultDecodingContext}}, a new
> {{DecompressorStream}} is allocated and used. This is a lot of allocation,
> and the use of the streaming pattern requires copying every byte to be
> decompressed more times than necessary. Each byte is copied from a
> {{ByteBuff}} into a {{byte[]}}, then decompressed, then copied back to a
> {{ByteBuff}}. For decompressors like
> {{org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor}} that only
> operate on direct memory, two additional copies are introduced to move from a
> {{byte[]}} to a direct NIO {{ByteBuffer}}, then back to a {{byte[]}}.
> Aside from the copies inherent in the decompression algorithm, the necessity
> of copying from an compressed buffer to an uncompressed buffer, all of these
> other copies can be avoided without sacrificing functionality. Along the way,
> we'll also avoid allocating objects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)