[ 
https://issues.apache.org/jira/browse/HBASE-29135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-29135.
----------------------------------
    Resolution: Fixed

Pushed to branch-2.5+. Thanks [~charlesconnell]!

> ZStandard decompression can operate directly on ByteBuffs
> ---------------------------------------------------------
>
>                 Key: HBASE-29135
>                 URL: https://issues.apache.org/jira/browse/HBASE-29135
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Charles Connell
>            Assignee: Charles Connell
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.0.0-beta-2, 2.6.3, 2.5.12
>
>         Attachments: create-decompression-stream-zstd.html
>
>
> I've been thinking about ways to improve HBase's performance when reading 
> HFiles, and I believe there is significant opportunity. I look at many 
> RegionServer profile flamegraphs of my company's servers. A pattern that I've 
> discovered is that object allocation in a very hot code path is a performance 
> killer. The HFile decoding code makes some effort to avoid this, but it isn't 
> totally successful.
> Each time a block is decoded in {{HFileBlockDefaultDecodingContext}}, a new 
> {{DecompressorStream}} is allocated and used. This is a lot of allocation, 
> and the use of the streaming pattern requires copying every byte to be 
> decompressed more times than necessary. Each byte is copied from a 
> {{ByteBuff}} into a {{byte[]}}, then decompressed, then copied back to a 
> {{ByteBuff}}. For decompressors like 
> {{org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor}} that only 
> operate on direct memory, two additional copies are introduced to move from a 
> {{byte[]}} to a direct NIO {{ByteBuffer}}, then back to a {{byte[]}}.
> Aside from the copies inherent in the decompression algorithm, the necessity 
> of copying from an compressed buffer to an uncompressed buffer, all of these 
> other copies can be avoided without sacrificing functionality. Along the way, 
> we'll also avoid allocating objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to