[ 
https://issues.apache.org/jira/browse/HBASE-29123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-29123.
----------------------------------
    Resolution: Fixed

Pushed to branch-2.5+. Thanks a lot [~charlesconnell]!

> A faster CodecPool for HBase
> ----------------------------
>
>                 Key: HBASE-29123
>                 URL: https://issues.apache.org/jira/browse/HBASE-29123
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile, io
>            Reporter: Charles Connell
>            Assignee: Charles Connell
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12
>
>         Attachments: borrow-decompressor.html, lease-counting.html, 
> return-decompressor.html
>
>
> I look at many profile flamegraphs of my company's RegionServers. I sometimes 
> see memory allocation inside of {{org.apache.hadoop.io.compress.CodecPool}} 
> taking up roughly 1% of my CPU time. The point of a CodecPool is to avoid 
> allocating short-lived objects, so this is not good. Luckily, these 
> allocations can be avoided. Attached are three flamegraphs showing the 
> allocations I'm talking about.
> I plan this ticket as the first of a series relating to decompression 
> performance. In the context of the overall series, it makes sense to fork 
> CodecPool out of hadoop-common and start a new copy of it in HBase. I'll do 
> that in this ticket and include my improvements:
> Change the pool data structure from {{HashMap<Class<Compressor>, 
> HashSet<Compressor>>}} to {{ConcurrentHashMap<Class<Compressor>, 
> ConcurrentSkipListSet<Compressor>>}}. This allows the "borrow" code:
> {code}
>     T codec = null;
>     Set<T> codecSet;
>     synchronized (pool) {
>       codecSet = pool.get(codecClass);
>     }
>     if (codecSet != null) {
>       synchronized (codecSet) {
>         if (!codecSet.isEmpty()) {
>           codec = codecSet.iterator().next();
>           codecSet.remove(codec);
>         }
>       }
>     }
> {code}
> to be re-written as:
> {code}
>     if (codecClass == null) {
>       return null;
>     }
>     NavigableSet<T> codecSet = pool.get(codecClass);
>     if (codecSet != null) {
>       return codecSet.pollFirst();
>     } else {
>       return null;
>     }
> {code}
> thus avoiding the allocation of an iterator and the necessity of locking.
> The lease counters are only read in unit tests, so I'll stop updating those 
> outside of testing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to