[ https://issues.apache.org/jira/browse/HBASE-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791688#comment-17791688 ]
Bryan Beaudreault commented on HBASE-27891: ------------------------------------------- Hey [~wchevreuil]. We aren't working on it right now, but happy to provide review. I'll take a look at those linked issues as well, thanks! For us, optimization of BlockCache fell off our high priority. We'll probably get back to it eventually, but not exactly sure when. A bigger caching issue for us is bloom filters, which can be quite large and are currently forced onto the on-heap cache. As a result we end up having to do a lot of micromanaging of bloom error rates and heap sizes, which I'd like to avoid. I'd like to investigate pushing that off heap, maybe we can mitigate some of the BucketCache deserialization overhead by not parsing the full HFileBlock for blooms. Instead keep it as an unparsed ByteBuff and pass directly to BloomFilterUtil where necessary. Of course that'd probably require some big changes, and is unrelated to this issue but figured I'd say what sorts of things we're thinking :) > Report heap used by BucketCache as a jmx metric > ----------------------------------------------- > > Key: HBASE-27891 > URL: https://issues.apache.org/jira/browse/HBASE-27891 > Project: HBase > Issue Type: Improvement > Reporter: Bryan Beaudreault > Priority: Major > > The BucketCache can take a non-trivial amount of heap, especially for very > large cache sizes. For example, we have a server with 500k blocks in the > bucket cache, and according to a heap dump it was holding around 260mb. One > needs to account for this when determining the size of heap to use, so we > should report it. > The major contributors I saw were the offsetLock, blocksByHFile set, and > backingMap. -- This message was sent by Atlassian Jira (v8.20.10#820010)