[
https://issues.apache.org/jira/browse/HBASE-30225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wellington Chevreuil updated HBASE-30225:
-----------------------------------------
Attachment: flame-graph-zoomed.png
> Performance degradation observed on ycsb reads benchmark after HBASE-29727
> --------------------------------------------------------------------------
>
> Key: HBASE-30225
> URL: https://issues.apache.org/jira/browse/HBASE-30225
> Project: HBase
> Issue Type: Bug
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Major
> Attachments: flame-graph-zoomed.png, flamegraph-high-level.png
>
>
> In HBASE-29727 we replaced a single Path attribute by three String fields for
> region, column family and file names, respectively. Since these values tend
> to have a high level of redundancy on large caches (same region, family and
> file names for many different blocks), we introduced the usage of string pool
> to avoid string value repetition and save heap allocation.
> When executing ycsb read workloads, we observed a ~30% latency degradation
> The problem was that we added logic for parsing the file Path into region
> name, family name, as well checks for archiving all on the BlockCacheKey
> constructor used by HFileReaderImpl on the beginning of each block read. As
> seen on the flame graphs attached covering a five minutes window on one of
> the RSes, around 30% of the CPU time was spent on the BlockCacheKey
> constructor, either calling Path.getParent() or HFileUtils.isHFileArchived().
> !flame-graph-zoomed.png!
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)