EungsopYoo commented on PR #7291: URL: https://github.com/apache/hbase/pull/7291#issuecomment-3787316067
@VladRodionov Thank you for your comments. I’d like to respond to the questions you raised. > * Heap pressure: using Java-heap–resident structures (or reusing BucketCache metadata paths) may increase GC pressure under some workloads. ZGC certainly helps here, but I’m curious how this is expected to behave for row-sized objects with high churn. Although it will be clearer once we run some tests, under high-churn workloads I expect RowCache, like BlockCache, to experience significant GC pressure. In such cases, it would likely be more appropriate not to use RowCache. > * Metadata overhead: for cache entries on the order of a single row, the relative metadata overhead can become significant compared to the payload. It would be interesting to understand how this is being evaluated or measured in the current approach. That’s correct. Because the overhead is large, I plan to set the default size of RowCache to 2% of the heap, which is relatively much smaller than the 40% heap size typically allocated to BlockCache. > * Sparse row access: some applications primarily care about a subset of row data (for example, only the latest versions of selected cells). Google Bigtable’s row cache supports sparse rows — are similar access patterns in scope here, or is the focus on full-row caching? > * In-place mutation: supporting sparse rows often implies updating cached entries in place so the cache always reflects the most recent version of the row. I’m curious whether this is within the intended scope or something to consider later. In this PR, I am not considering caching for sparse rows and am focusing only on caching full rows. Support for sparse row caching can be considered later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
