[
https://issues.apache.org/jira/browse/HBASE-30018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HBASE-30018:
-----------------------------------
Labels: pull-request-available (was: )
> Pluggable HBase Block Cache System
> ----------------------------------
>
> Key: HBASE-30018
> URL: https://issues.apache.org/jira/browse/HBASE-30018
> Project: HBase
> Issue Type: Umbrella
> Components: BlockCache, Performance
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Priority: Major
> Labels: pull-request-available
>
> h2. Description
> This umbrella tracks work to refactor the HBase block cache into a *pluggable
> architecture* with clear separation of concerns:
> * cache storage (*CacheEngine*)
> * cache topology (*CacheTopology*, L1/L2 orchestration)
> * placement and admission policy (*CachePlacementPolicy*)
> The current design tightly couples these responsibilities across
> *BlockCache*, *CombinedBlockCache*, and *BucketCache*, making it difficult to:
> * introduce new cache implementations
> * evolve cache behavior independently
> * experiment with alternative policies and topologies
> In addition, existing implementations (e.g., *BucketCache*) incur significant
> metadata overhead at large scale (e.g., ~9GB for ~1.6TB cache with 64KB
> blocks), reducing effective cache capacity.
> h2. Proposed Architecture
> This effort introduces a layered architecture with explicit responsibilities:
> * *CacheEngine* — storage abstraction for cache backends
> * *CacheTopology* — tier orchestration and coordination
> * *CachePlacementPolicy* — admission, placement, and promotion decisions
> * *CacheAccessService* — unified entry point for read/write cache interactions
> Key improvements include:
> * explicit cache admission control on the write/{{put(...)}} path
> * separation of storage, orchestration, and policy concerns
> * support for multiple cache engines and topologies
> h2. Implementation Plan
> h3. Phase 1 — Foundation
> # Introduce *CacheEngine* and *CacheTopology* abstractions (no behavior
> change)
> # Introduce *CachePlacementPolicy* abstractions and default compatibility
> policy
> # Introduce *CacheAccessService* abstraction
> h3. Phase 2 — Migrate callers
> # Refactor *HFileReaderImpl* to use *CacheAccessService* for block cache
> access
> # Refactor write path, prefetch, and compaction cache population to use
> *CacheAccessService*
> h3. Phase 3 — Extract orchestration
> # Refactor *CombinedBlockCache* into explicit *TieredExclusiveTopology*
> # Add *TieredInclusiveTopology* support (optional / later)
> h3. Phase 4 — Migrate implementations
> # Refactor *LruBlockCache* to implement *CacheEngine*
> # Refactor *BucketCache* to implement *CacheEngine*
> # Remove topology-specific assumptions from *BucketCache* (if still needed as
> separate cleanup)
> h3. Phase 5 — Cleanup + observability
> # Add metrics for admission decisions and tiered cache topology
> # Deprecate and remove legacy *BlockCache* abstraction
> h3. Phase 6 — New engines
> # Add *CarrotCache*-based *CacheEngine* implementation
> h2. Notes
> * No behavior change is expected in early phases
> * Changes are designed to be incremental and reviewable
> * This umbrella groups the individual tasks required to deliver this
> functionality
> h2. References
> * RFC design doc:
> https://docs.google.com/document/d/1DOoRfqdDzC-Nz9zfCzSed8vbhRe5BMz8/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true
--
This message was sent by Atlassian Jira
(v8.20.10#820010)