[ 
https://issues.apache.org/jira/browse/HBASE-30018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-30018:
--------------------------------------
    Description: 
h2. Description

This umbrella tracks work to refactor the HBase block cache into a *pluggable 
architecture* with clear separation of concerns:

* cache storage (*CacheEngine*)
* cache topology (*CacheTopology*, L1/L2 orchestration)
* placement and admission policy (*CachePlacementPolicy*)

The current design tightly couples these responsibilities across *BlockCache*, 
*CombinedBlockCache*, and *BucketCache*, making it difficult to:

* introduce new cache implementations
* evolve cache behavior independently
* experiment with alternative policies and topologies

In addition, existing implementations (e.g., *BucketCache*) incur significant 
metadata overhead at large scale (e.g., ~9GB for ~1.6TB cache with 64KB 
blocks), reducing effective cache capacity.

h2. Proposed Architecture

This effort introduces a layered architecture with explicit responsibilities:

* *CacheEngine* — storage abstraction for cache backends
* *CacheTopology* — tier orchestration and coordination
* *CachePlacementPolicy* — admission, placement, and promotion decisions
* *CacheAccessService* — unified entry point for read/write cache interactions

Key improvements include:

* explicit cache admission control on the write/{{put(...)}} path
* separation of storage, orchestration, and policy concerns
* support for multiple cache engines and topologies

h2. Implementation Plan

h3. Phase 1 — foundation

# Introduce *CacheEngine* and *CacheTopology* abstractions (no behavior change)
# Introduce *CachePlacementPolicy* abstractions and default compatibility policy
# Introduce *CacheAccessService* abstraction

h3. Phase 2 — migrate callers

# Refactor *HFileReaderImpl* to use *CacheAccessService* for block cache access
# Refactor write path, prefetch, and compaction cache population to use 
*CacheAccessService*

h3. Phase 3 — extract orchestration

# Refactor *CombinedBlockCache* into explicit *TieredExclusiveTopology*
# Add *TieredInclusiveTopology* support (optional / later)

h3. Phase 4 — migrate implementations

# Refactor *LruBlockCache* to implement *CacheEngine*
# Refactor *BucketCache* to implement *CacheEngine*
# Remove topology-specific assumptions from *BucketCache* (if still needed as 
separate cleanup)

h3. Phase 5 — cleanup + observability

# Add metrics for admission decisions and tiered cache topology
# Deprecate and remove legacy *BlockCache* abstraction

h3. Phase 6 — new engines

# Add *CarrotCache*-based *CacheEngine* implementation

h2. Notes

* No behavior change is expected in early phases
* Changes are designed to be incremental and reviewable
* This umbrella groups the individual tasks required to deliver this 
functionality

h2. References

* RFC design doc:
https://docs.google.com/document/d/1DOoRfqdDzC-Nz9zfCzSed8vbhRe5BMz8/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true

  was:
This umbrella tracks work to refactor the HBase block cache into a pluggable 
architecture with clear separation of:
        •       cache storage (engine)
        •       cache topology (L1/L2 orchestration)
        •       placement and admission policy

The current design tightly couples these concerns (BlockCache, 
CombinedBlockCache, BucketCache), making it difficult to introduce new cache 
implementations and evolve cache behavior.

In addition, existing implementations (e.g., BucketCache) incur significant 
metadata overhead at large scale (e.g., ~9GB for ~1.6TB cache with 64KB 
blocks), reducing effective cache capacity.

This effort introduces:
        •       BlockCacheEngine (storage abstraction)
        •       CacheTopology (tier orchestration)
        •       CachePlacementPolicy (admission, placement, promotion)
        •       explicit cache admission control on write/put path
        •       CacheAccessService for read/write path integration

The work will be implemented incrementally:
        •       Phase 1: introduce internal APIs (no behavior change)
        •       Phase 2: refactor existing topology (CombinedBlockCache)
        •       Phase 3: adapt BucketCache to new interfaces
        •       Phase 4: enable new engines (e.g., CarrotCache, EHCache)

This umbrella groups the individual tasks required to deliver this 
functionality.

RFC design doc: 
https://docs.google.com/document/d/1DOoRfqdDzC-Nz9zfCzSed8vbhRe5BMz8/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true


> Pluggable HBase Block Cache System
> ----------------------------------
>
>                 Key: HBASE-30018
>                 URL: https://issues.apache.org/jira/browse/HBASE-30018
>             Project: HBase
>          Issue Type: Umbrella
>          Components: BlockCache, Performance
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Major
>
> h2. Description
> This umbrella tracks work to refactor the HBase block cache into a *pluggable 
> architecture* with clear separation of concerns:
> * cache storage (*CacheEngine*)
> * cache topology (*CacheTopology*, L1/L2 orchestration)
> * placement and admission policy (*CachePlacementPolicy*)
> The current design tightly couples these responsibilities across 
> *BlockCache*, *CombinedBlockCache*, and *BucketCache*, making it difficult to:
> * introduce new cache implementations
> * evolve cache behavior independently
> * experiment with alternative policies and topologies
> In addition, existing implementations (e.g., *BucketCache*) incur significant 
> metadata overhead at large scale (e.g., ~9GB for ~1.6TB cache with 64KB 
> blocks), reducing effective cache capacity.
> h2. Proposed Architecture
> This effort introduces a layered architecture with explicit responsibilities:
> * *CacheEngine* — storage abstraction for cache backends
> * *CacheTopology* — tier orchestration and coordination
> * *CachePlacementPolicy* — admission, placement, and promotion decisions
> * *CacheAccessService* — unified entry point for read/write cache interactions
> Key improvements include:
> * explicit cache admission control on the write/{{put(...)}} path
> * separation of storage, orchestration, and policy concerns
> * support for multiple cache engines and topologies
> h2. Implementation Plan
> h3. Phase 1 — foundation
> # Introduce *CacheEngine* and *CacheTopology* abstractions (no behavior 
> change)
> # Introduce *CachePlacementPolicy* abstractions and default compatibility 
> policy
> # Introduce *CacheAccessService* abstraction
> h3. Phase 2 — migrate callers
> # Refactor *HFileReaderImpl* to use *CacheAccessService* for block cache 
> access
> # Refactor write path, prefetch, and compaction cache population to use 
> *CacheAccessService*
> h3. Phase 3 — extract orchestration
> # Refactor *CombinedBlockCache* into explicit *TieredExclusiveTopology*
> # Add *TieredInclusiveTopology* support (optional / later)
> h3. Phase 4 — migrate implementations
> # Refactor *LruBlockCache* to implement *CacheEngine*
> # Refactor *BucketCache* to implement *CacheEngine*
> # Remove topology-specific assumptions from *BucketCache* (if still needed as 
> separate cleanup)
> h3. Phase 5 — cleanup + observability
> # Add metrics for admission decisions and tiered cache topology
> # Deprecate and remove legacy *BlockCache* abstraction
> h3. Phase 6 — new engines
> # Add *CarrotCache*-based *CacheEngine* implementation
> h2. Notes
> * No behavior change is expected in early phases
> * Changes are designed to be incremental and reviewable
> * This umbrella groups the individual tasks required to deliver this 
> functionality
> h2. References
> * RFC design doc:
> https://docs.google.com/document/d/1DOoRfqdDzC-Nz9zfCzSed8vbhRe5BMz8/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to