Re: [PR] HBASE-30019 Introduce CacheEngine and CacheTopology abstractions [hbase]

via GitHub Mon, 27 Apr 2026 18:49:27 -0700


VladRodionov commented on code in PR #8155:
URL: https://github.com/apache/hbase/pull/8155#discussion_r3151112264



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/cache/CacheEngine.java:
##########
@@ -0,0 +1,319 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.io.hfile.cache;
+
+import java.util.Optional;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.io.hfile.BlockCacheKey;
+import org.apache.hadoop.hbase.io.hfile.BlockType;
+import org.apache.hadoop.hbase.io.hfile.CacheStats;
+import org.apache.hadoop.hbase.io.hfile.Cacheable;
+import org.apache.hadoop.hbase.io.hfile.HFileBlock;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/**
+ * Storage abstraction for a concrete HBase block cache backend.
+ *
+ * <p>A {@code CacheEngine} represents the storage layer only. It is 
responsible for storing,
+ * retrieving, invalidating, and reporting statistics for cached blocks. It 
does not perform
+ * tier orchestration, admission control, placement decisions, or 
promotion/demotion across
+ * cache levels.</p>
+ *
+ * <p>This interface is intentionally aligned with the storage-oriented subset 
of the current
+ * {@code BlockCache} contract so that existing implementations such as 
LruBlockCache and
+ * BucketCache can be migrated incrementally with minimal behavioral risk.</p>
+ *
+ * <p>Responsibilities of a {@code CacheEngine} include:</p>
+ * <ul>
+ *   <li>block lookup</li>
+ *   <li>block insertion</li>
+ *   <li>targeted invalidation / eviction</li>
+ *   <li>capacity and occupancy reporting</li>
+ *   <li>engine-local statistics</li>
+ *   <li>optional implementation-specific fit/capability checks</li>
+ * </ul>
+ *
+ * <p>Non-responsibilities include:</p>
+ * <ul>
+ *   <li>L1/L2 topology orchestration</li>
+ *   <li>admission policy</li>
+ *   <li>tier placement decisions</li>
+ *   <li>promotion or demotion across tiers</li>
+ * </ul>
+ */
[email protected]
+public interface CacheEngine {
+
+  /**
+   * Returns a human-readable name for this cache engine instance.
+   *
+   * <p>The name is intended for logging, metrics, diagnostics, and 
configuration reporting.
+   * It should be stable for the lifetime of the engine instance.</p>
+   *
+   * @return engine name
+   */
+  String getName();
+
+  /**
+   * Returns the engine type.
+   *
+   * <p>This identifies the concrete backend family, such as LRU, BUCKET, or 
CARROT. The type
+   * is useful for metrics, diagnostics, and topology assembly.</p>
+   *
+   * @return engine type
+   */
+  CacheEngineType getType();
+
+  /**
+   * Adds a block to the cache.
+   *
+   * @param cacheKey block cache key
+   * @param buf block contents
+   * @param inMemory whether the block should be treated as in-memory
+   */
+  void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory);
+
+  /**
+   * Adds a block to the cache, optionally waiting for asynchronous cache 
backends.
+   *
+   * <p>This is primarily useful for implementations such as BucketCache that 
may buffer writes
+   * asynchronously.</p>
+   *
+   * @param cacheKey block cache key
+   * @param buf block contents
+   * @param inMemory whether the block should be treated as in-memory
+   * @param waitWhenCache whether to wait for the cache operation to be 
accepted/flushed
+   */
+  default void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
inMemory,
+      boolean waitWhenCache) {
+    cacheBlock(cacheKey, buf, inMemory);
+  }
+
+  /**
+   * Adds a block to the cache, defaulting to non in-memory treatment.
+   *
+   * @param cacheKey block cache key
+   * @param buf block contents
+   */
+  void cacheBlock(BlockCacheKey cacheKey, Cacheable buf);
+
+  /**
+   * Fetches a block from the cache.
+   *
+   * @param cacheKey block to fetch
+   * @param caching whether caching is enabled for the request; used for 
metrics
+   * @param repeat whether this is a repeated lookup for the same block; used 
to avoid
+   *               double-counting misses
+   * @param updateCacheMetrics whether cache metrics should be updated
+   * @return cached block, or {@code null} if not present
+   */
+  Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat,
+      boolean updateCacheMetrics);
+
+  /**
+   * Fetches a block from the cache with an optional block type hint.
+   *
+   * <p>Implementations may ignore the block type if it is not needed.</p>
+   *
+   * @param cacheKey block to fetch
+   * @param caching whether caching is enabled for the request; used for 
metrics
+   * @param repeat whether this is a repeated lookup for the same block; used 
to avoid
+   *               double-counting misses
+   * @param updateCacheMetrics whether cache metrics should be updated
+   * @param blockType optional block type hint
+   * @return cached block, or {@code null} if not present
+   */
+  default Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean 
repeat,
+      boolean updateCacheMetrics, BlockType blockType) {
+    return getBlock(cacheKey, caching, repeat, updateCacheMetrics);
+  }
+
+  /**
+   * Evicts a single block from the cache.
+   *
+   * @param cacheKey block to evict
+   * @return {@code true} if the block existed and was evicted, {@code false} 
otherwise
+   */
+  boolean evictBlock(BlockCacheKey cacheKey);
+
+  /**
+   * Evicts all cached blocks for the given HFile.
+   *
+   * @param hfileName HFile name
+   * @return number of blocks evicted
+   */
+  int evictBlocksByHfileName(String hfileName);
+
+  /**
+   * Evicts all cached blocks for the given HFile within the specified offset 
range.
+   *
+   * <p>This is useful for targeted invalidation during file lifecycle events 
where only a subset
+   * of blocks should be removed.</p>
+   *
+   * @param hfileName HFile name
+   * @param initOffset inclusive start offset
+   * @param endOffset inclusive end offset
+   * @return number of blocks evicted
+   */
+  default int evictBlocksRangeByHfileName(String hfileName, long initOffset, 
long endOffset) {
+    return 0;
+  }
+
+  /**
+   * Evicts all cached blocks associated with the specified region.
+   *
+   * <p>This is a new API intended to support region-scoped invalidation in a 
storage-oriented
+   * way, without requiring higher-level code to enumerate files first.</p>
+   *
+   * @param regionName region name
+   * @return number of blocks evicted
+   */
+  default int evictBlocksByRegionName(String regionName) {
+    return 0;
+  }
+
+  /**
+   * Returns engine statistics.
+   *
+   * @return cache statistics
+   */
+  CacheStats getStats();
+
+  /**
+   * Shuts down this cache engine and releases any owned resources.
+   */
+  void shutdown();
+
+  /**
+   * Returns the total configured cache size, in bytes.
+   *
+   * @return total cache size
+   */
+  long size();
+
+  /**
+   * Returns the maximum configured cache size, in bytes.
+   *
+   * @return maximum cache size
+   */
+  long getMaxSize();
+
+  /**
+   * Returns the amount of free space available in the cache, in bytes.
+   *
+   * @return free size
+   */
+  long getFreeSize();
+
+  /**
+   * Returns the currently occupied cache size, in bytes.
+   *
+   * @return occupied size
+   */
+  long getCurrentSize();

Review Comment:
   getMaxSize is configuration, getCurrentSize is actual usage, and getFreeSize
   is an engine-provided fast/accurate view of remaining capacity. These are not
   interchangeable due to differences in allocation models across engines. 
size() does seem redundant - will remove it. Actually I will keep size() as a  
replacement for getCurrentSize and will remove getCurrentSize



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-30019 Introduce CacheEngine and CacheTopology abstractions [hbase]

Reply via email to