[D] Proposal for Integrating Redis Distributed Cache alongside Caffeine for Enhanced Scalability and Consistency [gravitino]

via GitHub Fri, 19 Sep 2025 07:33:40 -0700


GitHub user lzh010817 created a discussion: Proposal for Integrating Redis 
Distributed Cache alongside Caffeine for Enhanced Scalability and Consistency


      I would like to initiate a discussion regarding the potential integration 
of Redis-based distributed caching to complement the existing Caffeine local 
cache. While Caffeine excels in providing low-latency, in-memory caching for 
single-node deployments, Gravitino may benefit from a distributed caching layer 
to address challenges in high-concurrency scenarios and multi-node 
environments.

**Key Considerations for Redis Integration:**

1. **Scalability & Horizontal Expansion:**
    As Gravitino scales horizontally, node-specific local caches (e.g., 
Caffeine) may lead to data inconsistency during node restarts or parallel 
operations. Redis, as a distributed cache, could ensure consistent metadata 
access across all nodes, reducing redundant backend queries and improving 
throughput.
2. Cache Consistency & Fault Tolerance:
    Redis offers features like persistence, replication, and automatic 
failover, which mitigate risks of cache loss during node failures. This aligns 
with Gravitino’s need for reliable metadata management in distributed setups.
3. Performance Optimization:
    While Caffeine provides nanosecond-level access latency, Redis can handle 
cross-node cache synchronization with minimal latency penalties using 
pipelining and cluster-mode operations. For frequently accessed metadata (e.g., 
catalog details), Redis could serve as a shared L2 cache, while Caffeine 
remains the L1 node-local cache.
4. Implementation Approach:
    4.1 Introduce a cache abstraction layer to support pluggable cache 
providers (e.g., Caffeine for local, Redis for distributed).
    4.2 Lever Redis Cluster for high availability and cache strategies to 
preload hot metadata on startup.
    4.3 Use key-based expiration and invalidation policies to ensure data 
freshness across nodes.

**Open Questions for Community Feedback:**

- Are there specific use cases in Gravitino where distributed caching would 
provide the most value (e.g., multi-region deployments, frequent schema 
updates)?

- How might we balance the trade-offs between added infrastructure complexity 
(Redis cluster management) and performance gains?

- Would a hybrid cache architecture (Caffeine + Redis) be feasible, and what 
strategies could optimize cache coherence?

I believe exploring Redis integration could strengthen Gravitino’s performance 
in distributed environments while maintaining backward compatibility. Looking 
forward to your insights and collaboration!



GitHub link: https://github.com/apache/gravitino/discussions/8480

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

[D] Proposal for Integrating Redis Distributed Cache alongside Caffeine for Enhanced Scalability and Consistency​ [gravitino]

Reply via email to

[D] Proposal for Integrating Redis Distributed Cache alongside Caffeine for Enhanced Scalability and Consistency [gravitino]