Re: [DISCUSS] Cassandra Cross Data Center Read/Write with Remote Quorum During Data Center Failure

Qc L Mon, 15 Dec 2025 16:42:31 -0800

Hello Patrick, Stefan, and Jeff,

Thanks for the insightful comments! Client-side failover is a useful and
widely adopted approach, especially for individual applications that need
fine-grained control over latency and retry behavior. It gives clients
flexibility in how and when they shift traffic across datacenters.


For a large-scale platform with 100+ independent clients, relying on
client-side failover introduces significant operational and maintenance
overhead. Each client must maintain its own failover logic, leading to
inconsistent behavior and slowdown the incident mitigation for large
deployments. A server-side failover mechanism provides a single, centrally
controlled policy that applies uniformly to all workloads. This ensures
faster mitigation during incidents, and eliminates the need to coordinate
changes across dozens or hundreds of client teams.

>From a performance perspective, server-side failover also avoids the extra
request cycle inherent in client-side retries. With client-side failover, a
request must first fail locally before being retried remotely, adding an
additional network round trip and increasing tail latency. In contrast,
server-side remote quorum failover reduces network round trips during
failover and delivers more performant responses under degraded conditions.


On Tue, Dec 2, 2025 at 11:07 AM Jeff Jirsa <[email protected]> wrote:

> I’ve seen this exact application level behavior implemented as a DC
> preference list via extending the LoadBalancingPolicy in the java driver.
>
> Almost literally what you’re describing, just by extending the java
> implementation.
>
> No server side changes required at all. Application just gets a priority
> list and fails over if it gets an unavailable exception.
>
>
>
> On Dec 1, 2025, at 5:38 PM, Qc L <[email protected]> wrote:
>
> Hi Stefan,
>
> Thanks for raising the question about using a custom LoadBalancingPolicy!
> Remote quorum and LB-based failover aim to solve similar problems,
> maintaining availability during datacenter degradation, so I did a
> comparison study on the server-side remote quorum solution versus relying
> on driver-side logic.
>
>
>    - For our use cases, we found that client-side failover is expensive
>    to operate and leads to fragmented behavior during incidents. We also
>    prefer failover to be triggered only when needed, not automatically. A
>    server-side mechanism gives us controlled, predictable behavior and makes
>    the system easier to operate.
>    - Remote quorum is implemented on the server, where the coordinator
>    can intentionally form a quorum using replicas in a backup region when the
>    local region cannot satisfy LOCAL_QUORUM or EACH_QUORUM. The database
>    determines the fallback region, how replicas are selected, and ensures
>    consistency guarantees still hold.
>
>
> By keeping this logic internal to the server, we provide a unified,
> consistent behavior across all clients without requiring any application
> changes.
>
> Happy to discuss further if helpful.
>
> On Mon, Dec 1, 2025 at 12:57 PM Štefan Miklošovič <[email protected]>
> wrote:
>
>> Hello,
>>
>> before going deeper into your proposal ... I just have to ask, have you
>> tried e.g. custom LoadBalancingPolicy in driver, if you use that?
>>
>> I can imagine that if the driver detects a node to be down then based on
>> its "distance" / dc it belongs to you might start to create a different
>> query plan which talks to remote DC and similar.
>>
>>
>> https://docs.datastax.com/en/drivers/java/4.2/com/datastax/oss/driver/api/core/loadbalancing/LoadBalancingPolicy.html
>>
>>
>> https://docs.datastax.com/en/drivers/java/4.2/com/datastax/oss/driver/api/core/loadbalancing/LoadBalancingPolicy.html
>>
>> On Mon, Dec 1, 2025 at 8:54 PM Qc L <[email protected]> wrote:
>>
>>> Hello team,
>>>
>>> I’d like to propose adding the remote quorum to Cassandra consistency
>>> level for handling simultaneous hosts unavailability in the local data
>>> center that cannot achieve the required quorum.
>>>
>>> Background
>>> NetworkTopologyStrategy is the most commonly used strategy at Uber, and
>>> we use Local_Quorum for read/write in many use cases. Our Cassandra
>>> deployment in each data center currently relies on majority replicas being
>>> healthy to consistently achieve local quorum.
>>>
>>> Current behavior
>>> When a local data center in a Cassandra deployment experiences outages,
>>> network isolation, or maintenance events, the EACH_QUORUM / LOCAL_QUORUM
>>> consistency level will fail for both reads and writes if enough replicas in
>>> that the wlocal data center are unavailable. In this configuration,
>>> simultaneous hosts unavailability can temporarily prevent the cluster from
>>> reaching the required quorum for reads and writes. For applications that
>>> require high availability and a seamless user experience, this can lead to
>>> service downtime and a noticeable drop in overall availability. see
>>> attached figure1.<figure1.png>
>>>
>>> Proposed Solution
>>> To prevent this issue and ensure a seamless user experience, we can use
>>> the Remote Quorum consistency level as a fallback mechanism in scenarios
>>> where local replicas are unavailable. Remote Quorum in Cassandra refers to
>>> a read or write operation that achieves quorum (a majority of replicas) in
>>> the remote data center, rather than relying solely on replicas within the
>>> local data center. see attached Figure2.
>>>
>>> <figure2.png>
>>>
>>> We will add a feature to do read/write consistency level override on the
>>> server side. When local replicas are not available, we will overwrite the
>>> server side write consistency level from each quorum to remote quorum. Note
>>> that, implementing this change in client side will require some protocol
>>> changes in CQL, we only add this on server side which can only be used by
>>> server internal.
>>>
>>> For example, giving the following Cassandra setup
>>>
>>>
>>>
>>>
>>>
>>> *CREATE KEYSPACE ks WITH REPLICATION = {  'class':
>>> 'NetworkTopologyStrategy',  'cluster1': 3,  'cluster2': 3,  'cluster3': 3};*
>>>
>>> The selected approach for this design is to explicitly configure a
>>> backup data center mapping for the local data center, where each data
>>> center defines its preferred failover target. For example
>>> *remote_quorum_target_data_center:*
>>>
>>>
>>> *  cluster1: cluster2  cluster2: cluster3  cluster3: cluster1*
>>>
>>> Implementations
>>> We proposed the following feature to Cassandra to address data center
>>> failure scenarios
>>>
>>>    1. Introduce a new Consistency level called remote quorum.
>>>    2. Feature to do read/write consistency level override on server
>>>    side. (This can be controlled by a feature flag). Use Node tools command 
>>> to
>>>    turn on/off the server failback
>>>
>>> Why remote quorum is useful
>>> As shown in the figure 2, we have the data center failure in one of the
>>> data centers, Local quorum and each quorum will fail since two replicas are
>>> unavailable and cannot meet the quorum requirement in the local data center.
>>>
>>> During the incident, we can use the nodetool to enable failover to
>>> remote quorum. With the failover enabled, we can failover to the remote
>>> data center for read and write, which avoids the available drop.
>>>
>>> Any thoughts or concerns?
>>> Thanks in advance for your feedback,
>>>
>>> Qiaochu
>>>
>>>
>

Re: [DISCUSS] Cassandra Cross Data Center Read/Write with Remote Quorum During Data Center Failure

Reply via email to