Re: [DISCUSS] CEP-59: Graceful Disconnect – In-Band Connection Draining for Node Shutdown

Jane H Thu, 29 Jan 2026 17:44:52 -0800

Hi Patrick,

Thanks for reading the CEP and for the thoughtful questions! Replies below.


Driver backward compatibility / mixed rollouts
--------
This is fully opt-in per connection. Older drivers won’t REGISTER for
GRACEFUL_DISCONNECT, so servers won’t send it to them, and those
connections behave exactly as they do today.

REGISTER vs STARTUP for opt-in
-------
There are two plausible ways for a driver to opt in to GRACEFUL_DISCONNECT:

Option A: REGISTER (as proposed today)
| Driver behavior                                            | Server
behavior
             |
|------------------------------------------------------------|------------------------------------------------------------------------------------------------|
| Send `OPTIONS`                                             | Return
`SUPPORTED` (`Map<String, List<String>>`) containing
`"GRACEFUL_DISCONNECT": ["true"]`. |
| Send `STARTUP` as normal                                   | Optionally
handle authentication as normal. Send `READY` as normal.
         |
| Send `REGISTER` including event type `GRACEFUL_DISCONNECT` | Acknowledge
normally (e.g., `READY`).
       |

This is consistent with the protocol: REGISTER is the standard mechanism to
subscribe to events.
However, this does add an extra round trip per query connection that wants
the event. Today most drivers only REGISTER on the control connection for
cluster-wide events (STATUS_CHANGE / TOPOLOGY_CHANGE / SCHEMA_CHANGE), and
query connections typically do not REGISTER anything. If we want every
query connection to receive GRACEFUL_DISCONNECT (because the signal is
connection-local), then every query connection would need to send REGISTER,
which means one additional message exchange during connection establishment.

Option B: STARTUP opt-in (alternative)
| Driver behavior
                                                    | Server behavior
                                                                         |
|
-----------------------------------------------------------------------------------------------------------------------------
|
----------------------------------------------------------------------------------------------
|
| Send `OPTIONS`
                                                     | Return `SUPPORTED`
(`Map<String, List<String>>`) containing `"GRACEFUL_DISCONNECT": ["true"]`.
|
| Send STARTUP with an additional entry in the options map, e.g. {
"CQL_VERSION": "3.0.0", "GRACEFUL_DISCONNECT": "true", ... } | Optionally
handle authentication as normal. Send `READY` as normal.
         |

This avoids the extra round trip, because the opt-in piggybacks on an
existing step in the handshake. But it introduces new semantics: STARTUP
options would be used to request an event stream subscription, which is
non-standard given that REGISTER already exists for that purpose.

Given the above, we prefer REGISTER for Protocol semantics consistency,
even though it costs one additional round trip on each query connection
that opts in.

Signal multiplication
--------
The protocol guidance about “don’t REGISTER on all connections” is
primarily aimed at the existing out-of-band events (STATUS_CHANGE /
TOPOLOGY_CHANGE / SCHEMA_CHANGE). Those events are gossip-driven and
broadcast by multiple nodes, so registering on many connections can easily
produce redundant notifications.

Concrete example (duplication with STATUS_CHANGE):
* In a 3-node cluster (node1, node2, node3), node1 is going down.
* Node2 and node3 learn about node1’s state change via gossip.
* Both node2 and node3 will send a STATUS_CHANGE event (“node1 is DOWN”) to
every client connection that registered for STATUS_CHANGE.
* If a driver registers for STATUS_CHANGE on connections to both node2 and
node3, it will receive two notifications for the same cluster event. That’s
the “signal multiplication” the spec warns about.

But the protocol does not stop us from adding an in-band event like
GRACEFUL_DISCONNECT. In the above example of node1 going down:
* GRACEFUL_DISCONNECT is in-band and connection-local, not
gossip/broadcast.
* Only the node that is actually shutting down (node1) emits
GRACEFUL_DISCONNECT, and it emits it only on its own native connections
that opted in.
* Node2 and node3 do not emit GRACEFUL_DISCONNECT for node1’s shutdown,
because they are not the node being drained.
So even if a driver has connections to node2 and node3 that are registered
for other events, it will not receive any GRACEFUL_DISCONNECT from them for
node1 going down.

I understand such an in-band event is new. We can add a clarification to
the protocol explaining that the recommendation of “don’t REGISTER on all
connections” will not apply to in-band events like GRACEFUL_DISCONNECT.

Event timing for operators
---------
Servers should emit GRACEFUL_DISCONNECT whenever it needs to close a
connection gracefully, regardless of the trigger.
I’ll update the CEP to clarify that GRACEFUL_DISCONNECT to whenever the
server intends to close a connection gracefully, including nodetool drain,
nodetool disablebinary + shutdown, rolling restarts, or a controlled JVM
shutdown hook path.

Operator control + observability
---------
+1. I agree to add the server-side configs:
graceful_disconnect_enabled graceful_disconnect_grace_period_ms
graceful_disconnect_max_drain_ms
And metrics/counters such as: connections_draining forced_disconnects
I’ll update the CEP accordingly.

Thanks again—this feedback is super helpful for tightening the proposal.

Regards,
Jane

On Wed, Jan 14, 2026 at 1:33 PM Patrick McFadin <[email protected]> wrote:

> Hi Jane,
>
> Thank you for the thought-out CEP. I certainly see the use of a feature
> like this to add resilience during cluster state changes. I have a few
> questions after reading the CEP.
>
> Driver compatibility: The way I read this, it's based on an ideal scenario
> where client and server are on the same version to support this feature. In
> my experience, client rollouts are never complete and often lag far behind
> the cluster upgrade. What happens when the driver completely
> ignores GRACEFUL_DISCONNECT? It might mean considering something on the
> server side.
>
> Discovery things: Speaking of the client, you want to use the SUPPORTED as
> listed in the v4 spec[1], but why not add this to STARTUP? You mention
> something in the "Rejected alternatives," but could you expand your
> thinking here?
>
> Signal multiplication: You have this in the CEP "Other protocols (HTTP/2,
> PostgreSQL, Redis Cluster) use connection-local in-band signals to enable
> safe draining." Our protocol guidance[1] explicitly notes that drivers
> often keep multiple connections and should not register for events on all
> of them, as this duplicates traffic. I don't know how you could ensure that
> every connection would be aware of a GRACEFUL_DISCONNECT without changing
> that aspect of the spec.
>
>
> Event timing for operators: It's not clear to me when
> the GRACEFUL_DISCONNECT is emitted when you do something like a drain,
> disablebinary or just a JVM shutdown hook. This is crucial for operators to
> understand how this could work and should be in the CEP spec for clarity. I
> think it will matter to a lot of people.
>
> Operator control: I've been on this push for a while and so I have to
> mention it. Opt-in vs default. We need more controls in the config YAML.
> graceful_disconnect_enabled
>
> If there is a server-side component:
> graceful_disconnect_grace_period_ms
> graceful_disconnect_max_drain_ms
>
> And finally, it needs more observability...
> logging/metrics counters: connections_draining, forced_disconnects
>
>
> Thanks for proposing this!
>
> Patrick
>
> 1 -
> https://cassandra.apache.org/doc/latest/cassandra/_attachments/native_protocol_v4.html
>
> On Tue, Jan 13, 2026 at 4:30 PM Jane H <[email protected]> wrote:
>
>> Hi all,
>>
>> I’d like to start a discussion on a CEP proposal: *CEP-59: Graceful
>> Disconnect*, to make intentional node shutdown/drain less disruptive for
>> clients (link:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406619103
>> ).
>>
>> Today, intentional node shutdown (e.g., rolling restarts) can still be
>> disruptive from a client perspective. Drivers often ignore DOWN events
>> because they are not reliable, and outstanding requests can end up as
>> client-facing TimeOut exceptions.
>>
>> The proposed solution is to add an in-band GRACEFUL_DISCONNECT event
>> that both control and query connections can opt into via REGISTER. When
>> a node is shutting down, it will emit the event to all subscribed
>> connections. Drivers will stop sending new queries on that connection/host,
>> allow in-flight requests to finish, then reconnect with exponential backoff.
>>
>> If you have thoughts on the proposed protocol, server shutdown behavior,
>> driver expectations, edge cases, or general feedback, I’d really appreciate
>> it.
>>
>> Regards,
>> Jane
>>
>

Re: [DISCUSS] CEP-59: Graceful Disconnect – In-Band Connection Draining for Node Shutdown

Reply via email to