Garvit Rajput created ZOOKEEPER-4945:
----------------------------------------
Summary: Support automatic renewal of ephemeral nodes via client
heartbeats
Key: ZOOKEEPER-4945
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4945
Project: ZooKeeper
Issue Type: New Feature
Components: c client, server
Reporter: Garvit Rajput
### Motivation
Currently, ZooKeeper clients must maintain an active session to retain
ephemeral nodes. If a session timeout occurs due to a network glitch or GC
pause, these nodes are deleted even if the client recovers shortly after.
### Proposed Feature
Introduce a configurable **auto-renewal heartbeat mechanism** where the
ZooKeeper client can **extend the lifetime of ephemeral nodes** for a grace
period after temporary session disconnections — essentially a soft-reconnect
buffer.
This feature would:
- Reduce unintended ephemeral node deletion due to transient network failures.
- Improve stability for clients with flaky connections.
- Help cloud-native workloads where short-lived network interruptions are
common.
### Implementation Ideas
- Introduce a `znode.ephemeral.gracePeriod` config on the server/client.
- Allow clients to reattach to their ephemeral nodes within this window.
- Maintain consistency and fencing semantics using a version hash or ephemeral
token.
### Benefits
This change would improve ZooKeeper's resilience in distributed environments
without breaking the ephemeral node contract, as the node would still expire if
the client doesn't reconnect within the grace period.
### Impact
- Fully backward-compatible
- Opt-in via configuration
- May require slight changes to session expiration logic
Let me know if this is a direction you'd consider. Happy to discuss design or
help contribute a patch.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)