Till Rohrmann created FLINK-22636:
-------------------------------------
Summary: Group job specific ZooKeeper HA services under common
jobs/<JobID> zNode
Key: FLINK-22636
URL: https://issues.apache.org/jira/browse/FLINK-22636
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Affects Versions: 1.12.3, 1.13.0, 1.14.0
Reporter: Till Rohrmann
Fix For: 1.14.0
In order to better clean up Zookeeper HA services, I suggest grouping
job-specific services under a common {{jobs/<JobID>}} zNode. That way, it
becomes trivial to clean up the job-specific Zookeeper data (simply deleting
the {{jobs/<JobID>}} node.
Currently, our Zookeeper structure is not really structured well. The current
layout looks like this:
{code}
clusterID -> jobgraphs -> <job-id>
-> checkpoints -> <job-id> -> checkpoint-1
-> checkpoint-counter -> <job-id> -> counter
-> leaderlatch -> dispatcher_lock
-> resourc_emanager_lock
-> <job-id>
-> leader -> dispatcher_lock
-> resource_manager_lock
-> <job-id>
{code}
The new layout could look like this:
{code}
clusterID -> jobgraphs -> <job-id>
-> jobs -> <job-id> -> checkpoints -> checkpoint-1
-> checkpoint_id_counter ->
counter
-> leader -> latch
->
connection_info
-> leader -> dispatcher -> latch
-> connection_info
-> resource_manager -> latch
->
connection_info
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)