[GitHub] [flink] TisonKun opened a new pull request #9878: [FLINK-14149][coordination] Implement ZooKeeperLeaderElectionServiceNG

GitBox Thu, 10 Oct 2019 02:12:19 -0700

TisonKun opened a new pull request #9878: [FLINK-14149][coordination] Implement
ZooKeeperLeaderElectionServiceNG
URL: https://github.com/apache/flink/pull/9878

## What is the purpose of the change

Based on [FLINK-10333](https://issues.apache.org/jira/browse/FLINK-10333),
we reach a consensus that refactor ZK based storage with a transaction store
mechanism. The overall design can be found in the design document linked
[here](https://docs.google.com/document/d/1cBY1t0k5g1xNqzyfZby3LcPu4t-wpx57G1xf-nmWrCo/edit).

This subtask is aimed at introducing the prerequisite to adopt transaction
store, i.e., a new leader election service for ZK scenario. The necessity is
that we have to retrieve the corresponding latch path per contender following
the algorithm describe in
[FLINK-10333](https://issues.apache.org/jira/browse/FLINK-10333).

Here is the descriptive details about the implementation.

We adopt the optimized version of this
[recipe](https://zookeeper.apache.org/doc/current/recipes.html#sc_leaderElection).
Below is only the most important two differences from the former
implementation:

### Leader election is a one-shot service.

Specifically, we only create one latch for a specific contender. We tolerate
SUSPENDED a.k.a. CONNECTIONLOSS so that the only situation we lost leadership
is session expired, which infers the ephemeral latch znode is deleted. We don't
re-participant as contender so after revokeLeadership a contender will never be
granted any more. This is not a problem but we can do further refactor in
contender side for better behavior.

### Leader info znode is PERSISTENT.

It is because we now regard create/setData to leader info znode a
leader-only operation and thus do it in a transaction. If we keep using
ephemeral znode it is hard to test. Because we share ZK client so the ephemeral
znode is not deleted so that we should deal with complex znode stat that
transaction cannot simply deal with. And since znode is PERSISTENT we introduce
a concealLeaderInfo method called back on contender stop to clean up.

**Here is a known downside that retriever cannot be notified when the former
leader lost and the latter leader started to serve. However, state would
eventually converge and connect to former outdated leader will fail.**

**For what we gain**

1. Basics for the overall goal under FLINK-10333
2. Leader info node must be modified by the current leader. Thus we can
reduce a lot of concurrency handling logic in currently ZLES, including using
NodeCache as well as dealing with complex stat of ephemeral leader info node.

## Verifying this change

We make sure that all HA tests pass as before. During the implementation we
add a dedicated test `ZooKeeperLeaderElectionServiceNGTest` which is eventually
merged into `ZooKeeperLeaderElectionServiceTest`.

## Does this pull request potentially affect one of the following parts:

- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes, we introduce a new
implementation of ZK leader service and replace the old one)
- The S3 file system connector: (no)

## Documentation

- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable)

CC @tillrohrmann


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] TisonKun opened a new pull request #9878: [FLINK-14149][coordination] Implement ZooKeeperLeaderElectionServiceNG

Reply via email to