[DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-05 Thread Matthias Pohl
Hi everyone, I brought up FLINK-26522 [1] in the mailing list discussion about consolidating the HighAvailabilityServices interfaces [2], previously. There, it was concluded that the community still wants the ability to have per-component leader election and, therefore, keep the HighAvailabilitySer

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-12 Thread Yang Wang
Thanks Matthias for preparing this thorough FLIP, which has taken us reviewing the multiple component leader election. I am totally in favor of doing the code clean-up. The current implementation does not have very good readability due to legacy compatibility. And I just have a few comments. # HA

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-12 Thread Matthias Pohl
Thanks Yang Wang for sharing your view on this. Please find my responses below. # HA data format in the HA backend(e.g. ZK, K8s ConfigMap) > We have already changed the HA data format after introducing the multiple > component leader election in FLINK-24038. For K8s HA, > the num of ConfigMaps red

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-15 Thread Yang Wang
Thanks Matthias for updating the FLIP. Given that we do not touch the dedicated ConfigMap for the checkpoint, this FLIP will not affect the LAST_STATE recovery for the Flink Kubernetes Operator. # LeaderContender#initiateLeaderElection Could we simply execute *leaderElectionService.register(conte

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-17 Thread Matthias Pohl
Thanks Yang for getting back to me. I checked the connection information that's stored in the HA backend once more. My previous proposal is based on a wrong assumption: The address we store is the RPC endpoint's address. That address should be unique per component which means that we shouldn't cha

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-17 Thread Yang Wang
Thanks Matthias for the detailed explanation. For the HA backend data structure, you are right. Even the different components are running in a same JVM, they have completely different connection infos. But it is not urgent to use a single ZNode to store multiple connection entries for now. I lean

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-18 Thread Chesnay Schepler
There are a lot of good things in this, and until the Extension bit I'm fully on board. With the extension, how does the leader contender get access to the LeaderElection? I would've assumed that LEService returns a LeaderElection when register is called, but according to the diagram this met

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-18 Thread Matthias Pohl
Thanks for participating in the discussion, Yang & Chesnay. LeaderElection interface extension gave me a headache as well. I added it initially because I thought it would be of more value. But essentially, it doesn't help but make the code harder to understand (as your questions rightfully point ou

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-18 Thread Matthias Pohl
After another round of discussion, I came up with a (hopefully) final proposal. The previously discussed approach was still not optimal because the contender ID lived in the LeaderContender even though it is actually LeaderElectionService-internal knowledge. Fixing that helped fix the overall archi

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-23 Thread Chesnay Schepler
Thanks for updating the design. From my side this looks good. On 18/01/2023 17:59, Matthias Pohl wrote: After another round of discussion, I came up with a (hopefully) final proposal. The previously discussed approach was still not optimal because the contender ID lived in the LeaderContender ev

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-24 Thread Yang Wang
Having the *start()* in *LeaderContender* interface and bringing back the *LeaderElection* with some new methods make sense to me. I have no more concerns now. >- *LeaderContender*: The LeaderContender is integrated as usual except >that it accesses the LeaderElection instead of the Lead

Re: [DISCUSS] FLIP-285: Refactoring the leader election code base in Flink

2023-01-25 Thread Matthias Pohl
Thanks Yang and Chesnay for your feedback. I'm gonna go ahead and start a voting thread as this discussion thread is already open for some time. Best, Matthias On Wed, Jan 25, 2023 at 4:08 AM Yang Wang wrote: > Having the *start()* in *LeaderContender* interface and bringing back the > *LeaderE