XComp opened a new pull request, #24561:
URL: https://github.com/apache/flink/pull/24561

   ## What is the purpose of the change
   
   There is a risk of the leader election running into deadlocks due to nested 
locking. One of these cases was reported in FLINK-34672.
   
   The cause of this issue is that the lock of the contender and the lock of 
the leader elector do not have a defined order. This can be fixed by making the 
leadership information not available to contenders but to make contenders 
create callbacks that should run in the leader election executor. This way, 
we're always checking for leadership before checking the run state of the 
contender.
   
   ## Brief change log
   
   * Removes `DefaultLeaderElectionService#getLeaderSessionID`: This becomes 
internal knowledge and shouldn't be exposed.
   * Introduces `LeaderElection#runAsyncIfLeader`: This method can be used by 
contenders to execute logic that is leadership-related. 
   * `DefaultLeaderElectionService` runs the `runAsyncIfLeader` calls on the 
internal leader operation executor. `#confirmLeadership` utilizes the new API 
as well.
   * Removes `LeaderElection#hasLeadership`
   
   ## Verifying this change
   
   * `DefaultLeaderElectionServiceTest` was extended to verify that the 
callbacks are only executed if the leadership is obtained
   * Other tests needed to be adapted/cleaned up
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: yes
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to