[ 
https://issues.apache.org/jira/browse/CURATOR-518?focusedWorklogId=849096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849096
 ]

ASF GitHub Bot logged work on CURATOR-518:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Mar/23 10:07
            Start Date: 04/Mar/23 10:07
    Worklog Time Spent: 10m 
      Work Description: tisonkun commented on code in PR #446:
URL: https://github.com/apache/curator/pull/446#discussion_r1125432578


##########
curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java:
##########
@@ -379,10 +423,31 @@ public boolean hasLeadership()
      */
     public synchronized void interruptLeadership()
     {
-        Future<?> task = ourTask.get();
-        if ( task != null )
+        // Correctness with requeue:
+        // * requeue, interrupt and ourTask are guarded by synchronized(this), 
so requeue is not
+        //   able to execute interleaving.
+        // * hasLeadership is true, so we are not cancelling before task 
execution.
+        // * auto-requeue is always executed before task termination and has 
no interruption point.
+        // * it is harmless to interrupt nearly completed task(e.g. lost or 
release leadership).
+        if ( ourTask != null && hasLeadership )
         {
-            task.cancel(true);
+            ourTask.cancel(true);
+        }
+    }
+
+    /**
+     * Cancel ongoing election and leadership.
+     *
+     * <p>Subsequent {@link #requeue()} will succeed if {@link #autoRequeue()} 
is not enabled.
+     */
+    private synchronized void cancelElection() {
+        if ( ourTask != null ) {
+            ourTask.cancel(true);
+            isQueued = false;

Review Comment:
   ```suggestion
               clearIsQueued();
   ```
   
   Keep consistent.



##########
curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java:
##########
@@ -237,13 +242,15 @@ private synchronized boolean internalRequeue()
         if ( !isQueued && (state.get() == State.STARTED) )
         {
             isQueued = true;
-            Future<Void> task = executorService.submit(new Callable<Void>()
+            Future<?> oldTask = ourTask;
+            ourTask = executorService.submit(new Callable<Void>()
             {
                 @Override
                 public Void call() throws Exception
                 {
                     try
                     {
+                        waitTaskDone(oldTask, Duration.ofMillis(500), 
Duration.ofSeconds(5));

Review Comment:
   It seems this is a best-effort waiting call and we don't have happens-before 
relation to ensure that the previous task must be done or cancelled.
   
   Filter only `hasLeadership` will proceed `interruptLeadership` seems enough 
for fixing 518?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 849096)
    Time Spent: 0.5h  (was: 20m)

> Curator. LeaderSelector. Two successive calls to interruptLeadership() will 
> break autoRequeue.
> ----------------------------------------------------------------------------------------------
>
>                 Key: CURATOR-518
>                 URL: https://issues.apache.org/jira/browse/CURATOR-518
>             Project: Apache Curator
>          Issue Type: Improvement
>          Components: Recipes
>    Affects Versions: 4.0.1, 4.2.0
>         Environment: Windows 8, JRE 1.8.0_181
>            Reporter: Bulatov Oleg
>            Priority: Major
>              Labels: newbie
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> h1. Curator. LeaderSelector. Two successive calls to interruptLeadership() 
> will break autoRequeue
> If we set autoRequeue to TRUE. But during execution interruptLeadership() 
> will be called from another thread before internalRequeue() completed its 
> work. Then it will break recursive call to internalRequeue(), so that client 
> will not ask for leadership and get stuck.
> We can solve this problem if we check hasLeadership() before calling 
> interruptLeadership(). But it is strange that such check curator library does 
> not do internally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to