[
https://issues.apache.org/jira/browse/CURATOR-518?focusedWorklogId=849096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849096
]
ASF GitHub Bot logged work on CURATOR-518:
------------------------------------------
Author: ASF GitHub Bot
Created on: 04/Mar/23 10:07
Start Date: 04/Mar/23 10:07
Worklog Time Spent: 10m
Work Description: tisonkun commented on code in PR #446:
URL: https://github.com/apache/curator/pull/446#discussion_r1125432578
##########
curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java:
##########
@@ -379,10 +423,31 @@ public boolean hasLeadership()
*/
public synchronized void interruptLeadership()
{
- Future<?> task = ourTask.get();
- if ( task != null )
+ // Correctness with requeue:
+ // * requeue, interrupt and ourTask are guarded by synchronized(this),
so requeue is not
+ // able to execute interleaving.
+ // * hasLeadership is true, so we are not cancelling before task
execution.
+ // * auto-requeue is always executed before task termination and has
no interruption point.
+ // * it is harmless to interrupt nearly completed task(e.g. lost or
release leadership).
+ if ( ourTask != null && hasLeadership )
{
- task.cancel(true);
+ ourTask.cancel(true);
+ }
+ }
+
+ /**
+ * Cancel ongoing election and leadership.
+ *
+ * <p>Subsequent {@link #requeue()} will succeed if {@link #autoRequeue()}
is not enabled.
+ */
+ private synchronized void cancelElection() {
+ if ( ourTask != null ) {
+ ourTask.cancel(true);
+ isQueued = false;
Review Comment:
```suggestion
clearIsQueued();
```
Keep consistent.
##########
curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java:
##########
@@ -237,13 +242,15 @@ private synchronized boolean internalRequeue()
if ( !isQueued && (state.get() == State.STARTED) )
{
isQueued = true;
- Future<Void> task = executorService.submit(new Callable<Void>()
+ Future<?> oldTask = ourTask;
+ ourTask = executorService.submit(new Callable<Void>()
{
@Override
public Void call() throws Exception
{
try
{
+ waitTaskDone(oldTask, Duration.ofMillis(500),
Duration.ofSeconds(5));
Review Comment:
It seems this is a best-effort waiting call and we don't have happens-before
relation to ensure that the previous task must be done or cancelled.
Filter only `hasLeadership` will proceed `interruptLeadership` seems enough
for fixing 518?
Issue Time Tracking
-------------------
Worklog Id: (was: 849096)
Time Spent: 0.5h (was: 20m)
> Curator. LeaderSelector. Two successive calls to interruptLeadership() will
> break autoRequeue.
> ----------------------------------------------------------------------------------------------
>
> Key: CURATOR-518
> URL: https://issues.apache.org/jira/browse/CURATOR-518
> Project: Apache Curator
> Issue Type: Improvement
> Components: Recipes
> Affects Versions: 4.0.1, 4.2.0
> Environment: Windows 8, JRE 1.8.0_181
> Reporter: Bulatov Oleg
> Priority: Major
> Labels: newbie
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> h1. Curator. LeaderSelector. Two successive calls to interruptLeadership()
> will break autoRequeue
> If we set autoRequeue to TRUE. But during execution interruptLeadership()
> will be called from another thread before internalRequeue() completed its
> work. Then it will break recursive call to internalRequeue(), so that client
> will not ask for leadership and get stuck.
> We can solve this problem if we check hasLeadership() before calling
> interruptLeadership(). But it is strange that such check curator library does
> not do internally.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)