[ https://issues.apache.org/jira/browse/CURATOR-518?focusedWorklogId=849101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849101 ]
ASF GitHub Bot logged work on CURATOR-518: ------------------------------------------ Author: ASF GitHub Bot Created on: 04/Mar/23 11:42 Start Date: 04/Mar/23 11:42 Worklog Time Spent: 10m Work Description: kezhuw commented on code in PR #446: URL: https://github.com/apache/curator/pull/446#discussion_r1125447099 ########## curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java: ########## @@ -237,13 +242,15 @@ private synchronized boolean internalRequeue() if ( !isQueued && (state.get() == State.STARTED) ) { isQueued = true; - Future<Void> task = executorService.submit(new Callable<Void>() + Future<?> oldTask = ourTask; + ourTask = executorService.submit(new Callable<Void>() { @Override public Void call() throws Exception { try { + waitTaskDone(oldTask, Duration.ofMillis(500), Duration.ofSeconds(5)); Review Comment: I thought about the guarantee `LeaderSelector` try to provide, and conclude to that: * It is crucial for `LeaderSelectorListener.takeLeadership` to not swallow interruption silently without return. Otherwise, states(e.g. `hasLeadership`, `mutex`) could be changed by dated task. * Interruption for `LeaderSelectorListener.takeLeadership` is a must regardless of leadership. Otherwise, `LeaderSelectorListener.takeLeadership` could run with dated leadership and no future interruption. * Without `autoRequeue`, `requeue` is hard to use. * An attempt could be cancelled due to session state change. * `requeue` after `LeaderSelector.interruptLeadership` might be a noop. For your questions: 1. Yes. It may be better to enforce only one `LeaderSelectorListener.takeLeadership` call simultaneously. But that could make this pr not concentrated. 2. Yes, but `LeaderSelector.interruptLeadership` is not enough as it could introduce dated leadership. I think I messed up things here, it might be better to make this pr more concentrated on CURATOR-518 and leaves other concerns(concurrency of `LeaderSelector.doWork` and `requeue` usage) in followups. Issue Time Tracking ------------------- Worklog Id: (was: 849101) Time Spent: 40m (was: 0.5h) > Curator. LeaderSelector. Two successive calls to interruptLeadership() will > break autoRequeue. > ---------------------------------------------------------------------------------------------- > > Key: CURATOR-518 > URL: https://issues.apache.org/jira/browse/CURATOR-518 > Project: Apache Curator > Issue Type: Improvement > Components: Recipes > Affects Versions: 4.0.1, 4.2.0 > Environment: Windows 8, JRE 1.8.0_181 > Reporter: Bulatov Oleg > Priority: Major > Labels: newbie > Time Spent: 40m > Remaining Estimate: 0h > > h1. Curator. LeaderSelector. Two successive calls to interruptLeadership() > will break autoRequeue > If we set autoRequeue to TRUE. But during execution interruptLeadership() > will be called from another thread before internalRequeue() completed its > work. Then it will break recursive call to internalRequeue(), so that client > will not ask for leadership and get stuck. > We can solve this problem if we check hasLeadership() before calling > interruptLeadership(). But it is strange that such check curator library does > not do internally. -- This message was sent by Atlassian Jira (v8.20.10#820010)