[
https://issues.apache.org/jira/browse/CURATOR-518?focusedWorklogId=849101&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849101
]
ASF GitHub Bot logged work on CURATOR-518:
------------------------------------------
Author: ASF GitHub Bot
Created on: 04/Mar/23 11:42
Start Date: 04/Mar/23 11:42
Worklog Time Spent: 10m
Work Description: kezhuw commented on code in PR #446:
URL: https://github.com/apache/curator/pull/446#discussion_r1125447099
##########
curator-recipes/src/main/java/org/apache/curator/framework/recipes/leader/LeaderSelector.java:
##########
@@ -237,13 +242,15 @@ private synchronized boolean internalRequeue()
if ( !isQueued && (state.get() == State.STARTED) )
{
isQueued = true;
- Future<Void> task = executorService.submit(new Callable<Void>()
+ Future<?> oldTask = ourTask;
+ ourTask = executorService.submit(new Callable<Void>()
{
@Override
public Void call() throws Exception
{
try
{
+ waitTaskDone(oldTask, Duration.ofMillis(500),
Duration.ofSeconds(5));
Review Comment:
I thought about the guarantee `LeaderSelector` try to provide, and conclude
to that:
* It is crucial for `LeaderSelectorListener.takeLeadership` to not swallow
interruption silently without return. Otherwise, states(e.g. `hasLeadership`,
`mutex`) could be changed by dated task.
* Interruption for `LeaderSelectorListener.takeLeadership` is a must
regardless of leadership. Otherwise, `LeaderSelectorListener.takeLeadership`
could run with dated leadership and no future interruption.
* Without `autoRequeue`, `requeue` is hard to use.
* An attempt could be cancelled due to session state change.
* `requeue` after `LeaderSelector.interruptLeadership` might be a noop.
For your questions:
1. Yes. It may be better to enforce only one
`LeaderSelectorListener.takeLeadership` call simultaneously. But that could
make this pr not concentrated.
2. Yes, but `LeaderSelector.interruptLeadership` is not enough as it could
introduce dated leadership.
I think I messed up things here, it might be better to make this pr more
concentrated on CURATOR-518 and leaves other concerns(concurrency of
`LeaderSelector.doWork` and `requeue` usage) in followups.
Issue Time Tracking
-------------------
Worklog Id: (was: 849101)
Time Spent: 40m (was: 0.5h)
> Curator. LeaderSelector. Two successive calls to interruptLeadership() will
> break autoRequeue.
> ----------------------------------------------------------------------------------------------
>
> Key: CURATOR-518
> URL: https://issues.apache.org/jira/browse/CURATOR-518
> Project: Apache Curator
> Issue Type: Improvement
> Components: Recipes
> Affects Versions: 4.0.1, 4.2.0
> Environment: Windows 8, JRE 1.8.0_181
> Reporter: Bulatov Oleg
> Priority: Major
> Labels: newbie
> Time Spent: 40m
> Remaining Estimate: 0h
>
> h1. Curator. LeaderSelector. Two successive calls to interruptLeadership()
> will break autoRequeue
> If we set autoRequeue to TRUE. But during execution interruptLeadership()
> will be called from another thread before internalRequeue() completed its
> work. Then it will break recursive call to internalRequeue(), so that client
> will not ask for leadership and get stuck.
> We can solve this problem if we check hasLeadership() before calling
> interruptLeadership(). But it is strange that such check curator library does
> not do internally.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)