I've upgraded to curator 2.3.0. LeaderSelector still uses thread interrupting for signaling to the thread running takeLeadership() to stop, right? Inside my takeLeadership I do some database operations, and before commiting I'm checking if I was interrupted, and roll back if I was. However, some code in between clears the interrupt flag (i.e. logback does this), so I'm committing even though I lost/suspended the connection.
I need some other criteria to decide if I can commit or not. hasLeadership only checks a local flag, which is always true inside takeLeadership(). Do I have another flag I can check? -- Henrik Nordvik On Tue, Nov 5, 2013 at 5:21 PM, Jordan Zimmerman <[email protected] > wrote: > This sounds like a variation of > https://issues.apache.org/jira/browse/CURATOR-54 - The next release of > Curator (later this week) provides a more robust way of canceling > leadership that doesn’t require thread interruption. > > -Jordan > > On Nov 5, 2013, at 1:47 AM, Henrik Nordvik <[email protected]> wrote: > > Hi, > > I'm getting some strange behaviour when stopping zookeeper in one > environment that I can't reproduce locally. > The result is that the leader selector "quits" even though it is set as > auto-requeue. (I think that happens because the retry loop inside > LeaderSelector checks the interrupt-flag, which is set again even when I > cleared it). > > I think it boils down to getting > > 2013-11-04 18:22:32,501 INFO [main-EventThread ] > c.n.c.f.state.ConnectionStateManager - State change: LOST > 2013-11-04 18:22:32,501 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - Interrupting thread > Thread[LeaderSelector-0,5,main] > 2013-11-04 18:22:32,503 INFO [main-EventThread ] > c.n.c.f.state.ConnectionStateManager - State change: SUSPENDED > 2013-11-04 18:22:32,504 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - Interrupting thread > Thread[LeaderSelector-0,5,main] > > ... then I handle the interrupt in the leader thread. > > Then I get this: > 2013-11-04 18:22:36,465 INFO [main-EventThread ] > c.n.c.f.state.ConnectionStateManager - State change: LOST > 2013-11-04 18:22:36,465 INFO [main-EventThread ] > c.n.c.f.state.ConnectionStateManager - State change: SUSPENDED > 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - StateChanged: LOST > 2013-11-04 18:22:36,465 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - Interrupting thread > Thread[LeaderSelector-0,5,main] > 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - StateChanged: SUSPENDED > 2013-11-04 18:22:36,466 DEBUG [ectionStateManager-0] > s.f.s.a.feed.MyListener - Interrupting thread > Thread[LeaderSelector-0,5,main] > > > Full log is here: https://gist.github.com/zerd/7316258 > > The code follows the old leader selector example pretty well: > > @Override > public void takeLeadership(CuratorFramework curatorFramework) throws > Exception { > ourThread = Thread.currentThread(); > logger.debug(format("(%s) Got leadership", ourThread)); > try { > waitForAndPerformWork(); > } catch (InterruptedException e) { > logger.debug(format("(%s) Interrupted ", ourThread), e); > } finally { > logger.debug(format("(%s) No longer leader", ourThread)); > } > } > > @Override > public void stateChanged(CuratorFramework curatorFramework, > ConnectionState newState) { > logger.debug("StateChanged: " + newState); > > if ((newState == ConnectionState.LOST) || (newState == > ConnectionState.SUSPENDED)) { > if (ourThread != null) { > logger.debug("Interrupting thread " + ourThread); > ourThread.interrupt(); > } else { > logger.debug("Thread is null"); > } > } > } > > Is it supposed to go back and forth from lost to suspended? > My goal is to get it to resume trying to get the leadership when zookeeper > comes back. Do I have to requeue it manually when this happens? > Would upgrading to latest curator with CancelLeadershipException fix this? > > Thank you very much for your time. > > -- > Henrik Nordvik > > >
