[ 
https://issues.apache.org/jira/browse/SENTRY-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441571#comment-16441571
 ] 

Na Li commented on SENTRY-2203:
-------------------------------

Log message shows the call stack when releasing the leader lock failed. It also 
shows that the reason of the failure is because CuratorFrameworkImpl was not in 
start state
{code}
2018-04-08 04:34:31,760 INFO 
sentry.org.apache.curator.framework.imps.CuratorFrameworkImpl: 
backgroundOperationsLoop exiting                       <-- CuratorFrameworkImpl 
is closed
2018-04-08 04:34:31,762 INFO 
org.apache.sentry.provider.db.service.persistent.LeaderStatusMonitor: 
LeaderStatusMonitor: interrupted
2018-04-08 04:34:31,762 INFO org.apache.sentry.service.thrift.SentryService: 
Attempting to stop sentry thrift service...
2018-04-08 04:34:31,762 INFO 
org.apache.sentry.provider.db.service.persistent.LeaderStatusMonitor: 
LeaderStatusMonitor: becoming standby
2018-04-08 04:34:31,762 INFO org.apache.sentry.service.thrift.SentryService: 
Attempting to stop sentry web service...
2018-04-08 04:34:31,762 ERROR 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector: The leader 
threw an exception
java.lang.IllegalStateException: instance must be started before calling this 
method
        at 
com.google.common.base.Preconditions.checkState(Preconditions.java:145)         
                                                                             
<-- CuratorFrameworkImpl is not in started state
        at 
sentry.org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:359)
        at 
sentry.org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:339)
        at 
sentry.org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:123)
        at 
sentry.org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
        at 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:427)
        at 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:444)
        at 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64)
        at 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245)
        at 
sentry.org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}
d) CuratorFrameworkImpl code
{code}
  public DeleteBuilder delete() {
    Preconditions.checkState(this.getState() == CuratorFrameworkState.STARTED, 
"instance must be started before calling this method");
    return new DeleteBuilderImpl(this);
  }
{code}

> Leader Lock is not released when Sentry service shuts down
> ----------------------------------------------------------
>
>                 Key: SENTRY-2203
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2203
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.1.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-2203.001.patch
>
>
> In our testing for sentry HA, we found after restarting sentry service 
> without restarting zookeeper service, it is possible that none of sentry 
> servers is elected as leader to sync with HMS.
> What happened was
> 1) When a leader is elected, the sentry server host holds the leader lock. 
> The lock is identified by the mutexPath. All sentry servers in a cluster use 
> the same mutexPath.
> 2) When sentry service is shutdown, the HAContext is shutdown, so its 
> contained CuratorFrameworkImpl was shutdown, but the leader lock was still 
> hold by the sentry server host 
> 3) When the Interruption signal from shutdown caused the leader election 
> thread to be interrupted, releasing the leader lock failed because 
> CuratorFrameworkImpl was not in started state. 
> 4) When sentry server restarts, acquiring the leader lock failed because it 
> was not released. So no active sentry servers is leader. 
> 5) If releasing leader lock happened before CuratorFrameworkImpl was 
> shutdown, this issue won't happen. If restarting zookeeper after sentry 
> service restart, this issue won't happen.
> To fix this issue,
> Sentry LeaderStatusMonitor can deactivate the leader to release the leader 
> lock when it is closed, so the leader lock can be guaranteed to release 
> before CuratorFrameworkImpl is shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to