[ 
https://issues.apache.org/jira/browse/SENTRY-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Li updated SENTRY-2203:
--------------------------
    Description: 
In our testing for sentry HA, we found after restarting sentry service without 
restarting zookeeper service, it is possible that none of sentry servers is 
elected as leader to sync with HMS.

What happened was
1) When a leader is elected, the sentry server host holds the leader lock. The 
lock is identified by the mutexPath. All sentry servers in a cluster use the 
same mutexPath.
2) When sentry service is shutdown, the HAContext is shutdown, so its contained 
CuratorFrameworkImpl was shutdown, but the leader lock was still hold by the 
sentry server host 
3) When the Interruption signal from shutdown caused the leader election thread 
to be interrupted, releasing the leader lock failed because 
CuratorFrameworkImpl was not in started state. 
4) When sentry server restarts, acquiring the leader lock failed because it was 
not released. So no active sentry servers is leader. 
5) If releasing leader lock happened before CuratorFrameworkImpl was shutdown, 
this issue won't happen. If restarting zookeeper after sentry service restart, 
this issue won't happen.

To fix this issue,

Sentry LeaderStatusMonitor can deactivate the leader to release the leader lock 
when it is closed, so the leader lock can be guaranteed to release before 
CuratorFrameworkImpl is shutdown.

  was:
In our testing for sentry HA, we found after restarting sentry service without 
restarting zookeeper service, it is possible that none of sentry servers is 
elected as leader.

What happened was
1) When a leader is elected, the sentry server host holds the leader lock. The 
lock is identified by the mutexPath. All sentry servers in a cluster use the 
same mutexPath.
2) When sentry service is shutdown, the HAContext is shutdown, so its contained 
CuratorFrameworkImpl was shutdown, but the leader lock was still hold by the 
sentry server host 
3) When the Interruption signal from shutdown caused the leader election thread 
to be interrupted, releasing the leader lock failed because 
CuratorFrameworkImpl was not in started state. 
4) When sentry server restarts, acquiring the leader lock failed because it was 
not released. So no active sentry servers is leader. 
5) If releasing leader lock happened before CuratorFrameworkImpl was shutdown, 
this issue won't happen. If restarting zookeeper after sentry service restart, 
this issue won't happen.

To fix this issue,

Sentry LeaderStatusMonitor can deactivate the leader to release the leader lock 
when it is closed, so the leader lock can be guaranteed to release before 
CuratorFrameworkImpl is shutdown.


> Leader Lock is not released when Sentry service shuts down
> ----------------------------------------------------------
>
>                 Key: SENTRY-2203
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2203
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.1.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>
> In our testing for sentry HA, we found after restarting sentry service 
> without restarting zookeeper service, it is possible that none of sentry 
> servers is elected as leader to sync with HMS.
> What happened was
> 1) When a leader is elected, the sentry server host holds the leader lock. 
> The lock is identified by the mutexPath. All sentry servers in a cluster use 
> the same mutexPath.
> 2) When sentry service is shutdown, the HAContext is shutdown, so its 
> contained CuratorFrameworkImpl was shutdown, but the leader lock was still 
> hold by the sentry server host 
> 3) When the Interruption signal from shutdown caused the leader election 
> thread to be interrupted, releasing the leader lock failed because 
> CuratorFrameworkImpl was not in started state. 
> 4) When sentry server restarts, acquiring the leader lock failed because it 
> was not released. So no active sentry servers is leader. 
> 5) If releasing leader lock happened before CuratorFrameworkImpl was 
> shutdown, this issue won't happen. If restarting zookeeper after sentry 
> service restart, this issue won't happen.
> To fix this issue,
> Sentry LeaderStatusMonitor can deactivate the leader to release the leader 
> lock when it is closed, so the leader lock can be guaranteed to release 
> before CuratorFrameworkImpl is shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to