[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has dead

Ted Yu (JIRA) Wed, 29 Jun 2016 19:49:11 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356350#comment-15356350
 ]


Ted Yu commented on HBASE-16144:
--------------------------------

{code}
+public class ReplicationZKLockCleanerChore extends ScheduledChore {
{code}
Add annotation for audience.
{code}
+          String[] array = rsServerNameZnode.split("/");
+          String znode = array[array.length - 1];
{code}
Should array.length be checked before accessing the array ?
{code}
+            if (s != null && System.currentTimeMillis() - s.getMtime() > TTL) {
{code}
Use EnvironmentEdge instead.
{code}
+        } catch (InterruptedException e) {
+          LOG.warn("zk operation interrupted", e);
{code}
Restore interrupt status.

> Replication queue's lock will live forever if RS acquiring the lock has dead
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-16144
>                 URL: https://issues.apache.org/jira/browse/HBASE-16144
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 1.1.5, 0.98.20
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>         Attachments: HBASE-16144-v1.patch
>
>
> In default, we will use multi operation when we claimQueues from ZK. But if 
> we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy 
> nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock 
> will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has dead

Reply via email to