[ 
https://issues.apache.org/jira/browse/HBASE-22079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799237#comment-16799237
 ] 

Yu Li commented on HBASE-22079:
-------------------------------

bq. I see at least one leak with client ZK quorum stuff, but we don't use that. 
The {{clientZKWatcher}} will not be created unless 
{{hbase.client.zookeeper.quorum}} is set, so it's strange to me that by default 
this watcher is observed there, maybe worth a double check on the configuration.

bq. And what is the client zk watcher used for?
After HBASE-20159, if {{hbase.client.zookeeper.quorum}} is set but 
{{hbase.client.zookeeper.observer.mode}} not, {{HMaster}} will take care of 
watching and synchronizing master/meta address to client zookeeper.

bq. We do not close the MetaLocationSyncer?
Correct, in {{ClientZKSyncer$ClientZkUpdater#run}} the {{while}} loop will exit 
if server is stopped, so we didn't add explicit {{stop}} method for it. 
However, after a second look, it's true that the {{clientZKWatcher}} is leaked, 
and I think the fix here is necessary. The only strange thing is why this 
client zk watcher is started w/o setting {{hbase.client.zookeeper.quorum}} as 
per Sergey mentioned...

> master leaks ZK on shutdown and gets stuck because of netty threads if netty 
> socket is used
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-22079
>                 URL: https://issues.apache.org/jira/browse/HBASE-22079
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Major
>         Attachments: HBASE-22079.patch
>
>
> {noformat}
> "master/...:17000:becomeActiveMaster-SendThread(...1)" #311 daemon prio=5 
> os_prio=0 tid=0x0000000058c61800 nid=0x2dd0 waiting on condition 
> [0x0000000c477fe000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000c4a5b3c0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>       at 
> java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
>       at 
> java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
>       at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:232)
>       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
> {noformat}
> This causes a bunch of netty threads to also leak it looks like, and these 
> are not daemon (by design, apparently)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to