[ https://issues.apache.org/jira/browse/HBASE-20597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493845#comment-16493845 ]
Sean Busbey commented on HBASE-20597: ------------------------------------- bookeeper jobs were stomping on system resources last week, though I don't know if that's specifically what caused process killing here. > Use a lock to serialize access to a shared reference to ZooKeeperWatcher in > HBaseReplicationEndpoint > ---------------------------------------------------------------------------------------------------- > > Key: HBASE-20597 > URL: https://issues.apache.org/jira/browse/HBASE-20597 > Project: HBase > Issue Type: Bug > Components: Replication > Affects Versions: 1.3.2, 1.4.4 > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Minor > Fix For: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 2.0.1, 1.4.5 > > Attachments: HBASE-20597-branch-1.addendum-v2.0.patch, > HBASE-20597-branch-1.patch, HBASE-20597.addendum.0.patch, HBASE-20597.patch > > > The code that closes down a ZKW that fails to initialize when attempting to > connect to the remote cluster is not MT safe and can in theory leak > ZooKeeperWatcher instances. The allocation of a new ZKW and store to the > reference is not atomic. Might have concurrent allocations with only one > winning store, leading to leaked ZKW instances. -- This message was sent by Atlassian JIRA (v7.6.3#76005)