[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14370: --- Resolution: Fixed Status: Resolved (was: Patch Available) Why is this open again? Commit, then resolve. Don't commit and not resolve. Thanks for helping RMs keep JIRA clean! > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 1.3.0, 0.98.15 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch, > test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Status: Patch Available (was: Reopened) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 0.98.15 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch, > test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Fix Version/s: 1.3.0 > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 1.3.0, 0.98.15 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch, > test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-branch-1-v10.txt Re-attaching for test run. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 0.98.15 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch, > test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14370: --- Resolution: Fixed Status: Resolved (was: Patch Available) This has been committed, the issue should be resolved as Fixed. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, > 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, > 14370-wait-nofity.txt, hbase-14370_v4.patch, test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14370: --- Fix Version/s: (was: 1.1.3) (was: 1.0.3) (was: 1.3.0) (was: 1.2.0) Update fix versions when committed to other branches > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 0.98.15 > > Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, > 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, > 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, > 14370-wait-nofity.txt, hbase-14370_v4.patch, test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Status: Patch Available (was: Open) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, > 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch, test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-branch-1-v10.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, > 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch, test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: test-acl3-branch-1.stack > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch, > test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Hadoop Flags: Reviewed Fix Version/s: 1.1.3 1.0.3 0.98.15 1.3.0 1.2.0 2.0.0 [~apurtell]: I am preparing backport to 0.98 branch. Let me know if you have comments. Thanks > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: 14370-branch-1-v10.txt, 14370-branch-1-v10.txt, > 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch, test-acl3-branch-1.stack > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v7.txt Patch v7 adds preemption for nodeChildrenChanged events on top of patch v5. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v8.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: (was: 14370-v8.txt) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v8.txt Patch v8 moves the private runnable inside nodeChildrenChanged() with some comment explaining what it does. Hopefully this makes code more readble. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, > 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v10.txt In patch v10, added test where double ref decrement leads to region server abort. Also verified that after tables are deleted, total reference count comes down to 0 in normal operation. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 14370-v5.txt, > 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Status: Open (was: Patch Available) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 14370-v5.txt, > 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, > hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-branch-1-v10.txt Patch for branch-1. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, > 14370-v3.txt, 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-wait-nofity-v2.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-wait-nofity-v2.txt, > 14370-wait-nofity.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-14370: -- Attachment: hbase-14370_v4.patch Ted, can you check the v4 patch to see my points above. I did not run any tests on it though. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-wait-nofity-v2.txt, > 14370-wait-nofity.txt, hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v5.txt Patch v5 adds call to watcher.abort() in TableAuthManager#release() when ref counting assertion fails. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-v5.txt, > 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt, hbase-14370_v4.patch > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: (was: 14370-v3.txt) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v3.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v3.txt See if patch v3 is better. > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-wait-nofity.txt Please take a look at 14370-wait-nofity.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt, 14370-v3.txt, 14370-wait-nofity.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Issue Type: Bug (was: Improvement) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v1.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Status: Patch Available (was: Open) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: 14370-v1.txt > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
[ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14370: --- Attachment: (was: 14370-v1.txt) > Use separate thread for calling ZKPermissionWatcher#refreshNodes() > -- > > Key: HBASE-14370 > URL: https://issues.apache.org/jira/browse/HBASE-14370 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.0 >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14370-v1.txt > > > I came off a support case (0.98.0) where main zk thread was seen doing the > following: > {code} > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135) > at > org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > There were 62000 nodes under /acl due to lack of fix from HBASE-12635, > leading to slowness in table creation because zk notification for region > offline was blocked by the above. > The attached patch separates refreshNodes() call into its own thread. > Thanks to Enis and Devaraj for offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)