[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622125#comment-13622125 ] Hudson commented on HBASE-7871: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #476 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/476/]) HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) (Revision 1464012) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620739#comment-13620739 ] Anoop Sam John commented on HBASE-7871: --- Yes Ram. This fix should solve this test failure also. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620871#comment-13620871 ] Nicolas Liochon commented on HBASE-7871: Tested, no error on 250 tries = +1 for me. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621491#comment-13621491 ] Hudson commented on HBASE-7871: --- Integrated in hbase-0.95 #121 (See [https://builds.apache.org/job/hbase-0.95/121/]) HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) (Revision 1464013) Result = SUCCESS tedyu : Files : * /hbase/branches/0.95/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * /hbase/branches/0.95/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621498#comment-13621498 ] Hudson commented on HBASE-7871: --- Integrated in HBase-TRUNK #4009 (See [https://builds.apache.org/job/HBase-TRUNK/4009/]) HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) (Revision 1464012) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621775#comment-13621775 ] Hudson commented on HBASE-7871: --- Integrated in hbase-0.95-on-hadoop2 #53 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/53/]) HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) (Revision 1464013) Result = FAILURE tedyu : Files : * /hbase/branches/0.95/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * /hbase/branches/0.95/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620054#comment-13620054 ] Elliott Clark commented on HBASE-7871: -- getMetrics is called from a thread spawned by the Hadoop metrics system. The hadoop metrics system calls getMetrics to copy all of the values that a Source has. It's always a thread outside of the control of HBase. * Anything we do in hadoop1-compat will probably have to be done in hadoop 2. ** https://github.com/apache/hbase/blob/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L49 ** https://github.com/apache/hbase/blob/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L50 * I think we should go with reader writer locks inside of the Aggregate source, where the actual manipulation of the tree map happens. ** getMetrics takes a reader lock ** Adding or removing region sources would take a writer lock HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620100#comment-13620100 ] Nicolas Liochon commented on HBASE-7871: I've got some tests running, but I should be able to test it tomorrow. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620134#comment-13620134 ] Elliott Clark commented on HBASE-7871: -- Functionally it looks good though some comments about locking should be added. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620233#comment-13620233 ] Hadoop QA commented on HBASE-7871: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576633/7871-v3.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5099//console This message is automatically generated. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620296#comment-13620296 ] Hadoop QA commented on HBASE-7871: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576647/7871-v4.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestTableLockManager org.apache.hadoop.hbase.security.access.TestAccessController Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5101//console This message is automatically generated. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620573#comment-13620573 ] Anoop Sam John commented on HBASE-7871: --- Thanks [~eclark] for clarifying the usage. Yes I was also of the opinion of having a read write lock in such case. [~yuzhih...@gmail.com] I am +1 on the V4 patch. We will wait for the test from [~nkeywal] HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620614#comment-13620614 ] stack commented on HBASE-7871: -- [~liochon] Any luck w/ the testing? Patch looks good to me. +1 on trunk and 0.95 HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620627#comment-13620627 ] ramkrishna.s.vasudevan commented on HBASE-7871: --- Got this error in http://54.241.6.143/job/HBase-TRUNK/org.apache.hbase$hbase-server/74/testReport/junit/org.apache.hadoop.hbase.master/TestTableLockManager/testTableReadLock/ {code} java.lang.NullPointerException at java.util.TreeMap.rotateRight(TreeMap.java:2057) at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2199) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:962) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:863) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:147) {code} Hope this patch will fix this issue also. I think this was the reason for the TestTableLockManager to hang. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.1, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13615800#comment-13615800 ] Ted Yu commented on HBASE-7871: --- [~eclark]: Your opinion on this issue would be valuable. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614305#comment-13614305 ] Anoop Sam John commented on HBASE-7871: --- Thanks [~nkeywal] for testing. MetricsRegionAggregateSourceImpl#getMetrics() how this is being used? Only one doubt with me. The concurrent iterating over the TreeSet with a register/deregister can throw ConcurrentModificationException? May be [~eclark] can tell about the usage. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently
[ https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614360#comment-13614360 ] Ted Yu commented on HBASE-7871: --- I used the following command at the root of trunk to see where getMetrics(MetricsBuilder metricsBuilder, boolean all) is called: find . -name '*.java' -exec grep 'getMetrics(' {} \; -print | grep -v 'getMetrics()' I only saw reference in tests, such as hbase-hadoop1-compat/src/test/java/org/apache/hadoop/hbase/test/MetricsAssertHelperImpl.java, etc. HBase can be stuck when closing regions concurrently - Key: HBASE-7871 URL: https://issues.apache.org/jira/browse/HBASE-7871 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: Nicolas Liochon Assignee: Ted Yu Priority: Critical Fix For: 0.95.0, 0.98.0 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java The attached test fails ~1% of the the time on 0.96. It seems it does not fail on 0.94.5. It's simple: a table creation and some puts. I attach the stack. Logs says nothing it seems. The suspicious part is: {noformat} RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000] java.lang.Thread.State: RUNNABLE at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193) at java.util.TreeMap.deleteEntry(TreeMap.java:2151) at java.util.TreeMap.remove(TreeMap.java:585) at java.util.TreeSet.remove(TreeSet.java:259) at org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55) at org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86) at org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969) - locked 0x0006944e2558 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira