[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622125#comment-13622125
 ] 

Hudson commented on HBASE-7871:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #476 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/476/])
HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) 
(Revision 1464012)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-03 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620739#comment-13620739
 ] 

Anoop Sam John commented on HBASE-7871:
---

Yes Ram. This fix should solve this test failure also.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-03 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620871#comment-13620871
 ] 

Nicolas Liochon commented on HBASE-7871:


Tested, no error on 250 tries = +1 for me.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621491#comment-13621491
 ] 

Hudson commented on HBASE-7871:
---

Integrated in hbase-0.95 #121 (See 
[https://builds.apache.org/job/hbase-0.95/121/])
HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) 
(Revision 1464013)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.95/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
/hbase/branches/0.95/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621498#comment-13621498
 ] 

Hudson commented on HBASE-7871:
---

Integrated in HBase-TRUNK #4009 (See 
[https://builds.apache.org/job/HBase-TRUNK/4009/])
HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) 
(Revision 1464012)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621775#comment-13621775
 ] 

Hudson commented on HBASE-7871:
---

Integrated in hbase-0.95-on-hadoop2 #53 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/53/])
HBASE-7871 HBase can be stuck when closing regions concurrently (Ted Yu) 
(Revision 1464013)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.95/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java
* 
/hbase/branches/0.95/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java


 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620054#comment-13620054
 ] 

Elliott Clark commented on HBASE-7871:
--

getMetrics is called from a thread spawned by the Hadoop metrics system.  The 
hadoop metrics system calls getMetrics to copy all of the values that a Source 
has.  It's always a thread outside of the control of HBase.

* Anything we do in hadoop1-compat will probably have to be done in hadoop 2.
** 
https://github.com/apache/hbase/blob/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L49
** 
https://github.com/apache/hbase/blob/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L50
* I think we should go with reader writer locks inside of the Aggregate source, 
where the actual manipulation of the tree map happens.
** getMetrics takes a reader lock
** Adding or removing region sources would take a writer lock

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620100#comment-13620100
 ] 

Nicolas Liochon commented on HBASE-7871:


I've got some tests running, but I should be able to test it tomorrow.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, 
 TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620134#comment-13620134
 ] 

Elliott Clark commented on HBASE-7871:
--

Functionally it looks good though some comments about locking should be added.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, 
 TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620233#comment-13620233
 ] 

Hadoop QA commented on HBASE-7871:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576633/7871-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5099//console

This message is automatically generated.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, s1.txt, 
 TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 

[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620296#comment-13620296
 ] 

Hadoop QA commented on HBASE-7871:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576647/7871-v4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestTableLockManager
  org.apache.hadoop.hbase.security.access.TestAccessController

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5101//console

This message is automatically generated.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 

[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620573#comment-13620573
 ] 

Anoop Sam John commented on HBASE-7871:
---

Thanks [~eclark] for clarifying the usage. Yes I was also of the opinion of 
having a read write lock in such case.
[~yuzhih...@gmail.com] I am +1 on the V4 patch. We will wait for the test from 
[~nkeywal]

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620614#comment-13620614
 ] 

stack commented on HBASE-7871:
--

[~liochon] Any luck w/ the testing?  Patch looks good to me.  +1 on trunk and 
0.95

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-04-02 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620627#comment-13620627
 ] 

ramkrishna.s.vasudevan commented on HBASE-7871:
---

Got this error in 
http://54.241.6.143/job/HBase-TRUNK/org.apache.hbase$hbase-server/74/testReport/junit/org.apache.hadoop.hbase.master/TestTableLockManager/testTableReadLock/
{code}
java.lang.NullPointerException
at java.util.TreeMap.rotateRight(TreeMap.java:2057)
at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2199)
at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
at java.util.TreeMap.remove(TreeMap.java:585)
at java.util.TreeSet.remove(TreeSet.java:259)
at 
org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
at 
org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
at 
org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
at 
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:962)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:863)
at 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:147)
{code}
Hope this patch will fix this issue also. 
I think this was the reason for the TestTableLockManager to hang.  

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.1, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, 7871-v3.txt, 7871-v4.txt, 
 s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-03-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13615800#comment-13615800
 ] 

Ted Yu commented on HBASE-7871:
---

[~eclark]:
Your opinion on this issue would be valuable.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-03-26 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614305#comment-13614305
 ] 

Anoop Sam John commented on HBASE-7871:
---

Thanks [~nkeywal] for testing.
MetricsRegionAggregateSourceImpl#getMetrics() how this is being used?
Only one doubt with me. The concurrent iterating over the TreeSet with a 
register/deregister can throw ConcurrentModificationException?  May be 
[~eclark] can tell about the usage.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7871) HBase can be stuck when closing regions concurrently

2013-03-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614360#comment-13614360
 ] 

Ted Yu commented on HBASE-7871:
---

I used the following command at the root of trunk to see where 
getMetrics(MetricsBuilder metricsBuilder, boolean all) is called:

find . -name '*.java' -exec grep 'getMetrics(' {} \; -print | grep -v 
'getMetrics()'

I only saw reference in tests, such as 
hbase-hadoop1-compat/src/test/java/org/apache/hadoop/hbase/test/MetricsAssertHelperImpl.java,
 etc.

 HBase can be stuck when closing regions concurrently 
 -

 Key: HBASE-7871
 URL: https://issues.apache.org/jira/browse/HBASE-7871
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Nicolas Liochon
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.95.0, 0.98.0

 Attachments: 7871.patch, 7871-v2.patch, s1.txt, TestStartStop.java


 The attached test fails ~1% of the the time on 0.96. It seems it does not 
 fail on 0.94.5. It's simple: a table creation and some puts.
 I attach the stack. Logs says nothing it seems.
 The suspicious part is:
 {noformat}
 RS_CLOSE_REGION-localhost,57575,1361197489166-2 prio=10 
 tid=0x7fb0c8775800 nid=0x61ac runnable [0x7fb09f272000]
java.lang.Thread.State: RUNNABLE
 at java.util.TreeMap.fixAfterDeletion(TreeMap.java:2193)
 at java.util.TreeMap.deleteEntry(TreeMap.java:2151)
 at java.util.TreeMap.remove(TreeMap.java:585)
 at java.util.TreeSet.remove(TreeSet.java:259)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:55)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:86)
 at 
 org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1063)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:969)
 - locked 0x0006944e2558 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:146)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:203)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira