[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918945#comment-13918945
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in hbase-0.96-hadoop2 #223 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/223/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573727)
* /hbase/branches/0.96
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e423

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918946#comment-13918946
 ] 

Hudson commented on HBASE-10632:


SUCCESS: Integrated in hbase-0.96 #324 (See 
[https://builds.apache.org/job/hbase-0.96/324/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573727)
* /hbase/branches/0.96
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.'

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918926#comment-13918926
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #106 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/106/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573723)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918918#comment-13918918
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #184 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/184/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573725)
* /hbase/branches/0.98
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68a

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918868#comment-13918868
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-0.98 #196 (See 
[https://builds.apache.org/job/HBase-0.98/196/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573725)
* /hbase/branches/0.98
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.'

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918713#comment-13918713
 ] 

Hudson commented on HBASE-10632:


FAILURE: Integrated in HBase-TRUNK #4973 (See 
[https://builds.apache.org/job/HBase-TRUNK/4973/])
HBASE-10632 Region lost in limbo after ArrayIndexOutOfBoundsException during 
assignment (enis: rev 1573723)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java


> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918333#comment-13918333
 ] 

stack commented on HBASE-10632:
---

Skimmed patch.  lgtm.  +1 for 0.96.  Thanks.

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
> ...
> 2014

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917721#comment-13917721
 ] 

Andrew Purtell commented on HBASE-10632:


Thanks Enis, very much appreciated

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
> ..

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-03-02 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917718#comment-13917718
 ] 

Enis Soztutar commented on HBASE-10632:
---

bq. Will there be patches here or should we have backport issues?
Attached patch applies to trunk, 0.98 and 0.96. 
Will commit it tomorrow. 

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916759#comment-13916759
 ] 

Andrew Purtell commented on HBASE-10632:


bq. This should affect 0.98 and 0.96 code lines as well.

Will there be patches here or should we have backport issues?

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
> state=OFFLI

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916174#comment-13916174
 ] 

Enis Soztutar commented on HBASE-10632:
---

This should affect 0.98 and 0.96 code lines as well. 

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.96.2, 0.98.1, 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,139

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915501#comment-13915501
 ] 

Hadoop QA commented on HBASE-10632:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12631658/hbase-10632_v1.patch
  against trunk revision .
  ATTACHMENT ID: 12631658

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  regionsPerServer[serverIndex] = new int[entry.getValue().size() 
+ regionsPerServer[serverIndex].length];
+(serversToIndex.get(loc.get(i).getHostAndPort()) == null ? 
-1 : serversToIndex.get(loc.get(i).getHostAndPort()));

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8845//console

This message is automatically generated.

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> 

[jira] [Commented] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915439#comment-13915439
 ] 

Ted Yu commented on HBASE-10632:


lgtm
nit:
{code}
+(serversToIndex.get(loc.get(i).getHostAndPort()) == null ? 
-1 : serversToIndex.get(loc.get(i).getHostAndPort()));
{code}

> Region lost in limbo after ArrayIndexOutOfBoundsException during assignment
> ---
>
> Key: HBASE-10632
> URL: https://issues.apache.org/jira/browse/HBASE-10632
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: hbase-10070
>Reporter: Nick Dimiduk
>Assignee: Enis Soztutar
> Fix For: 0.99.0, hbase-10070
>
> Attachments: hbase-10632_v1.patch
>
>
> Discovered while running IntegrationTestBigLinkedList. Region 
> 24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
> hor13n13. During the process an exception is thrown.
> {noformat}
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
> for hor13n19.gq1.ygridcore.net,60020,1393341563552
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning 7 region(s) that 
> hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
> that were opening on this server)
> 2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> handler.ServerShutdownHandler: Reassigning region with rs = 
> {24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node 
> if exists
> 2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
> state=OPENING, ts=1393342207107, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
> {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> 2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
> master.AssignmentManager: Znode 
> IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
>  deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
> ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
> executor.EventHandler: Caught throwable while processing event 
> M_SERVER_SHUTDOWN
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
>   at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
>   at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {noformat}
> After that, region is left in limbo and is never reassigned.
> {noformat}
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.HMaster: Client=hrt_qa//68.142.246.29 move 
> hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
>  src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
> dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
> 2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
> master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
> 24d68aa7239824e42390a77b7212fcbf, NAME => 
> 'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
>  STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
> state=OFFLINE, ts=1393342242623, 
> server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
> ...
> 2014-02-25 15:35:26,586 DEBUG 
> [hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] 
> master.HMaster: Not running balancer because 1 region(s) in transition: 
> {24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
> sta