[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2020-11-17 Thread Truong Duc Kien (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17233520#comment-17233520
 ] 

Truong Duc Kien commented on HBASE-20552:
-

Possible related issue (already fixed)

https://issues.apache.org/jira/browse/HBASE-21421

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2020-07-10 Thread Josh Elser (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155669#comment-17155669
 ] 

Josh Elser commented on HBASE-20552:


[~shenshengli], what version are you on, by chance? I haven't seen this 
recently.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2020-06-28 Thread shenshengli (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147512#comment-17147512
 ] 

shenshengli commented on HBASE-20552:
-

I reproduced the problem in my own environment, with more than 10,000 regions 
on each RS.By adjusting the parameters of ’hbase.regionserver.msginterval‘, it 
from 3 s and 30 s, greatly reduces the risk of the problem.Conversely, if you 
go from 3s down to below 1s, this is almost certainly going to happen.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-21 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482748#comment-16482748
 ] 

Josh Elser commented on HBASE-20552:


Just to give you all an update, Ted and I have both made some internal changes 
to HBase to try to get some more insight around this if it happens again.

I ran through a dozen or so test scenarios end of last week, none of which 
showed this again. I'm apt to close this one as CannotRepro for now.

Can try to bring these extra debugging stuff out to Apache if y'all think it 
would be beneficial.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-18 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480777#comment-16480777
 ] 

Josh Elser commented on HBASE-20552:


Ok! Thanks for the info, Umesh. Glad we both worked towards the same 
conclusion. I feel a little bit better knowing that at least we think we did 
the right thing in HBase.

[~stack], my understanding is that HDFS is 3.1.0ish. I'm not sure, but your 
thinking does seem reasonable. I have a system up trying to get a live 
environment (so I can poke the pv2 WAL), but that's also been unsuccessful for 
me.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479955#comment-16479955
 ] 

stack commented on HBASE-20552:
---

Long shot: Something up w/ your HDFS over there [~elserj] and crew where lease 
recovery is dropping the end of the WAL? Doing some bad math on file length?  
It does it for two different logs here... The Master WAL Proc and the 
regionserver hosting hbase:meta's WAL. Some interesting version of HDFS? Thanks.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-17 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479831#comment-16479831
 ] 

Umesh Agashe commented on HBASE-20552:
--

[~elserj], I don't have a repro. I thought I had a repro but it was due to the 
bug which was inadvertently introduced in recent commit and got fixed in 
addendum (HBASE-20564). So far I found 2 instances of missing edits around the 
same time. First, in master proc wal where 003 is not able to read pids 468 
onwards. And second, in meta region:

pid=475 on 005 started with:
{code:java}
2018-05-02 05:39:45,811 INFO  [PEWorker-6] assignment.AssignProcedure: Starting 
pid=475, ppid=471, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28; rit=OFFLINE, 
location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502; 
forceNewPlan=false, retain=true
{code}
After this it was updated twice on 005:
{code:java}
2018-05-02 05:39:45,983 INFO  [PEWorker-1] assignment.RegionStateStore: pid=475 
updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPENING
2018-05-02 05:39:46,580 INFO  [PEWorker-1] assignment.RegionStateStore: pid=475 
updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, 
openSeqNum=13401, 
regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474
{code}
But when 003 read and printed meta, it has:
{code:java}
2018-05-02 05:44:08,236 INFO  
[master/ctr-e138-1518143905142-279227-01-03:2] 
assignment.RegionStateStore: Load hbase:meta entry 
region=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, 
lastHost=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502
{code}
The location server including timestamp matches to when pid=471 started 
"location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502".
 So 2 updates from pid=471 to meta are missing.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-17 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479767#comment-16479767
 ] 

Josh Elser commented on HBASE-20552:


Trying to help pick this one up too..
{quote}bq. On M005, pid=471 is SCP for R007 which also hosts meta. Meta is 
re-assigned with pid=472 to R002 which is followed by other region assignments
{quote}
I'm coming to think this is our problem, too.

pid=471 is an SCP for r007 from pv2-004.log which finished at  05:39:47,288 on 
m005. When m003 takes over and reads the tracker from pv2-002.log, the largest 
pid we have is pid=467.

My hunch (which I need to back up with code) is that because m003 never sees 
the completed SCP, it thinks that r002 is holding this region (overriding what 
meta say, maybe?), claiming it to be on r007 instead. The following is the 
"largest" proc from the pv2-004 log that m003 reads.
{noformat}
2018-05-02 05:43:33,876 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=465, state=SUCCESS; 
MoveRegionProcedure hri=94f6ca283dbb4445b2bcdc321b734d28, 
source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502{noformat}
Then, m003 initializes RegionStateStore, saying:
{noformat}
2018-05-02 05:44:08,236 INFO  
[master/ctr-e138-1518143905142-279227-01-03:2] 
assignment.RegionStateStore: Load hbase:meta entry 
region=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, 
lastHost=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502
{noformat}
This makes me wonder if Umesh's findings about pid=507 (SCP for r007 putting 
the region back on r007) are related...

You get anywhere on a repro, [~uagashe]? I have some nodes running through this 
internal scenario which has triggered this before. Might try my hand at 
repro'ing in an IT, but unsure how hard that will be ;)

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478047#comment-16478047
 ] 

Ted Yu commented on HBASE-20552:


The tests were run using branch-2.0 code

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-16 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478044#comment-16478044
 ] 

Umesh Agashe commented on HBASE-20552:
--

[~yuzhih...@gmail.com], Just want to confirm that you saw this on branch-2.0 or 
master?

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473180#comment-16473180
 ] 

Ted Yu commented on HBASE-20552:


w.r.t. the warning from ProcWal, I saw the following in a successful run 
(another test run):
{code}
2018-05-09 01:39:58,463 INFO  
[master/ctr-e138-1518143905142-296213-01-03:2] 
wal.ProcedureWALFormatReader: Rebuilding tracker for 
hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0001.log
2018-05-09 01:39:58,550 WARN  
[master/ctr-e138-1518143905142-296213-01-03:2] 
wal.ProcedureWALFormatReader: Nothing left to decode. Exiting with missing EOF, 
log=hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0001.log
2018-05-09 01:39:58,659 DEBUG 
[master/ctr-e138-1518143905142-296213-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=40, state=SUCCESS; 
ServerCrashProcedure 
server=ctr-e138-1518143905142-296213-01-03.hwx.site,16020,1525829363193, 
splitWal=true, meta=false
{code}
I am not sure if the 'Nothing left to decode' was related to the cause of this 
issue (unexpected state).

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472885#comment-16472885
 ] 

Umesh Agashe commented on HBASE-20552:
--

I think its real problem in the code. Working on repro and the patch.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472867#comment-16472867
 ] 

stack commented on HBASE-20552:
---

Go [~uagashe]!

Usually we'll complain if we fail to read procedures from Master WAL. Any 
evidence of us skipping Procedure steps?

(Seems like a good one!!)

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472482#comment-16472482
 ] 

Umesh Agashe commented on HBASE-20552:
--

Further, M003 starts SCP with pid=507 for R007:
{code:java}
2018-05-02 05:44:08,413 INFO  [PEWorker-6] procedure.ServerCrashProcedure: 
Start pid=507, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure 
server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, 
splitWal=true, meta=false{code}
This starts AssignProcedure with pid=508 for region 
94f6ca283dbb4445b2bcdc321b734d28:
{code:java}
2018-05-02 05:44:08,480 INFO  [PEWorker-6] assignment.AssignProcedure: Starting 
pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28; rit=OFFLINE, 
location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502; 
forceNewPlan=false, retain=true
2018-05-02 05:44:08,659 INFO  [PEWorker-11] assignment.RegionStateStore: 
pid=508 updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, 
regionState=OPENING, 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353
2018-05-02 05:44:08,727 INFO  [PEWorker-11] 
assignment.RegionTransitionProcedure: Dispatch pid=508, ppid=507, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, 
location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353
...
2018-05-02 05:44:09,213 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
assignment.RegionTransitionProcedure: Received report OPENED seqId=13402, 
pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, 
location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353
2018-05-02 05:44:09,213 DEBUG [PEWorker-12] 
assignment.RegionTransitionProcedure: Finishing pid=508, ppid=507, 
state=RUNNABLE:REGION_TRANSITION_FINISH; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, 
location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353
2018-05-02 05:44:09,214 INFO [PEWorker-12] assignment.RegionStateStore: pid=508 
updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, 
openSeqNum=13402, 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353
2018-05-02 05:44:09,258 INFO [PEWorker-12] procedure2.ProcedureExecutor: 
Finished subprocedure(s) of pid=507, state=RUNNABLE:SERVER_CRASH_HANDLE_RIT2; 
ServerCrashProcedure 
server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, 
splitWal=true, meta=false; resume parent processing.
2018-05-02 05:44:09,258 INFO [PEWorker-12] procedure2.ProcedureExecutor: 
Finished pid=508, ppid=507, state=SUCCESS; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase, 
region=94f6ca283dbb4445b2bcdc321b734d28 in 764msec
2018-05-02 05:44:09,273 INFO [PEWorker-14] procedure2.ProcedureExecutor: 
Finished pid=507, state=SUCCESS; ServerCrashProcedure 
server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, 
splitWal=true, meta=false in 975msec{code}

Strange thing is SCP for R007 is assigning region back to R007!

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472456#comment-16472456
 ] 

Umesh Agashe commented on HBASE-20552:
--

bq. Log for server 0002 was attached already.

Thanks! and also for 007?

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>   at 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472438#comment-16472438
 ] 

Umesh Agashe commented on HBASE-20552:
--

bq. Was there any region on 0008 you're interested in ?

670f6b815d2acac905130e5440d59304
1d954f21d711345a9587d995cecea136
91f73e76bbe7bc8a61b1b1299d34c6ab

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472422#comment-16472422
 ] 

Ted Yu commented on HBASE-20552:


Log for server 0002 was attached already.

Was there any region on 0008 you're interested in ?



> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Assignee: Umesh Agashe
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>   at 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-11 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472409#comment-16472409
 ] 

Umesh Agashe commented on HBASE-20552:
--

Usually following warnings can be ignored. But these messages followed by 
"Completed pid=" looks trouble. When M003 became active at around 2018-05-02 
05:43:33, there are a few warnings while reading master proc wal:
{code:java}
2018-05-02 05:43:33,529 WARN 
[master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: 
Unable to read tracker for 
hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log - 
Invalid Trailer version. got 8 expected 1
2018-05-02 05:43:33,638 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: 
Roll new state log: 5
2018-05-02 05:43:33,655 INFO 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Recovered WALProcedureStore lease in 219msec
2018-05-02 05:43:33,681 INFO 
[master/ctr-e138-1518143905142-279227-01-03:2] 
wal.ProcedureWALFormatReader: Rebuilding tracker for 
hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log
2018-05-02 05:43:33,816 WARN 
[master/ctr-e138-1518143905142-279227-01-03:2] 
wal.ProcedureWALFormatReader: Nothing left to decode. Exiting with missing EOF, 
log=hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log

2018-05-02 05:43:33,875 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=467, state=SUCCESS; 
MoveRegionProcedure hri=4c37ee7a4e1210e481debdc2933fc4d2, 
source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-03.hwx.site,16020,15252394258262018-05-02
 05:43:33,876 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=465, state=SUCCESS; 
MoveRegionProcedure hri=94f6ca283dbb4445b2bcdc321b734d28, 
source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502
2018-05-02 05:43:33,876 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=462, state=SUCCESS; 
MoveRegionProcedure hri=a8ff96226d546f0ea151823ae73e5a1b, 
source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-08.hwx.site,16020,1525238658606{code}
M003 during startup has no log messages for procedures with ids 468 to 504 even 
though they are ran and completed on M005. This is unusual. 
RecoverMetaProcedure on M003 starts with id 505 which is correct.

Orthogonal to above observation we have meta update issue as well. On M005, 
pid=471 is SCP for R007 which also hosts meta. Meta is re-assigned with pid=472 
to R002 which is followed by other region assignments
{code:java}
pid=478 e75a388bc2011feed75bdc1a0e99a9a9   
regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site
pid=474 670f6b815d2acac905130e5440d59304   
regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site
pid=479 c963eb77dbdc6dbab886dbe4eebba5ad  
regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site
pid=481 b5180eee96b616afdf79578309c66a11   
regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site
pid=486 8dc6fd2022c2fdf8c065fbd16cadaaca   
regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site
pid=480 f3db9f9879ed03f488dcb89bea834237   
regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site
pid=484 c078deb2474e9c19b85b5fdb9efaa47d   
regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site
pid=475 94f6ca283dbb4445b2bcdc321b734d28   
regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site
pid=483 1d954f21d711345a9587d995cecea136   
regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site
pid=476 1595f38ee901be7c67b997fe2fc95951   
regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site
pid=482 a6e0d7561c4f19e78f94d37462588281   
regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site
pid=485 91f73e76bbe7bc8a61b1b1299d34c6ab   
regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site
pid=477 a0620fc83de532a37f6a9bb8f99cc6c4   
regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site{code}
>From the logs all the procedures finished successfully without skipping steps. 
>Meta doesn't seem to be updated for 4 of these assignments. When M003 logs all 
>regions from meta at startup, locations for following 4 regions don't match 
>with the target locations in above procedures:
{code:java}
670f6b815d2acac905130e5440d59304   
ctr-e138-1518143905142-279227-01-08.hwx.site 
lastHost=ctr-e138-1518143905142-279227-01-07.hwx.site 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site
94f6ca283dbb4445b2bcdc321b734d28   

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469544#comment-16469544
 ] 

Umesh Agashe commented on HBASE-20552:
--

Thanks for attaching the logs. Need to go through logs to see if its similar to 
what we have seen so far...

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>   at 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469380#comment-16469380
 ] 

stack commented on HBASE-20552:
---

Sometimes in a procedure we'll look at current state of things and determine 
that we can 'pass' on a step because it looks like all has been done already.

We have to be careful when we do this. There is an outstanding grey array 
identified by [~uagashe] where we should be updating hbase:meta though it looks 
like we don't have too...  A more fundamental problem was addressed but this 
may be a new case of it. Will look in logs

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469240#comment-16469240
 ] 

Ted Yu commented on HBASE-20552:


>From master-ctr-e138-1518143905142-279227-01-03.hwx.site.log :
{code}
2018-05-02 05:44:08,236 INFO  
[master/ctr-e138-1518143905142-279227-01-03:2] 
assignment.RegionStateStore: Load hbase:meta entry 
region=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, 
lastHost=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502
{code}
It seems master 0005 might not have persisted the assignment to server 0002 in 
hbase:meta - the server shown above was 0007

So when server 0002 reported in w.r.t. region 94f6ca283dbb4445b2bcdc321b734d28, 
it was rejected.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469217#comment-16469217
 ] 

Ted Yu commented on HBASE-20552:


bq. Did all recover after the RS ABORTED?

The reported incident happened during nightly run. We didn't have a chance to 
fully evaluate the health of master 3 before the cluster was gone.
>From what I can tell, pid 465 doesn't seem to be parent procedure.
There was no pid=466 in log of 3.
For pid=467, it was for a different region:
{code}
2018-05-02 05:43:33,875 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Completed pid=467, state=SUCCESS; 
MoveRegionProcedure hri=4c37ee7a4e1210e481debdc2933fc4d2, 
source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-03.hwx.site,16020,   
1525239425826
{code}

Master logs have been attached.

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>  

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469192#comment-16469192
 ] 

stack commented on HBASE-20552:
---

Repeat: "Did all recover after the RS ABORTED?"

bq. Pid 475 was executed on master 05. I didn't find it mentioned in log of 
server 03.

Right. It looks like a new assign that came of startup of new master after 
reading the content of hbase:meta. There should be two assigns for the same 
region now. The one that was a subprocedure of pid=465 (?466?467?) and the new 
one pid=475. The completion of 467? would make it so 465 could mark it self 
successful.

High-level, there is not enough to go on here in the posted snippets. I'm just 
trying to teach you how to fish. Will not be able to answer what happened 
unless you post the full log from both masters.

Thanks.



> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469176#comment-16469176
 ] 

Ted Yu commented on HBASE-20552:


bq. It seems to have succeeded in getting the region assigned (to 0002 it 
seems).

The move procedure would assign to server 07.

bq. This is pid=475

Pid 475 was executed on master 05. I didn't find it mentioned in log of 
server 03.

bq. Now there are two assigns for the region

Assignment of region to server 0002 was done by master 05 (M1). Assignment 
to 0007 was done by M2.

bq. What happens to pid=475? It succeeds?

>From master 05 log, we can see that both 465 and 475 succeeded:
{code}
2018-05-02 05:38:59,773 INFO  [PEWorker-9] procedure2.ProcedureExecutor: 
Finished pid=465, state=SUCCESS; MoveRegionProcedure 
hri=94f6ca283dbb4445b2bcdc321b734d28, source=ctr-e138- 
1518143905142-279227-01-02.hwx.site,16020,1525239334474, 
destination=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502
 in 748msec
...
2018-05-02 05:39:46,700 INFO  [PEWorker-1] procedure2.ProcedureExecutor: 
Finished pid=475, ppid=471, state=SUCCESS; AssignProcedure 
table=test_hbase_ha_load_test_tool_hbase,
region=94f6ca283dbb4445b2bcdc321b734d28 in 976msec
{code}
In master 03, pid=465 was only mentioned once (shown in description). 
pid=475 didn't appear.

bq. What is pid=507?

It was crash processing:
{code}
2018-05-02 05:44:08,409 DEBUG 
[master/ctr-e138-1518143905142-279227-01-03:2] 
procedure2.ProcedureExecutor: Stored pid=507, state=RUNNABLE:SERVER_CRASH_START;
{code}

> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
> Attachments: 
> 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, 
> 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, 
> 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log
>
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> 

[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException

2018-05-09 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469113#comment-16469113
 ] 

stack commented on HBASE-20552:
---

Thanks for report. We throw UnexpectedStateException when we meet a condition 
we do not know how to handle. I expect there are a few of these lucking in AMv2.

Did all recover after the RS ABORTED?

The pid=465 on M2 is the original move done back up on M1 being replayed and 
thinking it is done. It seems to have succeeded in getting the region assigned 
(to 0002 it seems).

A move is composed of an unassign followed by an assign. The unsassign seems to 
have completed (a sub-procedure of pid=465) but what happened to the assign 
that was a sub-procedure of pid=465? It looks like we create a new assign when 
processing the crashed server. This is pid=475. Now there are two assigns for 
the region. Only one should prevail (the second when it notices the other 
assign should give up... There is a lock on region during the running of the 
assign so only one can run at a time).

What happens to pid=475? It succeeds? It gave up because it saw the 
subprocedure of pid=465 had succeeded, the assign?

What is pid=507? Bulk assign? Or crash processing?

We need to figure what state went without an update (an update of hbase:meta) 
or what procedure went to run presuming a state that wasn't true for some 
reason.

Looks like a good one. Thanks.



> HBase RegionServer was shutdown due to UnexpectedStateException
> ---
>
> Key: HBASE-20552
> URL: https://issues.apache.org/jira/browse/HBASE-20552
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Romil Choksi
>Priority: Critical
>
> This was observed during cluster testing (source code sync'ed with hbase-2.0, 
> built May 2nd):
> {code}
> 2018-05-02 05:44:10,089 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] 
> master.MasterRpcServices: Region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported 
> a fatal error:
> * ABORTING region server 
> ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- 
> 1518143905142-279227-01-07.hwx.site,16020,1525239609353, 
> table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138-  
> 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has 
> otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: 
> rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>  table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037)
>   ... 7 more
>  *
> Cause:
> org.apache.hadoop.hbase.YouAreDeadException: 
> org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, 
> location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353,
>table=test_hbase_ha_load_test_tool_hbase, 
> region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on 
> server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 
> but state  has otherwise.
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065)
>   at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459)
>   at 
>