[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Fix Version/s: (was: 1.5.0) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Resolution: Fixed Status: Resolved (was: Patch Available) Resolved but not pushed yet. I have a 1.4.9 rc1 candidate tagged, also not pushed yet. Testing first. > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Patch Available (was: Open) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: (was: HBASE-21464-branch-1.patch) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Open (was: Patch Available) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: (was: HBASE-21464-branch-1.patch) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: (was: HBASE-21464-branch-1.patch) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Patch Available (was: Open) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Open (was: Patch Available) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Patch Available (was: Reopened) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch, > HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-1.4 and branch-1 > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch, HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Status: Patch Available (was: Open) Attached a fix. The actual fix is in ConnectionManager. I've also included fixes for resource leaks I noticed in MetaTableAccessor while looking at the code. In ConnectionManager we put the locations of hbase:meta into the region location cache (implemented in MetaCache). Lookups from the region location cache use both tablename and row key as keys. For meta we assume the row key is always the empty byte array and that the data structure will compare the contents of the key not their identity. Some change along the way has broken this assumption. I poked around commit history on branch-1 but didn't find an obvious change. I probably have missed something that will be obvious to someone else. So, instead I made two related small changes to how meta locations are cached which preserve the intent but offer more assurance of a correct result. First, when looking up meta region locations we really ignore the row key. Second, when looking up meta region locations with useCache == false, so the client is relocating after a NSRE (perhaps a NSRE of meta itself) we remove all prior entries for meta region locations in the cache before going up to zookeeper to find and cache the latest. After these two changes truly the row key doesn't matter when looking up meta region locations in the cache and I can no longer reproduce the problem. > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.7, 1.4.8, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.5.0 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Attachment: HBASE-21464-branch-1.patch > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > Attachments: HBASE-21464-branch-1.patch > > > Splitting is blocked during split transaction. The split worker is trying to > update meta but isn't able to relocate it after NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Balancing cannot run indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Description: Splitting is blocked during split transaction. The split worker is trying to update meta but isn't able to relocate it after NSRE: {noformat} 2018-11-09 17:50:45,277 INFO [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88590 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table 'hbase:meta' at region=hbase:meta,1.1588230740, hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, seqNum=0{noformat} Clients, in this case YCSB, are hung with part of the keyspace missing: {noformat} 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] client.ConnectionManager$HConnectionImplementation: locateRegionInMeta parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying after sleep of 20158 because: No server address listed in hbase:meta for region test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. containing row user3301635648728421323{noformat} Balancing cannot run indefinitely because the split transaction is stuck {noformat} 2018-11-09 17:49:55,478 DEBUG [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: Not running balancer because 3 region(s) in transition: [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute{noformat} was: Splitting is blocked during split transaction. The split worker is trying to update meta but isn't able to relocate it after NSRE: {noformat} 2018-11-09 17:50:45,277 INFO [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88590 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table 'hbase:meta' at region=hbase:meta,1.1588230740, hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, seqNum=0{noformat} Clients, in this case YCSB, are hung with part of the keyspace missing: {noformat} 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] client.ConnectionManager$HConnectionImplementation: locateRegionInMeta parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying after sleep of 20158 because: No server address listed in hbase:meta for region test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. containing row user3301635648728421323{noformat} Additional confirmation of the problem on the master, balancing cannot run indefinitely because the split transaction is stuck {noformat} 2018-11-09 17:49:55,478 DEBUG
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Description: Splitting is blocked during split transaction. The split worker is trying to update meta but isn't able to relocate it after NSRE: {noformat} 2018-11-09 17:50:45,277 INFO [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88590 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table 'hbase:meta' at region=hbase:meta,1.1588230740, hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, seqNum=0{noformat} Clients, in this case YCSB, are hung with part of the keyspace missing: {noformat} 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] client.ConnectionManager$HConnectionImplementation: locateRegionInMeta parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying after sleep of 20158 because: No server address listed in hbase:meta for region test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. containing row user3301635648728421323{noformat} Additional confirmation of the problem on the master, balancing cannot run indefinitely because the split transaction is stuck {noformat} 2018-11-09 17:49:55,478 DEBUG [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: Not running balancer because 3 region(s) in transition: [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, server=ip-172-31-5-92.us-west-2.compute{noformat} Unfortunately I don't have a lot of time to debug this before heading out for the weekend. Will pick it up on Monday. I saved all of the cluster logs. was: ITBLL tests with an internal fork of 1.4.7 looked fine, but then same with an internal fork of 1.4.8 showed an alarming performance problem and eventual test failure. Can repro with the 1.4.8 upstream release. I didn't try 1.4.7 and will need to do it as a sanity check but let's assume for now there is a bad bug introduced somewhere between 1.4.7 and 1.4.8. Splitting is blocked when meta relocates during split transaction because the splitting thread does not try to relocate meta. The split worker is trying to update meta but doesn't relocate it even after NSRE: {noformat} 2018-11-09 17:50:45,277 INFO [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] client.RpcRetryingCaller: Call exception, tries=13, retries=350, started=88590 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table 'hbase:meta' at region=hbase:meta,1.1588230740, hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, seqNum=0{noformat} Clients, in this case YCSB, are hung with part of the keyspace missing: {noformat}
[jira] [Updated] (HBASE-21464) Splitting blocked with meta NSRE during split transaction
[ https://issues.apache.org/jira/browse/HBASE-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-21464: --- Summary: Splitting blocked with meta NSRE during split transaction (was: Splitting blocked when meta relocates during split transaction) > Splitting blocked with meta NSRE during split transaction > - > > Key: HBASE-21464 > URL: https://issues.apache.org/jira/browse/HBASE-21464 > Project: HBase > Issue Type: Bug >Affects Versions: 1.5.0, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.8, 1.4.7 >Reporter: Andrew Purtell >Priority: Blocker > Fix For: 1.5.0, 1.4.9 > > > ITBLL tests with an internal fork of 1.4.7 looked fine, but then same with an > internal fork of 1.4.8 showed an alarming performance problem and eventual > test failure. Can repro with the 1.4.8 upstream release. I didn't try 1.4.7 > and will need to do it as a sanity check but let's assume for now there is a > bad bug introduced somewhere between 1.4.7 and 1.4.8. > Splitting is blocked when meta relocates during split transaction because the > splitting thread does not try to relocate meta. > The split worker is trying to update meta but doesn't relocate it even after > NSRE: > {noformat} > 2018-11-09 17:50:45,277 INFO > [regionserver/ip-172-31-5-92.us-west-2.compute.internal/172.31.5.92:8120-splits-1541785709434] > client.RpcRetryingCaller: Call exception, tries=13, retries=350, > started=88590 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 > is not online on ip-172-31-13-83.us-west-2.compute.internal,8120,1541785618832 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3088) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1271) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2198) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2396) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)row > 'test,,1541785709452.5ba6596f0050c2dab969d152829227c6.44' on table > 'hbase:meta' at region=hbase:meta,1.1588230740, > hostname=ip-172-31-15-225.us-west-2.compute.internal,8120,1541785640586, > seqNum=0{noformat} > Clients, in this case YCSB, are hung with part of the keyspace missing: > {noformat} > 2018-11-09 17:51:06,033 DEBUG [hconnection-0x5739e567-shared--pool1-t165] > client.ConnectionManager$HConnectionImplementation: locateRegionInMeta > parentTable=hbase:meta, metaLocation=, attempt=14 of 35 failed; retrying > after sleep of 20158 because: No server address listed in hbase:meta for > region > test,user307326104267982763,1541785754600.ef90030b05cb02305b75e9bfbc3ee081. > containing row user3301635648728421323{noformat} > Additional confirmation of the problem on the master, balancing cannot run > indefinitely because the split transaction is stuck > {noformat} > 2018-11-09 17:49:55,478 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=8100] master.HMaster: > Not running balancer because 3 region(s) in transition: > [{ef90030b05cb02305b75e9bfbc3ee081 state=SPLITTING_NEW, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute.internal,8120,1541785626417}, > {5ba6596f0050c2dab969d152829227c6 state=SPLITTING, ts=1541785754606, > server=ip-172-31-5-92.us-west-2.compute{noformat} > Unfortunately I don't have a lot of time to debug this before heading out for > the weekend. Will pick it up on Monday. I saved all of the cluster logs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)