Hey,
We're continuing to get some heavy use out of our normalizers (split
point pr still to come), and we're seeing an issue with daughters
getting merged. The situation is a region is splitting and becoming the
parent to regions A and B, one of these regions, say B is then getting
compacted out but region A is still waiting to be compacted. Region B is
then getting merged by our normalization plan with it's neighbor region,
C, producing region D. This leaves us with regions A and D and the
metadata for the parent.
Now when a query comes in that spans A through D, the query planner sees
the reference to B in the parent's metadata and tries to find that
region. After the timeout period we get an exception that says the child
region (B) of the parent isn't online yet but should be soon.
19/07/03 18:33:06 ERROR Executor: Exception in task 3.0 in stage 9.0
(TID 43)
org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the
location
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:316)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:212)
at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:164)
at
org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:159)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
...
Caused by: org.apache.hadoop.hbase.client.RegionOfflineException: the
only available region for the required row is a split parent, the
daughters should be online soon:
table_z3_v2,\x01\x09\xFC}\x18\xFA\xD2o\x99\x8D\xE5,1550014268044.7b0d3c6836cf417b007771ab48d0450f.
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1307)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
From what I gather we should be removing the reference to B from the
parent's metadata when B gets merged with C.
Is this something people deal with or is there a good way to prevent this?
Thanks,
Austin