We are running HBase 0.98.5 with Phoenix 4.2.2. All was well for a period of
time (a year or more). But now, some users can no longer search. Others do not
have problems for a similar search. Granted, the searched data is different.
The 'asset' table was created and maintained using an HBase client. Search uses
a Phoenix view on this table. The row key is L,C,A,R.
A search that is failing is what we call a multiple locality search. A single
locality search is working. In multi-locality search, there is an IN L=(...)
clause.
Note, the L is the first column in the row key. The query is shown below.
SELECT * from "asset" WHERE R=(SELECT MAX(R) from "asset" WHERE L=T1.L and
C=T1.C AND A=T1.A AND C=? AND L IN (?))
AND (T1.L in (?) AND T1.T=? AND T1.N ILIKE ? order by T1.N
asc nulls first limit 500
The subquery selects the latest revision of each 'asset'. The additional
conditions (C=? and L IN (?)) was added to make the subquery more efficient. I
did notice there is no corresponding C=? in the primary WHERE clause.
The root cause is "Could not find hash cache for joinId: The cache might have
expired and have been removed." One recommendation on the Phoenix site is to
increase the phoenix.coprocessor.maxSearchCacheTimeToLiveMs. We use the
default. This recommendation does not seem applicable because the query fails
almost immediately - not after 30 seconds. Some bugs have reported a similar
problem. But, they all seem to be multi-tenant. We do not use the multi-tenant
property. We enforce via the WHERE clause. We do use a connection pool. Once
again, the problem with using a connection pool appears to be if you use
multi-tenant. Are there other reasons not to use a connection pool? We are also
experiencing an unbalanced workload across our region servers. I have not
confirmed, at this time, whether the localities for the users that have failing
searches reside on the overloaded region server.
Unfortunately, the problem only exists in the HBase clustered environment with
controlled access, so it is difficult to enable debug logging. Does anyone have
suggestions on how to troubleshoot or theories as to what could be the cause?
Other considerations are
- rework the query.
- upgrade to Phoenix 4.7. But, it has a dependency on HBase 0.98.17. I had all
kinds of problems upgrading so I've taken that off the table.
- provide an alternative to connection pooling. Not sure this would fix the
problem. If it was the cause, why would search consistently work for some users?
- replace Phoenix (obviously not a short-term option). Given the search table
is maintained by HBase we can't leverage secondary indexes. Therefore, future
search enhancements may be limited.
<ERROR> 1 2016-05-05T13:07:49.624Z localhost - - - [logger="AssetsResource"]
[hpSDID@11 deviceId="-" reqId="33214548506
4175617" threadId="330108"] Error searching assets:
query=%7Bname%3Asynthetic%7D,sort=%7Bname%7D,limit=100,skip=0,
... Internal database server error.
Caused by: org.apache.phoenix.exception.PhoenixIOException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException:
Could not find hash cache for joinId: ^@?g^R(g{. The cache might have expired
and have been removed.
at
org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
at
org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
at
org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Unknown Source)
at
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:536)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
~[phoenix-core-4.2.2.jar:4.2.2]
at
com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
~[cmx-core-metadata-core-3.2.0.27.jar:?]
... 53 more
Caused by: java.util.concurrent.ExecutionException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: Could no
t find hash cache for joinId: ^@?g^R(g{. The cache might have expired and have
been removed.
at
org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
at
org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
at
org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Unknown Source)
at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:1.7.0_51]
at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:1.7.0_51]
at
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:532)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.MergeSortTopNResultIterator.next(MergeSortTopNResultIterator.java:87)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
~[phoenix-core-4.2.2.jar:4.2.2]
at
com.hp.cmx.metadata.connector.phoenix.PhoenixSearchDao.queryUsingJDBC(PhoenixSearchDao.java:556)
~[cmx-core-metadata-core-3.2.0.27.jar:?]
... 53 more
Caused by: org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for
joinId: ^@?g^R(g{.
The cache might have expired and have been removed.
at
org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
at
org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
at
org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Unknown Source)
at
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:68)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
~[?:1.7.0_51]
... 1 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for
joinId: ^@?g^R(g{. Th
e cache might have expired and have been removed.
at
org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
at
org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
at
org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Unknown Source)
at sun.reflect.GeneratedConstructorAccessor288.newInstance(Unknown
Source) ~[?:?]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source) ~[?:1.7.0_51]
at java.lang.reflect.Constructor.newInstance(Unknown Source)
~[?:1.7.0_51]
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
~[hadoop-common-2.7.1.jar:?]
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
~[hadoop-common-2.7.1.jar:?]
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:316)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
~[?:1.7.0_51]
... 1 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException:
org.apache.hadoop.hbase.DoNotRetryIOException: Could not find hash cache for
joinId: ^@?g
^R(g{. The cache might have expired and have been removed.
at
org.apache.phoenix.coprocessor.HashJoinRegionScanner.<init>(HashJoinRegionScanner.java:96)
at
org.apache.phoenix.coprocessor.ScanRegionObserver.doPostScannerOpen(ScanRegionObserver.java:195)
at
org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:144)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1976)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3093)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
at
org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Unknown Source)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
~[hbase-client-0.98.5-hadoop2.jar:
0.98.5-hadoop2]
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
~[hbase-protocol-0.98.5-hadoop2.
jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:308)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-had
oop2]
at
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:738)
~[hbase-client-0.98.5-hadoop2.jar:0.98.5-hadoop2]
at
org.apache.phoenix.iterate.TableResultIterator.getDelegate(TableResultIterator.java:65)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:85)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.TableResultIterator.<init>(TableResultIterator.java:77)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:88)
~[phoenix-core-4.2.2.jar:4.2.2]
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:83)
~[phoenix-core-4.2.2.jar:4.2.2]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
~[?:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
~[?:1.7.0_51]
... 1 more
-Jerry