Hi,
I am using hbase 1.1 with phoenix 4.6.
I have a table with row key as (current_timestamp, id) which is salted and
index on (id). This table has ~3 million records.
I have a query like given below.
SELECT ID, CURRENT_TIMESTAMP, <SOME OTHER COLUMNS> from TBL
as a inner join ( select
max(CURRENT_TIMESTAMP) as mct, ID as tid from TBL where ID like
'42ecf4abd4bd7e7606025dc8eee3de6a3cc04418cbc2619ddc01f54d88d7c3bf%' group by
ID) as tmp on a.ID=tmp.tid and
a.CURRENT_TIMESTAMP=tmp.mct
The query hangs for long and finally fails with a timeout. I have 12 region
servers each with 5GB heap and also the total records satisfying the above
query is 62K whose CSV dump is ~10MB only.
DoNotRetryIOException: Could not find hash cache for join Id: Ӧ�8�D�. The cache
might have expired and have been removed
and -
Caused by: java.sql.SQLException: Encountered exception in sub plan [0]
execution.
at
org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:175)
at
com.infa.products.ldm.ingestion.server.java.hadoop.impl.FixPhoenixIngestInputFormat.getQueryPlan(FixPhoenixIngestInputFormat.java:94)
... 22 more
and
Caused by: java.sql.SQLException:
java.util.concurrent.TimeoutException at
org.apache.phoenix.cache.ServerCacheClient.addServerCache(ServerCacheClient.java:264)
I can try playing around with parameters such as
phoenix.coprocessor.maxServerCacheTimeToLiveMs and switching to sort_merge_join
actually helped.
But my question is as per Joins | Apache Phoenix in a case such as lhs INNER
JOIN rhs, it is rhs which will be built as hash table in server cache. So, in
the above query I assume this gets cached?
select max(CURRENT_TIMESTAMP) as mct, ID as tid from TBL where ID like
'42ecf4abd4bd7e7606025dc8eee3de6a3cc04418cbc2619ddc01f54d88d7c3bf%' group by
ID) as tmp
Thanks,
Sumit