Hello Drill,
We're noticing somewhat of an odd behavior with the following query against
HBase table.
They key of the table is roughly speaking
*8byteHash(string1)8byteHash(string2)*
SELECT CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT') p1_long, ...
from {table}
WHERE CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'BIGINT_BE') =
hash_to_long('key_part1') limit 10
The query does seem to work correctly in terms of result set but times out
on larger tables. The hash_to_long is udf that I wrote that converts a
string to long such that the above equality can be satisfied.
It appears that it doesn't push down this into subscan (i.e. prefix HBase
scan) - while the operator profile shows HBASE_SUB_SCAN:
[image: Inline image 1]
The physical plan start with unconstrained full table scan:
Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec
[tableName={table}, startRow=null, stopRow=null, filter=null],
How can we force the where clause to be reflected into scan bounds?
We're running latest Drill 1.6.
Andrey