Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG
Pretty much, yes. It's essentially the same use case as predicate
pushdown for the live table case (already implemented), which converts
predicates into a scan, and we should be able to reuse a significant
amount of that code. It is perhaps a somewhat limited use case, but I'd
argue that it's a reasonably significant one for hive on HBase--if
you've designed your HBase row key based on your query patterns, it's
reasonable to expect that most queries over snapshots will be SARGable
(that's certainly true for our use case, though I can't speak so much to
others).
Given that, does it seem worthwhile enough to file a ticket? We may
implement it either way (depending on how our preliminary performance
testing of queries over snapshots goes).
Thanks!
Andrew
On 3/30/15 8:03 PM, Gopal Vijayaraghavan wrote:
Looking at the current implementation on trunk, hive's hbase integration
doesn't currently seem to support predicate pushdown for queries over
HBase snapshots. Does this seem like a reasonable feature to add?
It would be nice to have relative feature parity between queries running
over snapshots and queries running over live tables.
Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG?
That particular method will be limited to simple filters on exactly one
key or perhaps with a few seeks, the more generic IN/BETWEEN SARGs.
But for that case, it will provided a significant boost.
Cheers,
Gopal