Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG

Pretty much, yes. It's essentially the same use case as predicate pushdown for the live table case (already implemented), which converts predicates into a scan, and we should be able to reuse a significant amount of that code. It is perhaps a somewhat limited use case, but I'd argue that it's a reasonably significant one for hive on HBase--if you've designed your HBase row key based on your query patterns, it's reasonable to expect that most queries over snapshots will be SARGable (that's certainly true for our use case, though I can't speak so much to others).

Given that, does it seem worthwhile enough to file a ticket? We may implement it either way (depending on how our preliminary performance testing of queries over snapshots goes).

Thanks!

Andrew

On 3/30/15 8:03 PM, Gopal Vijayaraghavan wrote:
Looking at the current implementation on trunk, hive's hbase integration
doesn't currently seem to support predicate pushdown for queries over
HBase snapshots. Does this seem like a reasonable feature to add?
It would be nice to have relative feature parity between queries running
over snapshots and queries running over live tables.
Are you suggesting taking advantage of the sorted order to seek to the key
mentioned in a SARG?

That particular method will be limited to simple filters on exactly one
key or perhaps with a few seeks, the more generic IN/BETWEEN SARGs.

But for that case, it will provided a significant boost.

Cheers,
Gopal



Reply via email to