Efficient time based queries - TIMERANGE or STARTROW/STOPROW?

Josh Wed, 12 Apr 2017 10:34:07 -0700

Hi,

I am just getting started with HBase, and have a question about the
efficiency of timestamp based scans.


My table's row key has structure `uuid#reverse_timestamp` where
reverse_timestamp is (java.lang.Long.MAX_VALUE - time in millis when the
row was written). For a given uuid I want to be able to retrieve the most
recent 10 rows in the table where timestamp is greater than x. It's
possible that a given uuid may have many thousands of rows (with different
timestamps).

I found there are two ways to run my query:
1. use HBase's built in timestamps and scan a time range:
> scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
TIMERANGE => [x, current_time], LIMIT => 10}

2. use only my row keys to do the scan, with STARTROW and STOPROW:
scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc',
STOPROW='647b2194-fbb8-46af-95ba-f498ddc8adcc#x', LIMIT => 10}

Both of these seem to work - but is one more efficient that the other?

Thanks for any advice,
Josh

Efficient time based queries - TIMERANGE or STARTROW/STOPROW?

Reply via email to