Hi Ted, Thanks for the fast reply! Ok I see - just out of interest, if I changed my row key to be uuid#timestamp (instead of uuid#reverse_timestamp) - would the timestamp approach still be equally efficient? I just want to understand whether or not the timestamp approach is relying on the ordering of my row keys.
Josh On Wed, Apr 12, 2017 at 6:39 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Since STARTROW is specified (with uuid) in both of your examples, I think > their efficiency should be tantamount. > > Cheers > > On Wed, Apr 12, 2017 at 10:33 AM, Josh <jof...@gmail.com> wrote: > > > Hi, > > > > I am just getting started with HBase, and have a question about the > > efficiency of timestamp based scans. > > > > My table's row key has structure `uuid#reverse_timestamp` where > > reverse_timestamp is (java.lang.Long.MAX_VALUE - time in millis when the > > row was written). For a given uuid I want to be able to retrieve the most > > recent 10 rows in the table where timestamp is greater than x. It's > > possible that a given uuid may have many thousands of rows (with > different > > timestamps). > > > > I found there are two ways to run my query: > > 1. use HBase's built in timestamps and scan a time range: > > > scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc', > > TIMERANGE => [x, current_time], LIMIT => 10} > > > > 2. use only my row keys to do the scan, with STARTROW and STOPROW: > > scan 'mytable', {STARTROW => '647b2194-fbb8-46af-95ba-f498ddc8adcc', > > STOPROW='647b2194-fbb8-46af-95ba-f498ddc8adcc#x', LIMIT => 10} > > > > Both of these seem to work - but is one more efficient that the other? > > > > Thanks for any advice, > > Josh > > >