Thanks Michael, >Not sure why you have timestamp in the key... assuming that message id would be incremented therefore rows would be in time order anyways.
I will need to do query like give me the message from timestamp1 to timestamp2. >You will want to use a separate table. That's what I thought as well. If i don't have a separated table, i will end up having table scanning. But how about the atomicity? If you write a record in, succeeded on one table failed on another? Hbase has no concept of transaction in this case. Shengjie On 20 December 2012 15:59, Michael Segel <[email protected]> wrote: > Not sure why you have timestamp in the key... assuming that message id > would be incremented therefore rows would be in time order anyways. > > But to answer your question... > You will want to use a separate table. > > In both instances you will end up doing a full table scan, however the > number of rows in a distinct user table would be much less than your user's > table. > > > HTH > > -Mike > > On Dec 20, 2012, at 8:55 AM, Shengjie Min <[email protected]> wrote: > > > I have a hbase table called "users", rowkey consists of three parts: > > > > 1. userid > > 2. messageid > > 3. timestamp > > > > rowkey looks like: ${userid}_${messageid}_${timestamp} > > > > Given I can hash the userid and make the length of the field fixed, is > > there anyway I can do a query like SQL query: > > > > select distinct(userid) from users > > > > If rowkey doesn't allow me to query like this, does that mean I need to > > create a separated table just contains all the user ids? I guess if I do > > something like that, it won't be atomic anymore when I insert a record > in, > > becoz I am dealing with two tables without transaction. > > -- > > All the best, > > Shengjie Min > > -- All the best, Shengjie Min
