Re: HBASE - select distinct query against the rowkey

Shengjie Min Thu, 20 Dec 2012 08:09:08 -0800

Thanks Michael,

>Not sure why you have timestamp in the key... assuming that message id
would be incremented therefore rows would be in time order anyways.


I will need to do query like give me the message from timestamp1 to
timestamp2.

>You will want to use a separate table.
That's what I thought as well. If i don't have a separated table, i will
end up having table scanning. But how about the atomicity? If you write a
record in, succeeded on one table failed on another? Hbase has no concept
of transaction in this case.

Shengjie


On 20 December 2012 15:59, Michael Segel <[email protected]> wrote:

> Not sure why you have timestamp in the key... assuming that message id
> would be incremented therefore rows would be in time order anyways.
>
> But to answer your question...
> You will want to use a separate table.
>
> In both instances you will end up doing a full table scan, however the
> number of rows in a distinct user table would be much less than your user's
> table.
>
>
> HTH
>
> -Mike
>
> On Dec 20, 2012, at 8:55 AM, Shengjie Min <[email protected]> wrote:
>
> > I have a hbase table called "users", rowkey consists of three parts:
> >
> >   1. userid
> >   2. messageid
> >   3. timestamp
> >
> > rowkey looks like: ${userid}_${messageid}_${timestamp}
> >
> > Given I can hash the userid and make the length of the field fixed, is
> > there anyway I can do a query like SQL query:
> >
> > select distinct(userid) from users
> >
> > If rowkey doesn't allow me to query like this, does that mean I need to
> > create a separated table just contains all the user ids? I guess if I do
> > something like that, it won't be atomic anymore when I insert a record
> in,
> > becoz I am dealing with two tables without transaction.
> > --
> > All the best,
> > Shengjie Min
>
>


-- 
All the best,
Shengjie Min

Re: HBASE - select distinct query against the rowkey

Reply via email to