Re: Rowkey design question

Mohammad Tariq Tue, 19 Feb 2013 09:35:42 -0800

No. before the timestamp. All the row keys which are identical go to the
same region. This is the default Hbase behavior and is meant to make the
performance better. But sometimes the machine gets overloaded with reads
and writes because we get concentrated on that particular machine. For
example timeseries data. So it's better to hash the keys in order to make
them go to all the machines equally. HTH


BTW, did that range query work??

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Tue, Feb 19, 2013 at 9:54 PM, Paul van Hoven <
paul.van.ho...@googlemail.com> wrote:

> Hey Tariq,
>
> thanks for your quick answer. I'm not sure if I got the idea in the
> seond part of your answer. You mean if I use a timestamp as a rowkey I
> should append a hash like this:
>
> 1357279200000+MD5HASH
>
> and then the data would be distributed more equally?
>
>
> 2013/2/19 Mohammad Tariq <donta...@gmail.com>:
> > Hello Paul,
> >
> >     Try this and see if it works :
> >        scan.setStartRow(Bytes.toBytes(startDate.getTime() + ""));
> >        scan.setStopRow(Bytes.toBytes(endDate.getTime() + 1 + ""));
> >
> > Also try not to use TS as the rowkey, as it may lead to RS hotspotting.
> > Just add a hash to your rowkeys so that data is distributed evenly on all
> > the RSs.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Tue, Feb 19, 2013 at 9:41 PM, Paul van Hoven <
> > paul.van.ho...@googlemail.com> wrote:
> >
> >> Hi,
> >>
> >> I'm currently playing with hbase. The design of the rowkey seems to be
> >> critical.
> >>
> >> The rowkey for a certain database table of mine is:
> >>
> >> timestamp+ipaddress
> >>
> >> It looks something like this when performing a scan on the table in the
> >> shell:
> >> hbase(main):012:0> scan 'ToyDataTable'
> >> ROW                                         COLUMN+CELL
> >>  1357020000000+192.168.178.9                column=CF:SampleCol,
> >> timestamp=1361288601717, value=Entry_1 = 2013-01-01 07:00:00
> >>
> >> Since I got several rows for different timestamps I'd like to tell a
> >> scan to just a region of the table for example from 2013-01-07 to
> >> 2013-01-09. Previously I only had a timestamp as the rowkey and I
> >> could restrict the rowkey like that:
> >>
> >> SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd
> HH:mm:ss");
> >>                         Date startDate = formatter.parse("2013-01-07
> >> 07:00:00");
> >>                         Date endDate = formatter.parse("2013-01-10
> >> 07:00:00");
> >>
> >>                         HTableInterface toyDataTable =
> >> pool.getTable("ToyDataTable");
> >>                         Scan scan = new Scan( Bytes.toBytes(
> >> startDate.getTime() ),
> >> Bytes.toBytes( endDate.getTime() ) );
> >>
> >> But this no longer works with my new design.
> >>
> >> Is there a way to tell the scan object to filter the rows with respect
> >> to the timestamp, or do I have to use a filter object?
> >>
>

Re: Rowkey design question

Reply via email to