Re: Is it necessary to set MD5 on rowkey?

Damien Hardy Tue, 18 Dec 2012 01:34:01 -0800

Hello,

There is middle term betwen sequecial keys (hot spoting risk) and md5
(heavy scan):
  * you can use composed keys with a field that can segregate data
(hostname, productname, metric name) like OpenTSDB
  * or use Salt with a limited number of values (example
substr(md5(rowid),0,1) = 16 values)
    so that a scan is a combination of 16 filters on on each salt values
    you can base your code on HBaseWD by sematext


http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
       https://github.com/sematext/HBaseWD

Cheers,


2012/12/18 bigdata <[email protected]>

> Many articles tell me that MD5 rowkey or part of it is good method to
> balance the records stored in different parts. But If I want to search some
> sequential rowkey records, such as date as rowkey or partially. I can not
> use rowkey filter to scan a range of date value one time on the date by
> MD5. How to balance this issue?
> Thanks.
>
>




-- 
Damien HARDY

Re: Is it necessary to set MD5 on rowkey?

Reply via email to