Re: Improving HBase scanner

Seraph Imalia Wed, 05 May 2010 08:01:40 -0700

Yeah, that is exactly why we are using GUID for the row key :)

Michelan is busy writing code to add secondary indexing - the table isabout 200 Gigs big so it's gonna take a while to run, but it lookslike the only option we have.



On 05 May 2010, at 10:29 AM, TuX RaceR wrote:

Also be aware that using a time based key, you will probably create'hot spots', i.e. the nodes will get all the load one after theother at writing time, and possibly at read time too, if you queryonly recent data.
But I do not see any way to avoid that, as you do need a scanner,
cheers
TuX


TuX RaceR wrote:
Seraph Imalia wrote:
Hi Ryan,

Thanks for your response - I am also working on this project.
I was hoping that hBase perhaps treated the time range differentlywhich would prevent a full table scan. I suppose our only nextoption is to implement indexing?
Yes I would say so except if a time-based key can naturallyidentify a record, or if you will always retrieve your recordsusing time queries.In that case you could create a key which is a concat of atimestamp and your old SQL uid,
cheers
TuX

Re: Improving HBase scanner

Reply via email to