Yeah, that is exactly why we are using GUID for the row key :)

Michelan is busy writing code to add secondary indexing - the table is about 200 Gigs big so it's gonna take a while to run, but it looks like the only option we have.


On 05 May 2010, at 10:29 AM, TuX RaceR wrote:

Also be aware that using a time based key, you will probably create 'hot spots', i.e. the nodes will get all the load one after the other at writing time, and possibly at read time too, if you query only recent data.
But I do not see any way to avoid that, as you do need a scanner,
cheers
TuX


TuX RaceR wrote:
Seraph Imalia wrote:
Hi Ryan,

Thanks for your response - I am also working on this project.

I was hoping that hBase perhaps treated the time range differently which would prevent a full table scan. I suppose our only next option is to implement indexing?

Yes I would say so except if a time-based key can naturally identify a record, or if you will always retrieve your records using time queries. In that case you could create a key which is a concat of a timestamp and your old SQL uid,

cheers
TuX






Reply via email to