Yeah, that is exactly why we are using GUID for the row key :)
Michelan is busy writing code to add secondary indexing - the table is
about 200 Gigs big so it's gonna take a while to run, but it looks
like the only option we have.
On 05 May 2010, at 10:29 AM, TuX RaceR wrote:
Also be aware that using a time based key, you will probably create
'hot spots', i.e. the nodes will get all the load one after the
other at writing time, and possibly at read time too, if you query
only recent data.
But I do not see any way to avoid that, as you do need a scanner,
cheers
TuX
TuX RaceR wrote:
Seraph Imalia wrote:
Hi Ryan,
Thanks for your response - I am also working on this project.
I was hoping that hBase perhaps treated the time range differently
which would prevent a full table scan. I suppose our only next
option is to implement indexing?
Yes I would say so except if a time-based key can naturally
identify a record, or if you will always retrieve your records
using time queries.
In that case you could create a key which is a concat of a
timestamp and your old SQL uid,
cheers
TuX