Everything you say is totally true. One last comment: if your update rate is lowish, and the IDs might have some meaning, you might be better served by a counter. eg: userids (max value=6 billion ;-)) Or something else that might end up needing to be human semi-readable.
-ryan On Mon, Mar 15, 2010 at 4:11 PM, Michael Segel <michael_se...@hotmail.com> wrote: > > > >> Date: Mon, 15 Mar 2010 08:15:10 +0100 >> Subject: Re: UUID as key wuz: RE: worth choosing the shortest possible >> column names/keys? >> From: timrobertson...@gmail.com >> To: hbase-user@hadoop.apache.org > >> >> Sure, understood. UUID aims to be globally unique, whereas I am only >> looking for in cluster uniqueness across a couple billion items, but an >> algorithm that allows ID minting by machines in parallel. >> > And if you use a serial counter. You have a single counter and a single point > of failure, or a point of contention. > If you're running a hadoop/mapreduce job and each node inserts in to HBase as > they run, then you have to coordinate counter access. > > Using UUID, you don't have that problem. Of course, you don't have a sequence > that you would using a counter. > > > _________________________________________________________________ > Hotmail: Trusted email with powerful SPAM protection. > http://clk.atdmt.com/GBL/go/210850553/direct/01/