Todd,

Can you define what a Locality Group is for HBase and how it would
function?  Eg, it sounds like the same thing as a column-family, and
it's not clear how one would benefit from the usage of an LG.

Jason

On Mon, Jun 13, 2011 at 8:45 PM, Todd Lipcon <t...@cloudera.com> wrote:
> Keep in mind that BigTable can have a large number of CFs because they also
> have Locality Groups. HBase has a 1:1 mapping of CF -> Locality Group.
>
> I don't know for sure, but I imagine most of these BT tables have a very
> small number of locality groups, even if they have 20+ CFs.
>
> Would be nice to extract CFs from LGs in HBase some day, if anyone has a
> month free ;-)
>
> -Todd
>
> On Mon, Jun 13, 2011 at 2:44 PM, Jason Rutherglen <
> jason.rutherg...@gmail.com> wrote:
>
>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>> has
>> > 16 CFs and one of the Google Base tables had 29 CFs
>>
>> What's Google doing in BigTable that enables so many CFs?
>>
>> Is the cost in HBase the seek to each individual key in the CFs, or is
>> it the cost of loading each block into RAM (?), which could be
>> alleviated though bypassing the block cache and accessing the blocks
>> as if they're local.
>>
>> On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <leifwickl...@gmail.com>
>> wrote:
>> > Thanks for replying, J-D.
>> >
>> > My interpretation is that they try to keep that number low, from page 2:
>> >>
>> >> "It is our intent that the number of distinct column families in a
>> >> table be small (in the hundreds at most)"
>> >>
>> >
>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>> has
>> > 16 CFs and one of the Google Base tables had 29 CFs.
>> >
>> >
>> >> Could you just store that in the same family?
>> >>
>> >
>> > Yup.  I could.  Their would be a little weirdness to it, but I think it's
>> > doable.  It seems like that's the consensus suggestion.
>> >
>> >
>> >> Row locking is rarely a good idea, it doesn't scale and they currently
>> >> aren't persisted anywhere except the RS memory (so if it dies...).
>> >> Using a single family might be better for you.
>> >
>> >
>> > Thanks for the pointer.
>> >
>> > Leif
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Reply via email to