thanks i think it makes sense. i will go through hbase architecture again to make sure i fully understand the mapping to "regions" and "stores"
On Fri, Aug 23, 2013 at 10:33 AM, Michael Segel <michael_se...@hotmail.com>wrote: > I think the issue which a lot of people miss is why do you want to use a > column family in the first place. > > Column families are part of the same table structure, and each family is > kept separate. > > So in your design, do you have tables which are related, but are not > always used at the same time? > > The example that I use when I teach about HBase or do a > lecture/presentation is an Order Entry system. > Here you have an order being entered, then you have one or many pick > slips being generated, same for shipping then there's the invoicing. > All separate processes which relate back to the same order. > > So here it makes sense to use column families. > > Other areas could be metadata is surrounding a transaction. Again... few > column families are tied together. > > Does this make sense? > > > On Aug 23, 2013, at 12:35 AM, lars hofhansl <la...@apache.org> wrote: > > > You can think of it this way: Every region and column family is a > "store" in HBase. Each store has a memstore and its own set of HFiles in > HDFS. > > The more stores you have, the more there is to manage. > > > > So you want to limit the number of stores. Also note that the word > "Table" is somewhat a misnomer in HBase it should have better been called > "Keyspace". > > The extra caution for the number of column families per table stems from > the fact that HBase flushes by region (which means all stores of that > region are flushed). This in turn means that unless are column families > hold roughly the same amount of data you end up with very lopsided > distributions of HFile sizes. > > > > -- Lars > > > > > > > > ________________________________ > > From: Koert Kuipers <ko...@tresata.com> > > To: user@hbase.apache.org; vrodio...@carrieriq.com > > Sent: Thursday, August 22, 2013 12:30 PM > > Subject: Re: one column family but lots of tables > > > > > > if that is the case, how come people keep warning about limiting the > number > > of column families to only a handful (with more hbase performance will > > degrade supposedly), yet there seems to be no similar warnings for number > > of tables? see for example here: > > http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/27616 > > > > if a table means at least one column family then the number of tables > > should also be kept to a minumum, no? > > > > > > > > > > On Thu, Aug 22, 2013 at 1:58 PM, Vladimir Rodionov > > <vrodio...@carrieriq.com>wrote: > > > >> Nope. Column family is per table (its sub-directory inside your table > >> directory in HDFS). > >> If you have N tables you will always have , at least, N distinct CFs > (even > >> if they have the same name). > >> > >> Best regards, > >> Vladimir Rodionov > >> Principal Platform Engineer > >> Carrier IQ, www.carrieriq.com > >> e-mail: vrodio...@carrieriq.com > >> > >> ________________________________________ > >> From: Koert Kuipers [ko...@tresata.com] > >> Sent: Thursday, August 22, 2013 8:06 AM > >> To: user@hbase.apache.org > >> Subject: one column family but lots of tables > >> > >> i read in multiple places that i should try to limit the number of > column > >> families in hbase. > >> > >> do i understand it correctly that when i create lots of tables, but they > >> all use the same column family (by name), that i am just using one > column > >> family and i am OK with respect to limiting number of column families ? > >> > >> thanks! koert > >> > >> Confidentiality Notice: The information contained in this message, > >> including any attachments hereto, may be confidential and is intended > to be > >> read only by the individual or entity to whom this message is > addressed. If > >> the reader of this message is not the intended recipient or an agent or > >> designee of the intended recipient, please note that any review, use, > >> disclosure or distribution of this message or its attachments, in any > form, > >> is strictly prohibited. If you have received this message in error, > please > >> immediately notify the sender and/or notificati...@carrieriq.com and > >> delete or destroy any copy of this message and its attachments. > > The opinions expressed here are mine, while they may reflect a cognitive > thought, that is purely accidental. > Use at your own risk. > Michael Segel > michael_segel (AT) hotmail.com > > > > > >