Hey, The bigtable paper talks more about column families, but in HBase each column family is stored in it's own file. That means there is disk locality for different column families. The canonical use is to put web crawl data in one family, and meta data (like derived meta data) in another. That way scanning just the meta data is not as expensive as scanning the web page crawl dump.
Column families are pre-defined - the "schema" for what it's worth - but the 'qualifier' within a family is dynamically determined by the client. In the terminology of the article, hbase would be more 'row oriented', but with the column family snag, it isnt that simple. Since rows from different families are stored in different files, reading efficiency is related to which column families you are reading in a query. -ryan On Fri, Jul 31, 2009 at 12:02 AM, Angus He<[email protected]> wrote: > Hi Ryan, > > 1. If it is not the case , what is the purpose of introduction of > "column family"? > Does the contents from different column family stored in different > files in HBase? > > BTW, in the bigtable paper, we can find the following text: > "Access control and both disk and memory accounting are performed at > the column-family level." > > 2. I was wondering if HBase shares the benefits described in the > "Benefits" sections of wikipedia article. If not, what is the meaning > of "column-stores" in HBase? > > > > > > On Fri, Jul 31, 2009 at 2:30 PM, Ryan Rawson<[email protected]> wrote: >> HBase and bigtable are referred to column-stores, but we arent a >> 'column oriented dbms' as described in the wikipedia. >> >> At the storage level, hbase stores key-values, where the key is a >> triple of row / column / timestamp. Files are ordered lists of these >> key/values, and they are sorted in that order, hence rows are stored >> together, then sorted by column then reverse by timestamp (newest on >> top). >> >> Thus hbase is not a 'column store' in the sense listed in the wikipedia >> entry. >> >> On Thu, Jul 30, 2009 at 11:23 PM, Angus He<[email protected]> wrote: >>> Why don't you try to google it first? >>> After googling with the keyword "Column-oriented", the first result is >>> exactly what you want. >>> http://en.wikipedia.org/wiki/Column-oriented_DBMS >>> >>> >>> >>> 2009/7/31 <[email protected]>: >>>> Hi, >>>> Does anyone can tell me the benefit of Column-oriented data modal? >>>> Thank you >>>> >>>> Fleming >>>> 宏明 >>>> --------------------------------------------------------------------------- >>>> TSMC PROPERTY >>>> This email communication (and any attachments) is proprietary information >>>> for the sole use of its >>>> intended recipient. Any unauthorized review, use or distribution by anyone >>>> other than the intended >>>> recipient is strictly prohibited. If you are not the intended recipient, >>>> please notify the sender by >>>> replying to this email, and then delete this email and any copies of it >>>> immediately. Thank you. >>>> --------------------------------------------------------------------------- >>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> Regards >>> Angus >>> >> > > > > -- > Regards > Angus >
