> If you stored only 1 column per family, it would resemble a > column-store, however as you stored more columns per family, they > would be stored in "row order", ie: columns from the same row are > stored next to each other.
I know. And In previous post, I have mentioned "You cannot equate the "column" in that article of wikipedia to the "column" in HBase. So we should consider the "column" in wikipedia as "column-family" in HBase". Anyway, Ryan, do you agree that hbase is a "column-family oriented db system"? > > On Fri, Jul 31, 2009 at 1:05 AM, Angus He<[email protected]> wrote: >> OK,OK,OK. >> >> If data is stored row-by-row in hbase, how could you explain the text >> under section "Physical Storage View" in >> http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture. >> Is the page stale or something else wrong? >> >> On Fri, Jul 31, 2009 at 3:50 PM, Ryan Rawson<[email protected]> wrote: >>> Data is stored row-by-row in the hbase store files (aka hfiles). >>> HBase is not a column-oriented-store as described in the wikipedia >>> article: http://en.wikipedia.org/wiki/Column-oriented_DBMS >>> >>> Have a look at the bigtable paper, do some searches, lots of material >>> out there describing the benefits of a flexible store like >>> bigtable/hbase. >>> >>> -ryan >>> >>> >>> >>> On Fri, Jul 31, 2009 at 12:42 AM, Angus He<[email protected]> wrote: >>>> Hi Ryan, >>>> >>>> You cannot equate the "column" in that article of wikipedia to the >>>> "column" in HBase. >>>> >>>> We should assume that the word "column" in "column-oriented" is >>>> predefined, otherwise, it is meaningless. >>>> >>>> So we should consider the "column" in wikipedia as "column-family" in >>>> HBase. In this way, the article can answer 宏明's question. >>>> >>>> >>>> On Fri, Jul 31, 2009 at 3:18 PM, Ryan Rawson<[email protected]> wrote: >>>>> Hey, >>>>> >>>>> The bigtable paper talks more about column families, but in HBase each >>>>> column family is stored in it's own file. That means there is disk >>>>> locality for different column families. The canonical use is to put >>>>> web crawl data in one family, and meta data (like derived meta data) >>>>> in another. That way scanning just the meta data is not as expensive >>>>> as scanning the web page crawl dump. >>>>> >>>>> Column families are pre-defined - the "schema" for what it's worth - >>>>> but the 'qualifier' within a family is dynamically determined by the >>>>> client. >>>>> >>>>> In the terminology of the article, hbase would be more 'row oriented', >>>>> but with the column family snag, it isnt that simple. Since rows from >>>>> different families are stored in different files, reading efficiency >>>>> is related to which column families you are reading in a query. >>>>> >>>>> -ryan >>>>> >>>>> On Fri, Jul 31, 2009 at 12:02 AM, Angus He<[email protected]> wrote: >>>>>> Hi Ryan, >>>>>> >>>>>> 1. If it is not the case , what is the purpose of introduction of >>>>>> "column family"? >>>>>> Does the contents from different column family stored in different >>>>>> files in HBase? >>>>>> >>>>>> BTW, in the bigtable paper, we can find the following text: >>>>>> "Access control and both disk and memory accounting are performed at >>>>>> the column-family level." >>>>>> >>>>>> 2. I was wondering if HBase shares the benefits described in the >>>>>> "Benefits" sections of wikipedia article. If not, what is the meaning >>>>>> of "column-stores" in HBase? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jul 31, 2009 at 2:30 PM, Ryan Rawson<[email protected]> wrote: >>>>>>> HBase and bigtable are referred to column-stores, but we arent a >>>>>>> 'column oriented dbms' as described in the wikipedia. >>>>>>> >>>>>>> At the storage level, hbase stores key-values, where the key is a >>>>>>> triple of row / column / timestamp. Files are ordered lists of these >>>>>>> key/values, and they are sorted in that order, hence rows are stored >>>>>>> together, then sorted by column then reverse by timestamp (newest on >>>>>>> top). >>>>>>> >>>>>>> Thus hbase is not a 'column store' in the sense listed in the wikipedia >>>>>>> entry. >>>>>>> >>>>>>> On Thu, Jul 30, 2009 at 11:23 PM, Angus He<[email protected]> wrote: >>>>>>>> Why don't you try to google it first? >>>>>>>> After googling with the keyword "Column-oriented", the first result is >>>>>>>> exactly what you want. >>>>>>>> http://en.wikipedia.org/wiki/Column-oriented_DBMS >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2009/7/31 <[email protected]>: >>>>>>>>> Hi, >>>>>>>>> Does anyone can tell me the benefit of Column-oriented data modal? >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> Fleming >>>>>>>>> 宏明 >>>>>>>>> --------------------------------------------------------------------------- >>>>>>>>> TSMC PROPERTY >>>>>>>>> This email communication (and any attachments) is proprietary >>>>>>>>> information >>>>>>>>> for the sole use of its >>>>>>>>> intended recipient. Any unauthorized review, use or distribution by >>>>>>>>> anyone >>>>>>>>> other than the intended >>>>>>>>> recipient is strictly prohibited. If you are not the intended >>>>>>>>> recipient, >>>>>>>>> please notify the sender by >>>>>>>>> replying to this email, and then delete this email and any copies of >>>>>>>>> it >>>>>>>>> immediately. Thank you. >>>>>>>>> --------------------------------------------------------------------------- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Regards >>>>>>>> Angus >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards >>>>>> Angus >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards >>>> Angus >>>> >>> >> >> >> >> -- >> Regards >> Angus >> > -- Regards Angus
