On Fri, Jul 31, 2009 at 4:23 PM, Ryan Rawson<[email protected]> wrote:
> Not really, only storing 1 value per column family is a fairly
> degenerate case and not really the primary mechanism by which people
> use hbase.  The column family storage model may superficially appear
> to be like a column-store, but it can do so much more and is much more
> flexible.

Yes, I couldn't agree more, Ryan.

And that's why we choose hbase instead of other column-oriented DBMS,
it provides us much more flexibility.

But from the conceptual point of view,  hbase and Google bigtable is a
column-family oriented database system indeed and consequently they
share the benefits as described in
http://en.wikipedia.org/wiki/Column-oriented_DBMS .



> On Fri, Jul 31, 2009 at 1:20 AM, Angus He<[email protected]> wrote:
>>> If you stored only 1 column per family, it would resemble a
>>> column-store, however as you stored more columns per family, they
>>> would be stored in "row order", ie: columns from the same row are
>>> stored next to each other.
>>
>> I know. And In previous post, I have mentioned "You cannot equate the
>> "column" in that article of wikipedia to the
>> "column" in HBase.
>> So we should consider the "column" in wikipedia as "column-family" in
>> HBase".
>>
>> Anyway,
>> Ryan, do you agree that hbase is a "column-family oriented db system"?
>>
>>
>>
>>
>>>
>>> On Fri, Jul 31, 2009 at 1:05 AM, Angus He<[email protected]> wrote:
>>>> OK,OK,OK.
>>>>
>>>> If data is stored row-by-row in hbase, how could you explain the text
>>>> under section "Physical Storage View" in
>>>> http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture.
>>>> Is the page stale or something else wrong?
>>>>
>>>> On Fri, Jul 31, 2009 at 3:50 PM, Ryan Rawson<[email protected]> wrote:
>>>>> Data is stored row-by-row in the hbase store files (aka hfiles).
>>>>> HBase is not a column-oriented-store as described in the wikipedia
>>>>> article: http://en.wikipedia.org/wiki/Column-oriented_DBMS
>>>>>
>>>>> Have a look at the bigtable paper, do some searches, lots of material
>>>>> out there describing the benefits of a flexible store like
>>>>> bigtable/hbase.
>>>>>
>>>>> -ryan
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 31, 2009 at 12:42 AM, Angus He<[email protected]> wrote:
>>>>>> Hi Ryan,
>>>>>>
>>>>>> You cannot equate the "column" in that article of wikipedia to the
>>>>>> "column" in HBase.
>>>>>>
>>>>>> We should assume that the word "column" in "column-oriented" is
>>>>>> predefined, otherwise, it is meaningless.
>>>>>>
>>>>>> So we should consider the "column" in wikipedia as "column-family" in
>>>>>> HBase.  In this way, the article can answer 宏明's question.
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 31, 2009 at 3:18 PM, Ryan Rawson<[email protected]> wrote:
>>>>>>> Hey,
>>>>>>>
>>>>>>> The bigtable paper talks more about column families, but in HBase each
>>>>>>> column family is stored in it's own file.  That means there is disk
>>>>>>> locality for different column families.  The canonical use is to put
>>>>>>> web crawl data in one family, and meta data (like derived meta data)
>>>>>>> in another.  That way scanning just the meta data is not as expensive
>>>>>>> as scanning the web page crawl dump.
>>>>>>>
>>>>>>> Column families are pre-defined - the "schema" for what it's worth -
>>>>>>> but the 'qualifier' within a family is dynamically determined by the
>>>>>>> client.
>>>>>>>
>>>>>>> In the terminology of the article, hbase would be more 'row oriented',
>>>>>>> but with the column family snag, it isnt that simple.  Since rows from
>>>>>>> different families are stored in different files, reading efficiency
>>>>>>> is related to which column families you are reading in a query.
>>>>>>>
>>>>>>> -ryan
>>>>>>>
>>>>>>> On Fri, Jul 31, 2009 at 12:02 AM, Angus He<[email protected]> wrote:
>>>>>>>> Hi Ryan,
>>>>>>>>
>>>>>>>> 1. If it is not the case , what is the purpose of introduction of
>>>>>>>> "column family"?
>>>>>>>> Does the contents from different column family stored in different
>>>>>>>> files in HBase?
>>>>>>>>
>>>>>>>> BTW, in the bigtable paper, we can find the following text:
>>>>>>>> "Access control and both disk and memory accounting are performed at
>>>>>>>> the column-family level."
>>>>>>>>
>>>>>>>> 2. I was wondering if HBase shares the benefits described in the
>>>>>>>> "Benefits" sections of wikipedia article. If not, what is the meaning
>>>>>>>> of  "column-stores" in HBase?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jul 31, 2009 at 2:30 PM, Ryan Rawson<[email protected]> wrote:
>>>>>>>>> HBase and bigtable are referred to column-stores, but we arent a
>>>>>>>>> 'column oriented dbms' as described in the wikipedia.
>>>>>>>>>
>>>>>>>>> At the storage level, hbase stores key-values, where the key is a
>>>>>>>>> triple of row / column / timestamp.  Files are ordered lists of these
>>>>>>>>> key/values, and they are sorted in that order, hence rows are stored
>>>>>>>>> together, then sorted by column then reverse by timestamp (newest on
>>>>>>>>> top).
>>>>>>>>>
>>>>>>>>> Thus hbase is not a 'column store' in the sense listed in the 
>>>>>>>>> wikipedia entry.
>>>>>>>>>
>>>>>>>>> On Thu, Jul 30, 2009 at 11:23 PM, Angus He<[email protected]> wrote:
>>>>>>>>>> Why don't you try to google it first?
>>>>>>>>>> After googling with the keyword "Column-oriented", the first result 
>>>>>>>>>> is
>>>>>>>>>> exactly what you want.
>>>>>>>>>> http://en.wikipedia.org/wiki/Column-oriented_DBMS
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2009/7/31  <[email protected]>:
>>>>>>>>>>> Hi,
>>>>>>>>>>> Does anyone can tell me the benefit of Column-oriented data modal?
>>>>>>>>>>> Thank you
>>>>>>>>>>>
>>>>>>>>>>> Fleming
>>>>>>>>>>> 宏明
>>>>>>>>>>>  ---------------------------------------------------------------------------
>>>>>>>>>>>                                                         TSMC 
>>>>>>>>>>> PROPERTY
>>>>>>>>>>>  This email communication (and any attachments) is proprietary 
>>>>>>>>>>> information
>>>>>>>>>>>  for the sole use of its
>>>>>>>>>>>  intended recipient. Any unauthorized review, use or distribution 
>>>>>>>>>>> by anyone
>>>>>>>>>>>  other than the intended
>>>>>>>>>>>  recipient is strictly prohibited.  If you are not the intended 
>>>>>>>>>>> recipient,
>>>>>>>>>>>  please notify the sender by
>>>>>>>>>>>  replying to this email, and then delete this email and any copies 
>>>>>>>>>>> of it
>>>>>>>>>>>  immediately. Thank you.
>>>>>>>>>>>  ---------------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Regards
>>>>>>>>>> Angus
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards
>>>>>>>> Angus
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards
>>>>>> Angus
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards
>>>> Angus
>>>>
>>>
>>
>>
>>
>> --
>> Regards
>> Angus
>>
>



-- 
Regards
Angus

Reply via email to