Hi, Planet-scale data explorations and data mining operations will almost always need to include some sequential scans. Then, How can we speed up sequential scans? BigTable paper shows that.
* Column-oriented storage (it reduces I/O) * Data compression * PDP (parallel distributed processing) using Map/Reduce Also, Matrices which are column major typically perform better with column-oriented operations, and likewise for row major matrices. See the Hama/Heart project (http://incubator.apache.org/hama, http://wiki.apache.org/incubator/HeartProposal) on Hadoop + Hbase. Salesman, Edward :) On Wed, Aug 27, 2008 at 4:57 PM, Mork0075 <[EMAIL PROTECTED]> wrote: > I'am still really interested in these three questions :) > >> You example with the friends makes perfectly sense. Can you imagine a >> scenario where storing the data in column oriented instead of row >> oriented db (so if you will an counterexample) causes such a huge >> performance mismatch, like the friends one in row/column comparison? > > >> Can you please provide an example of "good de-normalization" in HBase >> and how its held consitent (in your friends example in a relational db, >> there would be a cascadingDelete)? As i think of the users table: if i >> delete an user with the userid='123', then if have to walk through all >> of the other users column-family "friends" to guranty consitency?! Is >> de-normalization in HBase only used to avoid joins? Our webapp doenst >> use joins at the moment anyway. > >> As you describe it, its a problem of implementation. BigTable is >> designed to scale, there are routines to shard the data, desitribute >> it to the pool of connected servers. Could MySQL perhaps decide >> tomorrow to implement something similar or does the relational model >> avoids this? > > >> >> Jonathan Gray schrieb: >>> A few very big differences... >>> >>> - HBase/BigTable don't have "transactions" in the same way that a >>> relational database does. While it is possible (and was just recently >>> implemented for HBase, see HBASE-669) it is not at the core of this >>> design. A major bottleneck of distributed multi-master relational >>> databases is distributed transactions/locks. >>> >>> - There's a very big difference between storage of >>> relational/row-oriented databases and column-oriented databases. For >>> example, if I have a table of 'users' and I need to store friendships >>> between these users... In a relational database my design is something >>> like: >>> >>> Table: users(pkey = userid) >>> Table: friendships(userid,friendid,...) which contains one (or maybe >>> two depending on how it's impelemented) row for each friendship. >>> >>> In order to lookup a given users friend, SELECT * FROM friendships >>> WHERE userid = 'myid'; >>> >>> This query would use an index on the friendships table to retrieve all >>> the necessary rows. Depending on the relational database you might >>> also be fetching each and every row (entirely) off of disk to be >>> read. In a sharded relational database, this would require hitting >>> every node to get whichever friendships were stored on that node. >>> There's lots of room for optimizations here but any way you slice it, >>> you're likely pulling non-sequential blocks off disk. When you add in >>> the overhead of ACID transactions this can get slow. >>> >>> The cost of this relational query continues to increase as a user adds >>> more friends. You also begin to have practical limits. If I have >>> millions of users, each with many thousands of potential friends, the >>> size of these indexes grow exponentially and things get nasty >>> quickly. Rather than friendships, imagine I'm storing activity logs >>> of actions taken by users. >>> >>> In a column-oriented database these things scale continuously with >>> minimal difference between 10 users and 10,000,000 users, 10 >>> friendships and 10,000 friendships. >>> >>> Rather than a friendships table, you could just have a friendships >>> column family in the users table. Each column in that family would >>> contain the ID of a friend. The value could store anything else you >>> would have stored in the friendships table in the relational model. >>> As column families are stored together/sequentially on a per-row >>> basis, reading a user with 1 friend versus a user with 10,000 friends >>> is virtually the same. The biggest difference is just in the shipping >>> of this information across the network which is unavoidable. In this >>> system a user could have 10,000,000 friends. In a relational database >>> the size of the friendship table would grow massively and the indexes >>> would be out of control. >>> >>> >>> It's certainly possible to make relational databases "scale". What >>> that is about is usually massive optimizations, manual sharding, being >>> very clever about how you query things, and often de-normalizing. >>> Index bloat and table bloat can thrash a relational db. >>> >>> In HBase, de-normalizing is usually a good thing. Storage space is >>> often considered free (not a large cost associated with storing >>> something in multiple places). In a relational database this is not >>> so much the case. >>> >>> In HBase, the sharding and distribution come for free out of the box. >>> With a relational database you must jump through hoops and often end >>> up implementing your own sharding/distributed hashing algorithms so >>> you can distribute across machines. >>> >>> Yes, you lose the relational primitives. Sometimes you wish you could >>> do a simple join. But if you get in the right mindset, you learn how >>> to put your data into this new data model. And in the end the payoffs >>> are huge. >>> >>> >>> In addition to all this, with HBase/Hadoop you get MapReduce. It's >>> feasible to implement something like that on top of a distributed >>> relational database but again the complexity is enormous. With >>> HBase/Hadoop it's a built-in part of the system, a system which is >>> very intelligent at keeping logic close to data, etc. >>> >>> >>> To directly answer your question, it's "simpler" to scale HBase >>> because the scaling comes for free out of the box. You get automatic >>> sharding/distribution of storage and queries. There's nothing simpler >>> than that. Distributing a relational database is never simple. >>> >>> I hope this starts to shed some light on what the differences are. >>> >>> >>> Jonathan Gray >>> Streamy Inc. >>> >>> -----Original Message----- >>> From: Mork0075 [mailto:[EMAIL PROTECTED] Sent: Thursday, August >>> 21, 2008 8:48 AM >>> To: core-user@hadoop.apache.org; [EMAIL PROTECTED] >>> Subject: Re: Why is scaling HBase much simpler then scaling a >>> relational db? >>> >>> Thank you, but i still don't got it. >>> >>> I've read tons of websites and papers, but there's no clear und >>> founded answer "why use BigTable instead of relational databases". >>> >>> MySQL Cluster seams to offer the same scalabilty and level of >>> abstraction, whithout switching to a non relational pardigm. Lots of >>> blog posts are highly emotional, without answering the core question: >>> >>> "Why RDBMS don't scale and why something like BigTable do". Often you >>> read something like this: >>> >>> "They have also built a system called BigTable, which is a Column >>> Oriented Database, which splits a table into columns rather than rows >>> making is much simpler to distribute and parallelize." >>> >>> Why? >>> >>> Really confusing ... ;) >>> >>> Stuart Sierra schrieb: >>>> On Tue, Aug 19, 2008 at 9:44 AM, Mork0075 <[EMAIL PROTECTED]> >>>> wrote: >>>>> Can you please explain, why someone should use HBase for horizontal >>>>> scaling instead of a relational database? One reason for me would be, >>>>> that i don't have to implement the sharding logic myself. Are there >>>>> other? >>>> A slight tangent -- there are various tools that implement sharding >>>> over relational databases like MySQL. Two that I know of are >>>> DBSlayer, >>>> http://code.nytimes.com/projects/dbslayer >>>> and MySQL Proxy, >>>> http://forge.mysql.com/wiki/MySQL_Proxy >>>> >>>> I don't know of any formal comparisons between sharding traditional >>>> database servers and distributed databases like HBase. >>>> -Stuart >>>> >>> >>> >> >> > > -- Best regards, Edward J. Yoon [EMAIL PROTECTED] http://blog.udanax.org