Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hbase/DataModel" page has been changed by Misty: https://wiki.apache.org/hadoop/Hbase/DataModel?action=diff&rev1=9&rev2=10 + The HBase Wiki is in the process of being decommissioned. The info that used to be on this page has moved to http://hbase.apache.org/book.html#datamodel. Please update your bookmarks. - '''This page is a work in progress''' - - * [[#intro|Introduction]] - * [[#overview|Overview]] - * [[#row|Rows]] - * [[#columns|Column Families]] - * [[#ts|Timestamps]] - * [[#famatt|Families Attributes]] - * [[#example|Real Life Example]] - * [[#relational|The Source ERD]] - * [[#hbaseschema|The HBase Target Schema]] - <<Anchor(intro)>> - = Introduction = - - The Bigtable data model and therefor the HBase data model too since it's a clone, is particularly well adapted to data-intensive systems. Getting high scalability from your relational database isn't done by simply adding more machines because its data model is based on a single-machine architecture. For example, a JOIN between two tables is done in memory and does not take into account the possibility that the data has to go over the wire. Companies who did propose relational distributed databases had a lot of redesign to do and this why they have high licensing costs. The other option is to use replication and when the slaves are overloaded with ''writes'', the last option is to begin sharding the tables in sub-databases. At that point, data normalization is a thing you only remember seeing in class which is why going with the data model presented in this paper shouldn't bother you at all. - - <<Anchor(overview)>> - = Overview = - - To put it simply, HBase can be reduced to a Map<byte[], Map<byte[], Map<byte[], Map<Long, byte[]>>>>. The first Map maps row keys to their ''column families''. The second maps column families to their ''column keys''. The third one maps column keys to their ''timestamps''. Finally, the last one maps the timestamps to a single value. The keys are typically strings, the timestamp is a long and the value is an uninterpreted array of bytes. The column key is always preceded by its family and is represented like this: ''family:key''. Since a family maps to another map, this means that a single column family can contain a theoretical infinity of column keys. So, to retrieve a single value, the user has to do a ''get'' using three keys: - - row key+column key+timestamp -> value - - <<Anchor(row)>> - = Rows = - - The row key is treated by HBase as an array of bytes but it must have a string representation. A special property of the row key Map is that it keeps them in a lexicographical order. For example, numbers going from 1 to 100 will be ordered like this: - 1,10,100,11,12,13,14,15,16,17,18,19,2,20,21,...,9,91,92,93,94,95,96,97,98,99 - - To keep the integers natural ordering, the row keys have to be left-padded with zeros. To take advantage of this, the functionalities of the row key Map are augmented by offering a scanner which takes a ''start row key'' (if not specified, the first one in the table) and an ''stop row key'' (if not specified, the last one in the table). For example, if the row keys are dates in the format YYYYMMDD, getting the month of July 2008 is a matter of opening a scanner from ''20080700'' to ''20080800''. It does not matter if the specified row keys are existing or not, the only thing to keep in mind is that the stop row key will not be returned which is why the first of August is given to the scanner. - - <<Anchor(columns)>> - = Column Families = - - A column family regroups data of a same nature in HBase and has no constraint on the type. The families are part of the table schema and stay the same for each row; what differs from rows to rows is that the column keys can be very sparse. For example, row "20080702" may have in its "info:" family the following column keys: - ||info:aaa|| - ||info:bbb|| - ||info:ccc|| - While row "20080703" only has: - ||info:12342|| - Developers have to be very careful when using column keys since a key with a length of zero is permitted which means that in the previous example data can be inserted in column key "info:". We strongly suggest using empty column keys only when no other keys will be specified. Also, since the data in a family has the same nature, many attributes can be specified regarding [[#famatt|performance and timestamps]]. - - <<Anchor(ts)>> - = Timestamps = - - The values in HBase may have multiple versions kept according to the family configuration. By default, HBase sets the timestamp to each new value to current time in milliseconds and returns the latest version when a cell is retrieved. The developer can also provide its own timestamps when inserting data as he can specify a certain timestamp when fetching it. - - <<Anchor(famatt)>> - = Family Attributes = - - The following attributes can be specified or each families: - - Implemented - * Compression - * Record: means that each exact values found at a rowkey+columnkey+timestamp will be compressed independently. - * Block: means that blocks in HDFS are compressed. A block may contain multiple records if they are shorter than one HDFS block or may only contain part of a record if the record is longer than a HDFS block. - * Timestamps - * Max number: the maximum number of different versions a value has. - * Time to live: versions older than specified time will be garbage collected. - * Block Cache : caches blocks fetched from HDFS in a LRU-style queue. Improves random read performances and is a nice feature while waiting for full in-memory storage. - - Still not implemented - * In memory: all values of that family will be kept in memory. - * Length: values written will not be longer than the specified number of bytes. See [[https://issues.apache.org/jira/browse/HBASE-742|See HBASE-742]] - - <<Anchor(example)>> - = Real Life Example = - The following example is the same one given during HBase ETS presentation available in French in the [[HBase/HBasePresentations|presentation page]]. - - A good example on how to demonstrate the HBase data model is a blog because of its simple features and domain. Suppose the following mini-SRS: - * The blog entries, which consist of a title, an under title, a date, an author, a type (or tag), a text, and comments, can be created and updated by logged in users. - * The users, which consist of a username, a password, and a name, can log in and log out. - * The comments, which consist of a title, an author, and text, can be written anonymously by visitors as long as their identity is verified by a captcha. - - <<Anchor(relational)>> - == The Source ERD == - Let us consider the ERD (entity relationship diagram) below: - - {{http://people.apache.org/~jdcryans/db_blog.jpg}} - - <<Anchor(hbaseschema)>> - == The HBase Target Schema == - - A first solution could be : - - ||Table||Row Key||Family||Attributs|| - ||blogtable||TTYYYYMMDDHHmmss||info:||Always contains the column keys author,title,under_title. Should be IN-MEMORY and have a 1 version|| - || || ||text:||No column key. 3 versions|| - || || ||comment_title:||Column keys are written like YYYMMDDHHmmss. Should be IN-MEMORY and have a 1 version|| - || || ||comment_author:||Same keys. 1 version|| - || || ||comment_text:||Same keys. 1 version|| - ||usertable||login_name||info:||Always contains the column keys password and name. 1 version|| - - The row key for blogtable is a concatenation of it's type (shortened to 2 letters) and it's timestamp. This way, the rows will be gathered first by type and then by date throughout the cluster. It means more chances of hitting a single region to fetch the needed data. Also you can see that the one-to-many relationship between BLOGENTRY and COMMENT is handled by putting each attributes of the comments as a family in blogentry and by using it a date as a column key, all comments are already sorted. - - One advantage of this design is that when you show the "front page" of your blog, you only have to fetch the family "info:" from blogtable. When you show an actual blog entry, you fetch a whole row. Another advantage is that by using timestamps in the row key, your scanner will fetch sequential rows if you want to show, for example, the entries from the last month. -