Thank you for your reply
I set the factor =1 , that is ,no replication there , I use it for research , And I get an observation , that is, When you store a small number of data into hbase , hbase will use a huge disk space, i.e., when hbase store 3million messages, which use 1GB disk as text in linuxFS, it will use 10GB disk in hbase, While when you continue adding more data into hbase, hbase will use more disk , but with less addition, i.e., when hbase continue to store 200million message, which use 60GB disk as text in linuxFS , it will use 180GB disk in hbase, And when you continue these addion procession, i.e., when hbase store 6 billion message , which use 2TB disk as text in linux FS , it will use 3TB disk in hbase, Do I make it clear, And I want to know why hbase use 10GB when only 3million messages, and why the usage of disk does not grow with linear , that is , it does not grow to 600GB when hbase store 200 million messages, and it does not grow to 36TB when 6 billion message in hbase, I know it is a good feature for hbase to store big data, I want to know why, Could you help me Thank you --------------------- Guanhua Tian 发件人: varun kumar [mailto:varun....@gmail.com] 发送时间: 2013年1月21日 16:56 收件人: guanhua.t...@ia.ac.cn 抄送: user@hbase.apache.org 主题: Re: confused about Data/Disk ratio Hi Tian, What is replication factor you mention in hdfs. Regards, Varun Kumar.P On Mon, Jan 21, 2013 at 12:17 PM, tgh <guanhua.t...@ia.ac.cn> wrote: Hi I use hbase to store Data, and I have an observation, that is, When hbase store 1Gb data, hdfs use 10Gb disk space, and when data is 60Gb, hdfs use 180Gb disk, and when data is about 2Tb, hdfs use 3Tb disk, That is, the ratio of data/disk is not a linear one, and why, Could you help me Thank you --------------------- Guanhua Tian -- Regards, Varun Kumar.P