We've been doing some testing with HBase, and one of the problems we ran into was that our machines are not homogenous in terms of disk capacity. A few of our machines only have 80gb drives, where the rest have 250s. As such, as the equal distribution of blocks went on, these smaller machines filled up first, completely overloading the drives, and came to a crashing halt. Since one of these machines was also the namenode, it broke the rest of the cluster.

What I'm wondering is if there should be a way to tell HDFS to only use something like 80% of available disk space before considering a machine full. Would this be a useful feature, or should we approach the problem from another angle, like using a separate HDFS data partition?

-Bryan

Reply via email to