Good discussion folks. I've opened HBASE-13323 for this effort to be pursued.
https://issues.apache.org/jira/browse/HBASE-13323 On Mon, Mar 23, 2015 at 7:50 AM, Michael Segel <michael_se...@hotmail.com> wrote: > @lars, > > How does the HDFS load balancer impact the load balancing of HBase? > > Of course there are two loads… one is the number of regions managed by a > region server that’s HBase’s load, right? > And then there’s the data distribution of HBase files that is really > managed by HDFS load balancer, right? > > OP’s question is having a heterogenous cluster where he would like to see > a more even distribution of data/free space based on the capacity of the > newer machines in the cluster. > > This is a storage question, not a memory/cpu core question. > > Or am I missing something? > > > -Mike > > > On Mar 22, 2015, at 10:56 PM, lars hofhansl <la...@apache.org> wrote: > > > > Seems that it should not be too hard to add that to the stochastic load > balancer. > > We could add a spaceCost or something. > > > > > > > > ----- Original Message ----- > > From: Jean-Marc Spaggiari <jean-m...@spaggiari.org> > > To: user <user@hbase.apache.org> > > Cc: Development <developm...@mentacapital.com> > > Sent: Thursday, March 19, 2015 12:55 PM > > Subject: Re: introducing nodes w/ more storage > > > > You can extend the default balancer and assign the regions based on > > that.But at the end, the replicated blocks might still go all over the > > cluster and your "small" nodes are going to be full and will not be able > to > > get anymore writes even for the regions they are supposed to get. > > > > I'm not sure there is a good solution for what you are looking for :( > > > > I build my own balancer but because of differences in the CPUs, not > because > > of differences of the storage space... > > > > > > 2015-03-19 15:50 GMT-04:00 Nick Dimiduk <ndimi...@gmail.com>: > > > >> Seems more fantasy than fact, I'm afraid. The default load balancer [0] > >> takes store file size into account, but has no concept of capacity. It > >> doesn't know that nodes in a heterogenous environment have different > >> capacity. > >> > >> This would be a good feature to add though. > >> > >> [0]: > >> > >> > https://github.com/apache/hbase/blob/branch-1.0/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java > >> > >> On Tue, Mar 17, 2015 at 7:26 AM, Ted Tuttle <t...@mentacapital.com> > wrote: > >> > >>> Hello- > >>> > >>> Sometime back I asked a question about introducing new nodes w/ more > >>> storage that existing nodes. I was told at the time that HBase will > not > >> be > >>> able to utilize the additional storage; I assumed at the time that > >> regions > >>> are allocated to nodes in something like a round-robin fashion and the > >> node > >>> with the least storage sets the limit for how much each node can > utilize. > >>> > >>> My question this time around has to do with nodes w/ unequal numbers of > >>> volumes: Does HBase allocate regions based on nodes or volumes on the > >>> nodes? I am hoping I can add a node with 8 volumes totaling 8X TB and > >> all > >>> the volumes will be filled. This even though legacy nodes have 5 > volumes > >>> and total storage of 5X TB. > >>> > >>> Fact or fantasy? > >>> > >>> Thanks, > >>> Ted > >>> > >>> > >> > > > > The opinions expressed here are mine, while they may reflect a cognitive > thought, that is purely accidental. > Use at your own risk. > Michael Segel > michael_segel (AT) hotmail.com > > > > > >