Re: Rebalancing after adding a new node

Vladimir Rodionov Thu, 03 Sep 2015 11:18:07 -0700

HBase does that automatically for you. Regions will be redistributed by
HBase balancer and after next major compaction, locality of data will be
restored, but ... HBase balancer works on a global level (all tables) and
can not rebalance only one table, besides this there is a such a separate
beast as HDFS balancer that makes its own decisions and does not care much
about HBase data. It is recommended to disable HDFS balancer in HBase
cluster for this reason.


-Vlad



On Thu, Sep 3, 2015 at 1:32 AM, James Heather <james.heat...@mendeley.com>
wrote:

> Suppose I create a table with a billion rows, on a cluster with N nodes.
> Then I want to increase performance, so I add a new node to the cluster.
> Obviously the data is still stored on the first N nodes, and not on the new
> one. Is there a way of redistributing the data (online) to take advantage
> of the new node?
>
> I realise the answer might depend on the configuration of the table. If
> there are schemas that fit this notion well, and schemas that don't, I'd be
> interested to know about that too.
>
> (This will be running on CDH5, if that makes a difference.)
>
> James
>

Re: Rebalancing after adding a new node

Reply via email to