Dear all
I'm sorry to disturb you.
Our cluster has 200 nodes now. In order to improve its ability, we hope
to add 60 nodes into the current cluster. However, we all don't know what
will happen if we add so many nodes at the same time. Could you give me some
tips and notes? During the
If you add these nodes, data will be put on them as you add data to the
cluster.
Soon after adding the nodes you should rebalance the storage to avoid age
related surprises in how files are arranged in your cluster.
Other than that, your addition should cause little in the way of surprises.
On
Also, if you haven't yet configured rack awareness, now's a good time to
start :)
- Aaron
On Tue, Aug 11, 2009 at 11:27 PM, Ted Dunning ted.dunn...@gmail.com wrote:
If you add these nodes, data will be put on them as you add data to the
cluster.
Soon after adding the nodes you should
Thank you for teaching me that.
I'm trying to use the balance tool(bin/hadoop balancer -t xxx). However, the
data transfer is so slow that it will take a long long time.
Is there a good method to solve it?
What's more, I have a puzzle. The situation is we rarely use the existed
data in the
On Thu, Aug 13, 2009 at 8:06 AM, yang song hadoop.ini...@gmail.com wrote:
Thank you for teaching me that.
I'm trying to use the balance tool(bin/hadoop balancer -t xxx). However,
the
data transfer is so slow that it will take a long long time.
Is there a good method to solve it?
What's
There is a parameter (dfs.balance.bandwidthPerSec) that limits the
rebalancing bandwidth. The default is rather low.
See http://developer.yahoo.com/hadoop/tutorial/module2.html#rebalancing
On Wed, Aug 12, 2009 at 7:36 PM, yang song hadoop.ini...@gmail.com wrote:
I'm trying to use the balance