Re: Datanode disk considerations

Felix Chern Wed, 06 Aug 2014 13:52:33 -0700

Run the “hadoop balencer” command on the namenode. It’s is used for balancing 
skewed data.
http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer



On Aug 6, 2014, at 1:45 PM, Brian C. Huffman <bhuff...@etinternational.com> 
wrote:

> All,
> 
> We currently a Hadoop 2.2.0 cluster with the following characteristics:
> - 4 nodes
> - Each node is a datanode
> - Each node has 3 physical disks for data: 2 x 500GB and 1 x 2TB disk.
> - HDFS replication factor of 3
> 
> It appears that our 500GB disks are filling up first (the alternative would 
> be to put 4 times the number of blocks on the 2TB disks per node).  I'm 
> concerned that once the 500GB disks fill, our performance will slow down 
> (less spindles being read / written at the same time per node).  Is this 
> correct?  Is there anything we can do to change this behavior?
> 
> Thanks,
> Brian
> 
>

Re: Datanode disk considerations

Reply via email to