Hey  Vaibhavj,

Two notes beforehand:
1) When asking questions, you'll want to post the Hadoop version used.
2) You'll also want to only send to one mailing list at a time; it is a common courtesy.

Can you provide the list with the outputs of "df -h"? Also, can you share what your namenode interface thinks about the configured capacity, used, non-dfs used, and remaining columns for your node?


On Mar 16, 2009, at 7:19 AM, Vaibhav J wrote:


From: Vaibhav J [mailto:vaibh...@rediff.co.in]
Sent: Monday, March 16, 2009 5:46 PM
To: 'nutch-...@lucene.apache.org'; 'nutch-u...@lucene.apache.org'
Subject: Problem : data distribution is non uniform between two different
disks on datanode.

We have 27 datanode and replication factor is 1. (data size is ~6.75 TB)

We have specified two different disks for dfs data directory on each
datanode by using

property dfs.data.dir in hadoop-site.xml file of conf directory.

(value of property dfs.data.dir : /mnt/hadoop-dfs/data,

when we are setting replication factor 2 then data distribution is biased to
first disk,

more data is coping on /mnt/hadoop-dfs/data and after copying some
data...first disk becomes full

and showing no available space on disk while we have enough space on second
disk (/mnt2/hadoop-dfs/data ).

so, it is difficult to achieve replication factor 2.

Data traffic is coming on second disk also (/mnt2/hadoop-dfs/data) but it
looks that

more data is copied on fisrt disk (/mnt/hadoop-dfs/data).

What should we do to get uniform data distribution between two different
disks on

each datanode to achieve replication factor 2?


Vaibhav J.

Reply via email to