On 2/7/11 2:06 PM, Jonathan Disher wrote:
Currently I have a 48 node cluster using Dell R710's with 12 disks - two
250GB SATA drives in RAID1 for OS, and ten 1TB SATA disks as a JBOD
(mounted on /data/0 through /data/9) and listed separately in
hdfs-site.xml. It works... mostly. The big issues you will encounter is
losing a disk - the DataNode process will crash, and if you comment out
the affected drive, when you replace it you will have 9 disks full to N%
and one empty disk.

If DataNode is going down after a single disk failure then you probably haven't set dfs.datanode.failed.volumes.tolerated in hdfs-site.xml. You can up that number to allow DataNode to tolerate dead drives.

- Adam

Reply via email to