On Tue, May 10, 2011 at 12:03:09AM -0400, Rita wrote: > what filesystem are they using and what is the size of each filesystem?
It sounds nuts, but each disk has its own ext3 filesystem. Beyond switching to the deadline IO scheduler, we haven't done much tuning/tweaking. A script runs every ten minutes to test all of the data mounts and reconfigure hdfs-site.xml and restart the datanode if necessary. So far, this approach has allowed us to avoid loss of space to RAID without correlating the risk of disk failure by building larger RAID0s. In the future, we expect to deprecate the script and rely on the datanode process itself to handle missing/failing disks. -- Will Maier - UW High Energy Physics cel: 608.438.6162 tel: 608.263.9692 web: http://www.hep.wisc.edu/~wcmaier/