On Thu, Oct 15, 2009 at 12:32 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
> > >>No need for dedicated SATA drives with > >>RAID for your OS. Most of that is accessed during boot time so it won't > >>contend that much with HDFS. > > You may want to RAID your OS. If you lose a datanode with a large > volume of data say (8 TB) Hadoop will begin the process of > re-replicating that data elsewhere, that can use cluster resources. > > You MIGHT want to avoid that, or maybe you do not care. > > Having 2 disks for the OS is a waist of bays, so we got clever. Take a > system with 8 drives @ 1TB. Slice off ~30 GB from two of the disks and > use Linux software RAID-1 MIRROR for the OS+ swap. > > Now you don't need to separate disks for the OS and you don't run the > risk of losing that one disk that takes down the entire DataNode. > Forgot to mention it, but that is exactly what we do. We considered net-booting as an option, but we were time-constrained and so didn't look that deeply into it. I'd be interested in hearing others that have used networked boot...