What kind of raid are you doing? Sounds like raid0, which means you have a 100% chance of losing the entire box if a single disk goes down. If you choose just one, lets say sda, to host the OS you are now at 33% chance of losing the box if a disk goes bad - assuming that all disks have the same failure probability of course.
What we do is install the OS on disk1, (sda), then have 4 JBODs and I put our logs on disk1 as well. log4j is tricky because it will cause issues on disk corruption/io error events, but i have seen systems continue to operate even if log4j can't write to disk due to a disk full scenario. There is almost no non-HDFS data, you can literally wedge it in like 8gb. The biggest things that are not HDFS data are logs, and those can go into the HDFS partition, they tend to be low volume but can add up over time since the default is not to reap them. On Thu, Sep 30, 2010 at 4:17 PM, Daniel Einspanjer <deinspan...@mozilla.com> wrote: > Right now, most of our boxes have 3 disk in them. We take a small partition > on each of those and raid stripe them together to use as the OS partition > then allocate the rest of the disks as JBOD for HDFS storage. > > We are building out a new cluster and I'm wondering if there are any better > ideas for balancing the need for storage and speed of the HDFS disks with > having *some place* to put the OS and non-HDFS data. > > What are other people doing about that? > > -Daniel >