We have been using 2U boxes with 12x1TB disks. The first disk is used for
OS/Scratch/Laziness, the other 11 disks are formatted as individual (~900GB)
volumes and then mounted separately. We have /data-[a-k] mounted and
configured in our cluster and have not had any issues with unbalanced
loading. We do have varied sizes (from small to really large) files and
hadoop just seems to figure it out for us.

We have had single drive failures. We just bounce the datanode software and
all is happy. When we get around to replacing the failed drive (I just about
to go do one), we format it, mount it and then bounce the datanode. That
replacement volume is now not very well balanced, but it is not typically an
issue for us, we add lots of data every day so it does get filled up. We
have run the rebalancer to address huge disparites in node utilization (like
when we add a new node). That just made us feel better more than anything
else.

Cheers

On Mon, Jan 12, 2009 at 2:00 PM, David Ritch <david.ri...@gmail.com> wrote:

> Thank you!  I'm glad to hear that you have actually tested this.
>
> I believe that a failure of any disk - even with JBOD - will cause dataNode
> to bring the node down.  Presumably, we could bring it right back up, but
> this does sort of diminish the availability argument for JBOD.
>
> Sounds like it's basically a toss-up.  I'm a bit concerned about the
> potential for uneven distribution - both of amount of data, and of transfer
> load - across the spindles.  Unless I hear otherwise, I will probably go
> with RAID-0.
>
> On Mon, Jan 12, 2009 at 12:17 PM, Colin Evans <co...@metaweb.com> wrote:
>
> > Currently, Hadoop does round-robin allocation of blocks and data across
> > multiple JBOD disks.  We did some testing and found that there weren't
> > significant differences between RAID-0 and JBOD.  We went with JBOD
> because
> > we figured that RAID-0 has a higher failure rate than JBOD -- any disk
> > failure in a 3-disk RAID-0 configuration causes the whole node to go
> down,
> > but if there is a single disk failure in a JBOD configuration, Hadoop
> will
> > go on serving from the other disks.
>

Reply via email to