On Thu, Jun 30, 2011 at 5:24 PM, Todd Lipcon <t...@cloudera.com> wrote:
> > I'd advise you to look at "stock hadoop" again. This used to be true, but > was fixed a long while back by HDFS-457 and several followup JIRAs. > > If MapR does something fancier, I'm sure we'd be interested to hear about > it > so we can compare the approaches. > > -Todd > > MapR tracks disk responsiveness. In other words, a moving histogram of IO-completion times is maintained internally, and if a disk starts getting really slow, it is pre-emptively taken offline so it does not create long tails for running jobs (and the data on the disk is re-replicated using whatever re-replication policy is in place). One of the benefits of managing the disks directly instead of through ext3 / xfs / or other ... All these stats can be fed into Ganglia (or pushed out centrally via a text file that can be pulled out using NFS) if historical info about disk behavior (and failures) needs to be preserved. - Srivas.