Although this thread is wandering a bit, I disagree strongly that it is
inappropriate to discuss other vendor specific features (or competing
compute platform features) on general@.  The topic has become the factors
that influence hardware purchase choices, and one of those is how the
system deals with disk failure.  Compare/contrast with other platforms is
healthy for the Hadoop project!

On 6/30/11 9:47 PM, "Ian Holsman" <had...@holsman.net> wrote:

>
>On Jul 1, 2011, at 2:08 PM, M. C. Srivas wrote:
>
>> On Thu, Jun 30, 2011 at 5:24 PM, Todd Lipcon <t...@cloudera.com> wrote:
>> 
>>> 
>>> I'd advise you to look at "stock hadoop" again. This used to be true,
>>>but
>>> was fixed a long while back by HDFS-457 and several followup JIRAs.
>>> 
>>> If MapR does something fancier, I'm sure we'd be interested to hear
>>>about
>>> it
>>> so we can compare the approaches.
>>> 
>>> -Todd
>>> 
>>> 
>> MapR tracks disk responsiveness. In other words, a moving histogram of
>> IO-completion times is maintained internally, and if a disk starts
>>getting
>> really slow, it is pre-emptively taken offline so it does not create
>>long
>> tails for running jobs (and the data on the disk is re-replicated using
>> whatever re-replication policy is in place).  One of the benefits of
>> managing the disks directly instead of through ext3 / xfs / or other ...
>> 
>> All these stats can be fed into Ganglia (or pushed out centrally via a
>>text
>> file that can be pulled out using NFS)  if historical info about disk
>> behavior (and failures) needs to be preserved.
>> 
>> - Srivas.
>
>While I am intrigued about how MapR performs internally, I don't think
>this is the forum for it.
>please keep MapR (and other vendor specific discussions) on their
>respective support forums.
>
>Thanks!
>
>Ian.
>

Reply via email to