At 12:30 PM 2/22/2007, David Mathog wrote:
Jim Lux wrote:

> >Now we need to know exactly how you defined "failed".
>
> The paper defined failed as "requiring the computer to be pulled"
> whether or not the disk was actually dead.

That was sort of my point, if you're looking for indicators that
lead to "failed disk" there should be a precise definition of
what "failed disk" is.  How am I to know what criteria Google uses
for classifying a machine as nonfunctioning?    If the system is
pulled because the CPU blew up it's one thing, but if they pulled it
for any disk related reason, we need to know how bad "bad" was.

True.. there's a paragraph or so of how they determined "failed" (e.g. they didn't include drives removed from service because of scheduled replacement).



> I would make the case that it's not worth it to even glance at the
> outside of the case of a dead unit, much less do failure analysis on
> the power supply.  FA is expensive, new computers are not.  Pitch the
> dead (or "not quite dead yet, but suspect") computer, slap in a new
> one and go on.

Well, they cared enough to do the study!

Or, more realistically, that the small dollars spent on the study to identify a possible connection was tiny enough that it's probably down in the overall budgetary noise floor.


I think the heart of the problem is that disk failures are a bit like
airplane crashes: everything looks great until something snaps and then
the plane goes down shortly thereafter.

I think one of the values of the study was that it actually did demonstrate just that.. you really can't do a very good job predicting failures in advance, so you'd better have a system in place to deal with the inevitable failures while they're in service.

And, of course, they have some "real numbers" on failure rates, which is useful in and of itself, regardless of whether the failures could be predicted.


James Lux, P.E.
Spacecraft Radio Frequency Subsystems Group
Flight Communications Systems Section
Jet Propulsion Laboratory, Mail Stop 161-213
4800 Oak Grove Drive
Pasadena CA 91109
tel: (818)354-2075
fax: (818)393-6875

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to