Darren J Moffat writes:
> > Rather than being an additional stress-test (which is what the DEBUG
> > behavior in Solaris is), it sounds like the intent is to work around
> > bug-ridden drivers.  If so, that's verging far too close for my taste
> > to continuing operation after internal consistency has been lost and
> > risking data corruption.
> 
> In which case isn't that what we have FMA for ?

No.  If a memory page fails, it's possible to determine what was using
that page and kill/disable/threaten whatever it was that has been
affected.  If a kernel-resident hunk of code fails a consistency
check, there's no way to predict what might have been caught up in the
mess.  A stray write to kernel memory due to a software design defect
is very bad news.

I agree that if the hardware itself is sick, then FMA events are the
right solution.  I disagree that FMA ought to (or even in principle
_could_) deal sensibly with software design or coding flaws,
particularly those in the kernel, where all scoping is lost.

-- 
James Carlson, KISS Network                    <[EMAIL PROTECTED]>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to