Darren J Moffat writes: > > Rather than being an additional stress-test (which is what the DEBUG > > behavior in Solaris is), it sounds like the intent is to work around > > bug-ridden drivers. If so, that's verging far too close for my taste > > to continuing operation after internal consistency has been lost and > > risking data corruption. > > In which case isn't that what we have FMA for ?
No. If a memory page fails, it's possible to determine what was using that page and kill/disable/threaten whatever it was that has been affected. If a kernel-resident hunk of code fails a consistency check, there's no way to predict what might have been caught up in the mess. A stray write to kernel memory due to a software design defect is very bad news. I agree that if the hardware itself is sick, then FMA events are the right solution. I disagree that FMA ought to (or even in principle _could_) deal sensibly with software design or coding flaws, particularly those in the kernel, where all scoping is lost. -- James Carlson, KISS Network <[EMAIL PROTECTED]> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org