On Tue, Feb 14, 2012 at 11:57 AM, Julian Elischer <jul...@freebsd.org> wrote: > but I'm interested in any answers people may have
The way other OSes handle this is by detecting any abnormal amounts of faults (sometimes it's not the fault of the hardware - eg. when a partical from the outerspace hits a core and flips the bit), then the disable the core(s). Solaris & mainframe (z/OS) handle it this way, but you should google and find more info since I don't remember all the details. Also, see this presentation: "Getting to know the Solaris Fault Management Architecture (FMA)": http://www.prefetch.net/presentations/SolarisFaultManagement_Presentation.pdf Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/ > > >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" >> > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" -- Rayson ================================================== Open Grid Scheduler - The Official Open Source Grid Engine http://gridscheduler.sourceforge.net/ _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"