2011/8/11 Jeremy Chadwick <free...@jdc.parodius.com>: > On Thu, Aug 11, 2011 at 09:59:36AM +0100, Steven Hartland wrote: >> That's not the issue as its happening across board over 130 machines :( > > Agreed, bad hardware sounds unlikely here. I could believe some strange > incompatibility (e.g. BIOS quirk or the like[1]) that might cause problems > en masse across many servers, but hardware issues are unlikely in this > situation. > > [1]: I mention this because we had something similar happen at my > workplace. For months we used a specific model of system from our > vendor which worked reliably, zero issues. Then we got a new shipment > of boxes (same model as prior) which started acting very odd (often AHCI > timeout issues or MCEs which when decoded would usually turn out to be > nonsensical). It took weeks to determine the cause given how slow the > vendor was to respond: root cause turned out to be that the vendor > decided, on a whim, to start shipping a newer BIOS version which wasn't > "as compatible" with Solaris as previous BIOSes. Downgrading all the > systems to the older BIOS fixed the problem.
That falls in the "hw problem" category for me. Anyway, we really would need much more information in order to take a proactive action. Would it be possible to access to one of the panic'ing machine? Is it always the same panic which is happening or it is variadic (like: once page fault, once fatal double fault, once fatal trap, etc.). Whatever informations you can provide may be valuable here. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"