Re: Problems with crashing IBM X3630 M3/ZFS

2012-07-11 Thread Ivan Voras
On 06/07/2012 20:56, Bob Healey wrote:
> Hello.  I've got a quartet of IBM x3630 M3 with one that is frequently
> hard locking under heavy NFS load.  I am running 9.0-RELEASE with all
> the patches from freebsd-update.
> 
> My problem machine has 8 16 core clients, each doing IO intensive tasks
> connected to it via a Procurve and the onboard igb0 interface.  Mostly
> network reads, typically 10MB read per MB written.
> When the machine locks under load, none of the consoles respond, nor can
> I reach the machine via ethernet.  I can break into DDB via the serial
> over lan interface, and am running a debug/witness kernel at the moment
> (I was running GENERIC previously).  During the boot sequence, witness
> tosses me into DDB ~10 times before I get a login prompt. Prior to this
> machine acting up, it had multiple 802.1q vlans, and ran 9K packets on
> its private network to the compute clients.
> 
> A dmesg can be found at http://boyle.che.rpi.edu/~healer/boomer/dmesg
> /etc/rc.conf can be found at
> http://boyle.che.rpi.edu/~healer/boomer/rc.conf
> A listing of installed ports can be found at
> http://boyle.che.rpi.edu/~healer/boomer/pkg_info
> The output of psauxwwo wchan against my two crash dumps can be found at
> http://boyle.che.rpi.edu/~healer/boomer/crash1-psaux-wchan and
> http://boyle.che.rpi.edu/~healer/boomer/crash2-psaux-wchan
> 
> I'm not entire convinced this is software, but I've run out of local
> experts to ask, and can't prove its hardware.

Hi,

I tested a recent IBM machine similar to yours recently (I don't know if
it was exactly the same model, but it was probably an M3), and observed
a number of lockups which seemed to be related to the RAID card (IBM's
ServeRAID, re-branded LSI). I don't know if this has anything to do with
your problems, but IIRC in my case there were some kernel messages on
the console relating to the driver and/or PCI bus errors on the slot
with the RAID controller prior to the lockups - maybe you can check for
these.

I have other bad experiences with IBM's hardware and have given up on
them for running FreeBSD.




signature.asc
Description: OpenPGP digital signature


Problems with crashing IBM X3630 M3/ZFS

2012-07-06 Thread Bob Healey
Hello.  I've got a quartet of IBM x3630 M3 with one that is frequently 
hard locking under heavy NFS load.  I am running 9.0-RELEASE with all 
the patches from freebsd-update.


My problem machine has 8 16 core clients, each doing IO intensive tasks 
connected to it via a Procurve and the onboard igb0 interface.  Mostly 
network reads, typically 10MB read per MB written.
When the machine locks under load, none of the consoles respond, nor can 
I reach the machine via ethernet.  I can break into DDB via the serial 
over lan interface, and am running a debug/witness kernel at the moment 
(I was running GENERIC previously).  During the boot sequence, witness 
tosses me into DDB ~10 times before I get a login prompt. Prior to this 
machine acting up, it had multiple 802.1q vlans, and ran 9K packets on 
its private network to the compute clients.


A dmesg can be found at http://boyle.che.rpi.edu/~healer/boomer/dmesg
/etc/rc.conf can be found at http://boyle.che.rpi.edu/~healer/boomer/rc.conf
A listing of installed ports can be found at 
http://boyle.che.rpi.edu/~healer/boomer/pkg_info
The output of psauxwwo wchan against my two crash dumps can be found at 
http://boyle.che.rpi.edu/~healer/boomer/crash1-psaux-wchan and 
http://boyle.che.rpi.edu/~healer/boomer/crash2-psaux-wchan


I'm not entire convinced this is software, but I've run out of local 
experts to ask, and can't prove its hardware.


--
Bob Healey
Systems Administrator
Biocomputation and Bioinformatics Constellation
and Molecularium
hea...@rpi.edu
(518) 276-4407

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"