I have serial console setup with logging via conserver but so far no further crash. We also swapped hardware a bit around (another 4 node cluster with DL360g5 was working without crash for several weeks, we swapped those 4 nodes in for the first 4 in the 6 node cluster).
> -----Original Message----- > From: Sunil Mushran [mailto:[EMAIL PROTECTED] > Sent: Monday, July 30, 2007 10:21 > To: Ulf Zimmermann > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] 6 node cluster with unexplained reboots > > Do you have a netconsole setup? If not, set it up. That will capture the > real reason for the reset. Well, it typically does. > > Ulf Zimmermann wrote: > > We just installed a new cluster with 6 HP DL380g5, dual single port > Qlogic 24xx HBAs connected via two HP 4/16 Storageworks switches to a 3Par > S400. We are using the 3Par recommended config for the Qlogic driver and > device-mapper-multipath giving us 4 paths to the SAN. We do see some SCSI > errors where DM-MP is failing a path after get a 0x2000 error from the SAN > controller, but the path gets puts back in service in less then 10 > seconds. > > > > This needs to be fixed but I don't think it is what is causing our > reboots. 2 of the nodes rebooted once while being idle (ocfs2 and > clusterware were running, no db) and one node rebooted while idle (another > node was copying using fscat our 9i db from ocfs1 to the ocfs2 data > volume) and once while some load was put on it via the upgraded 10g > database. In all cases it is as if someone a hardware reset button. No > kernel panic (at least not one leading to a stop with visable message), we > can get a dirty write cache for the internal cciss controller. > > > > The only messages we get on the nodes are when the crashed node is > already in reset and it missed its ocfs2 heartbeat (set to the default of > 7), followed later by crs moving the vip. > > > > Any hints on trouble shooting this would be appreciated. > > > > Regards, Ulf. > > > > > > -------------------------- > > Sent from my BlackBerry Wireless Handheld > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Ocfs2-users mailing list > > Ocfs2-users@oss.oracle.com > > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users