2009/8/10 Alessandro FAGLIA <[email protected]>: > Hi list. > > It's about one week that mpt-statusd daemon running on a PE840 (equipped > with SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X > Fusion-MPT SAS (rev 01), OS Debian Lenny) is sending me many mails. > > $ sudo dpkg -l | grep mpt-status > ii mpt-status 1.2.0-4.2 > > $ sudo omreport storage controller > Controller SAS 5/iR Adapter (Slot 1) > > Controllers > ID : 0 > Status : Ok > Name : SAS 5/iR Adapter > Slot ID : PCI Slot 1 > State : Ready > Firmware Version : 00.10.51.00.06.12.05.00 > Minimum Required Firmware Version : Not Applicable > Driver Version : 3.04.07 > Minimum Required Driver Version : Not Applicable > Number of Connectors : 1 > Rebuild Rate : Not Applicable > BGI Rate : Not Applicable > Check Consistency Rate : Not Applicable > Reconstruct Rate : Not Applicable > Alarm State : Not Applicable > Cluster Mode : Not Applicable > SCSI Initiator ID : Not Applicable > Cache Memory Size : Not Applicable > Patrol Read Mode : Not Applicable > Patrol Read State : Not Applicable > Patrol Read Rate : Not Applicable > Patrol Read Iterations : Not Applicable > > > First it sends me a mail with this body (the subject is "info: mpt raid > status change on mybox"): > > This is a RAID status update from mpt-statusd. The mpt-status > program reports that one of the RAIDs changed state: > > Report from /etc/init.d/mpt-statusd on mybox > > then after about 10' it sends me another mail with the same subject and > this body: > > This is a RAID status update from mpt-statusd. The mpt-status > program reports that one of the RAIDs changed state: > > ioc0 vol_id 0 type IM, 2 phy, 148 GB, state OPTIMAL, flags ENABLED > ioc0 phy 1 scsi_id 32 ATA WDC WD1601ABYS-1 6H05, 149 GB, state ONLINE, > flags NONE > ioc0 phy 0 scsi_id 1 ATA WDC WD1601ABYS-1 6H05, 149 GB, state ONLINE, > flags NONE > > Report from /etc/init.d/mpt-statusd on mybox > > > In /var/log/messages I read what follows: > > mpt-statusd: detected non-optimal RAID status > > and also > > kernel: mptbase: ioc0: WARNING - IOC is in FAULT state (1600h)!!! > kernel: mptbase: ioc0: WARNING - Issuing HardReset from > mpt_fault_reset_work!! > kernel: mptbase: ioc0: Initiating recovery > kernel: mptbase: ioc0: WARNING - IOC is in FAULT state!!! > kernel: mptbase: ioc0: WARNING - FAULT code = 1600h > kernel: sd 0:1:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id > 0, sc=f6e68780, mf = f7bc2e80, idx=d > kernel: sd 0:1:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id > 0, sc=f6e686c0, mf = f7bc3080, idx=11 > (...) > kernel: sd 0:1:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id > 0, sc=f0465200, mf = f7bcb480, idx=119 > kernel: mptbase: ioc0: Recovered from IOC FAULT > kernel: mptbase: ioc0: WARNING - mpt_fault_reset_work: HardReset: success > kernel: mptbase: ioc0: Initiating recovery > > > OMSA is telling me nothing about such events, and the server apparently > runs fine, so I'm asking if I should be seriously worried about. Any > similar experience in the NG? >
we had simmilar dmesg errors on a 1435 running SAS 5/iR after replacing raid card, cables it turned to be an faulty hdd, all the tests went fine except there ware smart errors on one of the drives so i suggest you chceck both of the discs with smartmontools -- Lazy _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
