I put the drive that's missing a path in it's own pool and did some reading and writing (filled the drive with 0's using 'dd', then read them back off). Other than a handful of errors in iostat and /var/adm/messages (like the ones I reported before), everything appeared to work fine:
# iostat -En c1t5000039478CA7150d0 c1t5000039478CA7150d0 Soft Errors: 0 Hard Errors: 2 Transport Errors: 29 Vendor: TOSHIBA Product: MG03SCA300 Revision: 0108 Serial No: Z2H0A008FTP3 Size: 3000.59GB <3000592982016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 2 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 So the port on the backplane appears (at least partially) functional, where do you think I should go from here? Thanks again, Kevin On 10/30/2013 12:13 PM, Kevin Swab wrote: > The problem drive is currently configured as a hot-spare (it replaced > the old hot-spare, which kicked in when the original drive failed), but > I'll remove it from the pool and do some testing with it and report back... > > Thanks! > Kevin > > On 10/30/2013 12:02 PM, Johan Kragsterman wrote: >> Hi, Kevin! >> >> What if you replace the drive with one of the hotspares? I mean, let the >> hotspare stay at its place, and configure it for replacing the >> problematic drive. Then you will find out wether the backplane has a bad >> port or not. Allways start to try to narrow it down. >> >> Rgrds Johan >> >> >> >> -----"OmniOS-discuss" <omnios-discuss-boun...@lists.omniti.com> skrev: ----- >> Till: omnios-discuss@lists.omniti.com >> Från: Kevin Swab >> Sänt av: "OmniOS-discuss" >> Datum: 2013.10.30 18:38 >> Ärende: [OmniOS-discuss] multipath problem when replacing a failed SAS drive >> >> Hello, >> >> I'm running OmniOS r151006p on the following system: >> >> - Supermicro X8DT6 board, Xeon E5606 CPU, 48GB ram >> - Supermicro SC847 chassis, 36 drive bays, SAS expanders, LSI 9211-8i >> controller >> - 34 x Toshiba 3T SAS drives MG03SCA300 in one pool w/ 16 mirrored sets >> + 2 hot spares >> >> 'mpathadm list lu' showed all drives as having two paths to the controller. >> >> Yesterday, one of the drives failed and was replaced. The new drive is >> only showing one path in mpathadm, and errors have started showing up >> periodically in /var/adm/messages: >> >> >> >> # mpathadm list lu /dev/rdsk/c1t5000039478CA7150d0 >> mpath-support: libmpscsi_vhci.so >> /dev/rdsk/c1t5000039478CA7150d0s2 >> Total Path Count: 1 >> Operational Path Count: 1 >> >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event_sync: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info] >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler Log info 0x31120101 received for target 89. >> Oct 30 09:30:22 hagler scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event_sync: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info] >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler Log info 0x31120101 received for target 89. >> Oct 30 09:30:22 hagler scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event_sync: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info] >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler Log info 0x31120101 received for target 89. >> Oct 30 09:30:22 hagler scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event_sync: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> Oct 30 09:30:22 hagler scsi: [ID 365881 kern.info] >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler Log info 0x31120101 received for target 89. >> Oct 30 09:30:22 hagler scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc >> Oct 30 09:30:22 hagler scsi: [ID 243001 kern.warning] WARNING: >> /pci@0,0/pci8086,3410@9/pci1000,3020@0 (mpt_sas0): >> Oct 30 09:30:22 hagler mptsas_handle_event: IOCStatus=0x8000, >> IOCLogInfo=0x31120101 >> >> >> >> The error messages refer to target 89, which I can confirm corresponds >> to the missing path for my replacement drive using "lsiutil": >> >> >> >> # lsiutil -p 1 16 >> >> LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009 >> >> 1 MPT Port found >> >> Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC >> 1. mpt_sas0 LSI Logic SAS2008 03 200 0d000100 0 >> >> SAS2008's links are 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G >> >> B___T SASAddress PhyNum Handle Parent Type >> [ ... cut ... ] >> 0 89 5000039478ca7152 17 0059 0032 SAS Target >> 0 90 5000039478ca7153 17 005a 000a SAS Target >> [ ... cut ... ] >> >> >> >> When I ask "lsiutil" to rescan the bus, I see the following error when >> it gets to target 89: >> >> >> >> # lsiutil -p 1 8 >> >> LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009 >> >> 1 MPT Port found >> >> Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC >> 1. mpt_sas0 LSI Logic SAS2008 03 200 0d000100 0 >> >> SAS2008's links are 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G, 6.0 G >> >> B___T___L Type Vendor Product Rev >> [ ... cut ... ] >> ScsiIo to Bus 0 Target 89 failed, IOCStatus = 004b (IOC Terminated) >> 0 90 0 Disk TOSHIBA MG03SCA300 0108 5000039478ca7153 >> 17 >> [ ... cut ... ] >> >> >> >> This problem has happened to me once before on a similar system. At >> that time, I tried reseating the drive, and tried several different >> replacement drives, all had the same issue. I even tried rebooting the >> system and that didn't help. >> >> Does anyone know how I can clear this issue up? I'd be happy to provide >> any additional information that might be helpful, >> >> TIA, >> Kevin >> >> >> >> -- >> ------------------------------------------------------------------- >> Kevin Swab UNIX Systems Administrator >> ACNS Colorado State University >> Phone: (970)491-6572 Email: kevin.s...@colostate.edu >> GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss@lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > -- ------------------------------------------------------------------- Kevin Swab UNIX Systems Administrator ACNS Colorado State University Phone: (970)491-6572 Email: kevin.s...@colostate.edu GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss