Re: [zfs-discuss] solaris 10u8 hangs with message Disconnected command timeout for Target 0
On Aug 15, 2011, at 11:17 PM, Ding Honghui wrote: > My solaris storage hangs. I login to the console and there is messages[1] > display on the console. > I can't login into the console and seems the IO is totally blocked. > > The system is solaris 10u8 on Dell R710 with disk array Dell MD3000. 2 HBA > cable connect the server and MD3000. > The symptom is random. This symptom is consistent with a broken SATA disk behind a SAS expander. Unfortunately, the mpt driver is closed source, so we can only infer what the code does by using the open source mpt_sas driver as (hopefully) a derivative. > > It is very appreciated if any one can help me out. > > Regards, > Ding > > [1] > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /pci@0,0/pci8086,3410@9/pci8086,32c@0/pci1028,1f04@8 (mpt1): > Aug 16 13:14:16 nas-hz-02 Disconnected command timeout for Target 0 A command did not complete and the mpt driver reset the target. If that target is an expander, then everything behind the expander can reset, resulting in the aborts of any in-flight commands, as follows... > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /scsi_vhci/disk@g60026b900053aa1802a44b8f0ded (sd47): > Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) > Error Level: Retryable > Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073 > Error Block: 1380679073 > Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL > Serial Number: > Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention > Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), > ASCQ: 0x4, FRU: 0x0 > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /scsi_vhci/disk@g60026b900053aa18029e4b8f0d61 (sd41): > Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) > Error Level: Retryable > Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072 > Error Block: 1380679072 > Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL > Serial Number: > Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention > Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), > ASCQ: 0x4, FRU: 0x0 > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /scsi_vhci/disk@g60026b900053aa1802a24b8f0dc5 (sd45): > Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) > Error Level: Retryable > Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073 > Error Block: 1380679073 > Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL > Serial Number: > Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention > Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), > ASCQ: 0x4, FRU: 0x0 > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /scsi_vhci/disk@g60026b900053aa18029c4b8f0d35 (sd39): > Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) > Error Level: Retryable > Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072 > Error Block: 1380679072 > Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL > Serial Number: > Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention > Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), > ASCQ: 0x4, FRU: 0x0 > Aug 16 13:14:16 nas-hz-02 scsi: WARNING: > /scsi_vhci/disk@g60026b900053aa1802984b8f0cd2 (sd35): > Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) > Error Level: Retryable > Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072 > Error Block: 1380679072 > Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL > Serial Number: > Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention > Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), > ASCQ: 0x4, FRU: 0x0 You will be happiest if you do not use SATA disks directly connected to SAS expanders. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] solaris 10u8 hangs with message Disconnected command timeout for Target 0
Ding Honghui wrote: Hi, My solaris storage hangs. I login to the console and there is messages[1] display on the console. I can't login into the console and seems the IO is totally blocked. The system is solaris 10u8 on Dell R710 with disk array Dell MD3000. 2 HBA cable connect the server and MD3000. The symptom is random. It is very appreciated if any one can help me out. The SCSI target you are talking to is being reset. "Unit Attention" means it's forgotten what operating parameters have been negotiated with the system and is a warning the device might have been changed without the system knowing, and it's telling you this happened because of "device internal reset". That sort of thing can happen if the firmware in the SCSI target crashes and restarts, or the power supply blips, or if the device was swapped. I don't know anything about a Dell MD3000, but given it's happened on lots of disks at the same moment following a timeout, it looks like the array power cycled or array firmware (if any) rebooted. (Not sure if a SCSI bus reset can do this or not.) [1] Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /pci@0,0/pci8086,3410@9/pci8086,32c@0/pci1028,1f04@8 (mpt1): Aug 16 13:14:16 nas-hz-02 Disconnected command timeout for Target 0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802a44b8f0ded (sd47): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073Error Block: 1380679073 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa18029e4b8f0d61 (sd41): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802a24b8f0dc5 (sd45): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073Error Block: 1380679073 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa18029c4b8f0d35 (sd39): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802984b8f0cd2 (sd35): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 -- Andrew Gabriel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] solaris 10u8 hangs with message Disconnected command timeout for Target 0
Hi, My solaris storage hangs. I login to the console and there is messages[1] display on the console. I can't login into the console and seems the IO is totally blocked. The system is solaris 10u8 on Dell R710 with disk array Dell MD3000. 2 HBA cable connect the server and MD3000. The symptom is random. It is very appreciated if any one can help me out. Regards, Ding [1] Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /pci@0,0/pci8086,3410@9 /pci8086,32c@0/pci1028,1f04@8 (mpt1): Aug 16 13:14:16 nas-hz-02 Disconnected command timeout for Target 0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802a44b8f0ded (sd47): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073Error Block: 1380679073 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa18029e4b8f0d61 (sd41): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802a24b8f0dc5 (sd45): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679073Error Block: 1380679073 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa18029c4b8f0d35 (sd39): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /scsi_vhci/disk@g60026b900053aa1802984b8f0cd2 (sd35): Aug 16 13:14:16 nas-hz-02 Error for Command: write(10) Error Level: Retryable Aug 16 13:14:16 nas-hz-02 scsi: Requested Block: 1380679072Error Block: 1380679072 Aug 16 13:14:16 nas-hz-02 scsi: Vendor: DELL Serial Number: Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset), ASCQ: 0x4, FRU: 0x0 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss