Bug#700444: multipath-tools: Automatic path recovery not occurring on IBM Power with dual VIOS served disks.
On Sunday 10 March 2013 07:21 PM, Frank Fegert wrote: > ststnagios02:~# dmesg | tail > [6835466.341578] sd 0:0:1:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 > 00 > [6835466.341592] end_request: I/O error, dev sda, sector 0 > [6835467.341974] sd 1:0:1:0: [sdb] Unhandled error code > [6835467.341983] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR > driverbyte=DRIVER_OK > [6835467.341990] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 > 00 > [6835467.342003] end_request: I/O error, dev sdb, sector 0 > [6835471.342506] sd 0:0:1:0: [sda] Unhandled error code > [6835471.342516] sd 0:0:1:0: [sda] Result: hostbyte=DID_ERROR > driverbyte=DRIVER_OK > [6835471.342523] sd 0:0:1:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 > 00 > [6835471.342536] end_request: I/O error, dev sda, sector 0 > > ststnagios02:~# /sbin/scsiinfo -l > /dev/sda /dev/sdb > > ststnagios02:~# /sbin/scsiinfo -i /dev/sda > INQUIRY command status = 1 > > ststnagios02:~# /sbin/scsiinfo -i /dev/sdb > INQUIRY command status = 1 > > The Debian system is currently still in the "floating" state, so if > there's any other test you'd like me to run, that'd be no problem. > All other AIX systems on the test hardware sucessfully recovered their > paths though. That means multipath is reporting the correct status. The scsi devices never recovered. What is the device driver (and HBA) that is under use? Is it supported under Debian? I see you are on 2.6.39. Have you tried with a newer kernel? You should also check if rescanning the SCSI bus changes any state. Use rescan-scsi-bus. -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com "Necessity is the mother of invention." signature.asc Description: OpenPGP digital signature
Bug#700444: multipath-tools: Automatic path recovery not occurring on IBM Power with dual VIOS served disks.
Hello, sorry for the delayed reply! On Wed, Feb 13, 2013 at 11:47:00PM +0100, Frank Fegert wrote: > > When you thought the paths were back, were they responding to scsi commands? > > Sorry, didn't check that at the time. > > > You could use tools from the sg3-utils package or use the scsi_id > > program to confirm that. > > I'll try to setup a test environment. In a test environment, after a consecutive reboot of each of the two VIOS: ststnagios02:~# multipath -ll mpath0 (360050768019181279800023b) dm-0 AIX,VDASD size=36G features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=0 status=enabled |- 0:0:1:0 sda 8:0 failed faulty running `- 1:0:1:0 sdb 8:16 failed faulty running ststnagios02:~# dmesg | tail [6835466.341578] sd 0:0:1:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [6835466.341592] end_request: I/O error, dev sda, sector 0 [6835467.341974] sd 1:0:1:0: [sdb] Unhandled error code [6835467.341983] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [6835467.341990] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [6835467.342003] end_request: I/O error, dev sdb, sector 0 [6835471.342506] sd 0:0:1:0: [sda] Unhandled error code [6835471.342516] sd 0:0:1:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [6835471.342523] sd 0:0:1:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [6835471.342536] end_request: I/O error, dev sda, sector 0 ststnagios02:~# /sbin/scsiinfo -l /dev/sda /dev/sdb ststnagios02:~# /sbin/scsiinfo -i /dev/sda INQUIRY command status = 1 ststnagios02:~# /sbin/scsiinfo -i /dev/sdb INQUIRY command status = 1 The Debian system is currently still in the "floating" state, so if there's any other test you'd like me to run, that'd be no problem. All other AIX systems on the test hardware sucessfully recovered their paths though. Thanks & best regards, Frank Fegert -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#700444: multipath-tools: Automatic path recovery not occurring on IBM Power with dual VIOS served disks.
Hello, On Wed, Feb 13, 2013 at 11:47:28AM +0530, Ritesh Raj Sarraf wrote: > When you said temporary, are you sure when the paths recovered back? well all other LPARs running AIX got their paths back ;-) So the backend and the VIOSes were providing multiple paths again. After the Debian LPAR was rebooted, all paths were back to normal. > When you thought the paths were back, were they responding to scsi commands? Sorry, didn't check that at the time. > You could use tools from the sg3-utils package or use the scsi_id > program to confirm that. I'll try to setup a test environment. > Any good reasons for using "no_path_retry 10" ??? At the top of my head, no. Might be a relict from earlier times, or might be derived from the IBM SVC recommendations for Linux host attachments ('no_path_retry "5"'). Although in this case the Debian LPAR gets its SVC backed disks from the two VIOS rather than directly from the SVC. Thanks & best regards, Frank Fegert -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#700444: multipath-tools: Automatic path recovery not occurring on IBM Power with dual VIOS served disks.
On Wednesday 13 February 2013 12:12 AM, Frank Fegert wrote: > Hello, > > on a IBM Power LPAR with dual virtual I/O server (VIOS) backed disk > devices, multipath seems not to be able to recover temporarily failed > disk paths (e.g. after one VIOS is restarted after maintenance). The > serial console shows: When you said temporary, are you sure when the paths recovered back? When you thought the paths were back, were they responding to scsi commands? You could use tools from the sg3-utils package or use the scsi_id program to confirm that. > > [ 325.695570] ibmvscsi 3003: Virtual adapter failed rc 2! > [ 325.799041] ibmvscsi 3003: SRP_VERSION: 16.a > [ 325.799076] ibmvscsi 3003: Partner adapter not ready > [ 325.799087] ibmvscsi 3003: error after reset > [ 326.072253] sd 1:0:1:0: [sdb] Unhandled error code > [ 326.072271] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR > driverbyte=DRIVER_OK > [ 326.072281] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 04 42 23 23 00 00 08 00 > [ 326.072308] end_request: I/O error, dev sdb, sector 71443235 > [ 326.072321] device-mapper: multipath: Failing path 8:16. > [ 330.538142] sd 1:0:1:0: [sdb] Unhandled error code > [ 330.538157] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR > driverbyte=DRIVER_OK > [ 330.538165] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 > [ 330.538183] end_request: I/O error, dev sdb, sector 0 > [ 335.538861] sd 1:0:1:0: [sdb] Unhandled error code > [ 335.538876] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR > driverbyte=DRIVER_OK > [ 335.538884] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 > [ 335.538902] end_request: I/O error, dev sdb, sector 0 > ... > ... loops forever ... That could very well be the messages triggered by the multipath checkerloop. > > Any ideas where this may be caused and/or could be resolved? > > Thanks & best regards, > > Frank Fegert > > > -- Package-specific info: > Contents of /etc/multipath.conf: > defaults { > getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" > no_path_retry 10 > user_friendly_names yes > } > Any good reasons for using "no_path_retry 10" ??? -- Ritesh Raj Sarraf RESEARCHUT - http://www.researchut.com "Necessity is the mother of invention." signature.asc Description: OpenPGP digital signature
Bug#700444: multipath-tools: Automatic path recovery not occurring on IBM Power with dual VIOS served disks.
Package: multipath-tools Version: 0.4.8+git0.761c66f-10 Severity: normal Hello, on a IBM Power LPAR with dual virtual I/O server (VIOS) backed disk devices, multipath seems not to be able to recover temporarily failed disk paths (e.g. after one VIOS is restarted after maintenance). The serial console shows: [ 325.695570] ibmvscsi 3003: Virtual adapter failed rc 2! [ 325.799041] ibmvscsi 3003: SRP_VERSION: 16.a [ 325.799076] ibmvscsi 3003: Partner adapter not ready [ 325.799087] ibmvscsi 3003: error after reset [ 326.072253] sd 1:0:1:0: [sdb] Unhandled error code [ 326.072271] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 326.072281] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 04 42 23 23 00 00 08 00 [ 326.072308] end_request: I/O error, dev sdb, sector 71443235 [ 326.072321] device-mapper: multipath: Failing path 8:16. [ 330.538142] sd 1:0:1:0: [sdb] Unhandled error code [ 330.538157] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 330.538165] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 330.538183] end_request: I/O error, dev sdb, sector 0 [ 335.538861] sd 1:0:1:0: [sdb] Unhandled error code [ 335.538876] sd 1:0:1:0: [sdb] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK [ 335.538884] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 [ 335.538902] end_request: I/O error, dev sdb, sector 0 ... ... loops forever ... Any ideas where this may be caused and/or could be resolved? Thanks & best regards, Frank Fegert -- Package-specific info: Contents of /etc/multipath.conf: defaults { getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" no_path_retry 10 user_friendly_names yes } blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][[0-9]*]" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" } multipaths { multipath { wwid360050768019181279800023C } } devices { } -- System Information: Debian Release: 6.0.6 APT prefers stable APT policy: (990, 'stable'), (500, 'stable-updates') Architecture: powerpc (ppc64) Kernel: Linux 2.6.39-bpo.2-powerpc (SMP w/6 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages multipath-tools depends on: ii initscripts2.88dsf-13.1+squeeze1 scripts for initializing and shutt ii kpartx 0.4.8+git0.761c66f-10 create device mappings for partiti ii libaio10.3.107-7 Linux kernel AIO access library - ii libc6 2.11.3-4 Embedded GNU C Library: Shared lib ii libdevmapper1.02.1 2:1.02.48-5 The Linux Kernel Device Mapper use ii libncurses55.7+20100313-5shared libraries for terminal hand ii libreadline6 6.1-3 GNU readline and history libraries ii lsb-base 3.2-23.2squeeze1 Linux Standard Base 3.2 init scrip ii udev 164-3 /dev/ and hotplug management daemo multipath-tools recommends no packages. Versions of packages multipath-tools suggests: ii multipath-tools-bo 0.4.8+git0.761c66f-10 Support booting from multipath dev -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org