Your message dated Sun, 15 Sep 2013 18:04:07 +0200
with message-id <[email protected]>
and subject line Re: Bug#700444: multipath-tools: Automatic path recovery not 
occurring on IBM Power with dual VIOS served disks.
has caused the Debian Bug report #700444,
regarding multipath-tools: Automatic path recovery not occurring on IBM Power 
with dual VIOS served disks.
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
700444: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700444
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: multipath-tools
Version: 0.4.8+git0.761c66f-10
Severity: normal

Hello,

on a IBM Power LPAR with dual virtual I/O server (VIOS) backed disk
devices, multipath seems not to be able to recover temporarily failed
disk paths (e.g. after one VIOS is restarted after maintenance). The
serial console shows:

[  325.695570] ibmvscsi 30000003: Virtual adapter failed rc 2!
[  325.799041] ibmvscsi 30000003: SRP_VERSION: 16.a
[  325.799076] ibmvscsi 30000003: Partner adapter not ready
[  325.799087] ibmvscsi 30000003: error after reset
[  326.072253] sd 1:0:1:0: [sdb] Unhandled error code
[  326.072271] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
[  326.072281] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 04 42 23 23 00 00 08 00
[  326.072308] end_request: I/O error, dev sdb, sector 71443235
[  326.072321] device-mapper: multipath: Failing path 8:16.
[  330.538142] sd 1:0:1:0: [sdb] Unhandled error code
[  330.538157] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
[  330.538165] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
[  330.538183] end_request: I/O error, dev sdb, sector 0
[  335.538861] sd 1:0:1:0: [sdb] Unhandled error code
[  335.538876] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
[  335.538884] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
[  335.538902] end_request: I/O error, dev sdb, sector 0
...
... loops forever ...

Any ideas where this may be caused and/or could be resolved?

Thanks & best regards,

Frank Fegert


-- Package-specific info:
Contents of /etc/multipath.conf:
defaults {
        getuid_callout  "/lib/udev/scsi_id -g -u -d /dev/%n"
        no_path_retry  10
        user_friendly_names yes
}
blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z][[0-9]*]"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}
multipaths {
    multipath {
        wwid                    36005076801918127980000000000023C
    }
}
devices {
}


-- System Information:
Debian Release: 6.0.6
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'stable-updates')
Architecture: powerpc (ppc64)

Kernel: Linux 2.6.39-bpo.2-powerpc (SMP w/6 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages multipath-tools depends on:
ii  initscripts        2.88dsf-13.1+squeeze1 scripts for initializing and shutt
ii  kpartx             0.4.8+git0.761c66f-10 create device mappings for partiti
ii  libaio1            0.3.107-7             Linux kernel AIO access library - 
ii  libc6              2.11.3-4              Embedded GNU C Library: Shared lib
ii  libdevmapper1.02.1 2:1.02.48-5           The Linux Kernel Device Mapper use
ii  libncurses5        5.7+20100313-5        shared libraries for terminal hand
ii  libreadline6       6.1-3                 GNU readline and history libraries
ii  lsb-base           3.2-23.2squeeze1      Linux Standard Base 3.2 init scrip
ii  udev               164-3                 /dev/ and hotplug management daemo

multipath-tools recommends no packages.

Versions of packages multipath-tools suggests:
ii  multipath-tools-bo 0.4.8+git0.761c66f-10 Support booting from multipath dev

-- no debconf information

--- End Message ---
--- Begin Message ---
Hello,

On Mon, Mar 11, 2013 at 12:22:22PM +0530, Ritesh Raj Sarraf wrote:
> That means multipath is reporting the correct status. The scsi devices
> never recovered.
> What is the device driver (and HBA) that is under use? Is it supported
> under Debian?
> I see you are on 2.6.39. Have you tried with a newer kernel?
> 
> You should also check if rescanning the SCSI bus changes any state. Use
> rescan-scsi-bus.

i've finally had a chance to clean out a machine for further testing.
Now there is only the Debian LPAR and two VIO server LPARs providing
the virtual disk and network resources. Upon deliberately restarting
one VIO server, and thus effectively disconnecting access to sda, i
get the following dmesg entries:

  [2496289.407987] ibmvscsi 30000002: Virtual adapter failed rc 2!
  [2496289.508832] ibmvscsi 30000002: SRP_VERSION: 16.a
  [2496289.508845] ibmvscsi 30000002: Partner adapter not ready
  [2496289.508849] ibmvscsi 30000002: error after reset
  [2496294.813039] sd 0:0:1:0: [sda] Unhandled error code
  [2496294.813046] sd 0:0:1:0: [sda]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [2496294.813054] sd 0:0:1:0: [sda] CDB: Write(10): 2a 00 00 83 f7 25 00 00 08 
00
  [2496294.813072] end_request: I/O error, dev sda, sector 8648485
  [2496294.813086] sd 0:0:1:0: [sda] Unhandled error code
  [2496294.813090] sd 0:0:1:0: [sda]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [2496294.813096] sd 0:0:1:0: [sda] CDB: Write(10): 2a 00 00 77 e5 cd 00 00 08 
00
  [2496294.813112] end_request: I/O error, dev sda, sector 7857613
  [2496294.813119] device-mapper: multipath: Failing path 8:0.
  [2496297.365090] sd 0:0:1:0: [sda] Unhandled error code
  [2496297.365099] sd 0:0:1:0: [sda]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [2496297.365105] sd 0:0:1:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 
00
  [2496297.365121] end_request: I/O error, dev sda, sector 0
  ...

  ststnagios02:~# multipath -ll
  mpath0 (36005076801918127980000000000023b) dm-0 AIX,VDASD
  size=36G features='1 queue_if_no_path' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
    |- 0:0:1:0 sda 8:0  failed faulty running
    `- 1:0:1:0 sdb 8:16 active ready  running

As you can see, the "ibmvscsi" driver is used in this case. Following
your advice about the kernel version, i looked at kernel sources for
changes to "ibmvscsi.c" between the version 2.6.39 i currently use and
the most recent kernel version. There are only a few, so i decided to
apply them individually and narrow the issue down step by step. After
applying the very first code change 
(https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/scsi/ibmvscsi/ibmvscsi.c?id=201aed678482f247aa96bd8fcd9e960fefd82d59)
to my otherwise unchanged v2.6.39 kernel and doing another test cycle,
i now get the following behaviour:

  [ 2779.759149] ibmvscsi 30000003: Virtual adapter failed rc 2!
  [ 2779.860049] ibmvscsi 30000003: SRP_VERSION: 16.a
  [ 2779.860066] ibmvscsi 30000003: Partner adapter not ready
  [ 2779.860071] ibmvscsi 30000003: error after reset
  [ 2783.796245] sd 1:0:1:0: [sdb] Unhandled error code
  [ 2783.796257] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [ 2783.796265] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
  [ 2783.796282] end_request: I/O error, dev sdb, sector 0
  [ 2793.337931] sd 1:0:1:0: [sdb] Unhandled error code
  [ 2793.337945] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [ 2793.337952] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
  [ 2793.337969] end_request: I/O error, dev sdb, sector 0
  [ 2793.338056] device-mapper: multipath: Failing path 8:16.
  ...
  [ 2928.357022] sd 1:0:1:0: [sdb] Unhandled error code
  [ 2928.357041] sd 1:0:1:0: [sdb]  Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
  [ 2928.357050] sd 1:0:1:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
  [ 2928.357068] end_request: I/O error, dev sdb, sector 0
  [ 2931.893990] ibmvscsi 30000003: partner initialized
  [ 2931.894073] ibmvscsi 30000003: host srp version: 16.a, host partition 
vios2-p550-222 (2), OS 3, max io 262144
  [ 2931.894145] ibmvscsi 30000003: Client reserve enabled
  [ 2931.894153] ibmvscsi 30000003: sent SRP login
  [ 2931.894195] ibmvscsi 30000003: SRP_LOGIN succeeded

And with that, the paths are also able to successfully recover! So
it's really a kernel/driver issue which i can now easily resolve.

Thanks & best regards,

    Frank Fegert

--- End Message ---

Reply via email to