Added a new attempt. This time the pick-list is rather excessive. Unfortunately the more likely changes come later. So while there might be some patches which probably can be dropped, I fear the overall set still will be big. (Download again: http://people.canonical.com/~smb/lp1567602/)
5da8bf5 scsi_dh_alua: Disable ALUA handling for non-disk devices 71b066c scsi_dh_alua: Use vpd_pg83 information fc63978 scsi_dh_alua: improved logging 5e60928 scsi_dh_alua: sanitze sense code handling 73a86b1 scsi_dh_alua: use standard logging functions 9ba4fe6 scsi_dh_alua: return standard SCSI return codes in submit_rtpg ae5408e scsi_dh_alua: fixup description of stpg_endio() 79ded51 scsi_dh_alua: use flag for RTPG extended header 9121b4c scsi_dh_alua: use unaligned access macros ce6f796 scsi_dh_alua: rework alua_check_tpgs() to return the tpgs mode 567d885 scsi_dh_alua: simplify sense code handling c6fe649 scsi: Add scsi_vpd_lun_id() f52649f scsi: Add scsi_vpd_tpg_id() cdacdc7 scsi_dh_alua: use scsi_vpd_tpg_id() 6a49629 scsi_dh_alua: Remove stale variables fdd3a8c scsi_dh_alua: Pass buffer as function argument 542a8c0 scsi_dh_alua: separate out alua_stpg() ab7ae03 scsi_dh_alua: Make stpg synchronous fe551f5 scsi_dh_alua: call alua_rtpg() if stpg fails fc55215 scsi_dh_alua: switch to scsi_execute_req_flags() 0b98a60 scsi_dh_alua: allocate RTPG buffer separately e4dee21 scsi_dh_alua: Use separate alua_port_group structure ecec8731 scsi_dh_alua: use unique device id b20a55f scsi_dh_alua: simplify alua_initialize() 81537d5 revert commit a8e5a2d593cb ("[SCSI] scsi_dh_alua: ALUA handler attach sh 487cdf2 scsi_dh_alua: move optimize_stpg evaluation b5cbbc6 scsi_dh_alua: remove 'rel_port' from alua_dh_data structure dd2af31 scsi_dh_alua: Use workqueue for RTPG bdf2fc3 scsi_dh_alua: Allow workqueue to run synchronously a936713 scsi_dh_alua: Add new blacklist flag 'BLIST_SYNC_ALUA' d69dd4d scsi_dh_alua: Recheck state on unit attention abcf58c scsi_dh_alua: update all port states 82d5f45 scsi_dh_alua: Send TEST UNIT READY to poll for transitioning 676fa7c scsi_dh_alua: Update version to 2.0 9f33778 scsi_dh_alua: do not fail for unknown VPD identification -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1567602 Title: FCP devices are not detected correctly nor deterministically Status in Release Notes for Ubuntu: Fix Released Status in Ubuntu on IBM z Systems: In Progress Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: In Progress Bug description: Release notes: Usage of SCSI LUNs on DS8870 Storage server with μCode Bundles 87.51.xx.0 (LMC 7.7.51.xx) via NPIV enabled zfcp adaptors causes detection issues. LP #1567602 In that case do not use NPIV enabled zfcp adaptors. Scenario: Using Installer 432. No DASD devices, just two FCP CHPIDs with two LUNs each (configured for NPIV), provided via parmfile. I expect the installer to probe and detect the LUNs automatically once I enable the FCP CHPIDs. Repeated five times, I got different results each time. Detected LUNs vary between 2 and 4. One time 3 LUNs appear on SCSI1 and one on SCSI2 which looks especially odd to me. Subsequent steps to create partitions and file systems in the installer fail. Following is one of the five cases with more details: After enabling CHPID 192b: ~ # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 1077493890 Vendor: IBM Model: 2107900 Rev: 1.27 Type: Direct-Access ANSI SCSI revision: 05 dmesg showing: [ 221.738816] qdio: 0.0.192b ZFCP on SC f using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W A [ 226.895139] scsi 0:0:0:1077493890: Direct-Access IBM 2107900 1.27 PQ: 0 ANSI: 5 [ 226.895816] sd 0:0:0:1077493890: alua: supports implicit TPGS [ 226.896673] sd 0:0:0:1077493890: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [ 226.897825] sd 0:0:0:1077493890: [sda] Write Protect is off [ 226.897827] sd 0:0:0:1077493890: [sda] Mode Sense: ed 00 00 08 [ 226.898145] sd 0:0:0:1077493890: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 226.902571] sd 0:0:0:1077493890: [sda] Attached SCSI disk [ 287.117057] sd 0:0:0:1077493890: alua: evpd inquiry failed with 30000 [ 287.117062] sd 0:0:0:1077493890: alua: Attach failed (-22) [ 287.117064] sd 0:0:0:1077493890: failed to add device handler: -22 [ 287.147303] scsi 0:0:0:0: Unexpected response from lun 1077493890 while scanning, scan aborted As second step, I additionally enable CHPID 196b: ~ # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 1077493890 Vendor: IBM Model: 2107900 Rev: 1.27 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 1077493890 Vendor: IBM Model: 2107900 Rev: 1.27 Type: Direct-Access ANSI SCSI revision: 05 dmesg showing: [ 384.277394] scsi host1: zfcp [ 384.286516] qdio: 0.0.196b ZFCP on SC 10 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W A [ 385.377511] scsi 1:0:0:1077493890: Direct-Access IBM 2107900 1.27 PQ: 0 ANSI: 5 [ 385.378120] sd 1:0:0:1077493890: alua: supports implicit TPGS [ 385.378781] sd 1:0:0:1077493890: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [ 385.380097] sd 1:0:0:1077493890: [sdb] Write Protect is off [ 385.380099] sd 1:0:0:1077493890: [sdb] Mode Sense: ed 00 00 08 [ 385.380408] sd 1:0:0:1077493890: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 385.384969] sd 1:0:0:1077493890: [sdb] Attached SCSI disk [ 446.117041] sd 1:0:0:1077493890: alua: evpd inquiry failed with 30000 [ 446.117046] sd 1:0:0:1077493890: alua: Attach failed (-22) [ 446.117048] sd 1:0:0:1077493890: failed to add device handler: -22 [ 446.147158] scsi 1:0:0:0: Unexpected response from lun 1077493890 while scanning, scan aborted Next, the installer warns that no disk drives were detected. I checked this on a system (installer 445) with two NPIV enabled FCP devices that have 4 LUNS each. I only get the first of four LUNs per adaptor: Apr 6 12:57:08 anna[2607]: DEBUG: retrieving zipl-installer 0.0.33ubuntu2 Apr 6 12:57:08 anna[2607]: 2016-04-06 12:57:08 URL:http://ports.ubuntu.com//pool/main/z/zipl-installer/zipl-installer_0.0.33ubuntu2_s390x.udeb [4994/4994] -> "/var/cache/anna/zipl-installer_0.0.33ubuntu2_s390x.udeb" [1] Apr 6 12:57:09 main-menu[384]: INFO: Menu item 'driver-injection-disk-detect' selected Apr 6 12:57:10 main-menu[384]: INFO: Menu item 'user-setup-udeb' selected Apr 6 12:57:10 main-menu[384]: INFO: Menu item 'clock-setup' selected Apr 6 12:57:10 main-menu[384]: INFO: Menu item 's390-dasd' selected Apr 6 12:57:10 s390-dasd[6877]: INFO: s390-dasd: no channel found Apr 6 12:57:10 main-menu[384]: INFO: Menu item 's390-zfcp' selected Apr 6 12:57:10 s390-zfcp[6891]: DEBUG: DETECT: The zfcp device driver is not loaded Apr 6 12:57:10 s390-zfcp[6891]: WARNING **: Could not open directory: /sys/bus/ccw/drivers/zfcp/: No such file or directory Apr 6 12:57:11 main-menu[384]: INFO: Menu item 'disk-detect' selected Apr 6 12:57:11 anna-install: Installing mmc-modules Apr 6 12:57:11 disk-detect: insmod /lib/modules/4.4.0-17-generic/kernel/drivers/scsi/device_handler/scsi_dh_rdac.ko Apr 6 12:57:11 disk-detect: insmod /lib/modules/4.4.0-17-generic/kernel/drivers/scsi/device_handler/scsi_dh_alua.ko Apr 6 12:57:11 disk-detect: insmod /lib/modules/4.4.0-17-generic/kernel/drivers/scsi/device_handler/scsi_dh_hp_sw.ko Apr 6 12:57:11 disk-detect: insmod /lib/modules/4.4.0-17-generic/kernel/drivers/scsi/device_handler/scsi_dh_emc.ko Apr 6 12:57:11 kernel: [ 40.509881] rdac: device handler registered Apr 6 12:57:11 kernel: [ 40.513219] alua: device handler registered Apr 6 12:57:11 kernel: [ 40.515517] hp_sw: device handler registered Apr 6 12:57:11 kernel: [ 40.517572] emc: device handler registered Apr 6 12:57:11 net/hw-detect.hotplug: Detected hotpluggable network interface encf5f0 Apr 6 12:57:11 net/hw-detect.hotplug: Detected hotpluggable network interface lo Apr 6 12:57:11 apt-install: Queueing package udev for later installation Apr 6 12:57:11 apt-install: Queueing package pciutils for later installation Apr 6 12:57:12 check-missing-firmware: looking at dmesg for the first time Apr 6 12:57:12 check-missing-firmware: saving timestamp for a later use: [ 40.517572] Apr 6 12:57:12 check-missing-firmware: /dev/.udev/firmware-missing does not exist, skipping Apr 6 12:57:12 check-missing-firmware: /run/udev/firmware-missing does not exist, skipping Apr 6 12:57:12 check-missing-firmware: no missing firmware in loaded kernel modules Apr 6 12:58:29 main-menu[384]: (process:6908): unknown udeb mmc-modules Apr 6 12:58:29 main-menu[384]: INFO: Menu item 'disk-detect' succeeded but requested to be left unconfigured. Apr 6 12:58:31 main-menu[384]: INFO: Menu item 's390-zfcp' selected Apr 6 12:58:31 s390-zfcp[7752]: DEBUG: DETECT: Added FCP device: 0.0.1904: online=0 npiv=0 Apr 6 12:58:31 s390-zfcp[7752]: DEBUG: DETECT: Added FCP device: 0.0.1944: online=0 npiv=0 Apr 6 12:58:31 s390-zfcp[7752]: DEBUG: DETECT: Automatic LUN scanning is enabled Apr 6 12:58:31 s390-zfcp[7752]: DEBUG: PRESEED: No preseed data available Apr 6 12:58:40 s390-zfcp[7752]: DEBUG: SELECT: Using FCP device 0.0.1904 Apr 6 12:58:40 kernel: [ 129.591146] scsi host0: scsi_eh_0: sleeping Apr 6 12:58:40 kernel: [ 129.591163] scsi host0: zfcp Apr 6 12:58:40 kernel: [ 129.598187] qdio: 0.0.1904 ZFCP on SC f using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W A Apr 6 12:58:41 s390-zfcp[7752]: DEBUG: ENABLE: Activated FCP device 0.0.1904 (npiv=1) Apr 6 12:58:41 kernel: [ 130.873257] scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36 Apr 6 12:58:41 kernel: [ 130.873581] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:41 kernel: [ 130.873593] scsi 0:0:0:0: scsi scan: INQUIRY pass 2 length 164 Apr 6 12:58:41 kernel: [ 130.873802] scsi 0:0:0:0: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:41 kernel: [ 130.873807] scsi 0:0:0:0: scsi scan: peripheral device type of 31, no device added Apr 6 12:58:41 kernel: [ 130.874148] scsi 0:0:0:0: scsi scan: Sending REPORT LUNS to (try 0) Apr 6 12:58:41 kernel: [ 130.875003] scsi 0:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 Apr 6 12:58:41 kernel: [ 130.875005] scsi 0:0:0:0: scsi scan: REPORT LUN scan Apr 6 12:58:41 kernel: [ 130.875176] scsi 0:0:0:1074937986: scsi scan: INQUIRY pass 1 length 36 Apr 6 12:58:41 kernel: [ 130.875395] scsi 0:0:0:1074937986: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:41 kernel: [ 130.875399] scsi 0:0:0:1074937986: scsi scan: INQUIRY pass 2 length 164 Apr 6 12:58:41 kernel: [ 130.875608] scsi 0:0:0:1074937986: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:41 kernel: [ 130.875613] scsi 0:0:0:1074937986: Direct-Access IBM 2107900 1.27 PQ: 0 ANSI: 5 Apr 6 12:58:41 kernel: [ 130.876181] sd 0:0:0:1074937986: alua: supports implicit TPGS Apr 6 12:58:41 kernel: [ 130.876323] sd 0:0:0:1074937986: tag#1 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK Apr 6 12:58:41 kernel: [ 130.876326] sd 0:0:0:1074937986: tag#1 CDB: Test Unit Ready 00 00 00 00 00 00 Apr 6 12:58:41 kernel: [ 130.876328] sd 0:0:0:1074937986: tag#1 Sense Key : Unit Attention [current] Apr 6 12:58:41 kernel: [ 130.876333] sd 0:0:0:1074937986: tag#1 Add. Sense: Power on, reset, or bus device reset occurred Apr 6 12:58:41 kernel: [ 130.876722] sd 0:0:0:1074937986: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) Apr 6 12:58:41 kernel: [ 130.877759] sd 0:0:0:1074937986: [sda] Write Protect is off Apr 6 12:58:41 kernel: [ 130.877761] sd 0:0:0:1074937986: [sda] Mode Sense: ed 00 00 08 Apr 6 12:58:41 kernel: [ 130.878085] sd 0:0:0:1074937986: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Apr 6 12:58:41 kernel: [ 130.878295] sd 0:0:0:1074937986: tag#1 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK Apr 6 12:58:41 kernel: [ 130.878297] sd 0:0:0:1074937986: tag#1 CDB: Report supported operation codes a3 0c 01 12 00 00 00 00 02 00 00 00 Apr 6 12:58:41 kernel: [ 130.878298] sd 0:0:0:1074937986: tag#1 Sense Key : Illegal Request [current] Apr 6 12:58:41 kernel: [ 130.878300] sd 0:0:0:1074937986: tag#1 Add. Sense: Invalid field in cdb Apr 6 12:58:41 kernel: [ 130.882710] sd 0:0:0:1074937986: [sda] Attached SCSI disk Apr 6 12:58:43 s390-zfcp[7752]: DEBUG: POPULATE LUN TREE: 0.0.1904:0x50050763070845e3:0x4082401200000000 [0:0:0:1074937986] Apr 6 12:58:43 s390-zfcp[7752]: DEBUG: WRITE CONFIG: /etc/sysconfig/hardware/config-ccw-0.0.1904 Apr 6 12:58:47 s390-zfcp[7752]: DEBUG: SELECT: Using FCP device 0.0.1944 Apr 6 12:58:47 kernel: [ 136.279385] scsi host1: scsi_eh_1: sleeping Apr 6 12:58:47 kernel: [ 136.279396] scsi host1: zfcp Apr 6 12:58:47 kernel: [ 136.290784] qdio: 0.0.1944 ZFCP on SC 10 using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W A Apr 6 12:58:52 s390-zfcp[7752]: DEBUG: ENABLE: Activated FCP device 0.0.1944 (npiv=1) Apr 6 12:58:52 kernel: [ 141.278081] scsi 1:0:0:0: scsi scan: INQUIRY pass 1 length 36 Apr 6 12:58:52 kernel: [ 141.278691] scsi 1:0:0:0: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:52 kernel: [ 141.278698] scsi 1:0:0:0: scsi scan: INQUIRY pass 2 length 164 Apr 6 12:58:52 kernel: [ 141.278910] scsi 1:0:0:0: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:52 kernel: [ 141.278915] scsi 1:0:0:0: scsi scan: peripheral device type of 31, no device added Apr 6 12:58:52 kernel: [ 141.279221] scsi 1:0:0:0: scsi scan: Sending REPORT LUNS to (try 0) Apr 6 12:58:52 kernel: [ 141.280115] scsi 1:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 Apr 6 12:58:52 kernel: [ 141.280119] scsi 1:0:0:0: scsi scan: REPORT LUN scan Apr 6 12:58:52 kernel: [ 141.280296] scsi 1:0:0:1074937986: scsi scan: INQUIRY pass 1 length 36 Apr 6 12:58:52 kernel: [ 141.280507] scsi 1:0:0:1074937986: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:52 kernel: [ 141.280511] scsi 1:0:0:1074937986: scsi scan: INQUIRY pass 2 length 164 Apr 6 12:58:52 kernel: [ 141.280693] scsi 1:0:0:1074937986: scsi scan: INQUIRY successful with code 0x0 Apr 6 12:58:52 kernel: [ 141.280698] scsi 1:0:0:1074937986: Direct-Access IBM 2107900 1.27 PQ: 0 ANSI: 5 Apr 6 12:58:52 kernel: [ 141.281519] sd 1:0:0:1074937986: alua: supports implicit TPGS Apr 6 12:58:52 kernel: [ 141.281798] sd 1:0:0:1074937986: tag#1 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK Apr 6 12:58:52 kernel: [ 141.281801] sd 1:0:0:1074937986: tag#1 CDB: Test Unit Ready 00 00 00 00 00 00 Apr 6 12:58:52 kernel: [ 141.281802] sd 1:0:0:1074937986: tag#1 Sense Key : Unit Attention [current] Apr 6 12:58:52 kernel: [ 141.281804] sd 1:0:0:1074937986: tag#1 Add. Sense: Power on, reset, or bus device reset occurred Apr 6 12:58:52 kernel: [ 141.282130] sd 1:0:0:1074937986: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) Apr 6 12:58:52 kernel: [ 141.283353] sd 1:0:0:1074937986: [sdb] Write Protect is off Apr 6 12:58:52 kernel: [ 141.283355] sd 1:0:0:1074937986: [sdb] Mode Sense: ed 00 00 08 Apr 6 12:58:52 kernel: [ 141.283637] sd 1:0:0:1074937986: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Apr 6 12:58:52 kernel: [ 141.283758] sd 1:0:0:1074937986: tag#1 Done: SUCCESS Result: hostbyte=DID_TARGET_FAILURE driverbyte=DRIVER_OK Apr 6 12:58:52 kernel: [ 141.283760] sd 1:0:0:1074937986: tag#1 CDB: Report supported operation codes a3 0c 01 12 00 00 00 00 02 00 00 00 Apr 6 12:58:52 kernel: [ 141.283761] sd 1:0:0:1074937986: tag#1 Sense Key : Illegal Request [current] Apr 6 12:58:52 kernel: [ 141.283763] sd 1:0:0:1074937986: tag#1 Add. Sense: Invalid field in cdb Apr 6 12:58:52 kernel: [ 141.288209] sd 1:0:0:1074937986: [sdb] Attached SCSI disk Apr 6 12:58:53 s390-zfcp[7752]: DEBUG: POPULATE LUN TREE: 0.0.1944:0x50050763071845e3:0x4082401200000000 [1:0:0:1074937986] Apr 6 12:58:53 s390-zfcp[7752]: DEBUG: WRITE CONFIG: /etc/sysconfig/hardware/config-ccw-0.0.1944 It looks like the LUN scan is being terminated. Kernel parms with: scsi_mod.scsi_logging_level=4605 zfcp.dbflevel=6 zfcp.dbfsize=100 Re-testing with installer version 445 + udeb packages from 2016-04-07 s390-dasd 0.0.36ubuntu1, s390-zfcp 1.0.2ubuntu1, multipath-udeb 0.5.0+git1.656f8865-5ubuntu2, disk-detect 1.117ubuntu1 Some observations: 1. With one path only, the scanning/detection of all LUNs worked deterministically and correctly (including the final re-IPL after successful installation). 2. With two paths, but without the parmfile option 'disk-detect/multipath/enable=true', the LUNs were all detected, but once for every path. I did not want to continue installation that way. 3. With two paths and with the option 'disk-detect/multipath/enable=true', the multipath program was loaded and multipathing was set up correctly. ... 4. ... installation completed fine then... 5. .... but the final re-IPL after successful installation failed. Attaching last console messages of the IPL: 6. I could also see a difference in filesystem creation time for multipath and single path: with multipathing (2 paths) Apr 7 15:27:33 partman: mke2fs 1.42.13 (17-May-2015) Apr 7 15:28:34 kernel: [ 559.624851] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: errors=remount-ro without multipathing Apr 7 15:48:17 partman: mke2fs 1.42.13 (17-May-2015) Apr 7 15:48:50 kernel: [ 413.789054] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: errors=remount-ro To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-release-notes/+bug/1567602/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp