Public bug reported: ---Problem Description--- We have Ubuntu16.04.1 installed on our system and run DLPAR test for ZR1 adapter after some time it crashes at lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 Machine Type = 9119-MME*1085AE7 ---Debugger Data--- e:mon> e cpu 0xe: Vector: 300 (Data Access) at [c0000003d45335a0] pc: d000000003a374e0: lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] lr: d0000000039d749c: lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] sp: c0000003d4533820 msr: 8000000100009033 dar: 0 dsisr: 40000000 current = 0xc0000003e06c2a20 paca = 0xc000000007af8500 softe: 0 irq_happened: 0x01 pid = 17983, comm = scsi_eh_23 e:mon> r R00 = d0000000039d749c R16 = 0000000000000000 R01 = c0000003d4533820 R17 = c0000003d4533cd0 R02 = d000000003a84160 R18 = c0000003d4533cb8 R03 = c0000003ee76a000 R19 = c0000003d87e5088 R04 = c0000003dad6a800 R20 = c0000003d4533cb0 R05 = c0000003dad6a870 R21 = 000000000000001e R06 = 0000000000000001 R22 = c0000000018aab78 R07 = d000000003a84160 R23 = c0000003dad6a870 R08 = d000000003a2f830 R24 = c0000003dad6a800 R09 = 0000000000000004 R25 = c0000003d4533978 R10 = 0000000000000000 R26 = 0000000000000001 R11 = d000000003a59a50 R27 = 0000000000000000 R12 = 0000000028533824 R28 = c0000003e841e000 R13 = c000000007af8500 R29 = c0000003ee76a000 R14 = c0000003d87e5000 R30 = c0000003dad6a800 R15 = c0000003d4533cb8 R31 = c0000003ee76a000 pc = d000000003a374e0 lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] cfar= c000000000008468 slb_miss_realmode+0x50/0x78 lr = d0000000039d749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] msr = 8000000100009033 cr = 28538828 ctr = c000000000ae3cf0 xer = 0000000020000010 trap = 300 dar = 0000000000000000 dsisr = 40000000 e:mon> Stack trace output: e:mon> t [c0000003d4533850] d0000000039d749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] [c0000003d4533890] d0000000039df680 lpfc_sli_issue_iocb+0xf0/0x330 [lpfc] [c0000003d45338f0] d0000000039e3824 lpfc_sli_issue_iocb_wait+0x264/0x680 [lpfc] [c0000003d45339d0] d000000003a32944 lpfc_send_taskmgmt+0x2d4/0x7d0 [lpfc] [c0000003d4533aa0] d000000003a33564 lpfc_device_reset_handler+0x114/0x210 [lpfc] [c0000003d4533b60] c00000000075843c scsi_eh_ready_devs+0x68c/0xee0 [c0000003d4533c50] c00000000075a91c scsi_error_handler+0x6bc/0x9e0 [c0000003d4533d80] c0000000000e61e0 kthread+0x110/0x130 [c0000003d4533e30] c000000000009538 ret_from_kernel_thread+0x5c/0xa4 --- Exception: 0 at 0000000000000000 e:mon> e:mon> dl [10194.079284] sd 13:0:3:0: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.079293] sd 13:0:3:0: [sdaf] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.079297] blk_update_request: I/O error, dev sdaf, sector 41942912 [10194.079313] device-mapper: multipath: Failing path 65:240. [10194.079351] sd 13:0:2:0: [sdab] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.079360] sd 13:0:2:0: [sdab] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.079364] blk_update_request: I/O error, dev sdab, sector 41942912 [10194.079375] device-mapper: multipath: Failing path 65:176. [10194.102832] scsi 13:0:1:0: alua: Detached [10194.110320] sd 13:0:1:1: [sdh] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110324] sd 13:0:1:1: [sdh] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110326] blk_update_request: I/O error, dev sdh, sector 41942912 [10194.110334] device-mapper: multipath: Failing path 8:112. [10194.110394] sd 13:0:2:1: [sdac] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110398] sd 13:0:2:1: [sdac] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110401] blk_update_request: I/O error, dev sdac, sector 41942912 [10194.110407] device-mapper: multipath: Failing path 65:192. [10194.110439] sd 13:0:3:1: [sdag] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110448] sd 13:0:3:1: [sdag] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110452] blk_update_request: I/O error, dev sdag, sector 41942912 [10194.110464] device-mapper: multipath: Failing path 66:0. [10194.118851] scsi 13:0:0:1: alua: Detached [10194.122868] sd 13:0:3:0: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.122879] sd 13:0:3:0: [sdaf] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.122887] blk_update_request: I/O error, dev sdaf, sector 41942912 [10194.122911] device-mapper: multipath: Failing path 65:240. [10194.138865] scsi 13:0:2:0: alua: Detached [10194.158852] scsi 13:0:3:0: alua: Detached [10194.162199] sd 13:0:3:1: [sdag] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.162204] sd 13:0:3:1: [sdag] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.162207] blk_update_request: I/O error, dev sdag, sector 41942912 [10194.162216] device-mapper: multipath: Failing path 66:0. [10194.162241] device-mapper: multipath: Failing path 65:192. [10194.194835] scsi 13:0:1:1: alua: Detached [10194.202301] device-mapper: multipath: Failing path 65:208. [10194.202323] device-mapper: multipath: Failing path 8:128. [10194.202359] device-mapper: multipath: Failing path 66:16. [10194.218852] scsi 13:0:0:2: alua: Detached [10194.222391] device-mapper: multipath: Failing path 66:0. [10194.250830] scsi 13:0:2:1: alua: Detached [10194.274829] scsi 13:0:3:1: alua: Detached [10194.278436] device-mapper: multipath: Failing path 66:16. [10194.278467] device-mapper: multipath: Failing path 65:208. [10194.298817] scsi 13:0:1:2: alua: Detached [10194.306356] device-mapper: multipath: Failing path 65:224. [10194.306383] device-mapper: multipath: Failing path 65:160. [10194.306424] device-mapper: multipath: Failing path 66:32. [10194.334838] scsi 13:0:0:3: alua: Detached [10194.338579] device-mapper: multipath: Failing path 66:16. [10194.354934] scsi 13:0:2:2: alua: Detached [10194.378850] scsi 13:0:3:2: alua: Detached [10194.382605] device-mapper: multipath: Failing path 66:32. [10194.382643] device-mapper: multipath: Failing path 65:224. [10194.406826] scsi 13:0:1:3: alua: Detached [10194.410973] device-mapper: multipath: Failing path 66:32. [10194.434908] scsi 13:0:2:3: alua: Detached [10194.462920] scsi 13:0:3:3: alua: Detached [10194.587776] iommu: Removing device 0007:01:00.0 from group 0 [10204.593263] pci_bus 0007:01: busn_res: [bus 01-ff] is released [10204.593333] rpadlpar_io: slot PHB 21 removed [10849.383986] PCI host bridge /pci@800000020000015 ranges: [10849.383991] MEM 0x00003fc600000000..0x00003fc67effffff -> 0x0000000080000000 [10849.383993] MEM 0x000030c000000000..0x000030cfffffffff -> 0x0003d0c000000000 [10849.389303] PCI: I/O resource not set for host bridge /pci@800000020000015 (domain 8) [10849.389372] PCI host bridge to bus 0008:01 [10849.389378] pci_bus 0008:01: root bus resource [mem 0x3fc600000000-0x3fc67effffff] (bus address [0x80000000-0xfeffffff]) [10849.389384] pci_bus 0008:01: root bus resource [bus 01-ff] [10849.394162] pci 0008:01:00.1: reg 0x160: [mem 0x00000000-0x0000ffff 64bit pref] [10849.394165] pci 0008:01:00.1: VF(n) BAR0 space: [mem 0x00000000-0x0013ffff 64bit pref] (contains BAR0 for 20 VFs) [10849.405662] pci 0008:01:00.0: reg 0x160: [mem 0x00000000-0x0000ffff 64bit pref] [10849.405664] pci 0008:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x0013ffff 64bit pref] (contains BAR0 for 20 VFs) [10849.491175] iommu: Adding device 0008:01:00.1 to group 0 [10849.491704] iommu: Adding device 0008:01:00.0 to group 0 [10849.492196] PIAR: overlapping address range [10849.492198] PIAR: overlapping address range [10849.492199] PIAR: overlapping address range [10849.492199] PIAR: overlapping address range [10849.492200] PIAR: overlapping address range [10849.492441] lpfc 0008:01:00.1: enabling device (0140 -> 0142) [10849.495406] lpfc 0008:01:00.1: ibm,query-pe-dma-windows(53) 10000 8000000 20000015 returned 0 [10849.542283] lpfc 0008:01:00.1: Using 64-bit direct DMA at offset 800000000000000 [10849.675139] scsi host14: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 01 irq 505 [10850.235317] lpfc 0008:01:00.0: enabling device (0140 -> 0142) [10850.239152] lpfc 0008:01:00.0: Using 64-bit direct DMA at offset 800000000000000 [10850.399263] scsi host15: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 00 irq 504 [10850.959301] rpaphp: Slot [U78CA.001.CSS003P-P1-C6-C1] registered [10850.959309] rpadlpar_io: slot PHB 21 added [10851.847229] lpfc 0008:01:00.0: 1:1303 Link Up Event x1 received Data: x1 x0 x80 x0 x0 x0 0 [10852.026827] scsi 15:0:0:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.027527] sd 15:0:0:0: alua: supports implicit TPGS [10852.027843] sd 15:0:0:0: alua: port group 00 rel port 230 [10852.027890] sd 15:0:0:0: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.028276] sd 15:0:0:0: alua: port group 00 state A preferred supports tolusnA [10852.028425] sd 15:0:0:0: [sda] Write Protect is off [10852.028431] sd 15:0:0:0: [sda] Mode Sense: f5 00 00 08 [10852.028455] sd 15:0:0:0: Attached scsi generic sg0 type 0 [10852.028711] sd 15:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.029804] scsi 15:0:0:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.030789] sd 15:0:0:1: alua: supports implicit TPGS [10852.031484] sd 15:0:0:1: alua: port group 00 rel port 230 [10852.031522] sd 15:0:0:1: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.031994] sd 15:0:0:1: alua: port group 00 state A preferred supports tolusnA [10852.032153] sd 15:0:0:1: Attached scsi generic sg1 type 0 [10852.032239] sd 15:0:0:1: [sdb] Write Protect is off [10852.032246] sd 15:0:0:1: [sdb] Mode Sense: ed 00 00 08 [10852.032596] sd 15:0:0:1: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.033460] scsi 15:0:0:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.034530] sd 15:0:0:2: alua: supports implicit TPGS [10852.034917] sd 15:0:0:2: alua: port group 00 rel port 230 [10852.035000] sd 15:0:0:2: [sdd] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.035294] sd 15:0:0:2: alua: port group 00 state A preferred supports tolusnA [10852.035568] sd 15:0:0:2: Attached scsi generic sg2 type 0 [10852.036739] scsi 15:0:0:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.037441] sd 15:0:0:3: alua: supports implicit TPGS [10852.037740] sd 15:0:0:3: alua: port group 00 rel port 230 [10852.037798] sd 15:0:0:3: [sde] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.038070] sd 15:0:0:3: alua: port group 00 state A preferred supports tolusnA [10852.038234] sd 15:0:0:3: Attached scsi generic sg3 type 0 [10852.038349] sd 15:0:0:3: [sde] Write Protect is off [10852.038355] sd 15:0:0:3: [sde] Mode Sense: ed 00 00 08 [10852.038683] sd 15:0:0:3: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.039314] scsi 15:0:1:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.039748] sdb: [10852.040238] sd 15:0:1:0: alua: supports implicit TPGS [10852.040632] sd 15:0:1:0: alua: port group 00 rel port 30 [10852.040708] sd 15:0:1:0: [sdg] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.041053] sd 15:0:1:0: alua: port group 00 state A preferred supports tolusnA [10852.041481] sd 15:0:1:0: [sdg] Write Protect is off [10852.041496] sd 15:0:1:0: [sdg] Mode Sense: f5 00 00 08 [10852.041550] sd 15:0:0:1: [sdb] Attached SCSI disk [10852.041786] sd 15:0:1:0: [sdg] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.042390] sd 15:0:1:0: Attached scsi generic sg4 type 0 [10852.044049] scsi 15:0:1:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.044795] sd 15:0:1:1: alua: supports implicit TPGS [10852.045180] sd 15:0:1:1: alua: port group 00 rel port 30 [10852.045226] sd 15:0:0:0: [sda] Attached SCSI disk [10852.045313] sd 15:0:1:1: [sdh] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.045631] sd 15:0:1:1: alua: port group 00 state A preferred supports tolusnA [10852.045730] sd 15:0:1:1: Attached scsi generic sg5 type 0 [10852.045942] sd 15:0:1:1: [sdh] Write Protect is off [10852.045949] sd 15:0:1:1: [sdh] Mode Sense: ed 00 00 08 [10852.046318] sd 15:0:1:1: [sdh] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.046813] scsi 15:0:1:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.047808] sd 15:0:1:2: alua: supports implicit TPGS [10852.048133] sd 15:0:1:2: alua: port group 00 rel port 30 [10852.048358] sd 15:0:1:2: [sdi] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.048520] sd 15:0:1:2: alua: port group 00 state A preferred supports tolusnA [10852.048643] sd 15:0:1:2: Attached scsi generic sg6 type 0 [10852.049296] sdh: [10852.049299] sd 15:0:1:2: [sdi] Write Protect is off [10852.049308] sd 15:0:1:2: [sdi] Mode Sense: ed 00 00 08 [10852.049634] sd 15:0:1:2: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.049667] scsi 15:0:1:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.050354] sd 15:0:1:3: alua: supports implicit TPGS [10852.050853] sd 15:0:1:3: alua: port group 00 rel port 30 [10852.050943] sd 15:0:1:3: [sdaa] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.051214] sd 15:0:1:1: [sdh] Attached SCSI disk [10852.051287] sd 15:0:1:3: alua: port group 00 state A preferred supports tolusnA [10852.051426] sd 15:0:1:3: Attached scsi generic sg7 type 0 [10852.051646] sd 15:0:1:3: [sdaa] Write Protect is off [10852.051656] sd 15:0:1:3: [sdaa] Mode Sense: ed 00 00 08 [10852.051967] sd 15:0:1:3: [sdaa] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.052323] scsi 15:0:2:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.052973] sd 15:0:2:0: alua: supports implicit TPGS [10852.053314] sd 15:0:2:0: alua: port group 00 rel port 100 [10852.053406] sd 15:0:2:0: [sdab] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.053686] sd 15:0:2:0: alua: port group 00 state A preferred supports tolusnA [10852.053892] sd 15:0:2:0: Attached scsi generic sg8 type 0 [10852.054069] sd 15:0:2:0: [sdab] Write Protect is off [10852.054078] sd 15:0:2:0: [sdab] Mode Sense: f5 00 00 08 [10852.054391] sd 15:0:2:0: [sdab] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.055081] scsi 15:0:2:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.056159] sd 15:0:2:1: alua: supports implicit TPGS [10852.056493] sd 15:0:2:1: alua: port group 00 rel port 100 [10852.056574] sd 15:0:2:1: [sdac] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.056997] sd 15:0:2:1: alua: port group 00 state A preferred supports tolusnA [10852.057274] sd 15:0:2:1: [sdac] Write Protect is off [10852.057280] sd 15:0:2:1: Attached scsi generic sg29 type 0 [10852.057290] sd 15:0:2:1: [sdac] Mode Sense: ed 00 00 08 [10852.057578] sd 15:0:2:1: [sdac] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.058491] scsi 15:0:2:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.059132] sde: [10852.059173] sdaa: [10852.060148] sd 15:0:2:2: alua: supports implicit TPGS [10852.060723] sd 15:0:2:2: alua: port group 00 rel port 100 [10852.060814] sd 15:0:0:3: [sde] Attached SCSI disk [10852.060858] sd 15:0:1:3: [sdaa] Attached SCSI disk [10852.060942] sd 15:0:2:2: alua: rtpg failed with 8000002 [10852.061167] sdac: [10852.061313] sd 15:0:2:2: [sdad] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.061363] sd 15:0:2:2: alua: port group 00 state A preferred supports tolusnA [10852.061615] sd 15:0:2:2: Attached scsi generic sg30 type 0 [10852.062278] sd 15:0:2:2: [sdad] Write Protect is off [10852.062291] sd 15:0:2:2: [sdad] Mode Sense: ed 00 00 08 [10852.062841] scsi 15:0:2:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.062902] sd 15:0:2:2: [sdad] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.063740] sd 15:0:2:1: [sdac] Attached SCSI disk [10852.063965] sd 15:0:2:3: alua: supports implicit TPGS [10852.064330] sd 15:0:2:0: [sdab] Attached SCSI disk [10852.064437] sd 15:0:2:3: alua: port group 00 rel port 100 [10852.064507] sd 15:0:2:3: [sdae] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.064927] sd 15:0:2:3: alua: port group 00 state A preferred supports tolusnA [10852.065231] sd 15:0:2:3: Attached scsi generic sg31 type 0 [10852.065348] sd 15:0:2:3: [sdae] Write Protect is off [10852.065358] sd 15:0:2:3: [sdae] Mode Sense: ed 00 00 08 [10852.065859] sd 15:0:2:3: [sdae] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.065872] sdad: [10852.065959] sdi: [10852.066310] scsi 15:0:3:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.067721] sd 15:0:2:2: [sdad] Attached SCSI disk [10852.067796] sd 15:0:1:2: [sdi] Attached SCSI disk [10852.067876] sd 15:0:3:0: alua: supports implicit TPGS [10852.068387] sd 15:0:3:0: [sdaf] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.068426] sd 15:0:3:0: alua: port group 00 rel port 300 [10852.068904] sd 15:0:3:0: alua: port group 00 state A preferred supports tolusnA [10852.069151] sdae: [10852.069204] sd 15:0:3:0: Attached scsi generic sg32 type 0 [10852.069657] sd 15:0:3:0: [sdaf] Write Protect is off [10852.069664] sd 15:0:3:0: [sdaf] Mode Sense: f5 00 00 08 [10852.070344] sd 15:0:3:0: [sdaf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.070954] sd 15:0:2:3: [sdae] Attached SCSI disk [10852.074942] sd 15:0:3:0: [sdaf] Attached SCSI disk [10852.074954] scsi 15:0:3:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.076026] sd 15:0:3:1: alua: supports implicit TPGS [10852.076714] sd 15:0:3:1: alua: port group 00 rel port 300 [10852.076745] sd 15:0:3:1: [sdag] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.077546] sd 15:0:3:1: alua: port group 00 state A preferred supports tolusnA [10852.077885] sd 15:0:3:1: Attached scsi generic sg33 type 0 [10852.078100] sd 15:0:3:1: [sdag] Write Protect is off [10852.078109] sd 15:0:3:1: [sdag] Mode Sense: ed 00 00 08 [10852.078403] sd 15:0:3:1: [sdag] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.080875] sdag: [10852.083603] sd 15:0:3:1: [sdag] Attached SCSI disk [10852.086470] scsi 15:0:3:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.087290] sd 15:0:3:2: alua: supports implicit TPGS [10852.087630] sd 15:0:3:2: alua: port group 00 rel port 300 [10852.087790] sd 15:0:3:2: [sdah] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.087984] sd 15:0:3:2: alua: port group 00 state A preferred supports tolusnA [10852.088145] sd 15:0:3:2: Attached scsi generic sg34 type 0 [10852.088392] sd 15:0:3:2: [sdah] Write Protect is off [10852.088411] sd 15:0:3:2: [sdah] Mode Sense: ed 00 00 08 [10852.088687] sd 15:0:3:2: [sdah] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.089078] scsi 15:0:3:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.089911] sd 15:0:3:3: alua: supports implicit TPGS [10852.090323] sd 15:0:3:3: alua: port group 00 rel port 300 [10852.090360] sd 15:0:3:3: [sdai] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.090737] sd 15:0:3:3: alua: port group 00 state A preferred supports tolusnA [10852.091050] sd 15:0:3:3: Attached scsi gen0 [12126.399029] scsi host17: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 00 irq 504 [12126.959169] rpaphp: Slot [U78CA.001.CSS003P-P1-C6-C1] registered [12126.959176] rpadlpar_io: slot PHB 21 added [12127.844158] lpfc 0009:01:00.0: 1:1303 Link Up Event x1 received Data: x1 x0 x80 x0 x0 x0 0 [12128.043085] scsi 17:0:0:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [12128.043688] sd 17:0:0:0: alua: supports implicit TPGS [12128.043969] sd 17:0:0:0: alua: port group 00 rel port 300 [12128.044190] sd 17:0:0:0: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [12128.044311] sd 17:0:0:0: alua: port group 00 state A preferred supports tolusnA [12128.044430] sd 17:0:0:0: Attached scsi generic sg0 type 0 [12128.044896] sd 17:0:0:0: [sda] Write Protect is off [12128.044903] sd 17:0:0:0: [sda] Mode Sense: f5 00 00 08 [12128.045179] sd 17:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or [sdaf] Mode Sense: f5 00 00 08 [12128.088966] sd 17:0:3:0: Attached scsi generic sg32 type 0 [12128.089258] sd 17:0:3:0: [sdaf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [12128.089877] sd 17:0:2:3: [sdae] Attached SCSI disk [12128.090349] scsi 17:0:3:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [12128.092029] sd 17:0:3:1: alua: supports implicit TPGS [12128.092495] sd 17:0:3:1: alua: port group 00 rel port 30 [12128.092548] sd 17:0:3:1: [sdag] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [12128.092885] sd 17:0:3:1: alua: port group 00 state A preferred supports tolusnA [12128.093103] sd 17:0:3:1: Attached scsi generic sg33 type 0 [12128.093172] sd 17:0:3:1: [sdag] Write Protect is off [12128.093183] sd 17:0:3:1: [sdag] Mode Sense: ed 00 00 08 [12128.093503] sd 17:0:3:1: [sdag] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [12128.094183] scsi 17:0:3:2: Direct-Access IBM th 65:224. [12749.073922] device-mapper: multipath: Failing path 66:32. [12749.073953] device-mapper: multipath: Failing path 65:160. [12749.090309] scsi 17:0:0:3: alua: Detached [12749.094400] device-mapper: multipath: Failing path 66:16. [12749.118398] scsi 17:0:2:2: alua: Detached [12749.138317] scsi 17:0:3:2: alua: Detached [12749.141437] device-mapper: multipath: Failing path 65:224. [12749.141467] device-mapper: multipath: Failing path 66:32. [12749.162316] scsi 17:0:1:3: alua: Detached [12749.165448] device-mapper: multipath: Failing path 66:32. [12749.202489] scsi 17:0:2:3: alua: Detached [12749.238436] scsi 17:0:3:3: alua: Detached [12749.371720] iommu: Removing device 0009:01:00.0 from group 0 [12759.378488] pci_bus 0009:01: busn_res: [bus 01-ff] is released [12759.378559] rpadlpar_io: slot PHB 21 removed [13405.725246] PCI host bridge /pci@800000020000015 ranges: [13405.725253] MEM 0x00003fc600000000..0x00003fc67effffff -> 0x0000000080000000 l port 100 [13408.460796] sd 19:0:2:1: [sdac] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [13408.461979] sd 19:0:2:1: [sdac] Write Protect is off [13408.461987] sd 19:0:2:1: [sdac] Mode Sense: ed 00 00 08 [13408.462292] sd 19:0:2:1: [sdac] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [13408.463890] sd 19:0:2:0: [sdab] Attached SCSI disk [13408.465140] sdac: [13408.467041] sd 19:0:2:1: [sdac] Attached SCSI disk [13408.569203] sdd: [13408.569287] sdi: [13408.570556] sd 19:0:0:2: [sdd] Attached SCSI disk [13408.570631] sd 19:0:1:2: [sdi] Attached SCSI disk [13438.697588] sd 19:0:0:1: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [13439.326030] rport-19:0-9: blocked FC remote port time out: removing rport [13453.426174] sd 19:0:0:1: [sdb] Write Protect is off [13453.426184] sd 19:0:0:1: [sdb] Mode Sense: ed 00 00 08 [13453.426711] sd 19:0:0:1: [sdb] Write cache: disabled, read cache: enabled, doesn't support Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [14683.037860] scsi 21:0:0:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [14683.039069] sd 21:0:0:2: alua: supports implicit TPGS [14683.039472] sd 21:0:0:2: alua: port group 00 rel port 300 [14683.039557] sd 21:0:0:2: [sdd] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [14683.039882] sd 21:0:0:2: alua: port group 00 state A preferred supports tolusnA [14683.040143] sd 21:0:0:2: Attached scsi generic sg2 type 0 [14683.040221] sd 21:0:0:2: [sdd] Write Protect is off [14683.040231] sd 21:0:0:2: [sdd] Mode Sense: ed 00 00 08 [14683.040517] sd 21:0:0:2: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [14683.041295] scsi 21:0:0:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [14683.042497] sd 21:0:0:3: alua: supports implicit TPGS [14683.042870] sd 21:0:0:3: [sde] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [1468er: multipath: Failing path 8:16. [15299.302590] iommu: Removing device 000b:01:00.1 from group 0 [15299.317740] scsi 21:0:0:0: alua: Detached [15299.325260] sd 21:0:2:0: [sdab] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.325265] sd 21:0:2:0: [sdab] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [15299.325267] blk_update_request: I/O error, dev sdab, sector 41942912 [15299.325275] device-mapper: multipath: Failing path 65:176. [15299.325296] sd 21:0:3:0: [sdae] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.325298] sd 21:0:3:0: [sdae] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [15299.325301] blk_update_request: I/O error, dev sdae, sector 41942912 [15299.325309] device-mapper: multipath: Failing path 65:224. [15299.353733] scsi 21:0:1:0: alua: Detached [15299.361265] sd 21:0:3:1: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.361269] sd 21:0:3:1: [sdaf] tag#0 CDB958.559922] scsi 23:0:1:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [15958.560721] sd 23:0:1:1: alua: supports implicit TPGS [15958.561066] sd 23:0:1:1: alua: port group 00 rel port 100 [15958.561092] sd 23:0:1:1: [sdh] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [15958.561697] sd 23:0:1:1: alua: port group 00 state A preferred supports tolusnA [15958.561880] sd 23:0:1:1: Attached scsi generic sg5 type 0 [15958.561951] sd 23:0:1:1: [sdh] Write Protect is off [15958.561958] sd 23:0:1:1: [sdh] Mode Sense: ed 00 00 08 [15958.562239] sd 23:0:1:1: [sdh] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [15958.562942] scsi 23:0:1:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [15958.563577] sd 23:0:1:0: [sdg] Attached SCSI disk [15958.563921] sd 23:0:1:2: alua: supports implicit TPGS [15958.564270] sd 23:0:1:2: alua: port group 00 rel port 100 [15958.564443] sd 23:0:1:2: [sdi] 41943040 512-byte d000000003a37530 7c6307b4 extsw r3,r3 d000000003a37534 38210030 addi r1,r1,48 d000000003a37538 e8010010 ld r0,16(r1) d000000003a3753c ebc1fff0 ld r30,-16(r1) d000000003a37540 ebe1fff8 ld r31,-8(r1) d000000003a37544 7c0803a6 mtlr r0 d000000003a37548 4e800020 blr d000000003a3754c 60420000 ori r2,r2,0 d000000003a37550 813f0b14 lwz r9,2836(r31) d000000003a37554 2b890001 cmplwi cr7,r9,1 d000000003a37558 409dffa8 ble cr7,d000000003a37500 # lpfc_sli4_scmd_to_wqidx_distr+0x50/0x100 [lpfc] d000000003a3755c a14d0008 lhz r10,8(r13) d000000003a37560 a13f0572 lhz r9,1394(r31) d000000003a37564 7f895000 cmpw cr7,r9,r10 d000000003a37568 409dff98 ble cr7,d000000003a37500 # lpfc_sli4_scmd_to_wqidx_distr+0x50/0x100 [lpfc] d000000003a3756c e93f0568 ld r9,1384(r31) e:mon>
---Steps to Reproduce--- cd /kte/tools ./setup dlar cd /kte/tools/dlpar ./start.dlpar -d 0 Doing some analysis without a crashdump, as it's taking too long. This looks like a NULL pointer dereference, if my assembly reading/matching to C is correct. Would need to understand why/how this '(struct scsi_cmnd *cmnd)->device' field is NULL. Analysis -------- >From xmon: pc: <...>: lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] R10 = 0000000000000000 R26 = 0000000000000001 >From 'objdump -d /usr/lib/debug/<...>/lpfc.ko', lpfc_sli4_scmd_to_wqidx_distr = 674b0 lpfc_sli4_scmd_to_wqidx_distr+0x30 = 674e0 (crash) and 00000000000674b0 <lpfc_sli4_scmd_to_wqidx_distr>: <...> 674cc: 78 23 9e 7c mr r30,r4 674d0: 78 1b 7f 7c mr r31,r3 <...> 674dc: 10 00 5e e9 ld r10,16(r30) 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) 674e4: 00 00 29 e9 ld r9,0(r9) <...> This is the relevant snippet of code: >From Ubuntu 16.04 kernel 4.4.0-22.40 [1] int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba, struct lpfc_scsi_buf *lpfc_cmd) { struct scsi_cmnd *cmnd = lpfc_cmd->pCmd; <...> if (shost_use_blk_mq(cmnd->device->host)) { <...> So, back to the assembly, this seems the 2 function parameters passed by register (r4, r3), loaded into other registers (r30, r31). 674cc: 78 23 9e 7c mr r30,r4 674d0: 78 1b 7f 7c mr r31,r3 Per the load below, r10 is *cmnd, and r30 is *lpfc_cmd; it loads lpfc_cmd->pCmd, which has offset 16 bytes into struct lpfc_cmd [2] (after 2 pointers * 8-bytes each, from list_head list [3]) 674dc: 10 00 5e e9 ld r10,16(r30) struct lpfc_scsi_buf { struct list_head list; struct scsi_cmnd *pCmd; <...> struct list_head { struct list_head *next, *prev; }; And the load below hits the crash, because it dereferences r10 (*cmnd) which is zero: 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) From xmon: R10 = 0000000000000000 R26 = 0000000000000001 That deference was for cmnd->device; you can see the load instruction immediately afterward would further dereference the cmnd->device pointer, for the device->host field, which has offset 0 into struct scsi_device [4]: --- this is a confirmantion that the assembly/C matching looks correct. if (shost_use_blk_mq(cmnd->device->host)) { struct scsi_device { struct Scsi_Host *host; <...> [1] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/drivers/scsi/lpfc/lpfc_scsi.c?h=Ubuntu-4.4.0-22.40 [2] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/drivers/scsi/lpfc/lpfc_scsi.h?h=Ubuntu-4.4.0-22.40#n130 [3] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/include/linux/types.h?h=Ubuntu-4.4.0-22.40#n185 [4] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/include/scsi/scsi_device.h?h=Ubuntu-4.4.0-22.40#n77 One detail missing.. (In reply to comment #16) > And the load below hits the crash, because it dereferences r10 (*cmnd) which > is zero: > > 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) > > From xmon: > > R10 = 0000000000000000 R26 = 0000000000000001 > > That deference was for cmnd->device; [snip] which has offset zero into struct scsi_cmnd [5] if (shost_use_blk_mq(cmnd->device->host)) { struct scsi_cmnd { struct scsi_device *device; <...> [5] http://kernel.ubuntu.com/git/ubuntu/ubuntu- xenial.git/tree/include/scsi/scsi_cmnd.h?h=Ubuntu-4.4.0-22.40#n59 > Would need to understand why/how this > '(struct scsi_cmnd *cmnd)->device' field is NULL. Checking this again today, it occurred to me that the problem is actually *cmnd == NULL, and dereferencing *cmnd (for cmnd->device) hits the crash. Fix submitted upstream: http://marc.info/?l=linux-scsi&m=146534119707379&w=2 I didn't provide a test kernel because the system was running regression tests over weekend, and it takes long to reproduce the problem w/ DLPAR operations -- but the same problem could be reproduced w/ simpler test- cases (see commit), so I worked it in the background. If you keep hitting this, let me know and I'll provide a test kernel. Hi Mauricio, Installed the fix and verified. Now am not finding the issue. Hi Canonical, Can you consider picking up this fix that has not yet made the upstream kernel? It's fairly obvious and trivial, very documented in the commit message (w/ test-cases), and has been tested successfully here in IBM (also see commit msg). http://marc.info/?l=linux-scsi&m=146534119707379&w=2 The adapter vendor's team has not yet reviewed it on the mailing list (and no other patches for lpfc), so I guess it'll take some time until this makes in. Is that possible? Thanks Mauricio ** Affects: linux (Ubuntu) Importance: Undecided Assignee: Taco Screen team (taco-screen-team) Status: New ** Tags: architecture-ppc64 bugnameltc-141959 severity-high targetmilestone-inin16041 ** Tags added: architecture-ppc64 bugnameltc-141959 severity-high targetmilestone-inin16041 ** Changed in: ubuntu Assignee: (unassigned) => Taco Screen team (taco-screen-team) ** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1597974 Title: ISST-LTE:pVM:monklp5:Ubuntu16.04.1:system crashed at lpfc_sli4_scmd_to_wqidx_distr Status in linux package in Ubuntu: New Bug description: ---Problem Description--- We have Ubuntu16.04.1 installed on our system and run DLPAR test for ZR1 adapter after some time it crashes at lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 Machine Type = 9119-MME*1085AE7 ---Debugger Data--- e:mon> e cpu 0xe: Vector: 300 (Data Access) at [c0000003d45335a0] pc: d000000003a374e0: lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] lr: d0000000039d749c: lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] sp: c0000003d4533820 msr: 8000000100009033 dar: 0 dsisr: 40000000 current = 0xc0000003e06c2a20 paca = 0xc000000007af8500 softe: 0 irq_happened: 0x01 pid = 17983, comm = scsi_eh_23 e:mon> r R00 = d0000000039d749c R16 = 0000000000000000 R01 = c0000003d4533820 R17 = c0000003d4533cd0 R02 = d000000003a84160 R18 = c0000003d4533cb8 R03 = c0000003ee76a000 R19 = c0000003d87e5088 R04 = c0000003dad6a800 R20 = c0000003d4533cb0 R05 = c0000003dad6a870 R21 = 000000000000001e R06 = 0000000000000001 R22 = c0000000018aab78 R07 = d000000003a84160 R23 = c0000003dad6a870 R08 = d000000003a2f830 R24 = c0000003dad6a800 R09 = 0000000000000004 R25 = c0000003d4533978 R10 = 0000000000000000 R26 = 0000000000000001 R11 = d000000003a59a50 R27 = 0000000000000000 R12 = 0000000028533824 R28 = c0000003e841e000 R13 = c000000007af8500 R29 = c0000003ee76a000 R14 = c0000003d87e5000 R30 = c0000003dad6a800 R15 = c0000003d4533cb8 R31 = c0000003ee76a000 pc = d000000003a374e0 lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] cfar= c000000000008468 slb_miss_realmode+0x50/0x78 lr = d0000000039d749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] msr = 8000000100009033 cr = 28538828 ctr = c000000000ae3cf0 xer = 0000000020000010 trap = 300 dar = 0000000000000000 dsisr = 40000000 e:mon> Stack trace output: e:mon> t [c0000003d4533850] d0000000039d749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] [c0000003d4533890] d0000000039df680 lpfc_sli_issue_iocb+0xf0/0x330 [lpfc] [c0000003d45338f0] d0000000039e3824 lpfc_sli_issue_iocb_wait+0x264/0x680 [lpfc] [c0000003d45339d0] d000000003a32944 lpfc_send_taskmgmt+0x2d4/0x7d0 [lpfc] [c0000003d4533aa0] d000000003a33564 lpfc_device_reset_handler+0x114/0x210 [lpfc] [c0000003d4533b60] c00000000075843c scsi_eh_ready_devs+0x68c/0xee0 [c0000003d4533c50] c00000000075a91c scsi_error_handler+0x6bc/0x9e0 [c0000003d4533d80] c0000000000e61e0 kthread+0x110/0x130 [c0000003d4533e30] c000000000009538 ret_from_kernel_thread+0x5c/0xa4 --- Exception: 0 at 0000000000000000 e:mon> e:mon> dl [10194.079284] sd 13:0:3:0: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.079293] sd 13:0:3:0: [sdaf] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.079297] blk_update_request: I/O error, dev sdaf, sector 41942912 [10194.079313] device-mapper: multipath: Failing path 65:240. [10194.079351] sd 13:0:2:0: [sdab] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.079360] sd 13:0:2:0: [sdab] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.079364] blk_update_request: I/O error, dev sdab, sector 41942912 [10194.079375] device-mapper: multipath: Failing path 65:176. [10194.102832] scsi 13:0:1:0: alua: Detached [10194.110320] sd 13:0:1:1: [sdh] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110324] sd 13:0:1:1: [sdh] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110326] blk_update_request: I/O error, dev sdh, sector 41942912 [10194.110334] device-mapper: multipath: Failing path 8:112. [10194.110394] sd 13:0:2:1: [sdac] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110398] sd 13:0:2:1: [sdac] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110401] blk_update_request: I/O error, dev sdac, sector 41942912 [10194.110407] device-mapper: multipath: Failing path 65:192. [10194.110439] sd 13:0:3:1: [sdag] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.110448] sd 13:0:3:1: [sdag] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.110452] blk_update_request: I/O error, dev sdag, sector 41942912 [10194.110464] device-mapper: multipath: Failing path 66:0. [10194.118851] scsi 13:0:0:1: alua: Detached [10194.122868] sd 13:0:3:0: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.122879] sd 13:0:3:0: [sdaf] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.122887] blk_update_request: I/O error, dev sdaf, sector 41942912 [10194.122911] device-mapper: multipath: Failing path 65:240. [10194.138865] scsi 13:0:2:0: alua: Detached [10194.158852] scsi 13:0:3:0: alua: Detached [10194.162199] sd 13:0:3:1: [sdag] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [10194.162204] sd 13:0:3:1: [sdag] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [10194.162207] blk_update_request: I/O error, dev sdag, sector 41942912 [10194.162216] device-mapper: multipath: Failing path 66:0. [10194.162241] device-mapper: multipath: Failing path 65:192. [10194.194835] scsi 13:0:1:1: alua: Detached [10194.202301] device-mapper: multipath: Failing path 65:208. [10194.202323] device-mapper: multipath: Failing path 8:128. [10194.202359] device-mapper: multipath: Failing path 66:16. [10194.218852] scsi 13:0:0:2: alua: Detached [10194.222391] device-mapper: multipath: Failing path 66:0. [10194.250830] scsi 13:0:2:1: alua: Detached [10194.274829] scsi 13:0:3:1: alua: Detached [10194.278436] device-mapper: multipath: Failing path 66:16. [10194.278467] device-mapper: multipath: Failing path 65:208. [10194.298817] scsi 13:0:1:2: alua: Detached [10194.306356] device-mapper: multipath: Failing path 65:224. [10194.306383] device-mapper: multipath: Failing path 65:160. [10194.306424] device-mapper: multipath: Failing path 66:32. [10194.334838] scsi 13:0:0:3: alua: Detached [10194.338579] device-mapper: multipath: Failing path 66:16. [10194.354934] scsi 13:0:2:2: alua: Detached [10194.378850] scsi 13:0:3:2: alua: Detached [10194.382605] device-mapper: multipath: Failing path 66:32. [10194.382643] device-mapper: multipath: Failing path 65:224. [10194.406826] scsi 13:0:1:3: alua: Detached [10194.410973] device-mapper: multipath: Failing path 66:32. [10194.434908] scsi 13:0:2:3: alua: Detached [10194.462920] scsi 13:0:3:3: alua: Detached [10194.587776] iommu: Removing device 0007:01:00.0 from group 0 [10204.593263] pci_bus 0007:01: busn_res: [bus 01-ff] is released [10204.593333] rpadlpar_io: slot PHB 21 removed [10849.383986] PCI host bridge /pci@800000020000015 ranges: [10849.383991] MEM 0x00003fc600000000..0x00003fc67effffff -> 0x0000000080000000 [10849.383993] MEM 0x000030c000000000..0x000030cfffffffff -> 0x0003d0c000000000 [10849.389303] PCI: I/O resource not set for host bridge /pci@800000020000015 (domain 8) [10849.389372] PCI host bridge to bus 0008:01 [10849.389378] pci_bus 0008:01: root bus resource [mem 0x3fc600000000-0x3fc67effffff] (bus address [0x80000000-0xfeffffff]) [10849.389384] pci_bus 0008:01: root bus resource [bus 01-ff] [10849.394162] pci 0008:01:00.1: reg 0x160: [mem 0x00000000-0x0000ffff 64bit pref] [10849.394165] pci 0008:01:00.1: VF(n) BAR0 space: [mem 0x00000000-0x0013ffff 64bit pref] (contains BAR0 for 20 VFs) [10849.405662] pci 0008:01:00.0: reg 0x160: [mem 0x00000000-0x0000ffff 64bit pref] [10849.405664] pci 0008:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x0013ffff 64bit pref] (contains BAR0 for 20 VFs) [10849.491175] iommu: Adding device 0008:01:00.1 to group 0 [10849.491704] iommu: Adding device 0008:01:00.0 to group 0 [10849.492196] PIAR: overlapping address range [10849.492198] PIAR: overlapping address range [10849.492199] PIAR: overlapping address range [10849.492199] PIAR: overlapping address range [10849.492200] PIAR: overlapping address range [10849.492441] lpfc 0008:01:00.1: enabling device (0140 -> 0142) [10849.495406] lpfc 0008:01:00.1: ibm,query-pe-dma-windows(53) 10000 8000000 20000015 returned 0 [10849.542283] lpfc 0008:01:00.1: Using 64-bit direct DMA at offset 800000000000000 [10849.675139] scsi host14: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 01 irq 505 [10850.235317] lpfc 0008:01:00.0: enabling device (0140 -> 0142) [10850.239152] lpfc 0008:01:00.0: Using 64-bit direct DMA at offset 800000000000000 [10850.399263] scsi host15: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 00 irq 504 [10850.959301] rpaphp: Slot [U78CA.001.CSS003P-P1-C6-C1] registered [10850.959309] rpadlpar_io: slot PHB 21 added [10851.847229] lpfc 0008:01:00.0: 1:1303 Link Up Event x1 received Data: x1 x0 x80 x0 x0 x0 0 [10852.026827] scsi 15:0:0:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.027527] sd 15:0:0:0: alua: supports implicit TPGS [10852.027843] sd 15:0:0:0: alua: port group 00 rel port 230 [10852.027890] sd 15:0:0:0: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.028276] sd 15:0:0:0: alua: port group 00 state A preferred supports tolusnA [10852.028425] sd 15:0:0:0: [sda] Write Protect is off [10852.028431] sd 15:0:0:0: [sda] Mode Sense: f5 00 00 08 [10852.028455] sd 15:0:0:0: Attached scsi generic sg0 type 0 [10852.028711] sd 15:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.029804] scsi 15:0:0:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.030789] sd 15:0:0:1: alua: supports implicit TPGS [10852.031484] sd 15:0:0:1: alua: port group 00 rel port 230 [10852.031522] sd 15:0:0:1: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.031994] sd 15:0:0:1: alua: port group 00 state A preferred supports tolusnA [10852.032153] sd 15:0:0:1: Attached scsi generic sg1 type 0 [10852.032239] sd 15:0:0:1: [sdb] Write Protect is off [10852.032246] sd 15:0:0:1: [sdb] Mode Sense: ed 00 00 08 [10852.032596] sd 15:0:0:1: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.033460] scsi 15:0:0:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.034530] sd 15:0:0:2: alua: supports implicit TPGS [10852.034917] sd 15:0:0:2: alua: port group 00 rel port 230 [10852.035000] sd 15:0:0:2: [sdd] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.035294] sd 15:0:0:2: alua: port group 00 state A preferred supports tolusnA [10852.035568] sd 15:0:0:2: Attached scsi generic sg2 type 0 [10852.036739] scsi 15:0:0:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.037441] sd 15:0:0:3: alua: supports implicit TPGS [10852.037740] sd 15:0:0:3: alua: port group 00 rel port 230 [10852.037798] sd 15:0:0:3: [sde] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.038070] sd 15:0:0:3: alua: port group 00 state A preferred supports tolusnA [10852.038234] sd 15:0:0:3: Attached scsi generic sg3 type 0 [10852.038349] sd 15:0:0:3: [sde] Write Protect is off [10852.038355] sd 15:0:0:3: [sde] Mode Sense: ed 00 00 08 [10852.038683] sd 15:0:0:3: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.039314] scsi 15:0:1:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.039748] sdb: [10852.040238] sd 15:0:1:0: alua: supports implicit TPGS [10852.040632] sd 15:0:1:0: alua: port group 00 rel port 30 [10852.040708] sd 15:0:1:0: [sdg] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.041053] sd 15:0:1:0: alua: port group 00 state A preferred supports tolusnA [10852.041481] sd 15:0:1:0: [sdg] Write Protect is off [10852.041496] sd 15:0:1:0: [sdg] Mode Sense: f5 00 00 08 [10852.041550] sd 15:0:0:1: [sdb] Attached SCSI disk [10852.041786] sd 15:0:1:0: [sdg] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.042390] sd 15:0:1:0: Attached scsi generic sg4 type 0 [10852.044049] scsi 15:0:1:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.044795] sd 15:0:1:1: alua: supports implicit TPGS [10852.045180] sd 15:0:1:1: alua: port group 00 rel port 30 [10852.045226] sd 15:0:0:0: [sda] Attached SCSI disk [10852.045313] sd 15:0:1:1: [sdh] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.045631] sd 15:0:1:1: alua: port group 00 state A preferred supports tolusnA [10852.045730] sd 15:0:1:1: Attached scsi generic sg5 type 0 [10852.045942] sd 15:0:1:1: [sdh] Write Protect is off [10852.045949] sd 15:0:1:1: [sdh] Mode Sense: ed 00 00 08 [10852.046318] sd 15:0:1:1: [sdh] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.046813] scsi 15:0:1:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.047808] sd 15:0:1:2: alua: supports implicit TPGS [10852.048133] sd 15:0:1:2: alua: port group 00 rel port 30 [10852.048358] sd 15:0:1:2: [sdi] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.048520] sd 15:0:1:2: alua: port group 00 state A preferred supports tolusnA [10852.048643] sd 15:0:1:2: Attached scsi generic sg6 type 0 [10852.049296] sdh: [10852.049299] sd 15:0:1:2: [sdi] Write Protect is off [10852.049308] sd 15:0:1:2: [sdi] Mode Sense: ed 00 00 08 [10852.049634] sd 15:0:1:2: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.049667] scsi 15:0:1:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.050354] sd 15:0:1:3: alua: supports implicit TPGS [10852.050853] sd 15:0:1:3: alua: port group 00 rel port 30 [10852.050943] sd 15:0:1:3: [sdaa] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.051214] sd 15:0:1:1: [sdh] Attached SCSI disk [10852.051287] sd 15:0:1:3: alua: port group 00 state A preferred supports tolusnA [10852.051426] sd 15:0:1:3: Attached scsi generic sg7 type 0 [10852.051646] sd 15:0:1:3: [sdaa] Write Protect is off [10852.051656] sd 15:0:1:3: [sdaa] Mode Sense: ed 00 00 08 [10852.051967] sd 15:0:1:3: [sdaa] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.052323] scsi 15:0:2:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.052973] sd 15:0:2:0: alua: supports implicit TPGS [10852.053314] sd 15:0:2:0: alua: port group 00 rel port 100 [10852.053406] sd 15:0:2:0: [sdab] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.053686] sd 15:0:2:0: alua: port group 00 state A preferred supports tolusnA [10852.053892] sd 15:0:2:0: Attached scsi generic sg8 type 0 [10852.054069] sd 15:0:2:0: [sdab] Write Protect is off [10852.054078] sd 15:0:2:0: [sdab] Mode Sense: f5 00 00 08 [10852.054391] sd 15:0:2:0: [sdab] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.055081] scsi 15:0:2:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.056159] sd 15:0:2:1: alua: supports implicit TPGS [10852.056493] sd 15:0:2:1: alua: port group 00 rel port 100 [10852.056574] sd 15:0:2:1: [sdac] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.056997] sd 15:0:2:1: alua: port group 00 state A preferred supports tolusnA [10852.057274] sd 15:0:2:1: [sdac] Write Protect is off [10852.057280] sd 15:0:2:1: Attached scsi generic sg29 type 0 [10852.057290] sd 15:0:2:1: [sdac] Mode Sense: ed 00 00 08 [10852.057578] sd 15:0:2:1: [sdac] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.058491] scsi 15:0:2:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.059132] sde: [10852.059173] sdaa: [10852.060148] sd 15:0:2:2: alua: supports implicit TPGS [10852.060723] sd 15:0:2:2: alua: port group 00 rel port 100 [10852.060814] sd 15:0:0:3: [sde] Attached SCSI disk [10852.060858] sd 15:0:1:3: [sdaa] Attached SCSI disk [10852.060942] sd 15:0:2:2: alua: rtpg failed with 8000002 [10852.061167] sdac: [10852.061313] sd 15:0:2:2: [sdad] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.061363] sd 15:0:2:2: alua: port group 00 state A preferred supports tolusnA [10852.061615] sd 15:0:2:2: Attached scsi generic sg30 type 0 [10852.062278] sd 15:0:2:2: [sdad] Write Protect is off [10852.062291] sd 15:0:2:2: [sdad] Mode Sense: ed 00 00 08 [10852.062841] scsi 15:0:2:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.062902] sd 15:0:2:2: [sdad] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.063740] sd 15:0:2:1: [sdac] Attached SCSI disk [10852.063965] sd 15:0:2:3: alua: supports implicit TPGS [10852.064330] sd 15:0:2:0: [sdab] Attached SCSI disk [10852.064437] sd 15:0:2:3: alua: port group 00 rel port 100 [10852.064507] sd 15:0:2:3: [sdae] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.064927] sd 15:0:2:3: alua: port group 00 state A preferred supports tolusnA [10852.065231] sd 15:0:2:3: Attached scsi generic sg31 type 0 [10852.065348] sd 15:0:2:3: [sdae] Write Protect is off [10852.065358] sd 15:0:2:3: [sdae] Mode Sense: ed 00 00 08 [10852.065859] sd 15:0:2:3: [sdae] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.065872] sdad: [10852.065959] sdi: [10852.066310] scsi 15:0:3:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.067721] sd 15:0:2:2: [sdad] Attached SCSI disk [10852.067796] sd 15:0:1:2: [sdi] Attached SCSI disk [10852.067876] sd 15:0:3:0: alua: supports implicit TPGS [10852.068387] sd 15:0:3:0: [sdaf] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.068426] sd 15:0:3:0: alua: port group 00 rel port 300 [10852.068904] sd 15:0:3:0: alua: port group 00 state A preferred supports tolusnA [10852.069151] sdae: [10852.069204] sd 15:0:3:0: Attached scsi generic sg32 type 0 [10852.069657] sd 15:0:3:0: [sdaf] Write Protect is off [10852.069664] sd 15:0:3:0: [sdaf] Mode Sense: f5 00 00 08 [10852.070344] sd 15:0:3:0: [sdaf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.070954] sd 15:0:2:3: [sdae] Attached SCSI disk [10852.074942] sd 15:0:3:0: [sdaf] Attached SCSI disk [10852.074954] scsi 15:0:3:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.076026] sd 15:0:3:1: alua: supports implicit TPGS [10852.076714] sd 15:0:3:1: alua: port group 00 rel port 300 [10852.076745] sd 15:0:3:1: [sdag] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.077546] sd 15:0:3:1: alua: port group 00 state A preferred supports tolusnA [10852.077885] sd 15:0:3:1: Attached scsi generic sg33 type 0 [10852.078100] sd 15:0:3:1: [sdag] Write Protect is off [10852.078109] sd 15:0:3:1: [sdag] Mode Sense: ed 00 00 08 [10852.078403] sd 15:0:3:1: [sdag] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.080875] sdag: [10852.083603] sd 15:0:3:1: [sdag] Attached SCSI disk [10852.086470] scsi 15:0:3:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.087290] sd 15:0:3:2: alua: supports implicit TPGS [10852.087630] sd 15:0:3:2: alua: port group 00 rel port 300 [10852.087790] sd 15:0:3:2: [sdah] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.087984] sd 15:0:3:2: alua: port group 00 state A preferred supports tolusnA [10852.088145] sd 15:0:3:2: Attached scsi generic sg34 type 0 [10852.088392] sd 15:0:3:2: [sdah] Write Protect is off [10852.088411] sd 15:0:3:2: [sdah] Mode Sense: ed 00 00 08 [10852.088687] sd 15:0:3:2: [sdah] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [10852.089078] scsi 15:0:3:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [10852.089911] sd 15:0:3:3: alua: supports implicit TPGS [10852.090323] sd 15:0:3:3: alua: port group 00 rel port 300 [10852.090360] sd 15:0:3:3: [sdai] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [10852.090737] sd 15:0:3:3: alua: port group 00 state A preferred supports tolusnA [10852.091050] sd 15:0:3:3: Attached scsi gen0 [12126.399029] scsi host17: Emulex LPe16000 16Gb PCIe Fibre Channel Adapter on PCI bus 01 device 00 irq 504 [12126.959169] rpaphp: Slot [U78CA.001.CSS003P-P1-C6-C1] registered [12126.959176] rpadlpar_io: slot PHB 21 added [12127.844158] lpfc 0009:01:00.0: 1:1303 Link Up Event x1 received Data: x1 x0 x80 x0 x0 x0 0 [12128.043085] scsi 17:0:0:0: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [12128.043688] sd 17:0:0:0: alua: supports implicit TPGS [12128.043969] sd 17:0:0:0: alua: port group 00 rel port 300 [12128.044190] sd 17:0:0:0: [sda] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [12128.044311] sd 17:0:0:0: alua: port group 00 state A preferred supports tolusnA [12128.044430] sd 17:0:0:0: Attached scsi generic sg0 type 0 [12128.044896] sd 17:0:0:0: [sda] Write Protect is off [12128.044903] sd 17:0:0:0: [sda] Mode Sense: f5 00 00 08 [12128.045179] sd 17:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or [sdaf] Mode Sense: f5 00 00 08 [12128.088966] sd 17:0:3:0: Attached scsi generic sg32 type 0 [12128.089258] sd 17:0:3:0: [sdaf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [12128.089877] sd 17:0:2:3: [sdae] Attached SCSI disk [12128.090349] scsi 17:0:3:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [12128.092029] sd 17:0:3:1: alua: supports implicit TPGS [12128.092495] sd 17:0:3:1: alua: port group 00 rel port 30 [12128.092548] sd 17:0:3:1: [sdag] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [12128.092885] sd 17:0:3:1: alua: port group 00 state A preferred supports tolusnA [12128.093103] sd 17:0:3:1: Attached scsi generic sg33 type 0 [12128.093172] sd 17:0:3:1: [sdag] Write Protect is off [12128.093183] sd 17:0:3:1: [sdag] Mode Sense: ed 00 00 08 [12128.093503] sd 17:0:3:1: [sdag] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [12128.094183] scsi 17:0:3:2: Direct-Access IBM th 65:224. [12749.073922] device-mapper: multipath: Failing path 66:32. [12749.073953] device-mapper: multipath: Failing path 65:160. [12749.090309] scsi 17:0:0:3: alua: Detached [12749.094400] device-mapper: multipath: Failing path 66:16. [12749.118398] scsi 17:0:2:2: alua: Detached [12749.138317] scsi 17:0:3:2: alua: Detached [12749.141437] device-mapper: multipath: Failing path 65:224. [12749.141467] device-mapper: multipath: Failing path 66:32. [12749.162316] scsi 17:0:1:3: alua: Detached [12749.165448] device-mapper: multipath: Failing path 66:32. [12749.202489] scsi 17:0:2:3: alua: Detached [12749.238436] scsi 17:0:3:3: alua: Detached [12749.371720] iommu: Removing device 0009:01:00.0 from group 0 [12759.378488] pci_bus 0009:01: busn_res: [bus 01-ff] is released [12759.378559] rpadlpar_io: slot PHB 21 removed [13405.725246] PCI host bridge /pci@800000020000015 ranges: [13405.725253] MEM 0x00003fc600000000..0x00003fc67effffff -> 0x0000000080000000 l port 100 [13408.460796] sd 19:0:2:1: [sdac] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [13408.461979] sd 19:0:2:1: [sdac] Write Protect is off [13408.461987] sd 19:0:2:1: [sdac] Mode Sense: ed 00 00 08 [13408.462292] sd 19:0:2:1: [sdac] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [13408.463890] sd 19:0:2:0: [sdab] Attached SCSI disk [13408.465140] sdac: [13408.467041] sd 19:0:2:1: [sdac] Attached SCSI disk [13408.569203] sdd: [13408.569287] sdi: [13408.570556] sd 19:0:0:2: [sdd] Attached SCSI disk [13408.570631] sd 19:0:1:2: [sdi] Attached SCSI disk [13438.697588] sd 19:0:0:1: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [13439.326030] rport-19:0-9: blocked FC remote port time out: removing rport [13453.426174] sd 19:0:0:1: [sdb] Write Protect is off [13453.426184] sd 19:0:0:1: [sdb] Mode Sense: ed 00 00 08 [13453.426711] sd 19:0:0:1: [sdb] Write cache: disabled, read cache: enabled, doesn't support Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [14683.037860] scsi 21:0:0:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [14683.039069] sd 21:0:0:2: alua: supports implicit TPGS [14683.039472] sd 21:0:0:2: alua: port group 00 rel port 300 [14683.039557] sd 21:0:0:2: [sdd] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [14683.039882] sd 21:0:0:2: alua: port group 00 state A preferred supports tolusnA [14683.040143] sd 21:0:0:2: Attached scsi generic sg2 type 0 [14683.040221] sd 21:0:0:2: [sdd] Write Protect is off [14683.040231] sd 21:0:0:2: [sdd] Mode Sense: ed 00 00 08 [14683.040517] sd 21:0:0:2: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [14683.041295] scsi 21:0:0:3: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [14683.042497] sd 21:0:0:3: alua: supports implicit TPGS [14683.042870] sd 21:0:0:3: [sde] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [1468er: multipath: Failing path 8:16. [15299.302590] iommu: Removing device 000b:01:00.1 from group 0 [15299.317740] scsi 21:0:0:0: alua: Detached [15299.325260] sd 21:0:2:0: [sdab] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.325265] sd 21:0:2:0: [sdab] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [15299.325267] blk_update_request: I/O error, dev sdab, sector 41942912 [15299.325275] device-mapper: multipath: Failing path 65:176. [15299.325296] sd 21:0:3:0: [sdae] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.325298] sd 21:0:3:0: [sdae] tag#0 CDB: Read(10) 28 00 02 7f ff 80 00 00 80 00 [15299.325301] blk_update_request: I/O error, dev sdae, sector 41942912 [15299.325309] device-mapper: multipath: Failing path 65:224. [15299.353733] scsi 21:0:1:0: alua: Detached [15299.361265] sd 21:0:3:1: [sdaf] tag#0 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [15299.361269] sd 21:0:3:1: [sdaf] tag#0 CDB958.559922] scsi 23:0:1:1: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [15958.560721] sd 23:0:1:1: alua: supports implicit TPGS [15958.561066] sd 23:0:1:1: alua: port group 00 rel port 100 [15958.561092] sd 23:0:1:1: [sdh] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB) [15958.561697] sd 23:0:1:1: alua: port group 00 state A preferred supports tolusnA [15958.561880] sd 23:0:1:1: Attached scsi generic sg5 type 0 [15958.561951] sd 23:0:1:1: [sdh] Write Protect is off [15958.561958] sd 23:0:1:1: [sdh] Mode Sense: ed 00 00 08 [15958.562239] sd 23:0:1:1: [sdh] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [15958.562942] scsi 23:0:1:2: Direct-Access IBM 2107900 .149 PQ: 0 ANSI: 5 [15958.563577] sd 23:0:1:0: [sdg] Attached SCSI disk [15958.563921] sd 23:0:1:2: alua: supports implicit TPGS [15958.564270] sd 23:0:1:2: alua: port group 00 rel port 100 [15958.564443] sd 23:0:1:2: [sdi] 41943040 512-byte d000000003a37530 7c6307b4 extsw r3,r3 d000000003a37534 38210030 addi r1,r1,48 d000000003a37538 e8010010 ld r0,16(r1) d000000003a3753c ebc1fff0 ld r30,-16(r1) d000000003a37540 ebe1fff8 ld r31,-8(r1) d000000003a37544 7c0803a6 mtlr r0 d000000003a37548 4e800020 blr d000000003a3754c 60420000 ori r2,r2,0 d000000003a37550 813f0b14 lwz r9,2836(r31) d000000003a37554 2b890001 cmplwi cr7,r9,1 d000000003a37558 409dffa8 ble cr7,d000000003a37500 # lpfc_sli4_scmd_to_wqidx_distr+0x50/0x100 [lpfc] d000000003a3755c a14d0008 lhz r10,8(r13) d000000003a37560 a13f0572 lhz r9,1394(r31) d000000003a37564 7f895000 cmpw cr7,r9,r10 d000000003a37568 409dff98 ble cr7,d000000003a37500 # lpfc_sli4_scmd_to_wqidx_distr+0x50/0x100 [lpfc] d000000003a3756c e93f0568 ld r9,1384(r31) e:mon> ---Steps to Reproduce--- cd /kte/tools ./setup dlar cd /kte/tools/dlpar ./start.dlpar -d 0 Doing some analysis without a crashdump, as it's taking too long. This looks like a NULL pointer dereference, if my assembly reading/matching to C is correct. Would need to understand why/how this '(struct scsi_cmnd *cmnd)->device' field is NULL. Analysis -------- From xmon: pc: <...>: lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] R10 = 0000000000000000 R26 = 0000000000000001 From 'objdump -d /usr/lib/debug/<...>/lpfc.ko', lpfc_sli4_scmd_to_wqidx_distr = 674b0 lpfc_sli4_scmd_to_wqidx_distr+0x30 = 674e0 (crash) and 00000000000674b0 <lpfc_sli4_scmd_to_wqidx_distr>: <...> 674cc: 78 23 9e 7c mr r30,r4 674d0: 78 1b 7f 7c mr r31,r3 <...> 674dc: 10 00 5e e9 ld r10,16(r30) 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) 674e4: 00 00 29 e9 ld r9,0(r9) <...> This is the relevant snippet of code: From Ubuntu 16.04 kernel 4.4.0-22.40 [1] int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba, struct lpfc_scsi_buf *lpfc_cmd) { struct scsi_cmnd *cmnd = lpfc_cmd->pCmd; <...> if (shost_use_blk_mq(cmnd->device->host)) { <...> So, back to the assembly, this seems the 2 function parameters passed by register (r4, r3), loaded into other registers (r30, r31). 674cc: 78 23 9e 7c mr r30,r4 674d0: 78 1b 7f 7c mr r31,r3 Per the load below, r10 is *cmnd, and r30 is *lpfc_cmd; it loads lpfc_cmd->pCmd, which has offset 16 bytes into struct lpfc_cmd [2] (after 2 pointers * 8-bytes each, from list_head list [3]) 674dc: 10 00 5e e9 ld r10,16(r30) struct lpfc_scsi_buf { struct list_head list; struct scsi_cmnd *pCmd; <...> struct list_head { struct list_head *next, *prev; }; And the load below hits the crash, because it dereferences r10 (*cmnd) which is zero: 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) From xmon: R10 = 0000000000000000 R26 = 0000000000000001 That deference was for cmnd->device; you can see the load instruction immediately afterward would further dereference the cmnd->device pointer, for the device->host field, which has offset 0 into struct scsi_device [4]: --- this is a confirmantion that the assembly/C matching looks correct. if (shost_use_blk_mq(cmnd->device->host)) { struct scsi_device { struct Scsi_Host *host; <...> [1] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/drivers/scsi/lpfc/lpfc_scsi.c?h=Ubuntu-4.4.0-22.40 [2] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/drivers/scsi/lpfc/lpfc_scsi.h?h=Ubuntu-4.4.0-22.40#n130 [3] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/include/linux/types.h?h=Ubuntu-4.4.0-22.40#n185 [4] http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/include/scsi/scsi_device.h?h=Ubuntu-4.4.0-22.40#n77 One detail missing.. (In reply to comment #16) > And the load below hits the crash, because it dereferences r10 (*cmnd) which > is zero: > > 674e0: 00 00 2a e9 ld r9,0(r10) <<-- (crash) > > From xmon: > > R10 = 0000000000000000 R26 = 0000000000000001 > > That deference was for cmnd->device; [snip] which has offset zero into struct scsi_cmnd [5] if (shost_use_blk_mq(cmnd->device->host)) { struct scsi_cmnd { struct scsi_device *device; <...> [5] http://kernel.ubuntu.com/git/ubuntu/ubuntu- xenial.git/tree/include/scsi/scsi_cmnd.h?h=Ubuntu-4.4.0-22.40#n59 > Would need to understand why/how this > '(struct scsi_cmnd *cmnd)->device' field is NULL. Checking this again today, it occurred to me that the problem is actually *cmnd == NULL, and dereferencing *cmnd (for cmnd->device) hits the crash. Fix submitted upstream: http://marc.info/?l=linux-scsi&m=146534119707379&w=2 I didn't provide a test kernel because the system was running regression tests over weekend, and it takes long to reproduce the problem w/ DLPAR operations -- but the same problem could be reproduced w/ simpler test-cases (see commit), so I worked it in the background. If you keep hitting this, let me know and I'll provide a test kernel. Hi Mauricio, Installed the fix and verified. Now am not finding the issue. Hi Canonical, Can you consider picking up this fix that has not yet made the upstream kernel? It's fairly obvious and trivial, very documented in the commit message (w/ test-cases), and has been tested successfully here in IBM (also see commit msg). http://marc.info/?l=linux-scsi&m=146534119707379&w=2 The adapter vendor's team has not yet reviewed it on the mailing list (and no other patches for lpfc), so I guess it'll take some time until this makes in. Is that possible? Thanks Mauricio To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1597974/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp