[Kernel-packages] [Bug 1486180] Comment bridged from LTC Bugzilla

2016-02-11 Thread bugproxy
--- Comment From shriy...@in.ibm.com 2016-02-12 01:54 EDT---
The fix works fine and it is verified on : 4.4.0-4-generic Ubuntu 16.04.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486180

Title:
  Kernel OOPS during DLPAR operation with Fibre Channel adapter

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Wily:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  -- Problem Description --
  Kernel OOPS during DLPAR operation with Fibre Channel adapter
   
  ---uname output---
  4.1.0-1-generic
   
  ---Additional Hardware Info---
  Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host 
Adapter (rev 03) 
   
  Machine Type = POWER8 

  ---Steps to Reproduce---
  1) Install Ubuntu 15.10 on a Power VM LPAR.
  2) Configure and start rtas_errd daemon
  3) Via HMC try to add a Fibre channel adapter via dynamic partitioning
   During the operation following OOPS message is observed

  Oops output:

   !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2

  
  !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2
  [ 8696.808703] PCI host bridge /pci@8002020  ranges:
  [ 8696.808708]  MEM 0x0003ff84..0x0003ff847eff -> 
0x8000 
  [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 1)
  [ 8696.808761] PCI host bridge to bus 0001:01
  [ 8696.808765] pci_bus 0001:01: root bus resource [mem 
0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff])
  [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff]
  [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
  [ 8696.897395] rpadlpar_io: slot PHB 32 added
  [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0.
  [ 8696.972157] Copyright(c) 2004-2015 Emulex.  All rights reserved.
  [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142)
  [ 8696.976145] Unable to handle kernel paging request for data at address 
0x000c
  [ 8696.976174] Faulting instruction address: 0xc0084cc4
  [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries
  [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io 
rpaphp rtc_generic pseries_rng autofs4
  [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 
4.1.0-1-generic #1~dogfoodv1-Ubuntu
  [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: 
c000fd08c000
  [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: 

  [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300   Not tainted  
(4.1.0-1-generic)
  [ 8696.976255] MSR: 80019033   CR: 8222  
XER: 2000
  [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 
SOFTE: 1 
 GPR00: c0084ca8 c000fd08f370 c14bda00 
 
 GPR04: 0001 c000fd08f408 0003 
d2c31e60 
 GPR08: c13bda00  c003873e6b80 
d2ca7c98 
 GPR12: 8800 ce831b00 d29421f8 
38ca4522 
 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 
c000fc8054c0 
 GPR20: d238 d238 d2ccff90 
 
 GPR24: c165074c c0038e17e000 c13b5e00 
c0038e17e000 
 GPR28: c13b5e28 ca590600 c13b5df0 
c13b5e20 
  [ 8696.976396] NIP [c0084cc4] enable_ddw+0x254/0x7b0
  [ 8696.976405] LR [c0084ca8] enable_ddw+0x238/0x7b0
  [ 8696.976411] Call Trace:
  [ 8696.976419] [c000fd08f370] [c0084ca8] enable_ddw+0x238/0x7b0 
(unreliable)
  [ 8696.976431] [c000fd08f4b0] [c00866d8] 
dma_set_mask_pSeriesLP+0x218/0x2a0
  [ 8696.976444] [c000fd08f540] [c0023528] dma_set_mask+0x58/0xa0
  [ 8696.976474] [c000fd08f570] [d2c71280] 
lpfc_pci_probe_one+0xb0/0xc50 [lpfc]
  [ 8696.976486] [c000fd08f610] [c05987fc] 
local_pci_probe+0x6c/0x140
  [ 8696.976497] [c000fd08f6a0] [c0598a28] 
pci_device_probe+0x158/0x1e0
  [ 8696.976510] [c000fd08f700] [c067b744] 
driver_probe_device+0x1c4/0x5a0
  [ 8696.976522] [c000fd08f790] [c067bcdc] 
__driver_attach+0x11c/0x120
  [ 8696.976533] [c000fd08f7d0] [c067854c] 
bus_for_each_dev+0x9c/0x110
  [ 8696.976544] [c000fd08f820] [c067adbc] driver_attach+0x3c/0x60
  [ 8696.976555] [c000fd08f850] [c067a768] 
bus_add_driver+0x208/0x320
  [ 8696.976565] [c000fd08f8e0] [c067c99c] 
driver_register+0x9c/0x180
  [ 8696.976576] [c000fd08f950] [c05978ec] 
__pci_register_driver+0x6c/0x90
  [ 8696

[Kernel-packages] [Bug 1486180] Comment bridged from LTC Bugzilla

2016-01-28 Thread bugproxy
--- Comment From shriy...@in.ibm.com 2016-01-29 02:24 EDT---
Verified the fix for Ubuntu Xenial 16.04.

uname output : 4.3.0-7-generic

dmesg : After adding FC Adapter

root@alp2:~# dmesg -c
[55673.527853] PCI host bridge /pci@8002020  ranges:
[55673.527859]  MEM 0x3fc4..0x3fc47eff -> 0x8000
[55673.527861]  MEM 0x3080..0x308f -> 0x0003d080
[55673.532672] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 4)
[55673.532730] PCI host bridge to bus 0004:01
[55673.532737] pci_bus 0004:01: root bus resource [mem 
0x3fc4-0x3fc47eff] (bus address [0x8000-0xfeff])
[55673.532740] pci_bus 0004:01: root bus resource [mem 
0x3080-0x308f] (bus address [0x3d080-0x3d08f])
[55673.532743] pci_bus 0004:01: root bus resource [bus 01-ff]
[55673.616152] iommu: Adding device 0004:01:00.1 to group 0
[55673.616542] iommu: Adding device 0004:01:00.0 to group 0
[55673.617583] lpfc 0004:01:00.1: enabling device (0140 -> 0142)
[55673.619762] lpfc 0004:01:00.1: ibm,query-pe-dma-windows(53) 1 800 
2020 returned 0
[55673.621344] lpfc 0004:01:00.1: ibm,create-pe-dma-window(54) 1 800 
2020 10 24 returned 0 (liobn = 0x7020 starting addr = 800 0)
[55673.709324] lpfc 0004:01:00.1: Using 64-bit direct DMA at offset 
800
[55673.709860] scsi host5: Emulex LPe12000 PCIe Fibre Channel Adapter  on PCI 
bus 01 device 01 irq 507
[55675.855373] lpfc 0004:01:00.0: enabling device (0140 -> 0142)
[55675.857444] lpfc 0004:01:00.0: Using 64-bit direct DMA at offset 
800
[55675.857944] scsi host6: Emulex LPe12000 PCIe Fibre Channel Adapter  on PCI 
bus 01 device 00 irq 508
[55678.007129] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
[55678.007133] rpadlpar_io: slot PHB 32 added
[55678.361113] lpfc 0004:01:00.0: 1:1303 Link Up Event x1 received Data: x1 x1 
x20 x1 x0 x0 0
[55678.361126] lpfc 0004:01:00.0: 1:1309 Link Up Event npiv not supported in 
loop topology
[55678.362164] lpfc 0004:01:00.0: 1:(0):2858 FLOGI failure Status:x3/x18 TMO:x0
[55678.363126] lpfc 0004:01:00.0: 1:(0):2858 FLOGI failure Status:x3/x18 TMO:x0
[55678.364042] lpfc 0004:01:00.0: 1:(0):2858 FLOGI failure Status:x3/x18 TMO:x0
[55678.364046] lpfc 0004:01:00.0: 1:(0):0100 FLOGI failure Status:x3/x18 TMO:x0

Call trace is not seen.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486180

Title:
  Kernel OOPS during DLPAR operation with Fibre Channel adapter

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Wily:
  Fix Committed
Status in linux source package in Xenial:
  Fix Released

Bug description:
  -- Problem Description --
  Kernel OOPS during DLPAR operation with Fibre Channel adapter
   
  ---uname output---
  4.1.0-1-generic
   
  ---Additional Hardware Info---
  Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host 
Adapter (rev 03) 
   
  Machine Type = POWER8 

  ---Steps to Reproduce---
  1) Install Ubuntu 15.10 on a Power VM LPAR.
  2) Configure and start rtas_errd daemon
  3) Via HMC try to add a Fibre channel adapter via dynamic partitioning
   During the operation following OOPS message is observed

  Oops output:

   !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2

  
  !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2
  [ 8696.808703] PCI host bridge /pci@8002020  ranges:
  [ 8696.808708]  MEM 0x0003ff84..0x0003ff847eff -> 
0x8000 
  [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 1)
  [ 8696.808761] PCI host bridge to bus 0001:01
  [ 8696.808765] pci_bus 0001:01: root bus resource [mem 
0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff])
  [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff]
  [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
  [ 8696.897395] rpadlpar_io: slot PHB 32 added
  [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0.
  [ 8696.972157] Copyright(c) 2004-2015 Emulex.  All rights reserved.
  [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142)
  [ 8696.976145] Unable to handle kernel paging request for data at address 
0x000c
  [ 8696.976174] Faulting instruction address: 0xc0084cc4
  [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries
  [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io 
rpaphp rtc_generic pseries_rng autofs4
  [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 
4.1.0-1-generic #1~dogfoodv1-Ubuntu
  [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: 
c000fd08c000
  [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: 

  [ 8696.9

[Kernel-packages] [Bug 1486180] Comment bridged from LTC Bugzilla

2016-01-29 Thread bugproxy
--- Comment From shriy...@in.ibm.com 2016-01-29 04:46 EDT---
Also verified the same on Ubuntu 14.04.04

uname : 4.2.0-27-generic

Call trace is not seen.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486180

Title:
  Kernel OOPS during DLPAR operation with Fibre Channel adapter

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Wily:
  Fix Committed
Status in linux source package in Xenial:
  Fix Released

Bug description:
  -- Problem Description --
  Kernel OOPS during DLPAR operation with Fibre Channel adapter
   
  ---uname output---
  4.1.0-1-generic
   
  ---Additional Hardware Info---
  Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host 
Adapter (rev 03) 
   
  Machine Type = POWER8 

  ---Steps to Reproduce---
  1) Install Ubuntu 15.10 on a Power VM LPAR.
  2) Configure and start rtas_errd daemon
  3) Via HMC try to add a Fibre channel adapter via dynamic partitioning
   During the operation following OOPS message is observed

  Oops output:

   !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2

  
  !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2
  [ 8696.808703] PCI host bridge /pci@8002020  ranges:
  [ 8696.808708]  MEM 0x0003ff84..0x0003ff847eff -> 
0x8000 
  [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 1)
  [ 8696.808761] PCI host bridge to bus 0001:01
  [ 8696.808765] pci_bus 0001:01: root bus resource [mem 
0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff])
  [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff]
  [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
  [ 8696.897395] rpadlpar_io: slot PHB 32 added
  [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0.
  [ 8696.972157] Copyright(c) 2004-2015 Emulex.  All rights reserved.
  [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142)
  [ 8696.976145] Unable to handle kernel paging request for data at address 
0x000c
  [ 8696.976174] Faulting instruction address: 0xc0084cc4
  [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries
  [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io 
rpaphp rtc_generic pseries_rng autofs4
  [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 
4.1.0-1-generic #1~dogfoodv1-Ubuntu
  [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: 
c000fd08c000
  [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: 

  [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300   Not tainted  
(4.1.0-1-generic)
  [ 8696.976255] MSR: 80019033   CR: 8222  
XER: 2000
  [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 
SOFTE: 1 
 GPR00: c0084ca8 c000fd08f370 c14bda00 
 
 GPR04: 0001 c000fd08f408 0003 
d2c31e60 
 GPR08: c13bda00  c003873e6b80 
d2ca7c98 
 GPR12: 8800 ce831b00 d29421f8 
38ca4522 
 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 
c000fc8054c0 
 GPR20: d238 d238 d2ccff90 
 
 GPR24: c165074c c0038e17e000 c13b5e00 
c0038e17e000 
 GPR28: c13b5e28 ca590600 c13b5df0 
c13b5e20 
  [ 8696.976396] NIP [c0084cc4] enable_ddw+0x254/0x7b0
  [ 8696.976405] LR [c0084ca8] enable_ddw+0x238/0x7b0
  [ 8696.976411] Call Trace:
  [ 8696.976419] [c000fd08f370] [c0084ca8] enable_ddw+0x238/0x7b0 
(unreliable)
  [ 8696.976431] [c000fd08f4b0] [c00866d8] 
dma_set_mask_pSeriesLP+0x218/0x2a0
  [ 8696.976444] [c000fd08f540] [c0023528] dma_set_mask+0x58/0xa0
  [ 8696.976474] [c000fd08f570] [d2c71280] 
lpfc_pci_probe_one+0xb0/0xc50 [lpfc]
  [ 8696.976486] [c000fd08f610] [c05987fc] 
local_pci_probe+0x6c/0x140
  [ 8696.976497] [c000fd08f6a0] [c0598a28] 
pci_device_probe+0x158/0x1e0
  [ 8696.976510] [c000fd08f700] [c067b744] 
driver_probe_device+0x1c4/0x5a0
  [ 8696.976522] [c000fd08f790] [c067bcdc] 
__driver_attach+0x11c/0x120
  [ 8696.976533] [c000fd08f7d0] [c067854c] 
bus_for_each_dev+0x9c/0x110
  [ 8696.976544] [c000fd08f820] [c067adbc] driver_attach+0x3c/0x60
  [ 8696.976555] [c000fd08f850] [c067a768] 
bus_add_driver+0x208/0x320
  [ 8696.976565] [c000fd08f8e0] [c067c99c] 
driver_register+0x9c/0x180
  [ 8696.976576] [c000fd08f950] [c05978ec] 
__pci_register_driv

[Kernel-packages] [Bug 1486180] Comment bridged from LTC Bugzilla

2015-12-07 Thread bugproxy
--- Comment From gpicc...@br.ibm.com 2015-12-07 21:56 EDT---
Quick update/information: I wasn't able to perform DLPAR in the way described 
on bug. Normally I use command-line tool called "drmgr", and even with this 
tool, I couldn't perform DLPAR in the usual way.

I was able to add the adapter in LPAR via HMC, with partition powered
off. Then, once it boots, I was able to see adapter there, and in this
scenario I could perform DLPAR using the "drmgr" tool without issues,
i.e., bug wasn't reproduced in this case.

The problem is that, if the adapter is not present as "required" in LPAR
configuration, i.e, if the adapter is not present on partition boot
time, the DLPAR facility does not work for me. I got messages like

-"Dynamic reconfiguration is not supported for connector type slots on this 
system" ;
-"Validating PHB DLPAR capability...yes.
There are no DR capable slots on this system. Could not find drc index for 32, 
unable to add thePHB."

in partition console (when using "drmgr" command) or the message

"RMC network connection to the source partition is not present"

in HMC, when trying DLPAR  via web interface.

Sachin, do you know what's going on? Can you help me perform the correct
DLPAR operation to reproduce the bug?

BTW, Murilo (my co-worker) suggested that the machine's firmware level
is too outdated - should we upgrade it?

Cheers,

Guilherme

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486180

Title:
  Kernel OOPS during DLPAR operation with Fibre Channel adapter

Status in linux package in Ubuntu:
  New

Bug description:
  -- Problem Description --
  Kernel OOPS during DLPAR operation with Fibre Channel adapter
   
  ---uname output---
  4.1.0-1-generic
   
  ---Additional Hardware Info---
  Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host 
Adapter (rev 03) 
   
  Machine Type = POWER8 

  ---Steps to Reproduce---
  1) Install Ubuntu 15.10 on a Power VM LPAR.
  2) Configure and start rtas_errd daemon
  3) Via HMC try to add a Fibre channel adapter via dynamic partitioning
   During the operation following OOPS message is observed

  Oops output:

   !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2

  
  !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2
  [ 8696.808703] PCI host bridge /pci@8002020  ranges:
  [ 8696.808708]  MEM 0x0003ff84..0x0003ff847eff -> 
0x8000 
  [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 1)
  [ 8696.808761] PCI host bridge to bus 0001:01
  [ 8696.808765] pci_bus 0001:01: root bus resource [mem 
0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff])
  [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff]
  [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
  [ 8696.897395] rpadlpar_io: slot PHB 32 added
  [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0.
  [ 8696.972157] Copyright(c) 2004-2015 Emulex.  All rights reserved.
  [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142)
  [ 8696.976145] Unable to handle kernel paging request for data at address 
0x000c
  [ 8696.976174] Faulting instruction address: 0xc0084cc4
  [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries
  [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io 
rpaphp rtc_generic pseries_rng autofs4
  [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 
4.1.0-1-generic #1~dogfoodv1-Ubuntu
  [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: 
c000fd08c000
  [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: 

  [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300   Not tainted  
(4.1.0-1-generic)
  [ 8696.976255] MSR: 80019033   CR: 8222  
XER: 2000
  [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 
SOFTE: 1 
 GPR00: c0084ca8 c000fd08f370 c14bda00 
 
 GPR04: 0001 c000fd08f408 0003 
d2c31e60 
 GPR08: c13bda00  c003873e6b80 
d2ca7c98 
 GPR12: 8800 ce831b00 d29421f8 
38ca4522 
 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 
c000fc8054c0 
 GPR20: d238 d238 d2ccff90 
 
 GPR24: c165074c c0038e17e000 c13b5e00 
c0038e17e000 
 GPR28: c13b5e28 ca590600 c13b5df0 
c13b5e20 
  [ 8696.976396] NIP [c0084cc4] enable_ddw+0x254/0x7b0
  [ 8696.976405] LR [c0084ca8] enable_ddw+0x238/0x7b0
  [ 8696.976411] Cal

[Kernel-packages] [Bug 1486180] Comment bridged from LTC Bugzilla

2016-01-11 Thread bugproxy
--- Comment From cha...@us.ibm.com 2016-01-11 10:34 EDT---
Status update:

The root cause was found, and a patch is provided.
The problem happens when DLPAR of PCI device is done in LPAR with no PCI 
devices present at boot time. When DDW is being enabled (in function 
query_ddw() specifically), a NULL pointer dereference happens because a member 
of struct eeh_dev is NULL.

This is caused because EEH is not initialized correctly, by not probing
PCI devices as expected, and so not initializing the eeh_dev struct.

The commit 89a51df5ab1d ("powerpc/eeh: Fix crash in
eeh_add_device_early() on Cell") added a check to avoid oops in Cell
architecture in function eeh_add_device_early() - this function is used
to probe PCI devices in hotplug/DLPAR operation. The check is performed
by evaluating the return of eeh_enable() function.

The issue then happens because since we have no PCI device on boot time,
EEH is not enabled and this check fails on eeh_add_device_early(). Our
patch changes the way the arch checking is done, and so this bug does
not happen anymore.

The patch was submitted upstream. I don't know exactly the procedure  regarding 
Canonical - I think we should wait the upstream acceptance and then request 
Canonical to add the patch to Ubuntu's 14.04.4/15.10/16.04 kernel.
The patch's description provides a bit more details of the issue and the 
proposed solution.

Link to patch on ppc-dev list: https://lists.ozlabs.org/pipermail
/linuxppc-dev/2016-January/137695.html

Thanks Shryia for all the help provided.
Cheers,

Guilherme

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486180

Title:
  Kernel OOPS during DLPAR operation with Fibre Channel adapter

Status in linux package in Ubuntu:
  New

Bug description:
  -- Problem Description --
  Kernel OOPS during DLPAR operation with Fibre Channel adapter
   
  ---uname output---
  4.1.0-1-generic
   
  ---Additional Hardware Info---
  Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host 
Adapter (rev 03) 
   
  Machine Type = POWER8 

  ---Steps to Reproduce---
  1) Install Ubuntu 15.10 on a Power VM LPAR.
  2) Configure and start rtas_errd daemon
  3) Via HMC try to add a Fibre channel adapter via dynamic partitioning
   During the operation following OOPS message is observed

  Oops output:

   !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2

  
  !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!!  Version 3.10x2
  [ 8696.808703] PCI host bridge /pci@8002020  ranges:
  [ 8696.808708]  MEM 0x0003ff84..0x0003ff847eff -> 
0x8000 
  [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 
(domain 1)
  [ 8696.808761] PCI host bridge to bus 0001:01
  [ 8696.808765] pci_bus 0001:01: root bus resource [mem 
0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff])
  [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff]
  [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered
  [ 8696.897395] rpadlpar_io: slot PHB 32 added
  [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0.
  [ 8696.972157] Copyright(c) 2004-2015 Emulex.  All rights reserved.
  [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142)
  [ 8696.976145] Unable to handle kernel paging request for data at address 
0x000c
  [ 8696.976174] Faulting instruction address: 0xc0084cc4
  [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1]
  [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries
  [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io 
rpaphp rtc_generic pseries_rng autofs4
  [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 
4.1.0-1-generic #1~dogfoodv1-Ubuntu
  [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: 
c000fd08c000
  [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: 

  [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300   Not tainted  
(4.1.0-1-generic)
  [ 8696.976255] MSR: 80019033   CR: 8222  
XER: 2000
  [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 
SOFTE: 1 
 GPR00: c0084ca8 c000fd08f370 c14bda00 
 
 GPR04: 0001 c000fd08f408 0003 
d2c31e60 
 GPR08: c13bda00  c003873e6b80 
d2ca7c98 
 GPR12: 8800 ce831b00 d29421f8 
38ca4522 
 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 
c000fc8054c0 
 GPR20: d238 d238 d2ccff90 
 
 GPR24: c165074c c0038e17e000 c13b5e00 
c0038e17e000 
 GPR28: c13b5e28 ca590600 c00