[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-11-08 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.4.0-90.101

---
linux (5.4.0-90.101) focal; urgency=medium

  * focal/linux: 5.4.0-90.101 -proposed tracker (LP: #1947260)

  * Packaging resync (LP: #1786013)
- debian/dkms-versions -- update from kernel-versions (main/2021.10.18)

  * Add final-checks to check certificates (LP: #1947174)
- [Packaging] Add system trusted and revocation keys final check

  * No sound on Lenovo laptop models Legion 15IMHG05, Yoga 7 14ITL5, and 13s
Gen2 (LP: #1939052)
- ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i
  15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
- ALSA: hda/realtek: Fix for quirk to enable speaker output on the Lenovo 
13s
  Gen2

  * CVE-2020-36385
- RDMA/cma: Add missing locking to rdma_accept()
- RDMA/ucma: Fix the locking of ctx->file
- RDMA/ucma: Rework ucma_migrate_id() to avoid races with destroy

  * Focal update: v5.4.148 upstream stable release (LP: #1946802)
- rtc: tps65910: Correct driver module alias
- btrfs: wake up async_delalloc_pages waiters after submit
- btrfs: reset replace target device to allocation state on close
- blk-zoned: allow zone management send operations without CAP_SYS_ADMIN
- blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN
- PCI/MSI: Skip masking MSI-X on Xen PV
- powerpc/perf/hv-gpci: Fix counter value parsing
- xen: fix setting of max_pfn in shared_info
- include/linux/list.h: add a macro to test if entry is pointing to the head
- 9p/xen: Fix end of loop tests for list_for_each_entry
- tools/thermal/tmon: Add cross compiling support
- pinctrl: stmfx: Fix hazardous u8[] to unsigned long cast
- pinctrl: ingenic: Fix incorrect pull up/down info
- soc: qcom: aoss: Fix the out of bound usage of cooling_devs
- soc: aspeed: lpc-ctrl: Fix boundary check for mmap
- soc: aspeed: p2a-ctrl: Fix boundary check for mmap
- arm64: head: avoid over-mapping in map_memory
- crypto: public_key: fix overflow during implicit conversion
- block: bfq: fix bfq_set_next_ioprio_data()
- power: supply: max17042: handle fails of reading status register
- dm crypt: Avoid percpu_counter spinlock contention in crypt_page_alloc()
- VMCI: fix NULL pointer dereference when unmapping queue pair
- media: uvc: don't do DMA on stack
- media: rc-loopback: return number of emitters rather than error
- Revert "dmaengine: imx-sdma: refine to load context only once"
- dmaengine: imx-sdma: remove duplicated sdma_load_context
- libata: add ATA_HORKAGE_NO_NCQ_TRIM for Samsung 860 and 870 SSDs
- ARM: 9105/1: atags_to_fdt: don't warn about stack size
- PCI/portdrv: Enable Bandwidth Notification only if port supports it
- PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported
- PCI: Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure
- PCI: xilinx-nwl: Enable the clock through CCF
- PCI: aardvark: Fix checking for PIO status
- PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO 
response
- PCI: aardvark: Fix masking and unmasking legacy INTx interrupts
- HID: input: do not report stylus battery state as "full"
- f2fs: quota: fix potential deadlock
- scsi: bsg: Remove support for SCSI_IOCTL_SEND_COMMAND
- IB/hfi1: Adjust pkey entry in index 0
- RDMA/iwcm: Release resources if iw_cm module initialization fails
- docs: Fix infiniband uverbs minor number
- pinctrl: samsung: Fix pinctrl bank pin count
- vfio: Use config not menuconfig for VFIO_NOIOMMU
- powerpc/stacktrace: Include linux/delay.h
- RDMA/efa: Remove double QP type assignment
- f2fs: show f2fs instance in printk_ratelimited
- f2fs: reduce the scope of setting fsck tag when de->name_len is zero
- openrisc: don't printk() unconditionally
- dma-debug: fix debugfs initialization order
- SUNRPC: Fix potential memory corruption
- scsi: fdomain: Fix error return code in fdomain_probe()
- pinctrl: single: Fix error return code in 
pcs_parse_bits_in_pinctrl_entry()
- scsi: smartpqi: Fix an error code in pqi_get_raid_map()
- scsi: qedi: Fix error codes in qedi_alloc_global_queues()
- scsi: qedf: Fix error codes in qedf_alloc_global_queues()
- powerpc/config: Renable MTD_PHYSMAP_OF
- scsi: target: avoid per-loop XCOPY buffer allocations
- HID: i2c-hid: Fix Elan touchpad regression
- KVM: PPC: Book3S HV Nested: Reflect guest PMU in-use to L0 when guest SPRs
  are live
- platform/x86: dell-smbios-wmi: Add missing kfree in error-exit from
  run_smbios_call
- fscache: Fix cookie key hashing
- clk: at91: sam9x60: Don't use audio PLL
- clk: at91: clk-generated: pass the id of changeable parent at registration
- clk: at91: clk-generated: Limit the requested rate to our range
- KVM: PPC: Fix clearing never mapped TCEs in realmode
- f2fs: fix to account 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-11-02 Thread Kelsey Skunberg
Thank you, Eric! :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 0040
  [6051160.637043] ---[ end trace 236e6f4850146477 ]---

  [Test Plan]

  There are two ways to replicate the bug:

  Reset a single chassis I/O module or fail over a 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-11-01 Thread Eric Desrochers
This has been tested on Cisco Hardware by Field Engineering and the bug
is no longer reproducible.

- Eric

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-10-29 Thread Kelsey Skunberg
Hi Eric, may you please verify the focal kernel in -proposed resolves
this bug? You can find more instructions for this in comment #4. Thank
you!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 0040
  [6051160.637043] ---[ end trace 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-10-19 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux/5.4.0-90.101 kernel in
-proposed solves the problem. Please test the kernel and update this bug
with the results. If the problem is solved, change the tag
'verification-needed-focal' to 'verification-done-focal'. If the problem
still exists, change the tag 'verification-needed-focal' to
'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-10-12 Thread Kelsey Skunberg
** Changed in: linux (Ubuntu Focal)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 0040
  [6051160.637043] ---[ end trace 236e6f4850146477 ]---

  [Test Plan]

  There are two ways to replicate 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-09-27 Thread Steven Parker
** Description changed:

  [Impact]
  
  It has been brought to my attention the following:
  
  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.
  
  Testing usually consists of either failing over the fabric or the local
  I/O module for the Cisco chassis which houses a number of individual
  blades.
  
  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "
  
  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 0040
  [6051160.637043] ---[ end trace 236e6f4850146477 ]---
  
  [Test Plan]
  
-  ???
+ There are two ways to replicate the bug:
+ 
+ Specific hardware:
+Chassis Cisco UCS 5108 AC2 Chassis
+Blades Cisco UCS B200
+IO module Cisco UCS 2408
+ 
+ Server loads - Ubuntu 20.04 cluster running deployed maas, juju and
+ openstack.
+ 
+ 1) Reset a single chassis I/O module or fail over a fabric interconnect
+ (FI) for all chassis in the cluster. We have performed both tests.
+ 
+ Fail over of single chassis I/O module results in at least one node locking 
up.
+ 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-09-27 Thread Eric Desrochers
A test kernel of v5.4 (kernel series where the problem has been found)
has been tested by Field Engineer and here's the outcome:

"
-Extensive testing about 4/5 failovers both HWE (v5.11) and the patched kernels 
seem stable (v5.4).

Thank you this unblocks us for deployment of this cloud.
"

- Eric

** Description changed:

  [Impact]
  
  It has been brought to my attention the following:
  
  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.
  
  Testing usually consists of either failing over the fabric or the local
  I/O module for the Cisco chassis which houses a number of individual
  blades.
  
  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "
  
  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: 0040
  [6051160.637043] ---[ end trace 236e6f4850146477 ]---
  
  [Test Plan]
  
+  ???
+ 
  [Where problems could occur]
+ 
+ Cisco "fNIC" driver enables FCoE support for the Cisco UCS Virtual
+ Interface Card family of products.
+ 
+ If a problem arise it would be 

[Kernel-packages] [Bug 1944586] Re: kernel bug found when disconnecting one fiber channel interface on Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

2021-09-22 Thread Eric Desrochers
** Summary changed:

- kernel bug found when disconnecting one fiber channel interface on Cisco 
Chassis with fnic DRV_VERSION " 1.6.0.47"
+ kernel bug found when disconnecting one fiber channel interface on Cisco 
Chassis with fnic DRV_VERSION "1.6.0.47"

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1944586

Title:
  kernel bug found when disconnecting one fiber channel interface on
  Cisco Chassis with fnic DRV_VERSION "1.6.0.47"

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  [Impact]

  It has been brought to my attention the following:

  "
  We have been experiencing node lockups and degradation when testing fiber 
channel fail over for multi-path PURESTORAGE drives.

  Testing usually consists of either failing over the fabric or the
  local I/O module for the Cisco chassis which houses a number of
  individual blades.

  After rebooting a local Chassis I/O module we see commands like multipath -ll 
hanging.
  Resetting the blades individual fiber channel interface results in the 
following messages.
  "

  6051160.241383]  rport-9:0-1: blocked FC remote port time out: removing 
target and saving binding
  [6051160.252901] BUG: kernel NULL pointer dereference, address: 
0040
  [6051160.262267] #PF: supervisor read access in kernel mode
  [6051160.269314] #PF: error_code(0x) - not-present page
  [6051160.276016] PGD 0 P4D 0
  [6051160.279807] Oops:  [#1] SMP NOPTI
  [6051160.284642] CPU: 10 PID: 49346 Comm: kworker/10:2 Tainted: P   O 
 5.4.0-77-generic #86-Ubuntu
  [6051160.295967] Hardware name: Cisco Systems Inc UCSB-B200-M5/UCSB-B200-M5, 
BIOS B200M5.4.1.1d.0.0609200543 06/09/2020
  [6051160.308199] Workqueue: fc_dl_9 fc_timeout_deleted_rport 
[scsi_transport_fc]
  [6051160.316640] RIP: 0010:fnic_terminate_rport_io+0x10f/0x510 [fnic]
  [6051160.324050] Code: 48 89 c3 48 85 c0 0f 84 7b 02 00 00 48 05 20 01 00 00 
48 89 45 b0 0f 84 6b 02 00 00 48 8b 83 58 01 00 00 48 8b 80 b8 01 00 00 <48> 8b 
78 40 e8 68 e6 06 00 85 c0 0f 84 4c 02 00 00 48 8b 83 58 01
  [6051160.346553] RSP: 0018:bc224f297d90 EFLAGS: 00010082
  [6051160.353115] RAX:  RBX: 90abdd4c4b00 RCX: 
90d8ab2c2bb0
  [6051160.361983] RDX: 90d8b5467400 RSI:  RDI: 
90d8ab3b4b40
  [6051160.370812] RBP: bc224f297df8 R08: 90d8c08978c8 R09: 
90d8b8850800
  [6051160.379518] R10: 90d8a59d64c0 R11: 0001 R12: 
90d8ab2c31f8
  [6051160.388242] R13:  R14: 0246 R15: 
90d8ab2c27b8
  [6051160.396953] FS:  () GS:90d8c088() 
knlGS:
  [6051160.406838] CS:  0010 DS:  ES:  CR0: 80050033
  [6051160.414168] CR2: 0040 CR3: 000fc1c0a004 CR4: 
007626e0
  [6051160.423146] DR0:  DR1:  DR2: 

  [6051160.431884] DR3:  DR6: fffe0ff0 DR7: 
0400
  [6051160.440615] PKRU: 5554
  [6051160.444337] Call Trace:
  [6051160.447841]  fc_terminate_rport_io+0x56/0x70 [scsi_transport_fc]
  [6051160.455263]  fc_timeout_deleted_rport.cold+0x1bc/0x2c7 
[scsi_transport_fc]
  [6051160.463623]  process_one_work+0x1eb/0x3b0
  [6051160.468784]  worker_thread+0x4d/0x400
  [6051160.473660]  kthread+0x104/0x140
  [6051160.478102]  ? process_one_work+0x3b0/0x3b0
  [6051160.483439]  ? kthread_park+0x90/0x90
  [6051160.488213]  ret_from_fork+0x1f/0x40
  [6051160.492901] Modules linked in: dm_service_time zfs(PO) zunicode(PO) 
zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter 
ebtables ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle 
iptable_nat nf_nat vhost_vsock vmw_vsock_virtio_transport_common vsock 
unix_diag nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
vhost_net vhost tap 8021q garp mrp bluetooth ecdh_generic ecc tcp_diag 
inet_diag sctp nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter 
bpfilter bridge stp llc nls_iso8859_1 dm_queue_length dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common 
skx_edac nfit x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp 
kvm_intel kvm rapl input_leds joydev intel_cstate mei_me ioatdma mei dca 
ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor
  [6051160.492928]  async_tx xor raid6_pq libcrc32c raid1 raid0 multipath 
linear fnic mgag200 drm_vram_helper i2c_algo_bit ttm drm_kms_helper 
crct10dif_pclmul syscopyarea hid_generic crc32_pclmul libfcoe sysfillrect 
ghash_clmulni_intel sysimgblt aesni_intel fb_sys_fops crypto_simd libfc usbhid 
cryptd scsi_transport_fc hid drm glue_helper enic ahci lpc_ich libahci wmi
  [6051160.632623] CR2: