Thanks @mfo! That is correct, the crash was seen there, but they
determined it was generic and are pushing this to all the Linux OSVs.


Also of note, that patch set is a general driver update and not all of those 
are relevant to this bug, I've asked them to pinpoint the patches that resolve 
this issue specifically with the intent of just pulling those.


** Description changed:

+ [Impact]
  Server crash and Call trace reported on one of the servers running IO and
  switch port bounce test from the 2K login session configuration.
- 
  
  Call Trace:
  [56048.470488] Call Trace:
  [56048.470489]  _raw_spin_lock_irqsave+0x32/0x40
  [56048.470489]  lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
  [56048.470490]  lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
  [56048.470490]  lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
  [56048.470490]  lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
  [56048.470491]  lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
  [56048.470491]  lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
  [56048.470492]  lpfc_do_work+0x1485/0x1d70 [lpfc]
  [56048.470492]  ? __schedule+0x280/0x700
  [56048.470492]  ? finish_wait+0x80/0x80
  [56048.470493]  ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc]
  [56048.470493]  kthread+0x112/0x130
  [56048.470493]  ? kthread_flush_work_fn+0x10/0x10
  [56048.470494]  ret_from_fork+0x1f/0x40
  [56048.470494] Kernel panic - not syncing: Hard LOCKUP
- [56048.470495] CPU: 0 PID: 682 Comm: lpfc_worker_0 Kdump: loaded Tainted: G   
 
-      IOE    --------- -  - 4.18.0-240.el8.x86_64 #1
+ [56048.470495] CPU: 0 PID: 682 Comm: lpfc_worker_0 Kdump: loaded Tainted: G
+      IOE    --------- -  - 4.18.0-240.el8.x86_64 #1
  [56048.470496] Hardware name: Dell Inc. PowerEdge R740/0DY2X0, BIOS 2.11.2
  004/21/2021
  [56048.470496] Call Trace:
  [56048.470496]  <NMI>
  [56048.470496]  dump_stack+0x5c/0x80
  [56048.470497]  panic+0xe7/0x2a9
  [56048.470497]  ? __switch_to_asm+0x51/0x70
  [56048.470497]  nmi_panic.cold.9+0xc/0xc
  [56048.470498]  watchdog_overflow_callback.cold.7+0x5c/0x70
  [56048.470498]  __perf_event_overflow+0x52/0xf0
  [56048.470499]  handle_pmi_common+0x1db/0x270
  [56048.470499]  ? __set_pte_vaddr+0x32/0x50
  [56048.470499]  ? __native_set_fixmap+0x24/0x30
  [56048.470500]  ? ghes_copy_tofrom_phys+0xd3/0x1c0
  [56048.470500]  ? __ghes_peek_estatus.isra.12+0x49/0xa0
  [56048.470500]  intel_pmu_handle_irq+0xbf/0x160
  [56048.470501]  perf_event_nmi_handler+0x2d/0x50
  [56048.470501]  nmi_handle+0x63/0x110
  [56048.470501]  default_do_nmi+0x4e/0x100
  [56048.470502]  do_nmi+0x128/0x190
  [56048.470502]  end_repeat_nmi+0x16/0x6a
  [56048.470503] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470504] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4
  09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 
75
  f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00 00 75
  [56048.470504] RSP: 0018:ffffacebc7877ca8 EFLAGS: 00000002
  [56048.470505] RAX: 0000000000000101 RBX: 0000000000000246 RCX:
  000000000000001f
  [56048.470505] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
  ffff94dcf5341dc0
  [56048.470506] RBP: ffff94dcf5340000 R08: 0000000000000002 R09:
  0000000000029600
  [56048.470506] R10: 000060d29656a45c R11: ffff94dcf534fd12 R12:
  ffff94dcf5341db0
  [56048.470507] R13: ffff94dcf5341dc0 R14: ffff94dcc4ae8a00 R15:
  0000000000000003
  [56048.470507]  ? native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470507]  ? native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470508]  </NMI>
  [56048.470508]  _raw_spin_lock_irqsave+0x32/0x40
  [56048.470509]  lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
  [56048.470509]  lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
  [56048.470509]  lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
  [56048.470510]  lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
  [56048.470510]  lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
  [56048.470510]  lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
  [56048.470511]  lpfc_do_work+0x1485/0x1d70 [lpfc]
  [56048.470511]  ? __schedule+0x280/0x700
  [56048.470511]  ? finish_wait+0x80/0x80
  [56048.470512]  ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc]
  [56048.470512]  kthread+0x112/0x130
  [56048.470513]  ? kthread_flush_work_fn+0x10/0x10
  [56048.470513]  ret_from_fork+0x1f/0x40
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]#
  
- 
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /etc/redhat-release
  Red Hat Enterprise Linux release 8.3 (Ootpa)
  
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/module/lpfc/version
  0:14.0.390.2
  
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/scsi_host/host*/modeldesc
  Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter
  Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter
  
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/scsi_host/host*/fwrev
  14.0.390.1, sli-4:2:c
  14.0.390.1, sli-4:2:c
  
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/fc_host/host*/port_name
  0x10000090faf09459
  0x10000090faf0945a
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]#
- 
  
  HBA Attributes for 10:00:00:90:fa:f0:94:59
  
  Host Name                     : ms-svr3-10-231-131-160
  Manufacturer                  : Emulex Corporation
  Serial Number                 : FC70793283
  Model                         : LPe32002-M2
  Model Desc                    : Emulex LightPulse LPe32002-M2 2-Port 32Gb 
Fibre
  Channel Adapter
  Node WWN                      : 20 00 00 90 fa f0 94 59
  Node Symname                  :
  HW Version                    : 0000000c 00000001 00000000
  FW Version                    : 14.0.390.1
  Vendor Spec ID                : 10DF
  Number of Ports               : 1
  Driver Name                   : lpfc
  Driver Version                : 14.0.390.2; HBAAPI(I) v2.3.d, 07-12-10
  Device ID                     : E300
  HBA Type                      : LPe32002-M2
  Operational FW                : 14.0.390.1
  IEEE Address                  : 00 90 fa f0 94 59
  Boot Code                     : Enabled
  Boot Version                  : 14.0.390.1
  Board Temperature             : Normal
  Function Type                 : FC
  Sub Device ID                 : E300
  PCI Bus Number                : 94
  PCI Func Number               : 0
  Sub Vendor ID                 : 10DF
  IPL Filename                  : H62LEX1
  Service Processor FW Name     : 14.0.390.1
  ULP FW Name                   : 14.0.390.1
  FC Universal BIOS Version     : 14.0.390.1
  FC x86 BIOS Version           : 14.0.390.1
  FC EFI BIOS Version           : 14.0.388.0
  FC FCODE Version              : 14.0.386.0
  Flash Firmware Version        : 14.0.390.1
  Secure Firmware               : Enabled
  
  [root@ms-svr3-10-231-131-160 log]# hbacmd portattrib
  10:00:00:90:fa:f0:94:59
  
  Port Attributes for 10:00:00:90:fa:f0:94:59
  
  Node WWN                  : 20 00 00 90 fa f0 94 59
  Port WWN                  : 10 00 00 90 fa f0 94 59
  Port Symname              :
  Port FCID                 : 0000
  Port Type                 : Unknown
  Port State                : Link Down
  Port Service Type         : 8
  Port Supported FC4        : 00 00 01 00 00 00 00 01
-                             00 00 00 00 00 00 00 00
-                             00 00 00 00 00 00 00 00
-                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
  Port Active FC4           : 00 00 01 00 00 00 00 01
-                             00 00 00 00 00 00 00 00
-                             00 00 00 00 00 00 00 00
-                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
+                             00 00 00 00 00 00 00 00
  Port Supported Speed      : 8 16 32 Gbit/sec
  Configured Port Speed     : Auto Detect
  Port Speed                : Not Available
  Max Frame Size            : 2048
  OS Device Name            : /sys/class/scsi_host/host15
  Num Discovered Ports      : 0
  Fabric Name               : 00 00 00 00 00 00 00 00
  Function Type             : FC
  FEC                       : Enabled
  
+ [Fixes]
+ The following patch will resolve the issue:
+ scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
+ In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
+ lockup call trace hangs the system.
+ 
+ [Testcase]
+ 
+ 
  [root@ms-svr3-10-231-131-160 log]#
  [reply] [-]Comment 3James Smart 2022-04-13 09:12:37 PDT
  Patches pushed upstream 4/12/22:
  
  https://lore.kernel.org/linux-
  scsi/20220412222008.126521-1-jsmart2...@gmail.com/T/#t

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1971193

Title:
  Server Crash while running IO and switch port bounce test with 2K
  login session

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1971193/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to