** Also affects: ubuntu-power-systems
   Importance: Undecided
       Status: New

** Changed in: ubuntu-power-systems
   Importance: Undecided => High

** Changed in: ubuntu-power-systems
     Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Tags added: triage-g

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1750441

Title:
  Boston-LC:bos1u1: Stress test on Qlogic Fibre Channel on Ubuntu KVM
  guest that caused KVM host crashed in qlt_free_session_done call

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  Problem Description:
  =============
  - PCI passthru Qlogic Fibre Channel adapter from Ubuntu 18.04 KVM host to 
Ubuntu 18.04 KVM guest.

  - Stress test on Qlogic Fibre Channel on Ubuntu KVM guest caused KVM
  host crashed in qlt_free_session_done call.

  - Below stack traces from KVM host:

  
  91:mon> t
  [c000200e4e81fb60] c00800001162f044 qlt_free_session_done+0x4ec/0x680 
[qla2xxx] (unreliable)
  [c000200e4e81fc90] c00000000012fbb8 process_one_work+0x298/0x5a0
  [c000200e4e81fd20] c00000000012ff58 worker_thread+0x98/0x630
  [c000200e4e81fdc0] c000000000138ae8 kthread+0x1a8/0x1b0
  [c000200e4e81fe30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

  91:mon> e
  cpu 0x91: Vector: 300 (Data Access) at [c000200e4e81f8e0]
      pc: c00800001162ed58: qlt_free_session_done+0x200/0x680 [qla2xxx]
      lr: c00800001162eca8: qlt_free_session_done+0x150/0x680 [qla2xxx]
      sp: c000200e4e81fb60
     msr: 900000000280b033
     dar: 20
   dsisr: 40000000
    current = 0xc000200e4e7b0e00
    paca    = 0xc00000000fae3b00         softe: 0        irq_happened: 0x01
      pid   = 1119, comm = kworker/145:1
  Linux version 4.15.0-041500rc9-generic (kernel@tangerine) (gcc version 7.2.0 
(Ubuntu 7.2.0-6ubuntu1)) #201801212130 SMP Mon Jan 22 03:36:42 UTC 2018

  
  91:mon> r
  R00 = c00800001162eca8   R16 = 0000000000000000
  R01 = c000200e4e81fb60   R17 = 0000000000000000
  R02 = c00800001166ad60   R18 = 0000000000000000
  R03 = 0000000000000001   R19 = 0000000000000000
  R04 = c000200e44f8c7f8   R20 = c000200e618e7d80
  R05 = 000000000000f087   R21 = 0000000000000000
  R06 = c00800001165e6c8   R22 = 0000000000000001
  R07 = c00800001164adb0   R23 = c000200e44f99d24
  R08 = 0000000000000000   R24 = 0000000000000402
  R09 = 0000000000000000   R25 = 0000000000000000
  R10 = 0000000000000000   R26 = c000000fe1270c20
  R11 = c00800001163e170   R27 = c000200e44f99000
  R12 = c000000000cfccf0   R28 = c00800001164adb0
  R13 = c00000000fae3b00   R29 = c000000fe1270c00
  R14 = c000000000138948   R30 = c000200e44f8c7f8
  R15 = c000200e4f019440   R31 = c000000fe1270cc0
  pc  = c00800001162ed58 qlt_free_session_done+0x200/0x680 [qla2xxx]
  cfar= c00800001162ed1c qlt_free_session_done+0x1c4/0x680 [qla2xxx]
  lr  = c00800001162eca8 qlt_free_session_done+0x150/0x680 [qla2xxx]
  msr = 900000000280b033   cr  = 28002284
  ctr = c000000000cfccf0   xer = 0000000000000000   trap =  300
  dar = 0000000000000020   dsisr = 40000000
  91:mon>

  The crash location seems close to this one fixed about two weeks ago:

  https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
  
next.git/commit/drivers/scsi/qla2xxx/qla_os.c?h=next-20180212&id=2ce87cc5b269510de9ca1185ca8a6e10ec78c069

  scsi: qla2xxx: Fix memory corruption during hba reset test
  This patch fixes memory corrpution while performing HBA Reset test.

  Following stack trace is seen:

  [  466.397219] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000020
  [  466.433669] IP: [<ffffffffc06f5dd0>] qlt_free_session_done+0x260/0x5f0 
[qla2xxx]
  [  466.467731] PGD 0
  [  466.476718] Oops: 0000 [#1] SMP

  - Luciano built and provided the patch with new Qlogic change on
  Friday last week.

  root@bos1u1p1:~/chavez# ls linux-image*
  linux-image-4.15.0-041500rc9-generic_4.15.0-041500rc9.201801212130_ppc64el.deb
  
linux-image-extra-4.15.0-041500rc9-generic_4.15.0-041500rc9.201801212130_ppc64el.deb

  - I configured and ran same test over weekend and test ran good. KVM
  host did not crash in qlt_free_session_done call like before.

  - So the patch fixed the problem.

  Hi Canonical,

  Please review and consider this a request to pull in commit
  2ce87cc5b269510de9ca1185ca8a6e10ec78c069 please. Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1750441/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to