** Changed in: ubuntu-z-systems
       Status: Triaged => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1903682

Title:
  NULL pointer dereference when configuring multi-function with devfn !=
  0 before devfn == 0

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Focal:
  Fix Committed
Status in linux source package in Groovy:
  Triaged
Status in linux source package in Hirsute:
  Triaged

Bug description:
  SRU Justification:
  ==================

  [Impact]

  * While handling multifunction devices in zPCI the UID of the PCI
  function with function number 0 (that always exists according to the
  PCI spec) is taken as domain number.

  * Therefore if hot plugging functions with a function number larger
  than 0 are used before function 0, these need to be held in standby
  before creating the domain and bus.

  * This has been tested during development of this feature using a
  patched QEMU and in DPM, but unfortunately never in
  classic/traditional HMC mode.

  * On a classic/traditional mode machine with a multi-function device,
  and hot plug ("Reassign I/O Path") of the FID of the second port of
  the LPAR, any additional hotplug (and even just deconfiguring a PCI
  device) will hang - and hotplug now makes the entire Linux instance
  unresponsive.

  * The reason for this is a NULL pointer dereference - inc case
  configuring multi-function with devfn != 0 before devfn == 0.

  * This issue was introduced with the topology-aware PCI enumeration
  code.

  [Fix]

  * 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 0b2ca2c7d0c9 "s390/pci: fix
  hot-plug of PCI function missing bus"

  [Test Case]

  * IBM Z or LinuxONE hardware, equipped with hot-pluggable, multi-
  functional PCIe cards (like for example RoCE Express 2 adapters) in
  classic/traditional mode.

  * An Ubuntu OS running in LPAR, that comes with a kernel that includes
  the topology-aware PCI enumeration code (like for example 20.04.1 w/o
  further updates or 20.10 GA kernel).

  * Now on a system that is in classic/traditional mode, hot plug
  ("Reassign I/O Path") a multi-function device, but using the FID of
  the second port.

  [Regression Potential]

  * There is at least some regression risk, but I consider it as low,
  because:

  * Even is the modification is a single if statement (that spans two
  lines) in 'zpci_event_availability' it could harm the zPCI event
  management even more, in worst case it could break hot plug not only
  for systems in classic/traditional mode, but also in DPM mode (and
  making the system hang) or for all ports.

  * In such a case no enabling / disabling of devices would be possible.

  * But the fix is very simple and straight-forward, it checks
  zdev->zbus->bus for being NULL and in such a case break the function -
  means breaking instead of calling the PCI common code
  pci_scan_single_device() with the NULL pointer.

  * PCIe devices are usually more optional devices on s390x (compared to
  CCW and OSA devices for network) and this affects the zPCI subsystem
  only, which is unique to s390x.

  [Other]

  * The patch got upstream accepted with kernel v5.10-rc3, hence it will
  land sooner or later in Hirsute.

  * The patch has also been tagged for the upstream stable v5.8 series,
  hence will land in Groovy (based on kernel teams regular 'Groovy
  update: v5.8.x upstream stable release' LP bug).

  * Hence requesting this Kernel SRU for Focal only, since Ubuntu
  releases older than Focal do not have the topology-aware zPCI
  enumeration code.

  __________

  Background:

  When handling multifunction devices in zPCI we take the
  UID of the PCI function with function number 0
  (that always exists according to the PCI spec)
  as domain number.
  Therefore when hot plugging functions with function
  number larger than 0 before function 0, we need
  to hold these in standby before creating the
  domain and bus.

  This has been tested during feature development
  using a patched QEMU and with DPM but never in Classic
  Mode.

  Reproduction:

  This issue was introduced with the Topology aware PCI
  Enumeration code so test with a Linux supporting
  that feature. E.g. Upstream, Devel Driver etc.

  On a Classic Mode machine with a multi-function device,
  hot plug ("Reassign I/O Path") only the FID of the
  second port to the LPAR.

  Symptom:

  After this any additional hotplug and even just
  deconfiguring a PCI device will hang. A hotplug
  makes the entire Linux instance unresponsive.

  Analysis:

  The problem occurs in Classic Mode but not with
  previous testing as the LPAR hypervisor does
  hot plug/Reassign I/O Path as a two step process:

  1. zPCI event with PEC 0x0302 to plug the zPCI function in Standby
  2. zPCI event with PEC 0x0301 to configure the zPCI function

  For the first event we create the zdev in clp_add_pci_device()
  in Standby which is all fine so far.
  The problem then occurs in step 2 as we then find
  the existing zdev and try to configure it.
  This however does not work as the PCI bus
  is not yet created (as we still don't know the UID of
  function 0 that will become its domain).
  The bus pointer zdev->zbus->bus pointer is thus still
  NULL but will be accessed by common code which
  inevitably results in disaster including
  the above mentioned hang and (possibly) the below
  RCU stall:

  [  689.724703] rcu: INFO: rcu_sched self-detected stall on CPU
  [  689.724712] rcu:     16-....: (42004 ticks this GP) 
idle=6ee/1/0x4000000000000002 softirq=1234/1234 fqs=14001
  [  689.724742]  (t=42006 jiffies g=89 q=3770)
  [  689.724743] Task dump for CPU 16:
  [  689.724745] task:kmcheck         state:R  running task     stack:    0 
pid:  205 ppid:     2 flags:0x00000004
  [  689.724747] Call Trace:
  [  689.724757]  [<0000000ccde0b5c4>] show_stack+0x8c/0xd8
  [  689.724762]  [<0000000ccd0dabc4>] sched_show_task.part.0+0xe4/0x110
  [  689.724764]  [<0000000ccde0ea5e>] rcu_dump_cpu_stacks+0xde/0x120
  [  689.724767]  [<0000000ccd1465c6>] print_cpu_stall+0x266/0x330
  [  689.724768]  [<0000000ccd14a428>] rcu_sched_clock_irq+0x618/0x670
  [  689.724771]  [<0000000ccd15cd7a>] update_process_times+0xba/0xf0
  [  689.724775]  [<0000000ccd1766fa>] tick_sched_timer+0x9a/0x220
  [  689.724777]  [<0000000ccd15d962>] __hrtimer_run_queues+0x182/0x3a0
  [  689.724779]  [<0000000ccd1602f8>] hrtimer_interrupt+0x138/0x450
  [  689.724782]  [<0000000ccd0451c0>] do_IRQ+0x90/0xa0
  [  689.724784]  [<0000000ccde2be96>] ext_int_handler+0x17e/0x184
  [  689.724790]  [<0000000ccd9f373e>] pci_get_slot+0x5e/0xa0
  [  689.724794]  [<0000000ccd9dc182>] pci_scan_single_device+0x32/0x2a0
  [  689.724797]  [<0000000ccd0868f2>] __zpci_event_availability+0x192/0x360
  [  689.724800]  [<0000000ccdd40c16>] chsc_process_crw+0x2e6/0x300
  [  689.724802]  [<0000000ccdd4b088>] crw_collect_info+0x2b8/0x320
  [  689.724804]  [<0000000ccd0caf3a>] kthread+0x14a/0x170
  [  689.724805]  [<0000000ccde2b814>] ret_from_fork+0x24/0x2c

  The fix is very simple, we check zdev->zbus->bus
  for being NULL and in that case bail from the
  case 0x0301 before calling the PCI common code
  pci_scan_single_device() with the NULL pointer.

  The only subtlety is that we still need to
  do the zpci_enable_device() because the
  code in arch/s390/pci/pci_bus.c assumes
  that it can immediately do a scan of
  all devfn != 0 PCI functions once
  PCI function 0 is found.

  It thereby mimics what happens
  when we only find the FID for a function with
  devfn != 0 in the CLP List PCI Functions.

  This is implemented in the following upstream
  commit:

  0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 s390/pci: fix hot-plug of PCI
  function missing bus

  It is included in v5.10-rc3 and has been tagged for
  stable > v5.8 i.e. all upstream versions with
  the PCI enumeration changes.
  Also it carries the appropriate Fixes tag.

  I have verified that it cherry-picks cleanly
  on current focal master-next and expect
  it to cleanly cherry-pick on newer Ubuntu
  Kernels too.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1903682/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to