Since we are already on kernel 4.18.0.8.9 in cosmic that incl. the mentioned 
patches:
$ git log --oneline | grep "Check if IOMMU page is contained in the pinned 
physical page"
76fa497 KVM: PPC: Check if IOMMU page is contained in the pinned physical page
fheimes@T570:~/ubuntu-cosmic$ git tag --contains 76fa497
Ubuntu-4.18.0-7.8
Ubuntu-4.18.0-8.9
Ubuntu-4.18.0-9.10
v4.18
$ git log --oneline | grep "powerpc/mm/radix: Change pte relax sequence to 
handle nest MMU hang"
bd5050e powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang
$ git tag --contains bd5050e
Ubuntu-4.18.0-7.8
Ubuntu-4.18.0-8.9
Ubuntu-4.18.0-9.10
v4.18
$ git log --oneline | grep "powerpc/mm: Change function prototype"
e4c1112 powerpc/mm: Change function prototype
$ git tag --contains e4c1112
Ubuntu-4.18.0-7.8
Ubuntu-4.18.0-8.9
Ubuntu-4.18.0-9.10
v4.18
$ git log --oneline | grep "powerpc/mm/hugetlb: Update 
huge_ptep_set_access_flags to call __ptep_set_access_flags directly"
f069ff3 powerpc/mm/hugetlb: Update huge_ptep_set_access_flags to call 
__ptep_set_access_flags directly
$ git tag --contains f069ff3
Ubuntu-4.18.0-7.8
Ubuntu-4.18.0-8.9
Ubuntu-4.18.0-9.10
v4.18
this can be set to Fix Released for cosmic, too.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1789772

Title:
  tlbie master timeout checkstop (using NVidia/GPU)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  A hung state machine in the chip's NMU logic can trigger a fatal
  condition that will be flagged by hardware through a checkstop. Hence,
  customers that have a Power 9 Whitherspoon (equipped with GPUs) will
  experience a crash on their server when using NVIDIA's toolkit.

  The server will crash with the following hardware failing message:
  Unrecoverable Hardware Failure, (Critical) A system checkstop occurred 
(AffectedSubsystem: Canister/Appliance, PID: 19703), Resolved: 0

  In this case, a `NCUFIR[10] tlbie master timeout` has been observed by
  only starting the NVIDIA ATS driver. This issue is being triggered
  because the NMU logic is getting stuck when a page is upgraded from RO
  -> RW without a following tlbie.

  This is addressed with the following patches:
  bd5050e38aec3055ff4257ade987d808ac93b582 powerpc/mm/radix: Change pte relax 
sequence to handle nest MMU hang
  e4c1112c3fc503fc78379fa61450bfda3f0717fe powerpc/mm: Change function prototype
  044003b52a78bcbda7103633c351da16505096cf powerpc/mm/radix: Move function from 
radix.h to pgtable-radix.c
  f069ff396d657ac7bdb5de866c3ec28b8d08d953 powerpc/mm/hugetlb: Update 
huge_ptep_set_access_flags to call __ptep_set_access_flags directly

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1789772/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to