This bug was fixed in the package linux - 4.15.0-55.60

---------------
linux (4.15.0-55.60) bionic; urgency=medium

  * linux: 4.15.0-55.60 -proposed tracker (LP: #1834954)

  * Request backport of ceph commits into bionic (LP: #1834235)
    - ceph: use atomic_t for ceph_inode_info::i_shared_gen
    - ceph: define argument structure for handle_cap_grant
    - ceph: flush pending works before shutdown super
    - ceph: send cap releases more aggressively
    - ceph: single workqueue for inode related works
    - ceph: avoid dereferencing invalid pointer during cached readdir
    - ceph: quota: add initial infrastructure to support cephfs quotas
    - ceph: quota: support for ceph.quota.max_files
    - ceph: quota: don't allow cross-quota renames
    - ceph: fix root quota realm check
    - ceph: quota: support for ceph.quota.max_bytes
    - ceph: quota: update MDS when max_bytes is approaching
    - ceph: quota: add counter for snaprealms with quota
    - ceph: avoid iput_final() while holding mutex or in dispatch thread

  * QCA9377 isn't being recognized sometimes (LP: #1757218)
    - SAUCE: USB: Disable USB2 LPM at shutdown

  * hns: fix ICMP6 neighbor solicitation messages discard problem (LP: #1833140)
    - net: hns: fix ICMP6 neighbor solicitation messages discard problem
    - net: hns: fix unsigned comparison to less than zero

  * Fix occasional boot time crash in hns driver (LP: #1833138)
    - net: hns: Fix probabilistic memory overwrite when HNS driver initialized

  *  use-after-free in hns_nic_net_xmit_hw (LP: #1833136)
    - net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw()

  * hns: attempt to restart autoneg when disabled should report error
    (LP: #1833147)
    - net: hns: Restart autoneg need return failed when autoneg off

  * systemd 237-3ubuntu10.14 ADT test failure on Bionic ppc64el (test-seccomp)
    (LP: #1821625)
    - powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
    - powerpc: sys_pkey_mprotect() system call

  * [UBUNTU] pkey: Indicate old mkvp only if old and curr. mkvp are different
    (LP: #1832625)
    - pkey: Indicate old mkvp only if old and current mkvp are different

  * [UBUNTU] kernel: Fix gcm-aes-s390 wrong scatter-gather list processing
    (LP: #1832623)
    - s390/crypto: fix gcm-aes-s390 selftest failures

  * System crashes on hot adding a core with drmgr command (4.15.0-48-generic)
    (LP: #1833716)
    - powerpc/numa: improve control of topology updates
    - powerpc/numa: document topology_updates_enabled, disable by default

  * Kernel modules generated incorrectly when system is localized to a non-
    English language (LP: #1828084)
    - scripts: override locale from environment when running recordmcount.pl

  * [UBUNTU] kernel: Fix wrong dispatching for control domain CPRBs
    (LP: #1832624)
    - s390/zcrypt: Fix wrong dispatching for control domain CPRBs

  * CVE-2019-11815
    - net: rds: force to destroy connection if t_sock is NULL in
      rds_tcp_kill_sock().

  * Sound device not detected after resume from hibernate (LP: #1826868)
    - drm/i915: Force 2*96 MHz cdclk on glk/cnl when audio power is enabled
    - drm/i915: Save the old CDCLK atomic state
    - drm/i915: Remove redundant store of logical CDCLK state
    - drm/i915: Skip modeset for cdclk changes if possible

  * Handle overflow in proc_get_long of sysctl (LP: #1833935)
    - sysctl: handle overflow in proc_get_long

  * Dell XPS 13 (9370) defaults to s2idle sleep/suspend instead of deep, NVMe
    drains lots of power under s2idle (LP: #1808957)
    - Revert "UBUNTU: SAUCE: pci/nvme: prevent WDC PC SN720 NVMe from entering 
D3
      and being disabled"
    - Revert "UBUNTU: SAUCE: nvme: add quirk to not call disable function when
      suspending"
    - Revert "UBUNTU: SAUCE: pci: prevent Intel NVMe SSDPEKKF from entering D3"
    - Revert "SAUCE: nvme: add quirk to not call disable function when 
suspending"
    - Revert "SAUCE: pci: prevent sk hynix nvme from entering D3"
    - PCI: PM: Avoid possible suspend-to-idle issue
    - PCI: PM: Skip devices in D0 for suspend-to-idle
    - nvme-pci: Sync queues on reset
    - nvme: Export get and set features
    - nvme-pci: Use host managed power state for suspend

  * linux v4.15 ftbfs on a newer host kernel (e.g. hwe) (LP: #1823429)
    - selinux: use kernel linux/socket.h for genheaders and mdp

  * 32-bit x86 kernel 4.15.0-50 crash in vmalloc_sync_all (LP: #1830433)
    - x86/mm/pat: Disable preemption around __flush_tlb_all()
    - x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
    - x86/mm: Disable ioremap free page handling on x86-PAE
    - ioremap: Update pgtable free interfaces with addr
    - x86/mm: Add TLB purge to free pmd/pte page interfaces
    - x86/init: fix build with CONFIG_SWAP=n
    - x86/mm: provide pmdp_establish() helper
    - x86/mm: Use WRITE_ONCE() when setting PTEs

  * hinic: fix oops due to race in set_rx_mode (LP: #1832048)
    - hinic: fix a bug in set rx mode

  * ubuntu 18.04 flickering screen with Radeon X1600 (LP: #1791312)
    - drm/radeon: prefer lower reference dividers

  * Login screen never appears on vmwgfx using bionic kernel 4.15 (LP: #1832138)
    - drm/vmwgfx: use monotonic event timestamps

  * [linux-azure] Block Layer Commits Requested in Azure Kernels (LP: #1834499)
    - block: Clear kernel memory before copying to user
    - block/bio: Do not zero user pages

  * CONFIG_LOG_BUF_SHIFT set to 14 is too low on arm64 (LP: #1824864)
    - [Config] CONFIG_LOG_BUF_SHIFT=18 on all 64bit arches

  * Handle overflow for file-max (LP: #1834310)
    - sysctl: handle overflow for file-max
    - kernel/sysctl.c: fix out-of-bounds access when setting file-max

  * [ALSA] [PATCH] Headset fixup for System76 Gazelle (gaze14) (LP: #1827555)
    - ALSA: hda/realtek - Headset fixup for System76 Gazelle (gaze14)
    - ALSA: hda/realtek - Corrected fixup for System76 Gazelle (gaze14)

  * crashdump fails on HiSilicon D06 (LP: #1828868)
    - iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump 
kernel
    - iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel

  * CVE-2019-11833
    - ext4: zero out the unused memory region in the extent tree block

  * zfs 0.7.9 fixes a bug (https://github.com/zfsonlinux/zfs/pull/7343) that
    hangs the system completely (LP: #1772412)
    - SAUCE: (noup) Update zfs to 0.7.5-1ubuntu16.6

  * does not detect headphone when there is no other output devices
    (LP: #1831065)
    - ALSA: hda/realtek - Fixed hp_pin no value
    - ALSA: hda/realtek - Use a common helper for hp pin reference

  * kernel crash : net_sched  race condition in tcindex_destroy() (LP: #1825942)
    - net_sched: fix NULL pointer dereference when delete tcindex filter
    - RCU, workqueue: Implement rcu_work
    - net_sched: switch to rcu_work
    - net_sched: fix a race condition in tcindex_destroy()
    - net_sched: fix a memory leak in cls_tcindex
    - net_sched: initialize net pointer inside tcf_exts_init()
    - net_sched: fix two more memory leaks in cls_tcindex

  * Support new ums-realtek device (LP: #1831840)
    - USB: usb-storage: Add new ID to ums-realtek

  * amd_iommu possible data corruption (LP: #1823037)
    - iommu/amd: Reserve exclusion range in iova-domain
    - iommu/amd: Set exclusion range correctly

  * Add new sound card PCIID into the alsa driver (LP: #1832299)
    - ALSA: hda: Add Icelake PCI ID
    - ALSA: hda/intel: add CometLake PCI IDs

  * sky2 ethernet card doesn't work after returning from suspend
    (LP: #1807259) // sky2 ethernet card link not up after suspend
    (LP: #1809843)
    - sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79

  * idle-page oopses when accessing page frames that are out of range
    (LP: #1833410)
    - mm/page_idle.c: fix oops because end_pfn is larger than max_pfn

  * Add pointstick support on HP ZBook 17 G5 (LP: #1833387)
    - Revert "HID: multitouch: Support ALPS PTP stick with pid 0x120A"
    - SAUCE: HID: multitouch: Add pointstick support for ALPS Touchpad

  * [SRU][B/B-OEM/B-OEM-OSP-1/C/D/E] Add trackpoint middle button support of 2
    new thinpads (LP: #1833637)
    - Input: elantech - enable middle button support on 2 ThinkPads

  * CVE-2019-11085
    - drm/i915/gvt: Fix mmap range check
    - drm/i915: make mappable struct resource centric
    - drm/i915/gvt: Fix aperture read/write emulation when enable x-no-mmap=on

  * CVE-2019-11884
    - Bluetooth: hidp: fix buffer overflow

  * af_alg06 test from crypto test suite in LTP failed with kernel oops on B/C
    (LP: #1829725)
    - crypto: authenc - fix parsing key with misaligned rta_len

  * CVE-2018-12126 // CVE-2018-12127 // CVE-2018-12130 // CVE-2019-11091
    - SAUCE: Synchronize MDS mitigations with upstream
    - Documentation: Correct the possible MDS sysfs values
    - x86/speculation/mds: Fix documentation typo

  * CVE-2019-11091
    - x86/mds: Add MDSUM variant to the MDS documentation

  * alignment test in powerpc from ubuntu_kernel_selftests failed on B/C Power9
    (LP: #1813118)
    - selftests/powerpc: Remove Power9 copy_unaligned test

  * TRACE_syscall.ptrace_syscall_dropped in seccomp from ubuntu_kernel_selftests
    failed on B/C PowerPC (LP: #1812796)
    - selftests/seccomp: Enhance per-arch ptrace syscall skip tests

  * Add powerpc/alignment_handler test for selftests (LP: #1828935)
    - selftests/powerpc: Add alignment handler selftest
    - selftests/powerpc: Fix to use ucontext_t instead of struct ucontext

  * Cannot build kernel 4.15.0-48.51 due to an in-source-tree ZFS module.
    (LP: #1828763)
    - SAUCE: (noup) Update zfs to 0.7.5-1ubuntu16.5

  * Eletrical noise occurred when external headset enter powersaving mode on a
    DEll machine (LP: #1828798)
    - ALSA: hda/realtek - Reduce click noise on Dell Precision 5820 headphone
    - ALSA: hda/realtek - Fixup headphone noise via runtime suspend

  * [18.04/18.10] File libperf-jvmti.so is missing in linux-tools-common deb on
    Ubuntu (LP: #1761379)
    - [Packaging] Support building libperf-jvmti.so

  * TCP : race condition on socket ownership in tcp_close() (LP: #1830813)
    - tcp: do not release socket ownership in tcp_close()

  * bionic: netlink: potential shift overflow in netlink_bind() (LP: #1831103)
    - netlink: Don't shift on 64 for ngroups

  * Add support to Comet Lake LPSS (LP: #1830175)
    - mfd: intel-lpss: Add Intel Comet Lake PCI IDs

  * Reduce NAPI weight in hns driver from 256 to 64 (LP: #1830587)
    - net: hns: Use NAPI_POLL_WEIGHT for hns driver

  * x86: add support for AMD Rome (LP: #1819485)
    - x86: irq_remapping: Move irq remapping mode enum
    - iommu/amd: Add support for higher 64-bit IOMMU Control Register
    - iommu/amd: Add support for IOMMU XT mode
    - hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
    - hwmon/k10temp: Add support for AMD family 17h, model 30h CPUs
    - x86/amd_nb: Add PCI device IDs for family 17h, model 30h
    - x86/MCE/AMD: Fix the thresholding machinery initialization order
    - x86/amd_nb: Add support for newer PCI topologies

  * nx842 - CRB request time out (-110) when uninstall NX modules and initiate
    NX request (LP: #1827755)
    - crypto/nx: Initialize 842 high and normal RxFIFO control registers

  * Require improved hypervisor detection patch in Ubuntu 18.04 (LP: #1829972)
    - s390/early: improve machine detection

 -- Kleber Sacilotto de Souza <kleber.so...@canonical.com>  Tue, 02 Jul
2019 18:41:49 +0200

** Changed in: linux (Ubuntu Bionic)
       Status: Fix Committed => Fix Released

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-12126

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-12127

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-12130

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-11085

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-11091

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-11815

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-11833

** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-11884

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.555503]      the DIE domain not a subset of the NODE domain
  [  218.555512] BUG: arch topology borken
  [  218.555516]      the DIE domain not a subset of the NODE domain
  [  218.555523] BUG: arch topology borken
  [  218.555528]      the DIE domain not a subset of the NODE domain
  [  218.555535] BUG: arch topology borken
  [  218.555539]      the DIE domain not a subset of the NODE domain
  [  218.555545] BUG: arch topology borken
  [  218.555550]      the DIE domain not a subset of the NODE domain
  [  218.555556] BUG: arch topology borken
  [  218.555560]      the DIE domain not a subset of the NODE domain
  [  218.555567] BUG: arch topology borken
  [  218.555571]      the DIE domain not a subset of the NODE domain
  [  218.555577] BUG: arch topology borken
  [  218.555581]      the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc0000000001768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c0000000001768cc LR: c0000000001769a8 CTR: 
0000000000000000
  [  218.555770] REGS: c0000001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22824228  
XER: 00000004
  [  218.555789] CFAR: c000000000176920 SOFTE: 1
  [  218.555789] GPR00: c0000000001769a8 c0000001f5f1f7b0 c0000000016eb400 
c0000001f7bfd200
  [  218.555789] GPR04: 0000000000000001 0000000000000000 0000000000000008 
0000000000000010
  [  218.555789] GPR08: 0000000000000018 ffffffffffffffff c0000001f7bfd408 
0000000000000000
  [  218.555789] GPR12: 0000000000008000 c000000007a35800 0000000000000007 
c0000001f549d900
  [  218.555789] GPR16: 0000000000000040 c000000001722494 c0000001f0f29400 
0000000000000001
  [  218.555789] GPR20: c0000001ffb68580 0000000000000008 c0000000011d8580 
c00000000171dd78
  [  218.555789] GPR24: 0000000000000000 ffffffffffffe830 ffffffffffffec30 
00000000000012af
  [  218.555789] GPR28: 000000000000102f c0000001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c0000000001768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c0000000001769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c0000001f5f1f7b0] [ffffffffffffec30] 0xffffffffffffec30 
(unreliable)
  [  218.555884] [c0000001f5f1f7f0] [c0000000001769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c0000001f5f1f820] [c000000000176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c0000001f5f1f960] [c000000000178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c0000001f5f1fa90] [c000000000179a70] 
partition_sched_domains+0x2d0/0x410
  [  218.555916] [c0000001f5f1fb20] [c0000000001ffb60] 
rebuild_sched_domains_locked+0x60/0x80
  [  218.555924] [c0000001f5f1fb50] [c000000000202e68] 
rebuild_sched_domains+0x38/0x60
  [  218.555932] [c0000001f5f1fb80] [c000000000202fc8] 
cpuset_hotplug_workfn+0x138/0xb60
  [  218.555941] [c0000001f5f1fc90] [c000000000135858] 
process_one_work+0x298/0x5a0
  [  218.555949] [c0000001f5f1fd20] [c000000000135bf8] worker_thread+0x98/0x630
  [  218.555956] [c0000001f5f1fdc0] [c00000000013e7e8] kthread+0x1a8/0x1b0
  [  218.555964] [c0000001f5f1fe30] [c00000000000b658] 
ret_from_kernel_thread+0x5c/0x84
  [  218.555971] Instruction dump:
  [  218.555975] 7d908026 fbe1fff8 91810008 f8010010 f821ffc1 7c7d1b78 2e240000 
7c7f1b78
  [  218.555985] 48000010 7fbee840 7fdff378 419e0074 <ebdf0000> 4192002c 
7c0004ac e95f0010
  [  218.555997] ---[ end trace 1d7b9b38e50835a4 ]---
  ---

  ---uname output---
  Linux ubuntu 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:26:19 UTC 2019 
ppc64le ppc64le ppc64le GNU/Linux

  Machine Type = na

  ---Debugger---
  A debugger is not configured

  ---Steps to Reproduce---
   1. Install a 4.15 kernel (4.15.0-48-generic)
  2. Hot remove a core: drmgr -c cpu -r  -q 1
  3. Hot add a core: drmgr -c  cpu -a -q 1

  Actual Result:
  System crashes after "drmgr -c cpu -a -q 1" command is issued

  Expected result:
  Hot add succeeds without any crash

  == Comment: #20 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2019-06-20 
07:00:39 ==
  Please integrate these two patches

  1.
  
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=2d4d9b308f8f8dec68f6dbbff18c68ec7c6bd26f

  powerpc/numa: improve control of topology updates

  When booted with "topology_updates=no", or when "off" is written to
  /proc/powerpc/topology_updates, NUMA reassignments are inhibited for
  PRRN and VPHN events. However, migration and suspend unconditionally
  re-enable reassignments via start_topology_update(). This is
  incoherent.

  Check the topology_updates_enabled flag in
  start/stop_topology_update() so that callers of those APIs need not be
  aware of whether reassignments are enabled. This allows the
  administrative decision on reassignments to remain in force across
  migrations and suspensions.

  Signed-off-by: Nathan Lynch <nath...@linux.ibm.com>
  Signed-off-by: Michael Ellerman <m...@ellerman.id.au>

  2.
  
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=558f86493df09f68f79fe056d9028d317a3ce8ab

  powerpc/numa: document topology_updates_enabled, disable by default

  Changing the NUMA associations for CPUs and memory at runtime is
  basically unsupported by the core mm, scheduler etc. We see all manner
  of crashes, warnings and instability when the pseries code tries to do
  this. Disable this behavior by default, and document the switch a bit.

  Signed-off-by: Nathan Lynch <nath...@linux.ibm.com>
  Signed-off-by: Michael Ellerman <m...@ellerman.id.au>

  Thanks in advance for your support.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1833716/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to