[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-07-11 Thread Frank Heimes
Since Impish with reach it's EOL this week on July the 14th,
I'll change Impish here also to Won't Fix
that allows to close this bug entirely.

** Changed in: linux (Ubuntu Impish)
   Status: Fix Committed => Won't Fix

** Changed in: ubuntu-z-systems
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  Fix Released
Status in linux source package in Impish:
  Won't Fix
Status in linux source package in Jammy:
  Fix Released

Bug description:
  SRU Justification:
  ==

  [Impact]

   * Ubuntu on s390x KVM environments with lots of large guests with storage
 keys can be affected by rcu stalls.

   * These rcu stalls can cause the system to crash/dump.

  [Fix]

   * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use
  non-quiescing sske for KVM switch to keyed guest"

   * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap:
  voluntarily schedule during key setting"

  [Test Plan]

   * There is no trigger or direct test or re-creation of the 
 problem situation possible, but...

   * and IBM z13 or LinuxONE (or never) LPAR is needed that
 runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
 and acts as KVM host with again several large guests running
 on top with storage groups.

   * Let such a system running for days under significant load
 and watch the logs for rcu issues.

   * Prior to the submission of this SRU patched test kernels
 for focal 5.4 and bionic hwe-5.4 were created and tested.
 They ran for days at a staging environemnt at IBM
 without further issues.

   * The modifications are all limited to s390x.

   * A test kernel was build (see below) that ran in a test environment
 at IBM under appropriate load for several days.

  [Where problems could occur]

   * Due to the change for the KVM switch to keyed guest
 from classic sske to non-quiescing sske
 the KVM behaviour might have changed and the storage keys harmed.

   * The now more generous scheduling while setting keys
 has an impact on the guest memory management and mapping
 which will lead to a different performance.

   * This, with the introduction of __s390_enable_skey_pmd and
 cond_resched, might increase the overhead in certain situations,
 but eventually improves the responsiveness over time,
 hence avoid rcu stalls.

  [Other Info]
   
   * Since the patches are upstream in 5.19-rc1,
 they will be included in the kernel that is planned for kinetic (5.19).

   * Hence this is an SRU to jammy, impish and focal.

  __

  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60

  Contact Information = cborn...@de.ibm.com

  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help 

[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-07-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.4.0-122.138

---
linux (5.4.0-122.138) focal; urgency=medium

  * focal/linux: 5.4.0-122.138 -proposed tracker (LP: #1979489)

  * Remove SAUCE patches from test_vxlan_under_vrf.sh in net of
ubuntu_kernel_selftests (LP: #1975691)
- Revert "UBUNTU: SAUCE: selftests: net: Don't fail test_vxlan_under_vrf on
  xfail"
- Revert "UBUNTU: SAUCE: selftests: net: Make test for VXLAN underlay in 
non-
  default VRF an expected failure"

  * Enable Asus USB-BT500 Bluetooth dongle(0b05:190e) (LP: #1976613)
- Bluetooth: btusb: Add flag to define wideband speech capability
- Bluetooth: btrtl: Add support for RTL8761B
- Bluetooth: btusb: Add 0x0b05:0x190e Realtek 8761BU (ASUS BT500) device.

  * [UBUNTU 20.04] rcu stalls with many storage key guests (LP: #1975582)
- s390/gmap: voluntarily schedule during key setting
- s390/mm: use non-quiescing sske for KVM switch to keyed guest

  * Ubuntu 5.4.0-117.132-generic 5.4.189 has BUG: kernel NULL pointer
dereference, address: 0034 (LP: #1978719)
- mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()

  * Focal update: upstream stable patchset v5.4.192 (LP: #1979014)
- floppy: disable FDRAWCMD by default
- [Config] updateconfigs for BLK_DEV_FD_RAWCMD
- hamradio: defer 6pack kfree after unregister_netdev
- hamradio: remove needs_free_netdev to avoid UAF
- lightnvm: disable the subsystem
- [Config] updateconfigs for NVM, NVM_PBLK
- usb: mtu3: fix USB 3.0 dual-role-switch from device to host
- USB: quirks: add a Realtek card reader
- USB: quirks: add STRING quirk for VCOM device
- USB: serial: whiteheat: fix heap overflow in WHITEHEAT_GET_DTR_RTS
- USB: serial: cp210x: add PIDs for Kamstrup USB Meter Reader
- USB: serial: option: add support for Cinterion MV32-WA/MV32-WB
- USB: serial: option: add Telit 0x1057, 0x1058, 0x1075 compositions
- xhci: stop polling roothubs after shutdown
- xhci: increase usb U3 -> U0 link resume timeout from 100ms to 500ms
- iio: dac: ad5592r: Fix the missing return value.
- iio: dac: ad5446: Fix read_raw not returning set value
- iio: magnetometer: ak8975: Fix the error handling in ak8975_power_on()
- usb: misc: fix improper handling of refcount in uss720_probe()
- usb: typec: ucsi: Fix role swapping
- usb: gadget: uvc: Fix crash when encoding data for usb request
- usb: gadget: configfs: clear deactivation flag in
  configfs_composite_unbind()
- usb: dwc3: core: Fix tx/rx threshold settings
- usb: dwc3: gadget: Return proper request status
- serial: imx: fix overrun interrupts in DMA mode
- serial: 8250: Also set sticky MCR bits in console restoration
- serial: 8250: Correct the clock for EndRun PTP/1588 PCIe device
- arch_topology: Do not set llc_sibling if llc_id is invalid
- hex2bin: make the function hex_to_bin constant-time
- hex2bin: fix access beyond string end
- video: fbdev: udlfb: properly check endpoint type
- arm64: dts: meson: remove CPU opps below 1GHz for G12B boards
- arm64: dts: meson: remove CPU opps below 1GHz for SM1 boards
- mtd: rawnand: fix ecc parameters for mt7622
- USB: Fix xhci event ring dequeue pointer ERDP update issue
- ARM: dts: imx6qdl-apalis: Fix sgtl5000 detection issue
- phy: samsung: Fix missing of_node_put() in exynos_sata_phy_probe
- phy: samsung: exynos5250-sata: fix missing device put in probe error paths
- ARM: OMAP2+: Fix refcount leak in omap_gic_of_init
- phy: ti: omap-usb2: Fix error handling in omap_usb2_enable_clocks
- ARM: dts: at91: Map MCLK for wm8731 on at91sam9g20ek
- phy: mapphone-mdm6600: Fix PM error handling in phy_mdm6600_probe
- phy: ti: Add missing pm_runtime_disable() in serdes_am654_probe
- ARM: dts: Fix mmc order for omap3-gta04
- ARM: dts: am3517-evm: Fix misc pinmuxing
- ARM: dts: logicpd-som-lv: Fix wrong pinmuxing on OMAP35
- ipvs: correctly print the memory size of ip_vs_conn_tab
- mtd: rawnand: Fix return value check of wait_for_completion_timeout
- bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt
  hook
- tcp: md5: incorrect tcp_header_len for incoming connections
- tcp: ensure to use the most recently sent skb when filling the rate sample
- sctp: check asoc strreset_chunk in sctp_generate_reconf_event
- ARM: dts: imx6ull-colibri: fix vqmmc regulator
- arm64: dts: imx8mn-ddr4-evk: Describe the 32.768 kHz PMIC clock
- pinctrl: pistachio: fix use of irq_of_parse_and_map()
- cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe
- net: hns3: add validity check for message data length
- net/smc: sync err code when tcp connection was refused
- ip_gre: Make o_seqno start from 0 in native mode
- tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT
- bus: sunxi-rsb: Fix the return value 

[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-07-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.15.0-41.44

---
linux (5.15.0-41.44) jammy; urgency=medium

  * jammy/linux: 5.15.0-41.44 -proposed tracker (LP: #1979448)

  * Fix can't boot up after change to vmd  (LP: #1976587)
- PCI: vmd: Assign VMD IRQ domain before enumeration
- PCI: vmd: Revert 2565e5b69c44 ("PCI: vmd: Do not disable MSI-X remapping 
if
  interrupt remapping is enabled by IOMMU.")

  * [SRU][Jammy/OEM-5.17][PATCH 0/1] Fix calltrace in mac80211 (LP: #1978297)
- mac80211: fix struct ieee80211_tx_info size

  * [SRU][Jammy][PATCH 0/1] Fix amd display corruption on s2idle resume
(LP: #1978244)
- drm/amd/display: Don't reinitialize DMCUB on s0ix resume

  * pl2303 serial adapter not recognized (LP: #1967493)
- USB: serial: pl2303: fix type detection for odd device

  * Remove SAUCE patches from test_vxlan_under_vrf.sh in net of
ubuntu_kernel_selftests (LP: #1975691)
- Revert "UBUNTU: SAUCE: selftests: net: Don't fail test_vxlan_under_vrf on
  xfail"
- Revert "UBUNTU: SAUCE: selftests: net: Make test for VXLAN underlay in 
non-
  default VRF an expected failure"

  * Fix hp_wmi_read_int() reporting error (0x05) (LP: #1979051)
- platform/x86: hp-wmi: Fix hp_wmi_read_int() reporting error (0x05)

  * Request to back port vmci patches to Ubuntu kernel (LP: #1978145)
- VMCI: dma dg: whitespace formatting change for vmci register defines
- VMCI: dma dg: add MMIO access to registers
- VMCI: dma dg: detect DMA datagram capability
- VMCI: dma dg: set OS page size
- VMCI: dma dg: register dummy IRQ handlers for DMA datagrams
- VMCI: dma dg: allocate send and receive buffers for DMA datagrams
- VMCI: dma dg: add support for DMA datagrams sends
- VMCI: dma dg: add support for DMA datagrams receive
- VMCI: Fix some error handling paths in vmci_guest_probe_device()
- VMCI: Release notification_bitmap in error path
- VMCI: Check exclusive_vectors when freeing interrupt 1
- VMCI: Add support for ARM64
- [Config] Update policies for VMWARE_VMCI and VMWARE_VMCI_VSOCKETS

  * [UBUNTU 20.04] rcu stalls with many storage key guests (LP: #1975582)
- s390/gmap: voluntarily schedule during key setting
- s390/mm: use non-quiescing sske for KVM switch to keyed guest

  * [SRU][OEM-5.14/OEM-5.17/Jammy][PATCH 0/1] Fix i915 calltrace on new ADL BIOS
(LP: #1976214)
- drm/i915: update new TMDS clock setting defined by VBT

  * Revert PPC get_user workaround (LP: #1976248)
- powerpc: Export mmu_feature_keys[] as non-GPL

  * Jammy update: v5.15.39 upstream stable release (LP: #1978240)
- MIPS: Fix CP0 counter erratum detection for R4k CPUs
- parisc: Merge model and model name into one line in /proc/cpuinfo
- ALSA: hda/realtek: Add quirk for Yoga Duet 7 13ITL6 speakers
- ALSA: fireworks: fix wrong return count shorter than expected by 4 bytes
- mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC
- mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits
- mmc: core: Set HS clock speed before sending HS CMD13
- gpiolib: of: fix bounds check for 'gpio-reserved-ranges'
- x86/fpu: Prevent FPU state corruption
- KVM: x86/svm: Account for family 17h event renumberings in
  amd_pmc_perf_hw_id
- iommu/vt-d: Calculate mask for non-aligned flushes
- iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
- drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
- drm/amdgpu: do not use passthrough mode in Xen dom0
- RISC-V: relocate DTB if it's outside memory region
- Revert "SUNRPC: attempt AF_LOCAL connect on setup"
- timekeeping: Mark NMI safe time accessors as notrace
- firewire: fix potential uaf in outbound_phy_packet_callback()
- firewire: remove check of list iterator against head past the loop body
- firewire: core: extend card->lock in fw_core_handle_bus_reset
- net: stmmac: disable Split Header (SPH) for Intel platforms
- genirq: Synchronize interrupt thread startup
- ASoC: da7219: Fix change notifications for tone generator frequency
- ASoC: wm8958: Fix change notifications for DSP controls
- ASoC: meson: Fix event generation for AUI ACODEC mux
- ASoC: meson: Fix event generation for G12A tohdmi mux
- ASoC: meson: Fix event generation for AUI CODEC mux
- s390/dasd: fix data corruption for ESE devices
- s390/dasd: prevent double format of tracks for ESE devices
- s390/dasd: Fix read for ESE with blksize < 4k
- s390/dasd: Fix read inconsistency for ESE DASD devices
- can: grcan: grcan_close(): fix deadlock
- can: isotp: remove re-binding of bound socket
- can: grcan: use ofdev->dev when allocating DMA memory
- can: grcan: grcan_probe(): fix broken system id check for errata 
workaround
  needs
- can: grcan: only use the NAPI poll budget for RX
- nfc: replace improper check device_is_registered() in 

[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-06-20 Thread Frank Heimes
Ok, thx for the info.
I've arranged to get in both and submitted the request to the kernel teams 
mailing list.
(Thekernel team will notice that one is now stable and get it via that way ...)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  In Progress
Status in linux source package in Impish:
  In Progress
Status in linux source package in Jammy:
  In Progress

Bug description:
  SRU Justification:
  ==

  [Impact]

   * Ubuntu on s390x KVM environments with lots of large guests with storage
 keys can be affected by rcu stalls.

   * These rcu stalls can cause the system to crash/dump.

  [Fix]

   * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use
  non-quiescing sske for KVM switch to keyed guest"

   * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap:
  voluntarily schedule during key setting"

  [Test Plan]

   * There is no trigger or direct test or re-creation of the 
 problem situation possible, but...

   * and IBM z13 or LinuxONE (or never) LPAR is needed that
 runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
 and acts as KVM host with again several large guests running
 on top with storage groups.

   * Let such a system running for days under significant load
 and watch the logs for rcu issues.

   * Prior to the submission of this SRU patched test kernels
 for focal 5.4 and bionic hwe-5.4 were created and tested.
 They ran for days at a staging environemnt at IBM
 without further issues.

   * The modifications are all limited to s390x.

   * A test kernel was build (see below) that ran in a test environment
 at IBM under appropriate load for several days.

  [Where problems could occur]

   * Due to the change for the KVM switch to keyed guest
 from classic sske to non-quiescing sske
 the KVM behaviour might have changed and the storage keys harmed.

   * The now more generous scheduling while setting keys
 has an impact on the guest memory management and mapping
 which will lead to a different performance.

   * This, with the introduction of __s390_enable_skey_pmd and
 cond_resched, might increase the overhead in certain situations,
 but eventually improves the responsiveness over time,
 hence avoid rcu stalls.

  [Other Info]
   
   * Since the patches are upstream in 5.19-rc1,
 they will be included in the kernel that is planned for kinetic (5.19).

   * Hence this is an SRU to jammy, impish and focal.

  __

  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60

  Contact Information = cborn...@de.ibm.com

  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-06-10 Thread Frank Heimes
Added test builds for Impish and Jammy on top of the focal ones:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1975582

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  In Progress
Status in linux source package in Impish:
  In Progress
Status in linux source package in Jammy:
  In Progress

Bug description:
  SRU Justification:
  ==

  [Impact]

   * Ubuntu on s390x KVM environments with lots of large guests with storage
 keys can be affected by rcu stalls.

   * These rcu stalls can cause the system to crash/dump.

  [Fix]

   * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use
  non-quiescing sske for KVM switch to keyed guest"

   * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap:
  voluntarily schedule during key setting"

  [Test Plan]

   * There is no trigger or direct test or re-creation of the 
 problem situation possible, but...

   * and IBM z13 or LinuxONE (or never) LPAR is needed that
 runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
 and acts as KVM host with again several large guests running
 on top with storage groups.

   * Let such a system running for days under significant load
 and watch the logs for rcu issues.

   * Prior to the submission of this SRU patched test kernels
 for focal 5.4 and bionic hwe-5.4 were created and tested.
 They ran for days at a staging environemnt at IBM
 without further issues.

   * The modifications are all limited to s390x.

   * A test kernel was build (see below) that ran in a test environment
 at IBM under appropriate load for several days.

  [Where problems could occur]

   * Due to the change for the KVM switch to keyed guest
 from classic sske to non-quiescing sske
 the KVM behaviour might have changed and the storage keys harmed.

   * The now more generous scheduling while setting keys
 has an impact on the guest memory management and mapping
 which will lead to a different performance.

   * This, with the introduction of __s390_enable_skey_pmd and
 cond_resched, might increase the overhead in certain situations,
 but eventually improves the responsiveness over time,
 hence avoid rcu stalls.

  [Other Info]
   
   * Since the patches are upstream in 5.19-rc1,
 they will be included in the kernel that is planned for kinetic (5.19).

   * Hence this is an SRU to jammy, impish and focal.

  __

  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60

  Contact Information = cborn...@de.ibm.com

  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-06-10 Thread Frank Heimes
SRU request submitted to the Ubuntu kernel team mailing list for jammy, impish 
and focal:
https://lists.ubuntu.com/archives/kernel-team/2022-June/thread.html#130934
Changing status to 'In Progress' for jammy, impish and focal.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  In Progress
Status in linux source package in Impish:
  In Progress
Status in linux source package in Jammy:
  In Progress

Bug description:
  SRU Justification:
  ==

  [Impact]

   * Ubuntu on s390x KVM environments with lots of large guests with storage
 keys can be affected by rcu stalls.

   * These rcu stalls can cause the system to crash/dump.

  [Fix]

   * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use
  non-quiescing sske for KVM switch to keyed guest"

   * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap:
  voluntarily schedule during key setting"

  [Test Plan]

   * There is no trigger or direct test or re-creation of the 
 problem situation possible, but...

   * and IBM z13 or LinuxONE (or never) LPAR is needed that
 runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
 and acts as KVM host with again several large guests running
 on top with storage groups.

   * Let such a system running for days under significant load
 and watch the logs for rcu issues.

   * Prior to the submission of this SRU patched test kernels
 for focal 5.4 and bionic hwe-5.4 were created and tested.
 They ran for days at a staging environemnt at IBM
 without further issues.

   * The modifications are all limited to s390x.

   * A test kernel was build (see below) that ran in a test environment
 at IBM under appropriate load for several days.

  [Where problems could occur]

   * Due to the change for the KVM switch to keyed guest
 from classic sske to non-quiescing sske
 the KVM behaviour might have changed and the storage keys harmed.

   * The now more generous scheduling while setting keys
 has an impact on the guest memory management and mapping
 which will lead to a different performance.

   * This, with the introduction of __s390_enable_skey_pmd and
 cond_resched, might increase the overhead in certain situations,
 but eventually improves the responsiveness over time,
 hence avoid rcu stalls.

  [Other Info]
   
   * Since the patches are upstream in 5.19-rc1,
 they will be included in the kernel that is planned for kinetic (5.19).

   * Hence this is an SRU to jammy, impish and focal.

  __

  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60

  Contact Information = cborn...@de.ibm.com

  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-06-10 Thread Frank Heimes
** Also affects: linux (Ubuntu Impish)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu)
   Status: In Progress => Invalid

** Changed in: linux (Ubuntu Focal)
   Status: New => In Progress

** Changed in: linux (Ubuntu Impish)
   Status: New => In Progress

** Changed in: linux (Ubuntu Jammy)
   Status: New => In Progress

** Description changed:

+ SRU Justification:
+ ==
+ 
+ [Impact]
+ 
+  * Ubuntu on s390x KVM environments with lots of large guests with storage
+keys can be affected by rcu stalls.
+ 
+  * These rcu stalls can cause the system to crash/dump.
+ 
+ [Fix]
+ 
+  * 3ae11dbcfac9 3ae11dbcfac906a8c3a480e98660a823130dc16a "s390/mm: use
+ non-quiescing sske for KVM switch to keyed guest"
+ 
+  * 6d5946274df1 6d5946274df1fff539a7eece458a43be733d1db8 "s390/gmap:
+ voluntarily schedule during key setting"
+ 
+ [Test Plan]
+ 
+  * There is no trigger or direct test or re-creation of the 
+problem situation possible, but...
+ 
+  * and IBM z13 or LinuxONE (or never) LPAR is needed that
+runs Ubuntu Server 20.04 LTS or 18.04 LTS with HWE kernel
+and acts as KVM host with again several large guests running
+on top with storage groups.
+ 
+  * Let such a system running for days under significant load
+and watch the logs for rcu issues.
+ 
+  * Prior to the submission of this SRU patched test kernels
+for focal 5.4 and bionic hwe-5.4 were created and tested.
+They ran for days at a staging environemnt at IBM
+without further issues.
+ 
+  * The modifications are all limited to s390x.
+ 
+  * A test kernel was build (see below) that ran in a test environment
+at IBM under appropriate load for several days.
+ 
+ [Where problems could occur]
+ 
+  * Due to the change for the KVM switch to keyed guest
+from classic sske to non-quiescing sske
+the KVM behaviour might have changed and the storage keys harmed.
+ 
+  * The now more generous scheduling while setting keys
+has an impact on the guest memory management and mapping
+which will lead to a different performance.
+ 
+  * This, with the introduction of __s390_enable_skey_pmd and
+cond_resched, might increase the overhead in certain situations,
+but eventually improves the responsiveness over time,
+hence avoid rcu stalls.
+ 
+ [Other Info]
+  
+  * Since the patches are upstream in 5.19-rc1,
+they will be included in the kernel that is planned for kinetic (5.19).
+ 
+  * Hence this is an SRU to jammy, impish and focal.
+ 
+ __
+ 
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:
  
  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
- [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
+ [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
- [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
- [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
- [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
- [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
- [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
- [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
- [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
- [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
- [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
- [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
- [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 
+ [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100
+ [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100
+ [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980
+ [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80
+ [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70
+ [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0
+ [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0
+ [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0
+ [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0
+ [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134
+ [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60
  
-  
- 

[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-06-07 Thread Frank Heimes
Another issue was just reported which looks very similar: LP#1977837
I asked to re-try the cert. suite with the same test kernel that was build for 
this ticket.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-05-24 Thread Frank Heimes
A focal 5.4 and a bionic hwe-5.4 kernel are now build for testing at PPA:
https://launchpad.net/~fheimes/+archive/ubuntu/lp1975582
respectively:
https://launchpad.net/~fheimes/+archive/ubuntu/lp1975582/+packages

They both include (top-most commits):
"s390/pgtable: use non-quiescing sske for KVM switch to keyed"
"s390/gmap: voluntarily schedule during key setting"
"KVM: s390: vsie/gmap: reduce gmap_rmap overhead"
"NFS: Fix up nfs_ctx_key_to_expire()"

** Changed in: linux (Ubuntu)
   Status: New => In Progress

** Changed in: ubuntu-z-systems
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-05-24 Thread Frank Heimes
Okay, happy to do that.
I'll kick off a test build in PPA soon.
Will do focal 5.4 and bionic hwe-5.4.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-05-24 Thread Frank Heimes
Hi, I have some questions on these patches:
Looks to me that these are not upstream, yet?
At least I couldn't find them in 'linux-next'.
We actually would need to have them upstream first, before we can pick them up.
Is it planned to get them upstream accepted? And if so with which kernel 
version?
And would it make sense tagging these as stable updates?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-05-24 Thread Frank Heimes
** Changed in: ubuntu-z-systems
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: ubuntu-z-systems
 Assignee: (unassigned) => Skipper Bug Screeners (skipper-screen-team)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975582] Re: [UBUNTU 20.04] rcu stalls with many storage key guests

2022-05-24 Thread Frank Heimes
** Also affects: ubuntu-z-systems
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1975582

Title:
  [UBUNTU 20.04] rcu stalls with many storage key guests

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  There can be rcu stalls when running lots of large guests with storage keys:

  [1377614.579833] rcu: INFO: rcu_sched self-detected stall on CPU
  [1377614.579845] rcu:   18-: (2099 ticks this GP) 
idle=54e/1/0x4002 softirq=35598716/35598716 fqs=998 
  [1377614.579895](t=2100 jiffies g=155867385 q=20879)
  [1377614.579898] Task dump for CPU 18:
  [1377614.579899] CPU 1/KVM   R  running task0 1030947 256019 
0x0604
  [1377614.579902] Call Trace:
  [1377614.579912] ([<001f1f4b4f52>] show_stack+0x7a/0xc0)
  [1377614.579918]  [<001f1ec8e96c>] sched_show_task.part.0+0xdc/0x100 
  [1377614.579919]  [<001f1f4b7248>] rcu_dump_cpu_stacks+0xc0/0x100 
  [1377614.579924]  [<001f1ecdd10c>] rcu_sched_clock_irq+0x75c/0x980 
  [1377614.579926]  [<001f1eceb26c>] update_process_times+0x3c/0x80 
  [1377614.579931]  [<001f1ecfcfea>] tick_sched_handle.isra.0+0x4a/0x70 
  [1377614.579932]  [<001f1ecfd28e>] tick_sched_timer+0x5e/0xc0 
  [1377614.579933]  [<001f1ecec294>] __hrtimer_run_queues+0x114/0x2f0 
  [1377614.579935]  [<001f1ececfdc>] hrtimer_interrupt+0x12c/0x2a0 
  [1377614.579938]  [<001f1ebecb6a>] do_IRQ+0xaa/0xb0 
  [1377614.579942]  [<001f1f4c6d08>] ext_int_handler+0x130/0x134 
  [1377614.579945]  [<001f1ec0af10>] ptep_zap_key+0x40/0x60 

   
  Contact Information = cborn...@de.ibm.com 
   
  ---uname output---
   RELEASE: 5.4.0-90-generic
   VERSION: #101-Ubuntu SMP Fri Oct 15 19:59:45 UTC 2021

  == Comment: #1 - Christian Borntraeger  - 2022-05-24 
03:59:37 ==
  This is a test patch that might address the rcu stalls.

  == Comment: #2 - Christian Borntraeger  - 2022-05-24 
04:00:22 ==
  This is a 2nd patch that reduces the cost of key setting.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1975582/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp