Bug#995997: linux-image-5.14.0-2-amd64: Apps issue lots of clock_gettime calls on 5.14, performance drop in some cases

2021-10-09 Thread Vladimir K
Forcing clocksource back to tsc with tsc=nowatchdog fixes the performance 
regression.



Bug#995997: linux-image-5.14.0-2-amd64: Apps issue lots of clock_gettime calls on 5.14, performance drop in some cases.

2021-10-09 Thread Vladimir K
Seems to be related: 5.14 chooses hpet clocksource, while 5.10 uses tsc:


kernel: Linux version 5.10.0-8-amd64 (debian-ker...@lists.debian.org) (gcc-10 
(Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 
SMP Debian 5.10.46-4 (2021-08-03)
kernel: tsc: Fast TSC calibration using PIT
kernel: tsc: Detected 2595.027 MHz processor
kernel: clocksource: refined-jiffies: mask: 0x max_cycles: 0x, 
max_idle_ns: 7645519600211568 ns
kernel: clocksource: hpet: mask: 0x max_cycles: 0x, 
max_idle_ns: 133484873504 ns
kernel: clocksource: tsc-early: mask: 0x max_cycles: 
0x2567e268986, max_idle_ns: 440795272522 ns
kernel: clocksource: jiffies: mask: 0x max_cycles: 0x, 
max_idle_ns: 764504178510 ns
kernel: hpet0: at MMIO 0xfed0, IRQs 2, 8, 0
kernel: hpet0: 3 comparators, 32-bit 14.318180 MHz counter
kernel: clocksource: Switched to clocksource tsc-early
kernel: clocksource: acpi_pm: mask: 0xff max_cycles: 0xff, max_idle_ns: 
2085701024 ns
kernel: rtc_cmos 00:01: setting system clock to 2021-10-08T21:35:56 UTC 
(1633728956)
kernel: rtc_cmos 00:01: alarms up to one month, 114 bytes nvram, hpet irqs
kernel: sched_clock: Marking stable (940089949, 397183)->(945335255, -4848123)
kernel: tsc: Refined TSC clocksource calibration: 2615.448 MHz
kernel: clocksource: tsc: mask: 0x max_cycles: 0x25b33d90d1e, 
max_idle_ns: 440795316202 ns
kernel: clocksource: Switched to clocksource tsc




kernel: Linux version 5.14.0-2-amd64 (debian-ker...@lists.debian.org) (gcc-10 
(Debian 10.3.0-11) 10.3.0, GNU ld (GNU Binutils for Debian) 2.37) #1 SMP Debian 
5.14.9-2 (2021-10-03)
kernel: tsc: Fast TSC calibration using PIT
kernel: tsc: Detected 2595.027 MHz processor
kernel: clocksource: refined-jiffies: mask: 0x max_cycles: 0x, 
max_idle_ns: 7645519600211568 ns
kernel: clocksource: hpet: mask: 0x max_cycles: 0x, 
max_idle_ns: 133484873504 ns
kernel: clocksource: tsc-early: mask: 0x max_cycles: 
0x2567e268986, max_idle_ns: 440795272522 ns
kernel: clocksource: jiffies: mask: 0x max_cycles: 0x, 
max_idle_ns: 764504178510 ns
kernel: hpet0: at MMIO 0xfed0, IRQs 2, 8, 0
kernel: hpet0: 3 comparators, 32-bit 14.318180 MHz counter
kernel: clocksource: Switched to clocksource tsc-early
kernel: clocksource: acpi_pm: mask: 0xff max_cycles: 0xff, max_idle_ns: 
2085701024 ns
kernel: rtc_cmos 00:01: setting system clock to 2021-10-08T21:38:57 UTC 
(1633729137)
kernel: rtc_cmos 00:01: alarms up to one month, 114 bytes nvram, hpet irqs
kernel: sched_clock: Marking stable (1012272366, 406541)->(1016715343, -4036436)
kernel: tsc: Refined TSC clocksource calibration: 2615.168 MHz
kernel: clocksource: tsc: mask: 0x max_cycles: 0x25b234d5bf1, 
max_idle_ns: 440795256376 ns
kernel: clocksource: Switched to clocksource tsc
kernel: clocksource: timekeeping watchdog on CPU3: Marking clocksource 'tsc' as 
unstable because the skew is too large:
kernel: clocksource:   'hpet' wd_nsec: 515995748 wd_now: 
1bec100 wd_last: 14e0528 mask: 
kernel: clocksource:   'tsc' cs_nsec: 512040924 cs_now: 
547cae902 cs_last: 4f7fa46c4 mask: 
kernel: clocksource:   'tsc' is current clocksource.
kernel: tsc: Marking TSC unstable due to clocksource watchdog
kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 
'tsc=unstable'.
kernel: sched_clock: Marking unstable (2098965976, 406565)<-(2103408384, 
-4036436)
kernel: clocksource: Checking clocksource tsc synchronization from CPU 1 to 
CPUs 0,2,4,6.
kernel: clocksource: Switched to clocksource hpet



Bug#995997: linux-image-5.14.0-2-amd64: Apps issue lots of clock_gettime calls on 5.14, performance drop in some cases.

2021-10-09 Thread Vladimir K
Package: src:linux
Version: 5.14.9-2
Severity: normal

Dear Maintainer, after upgrade from 5.10 to 5.14 I noticed severe pefrormance 
regression when running
GZDoom with content above some threshold of complexity.
More info here: https://github.com/coelckers/gzdoom/issues/1479
It seems to be AMD-specific and not GPU-related.

The gist of the symptoms: judging by strace output, on 5.14 applications (not 
just gzdoom) issue lots of
clock_gettime(CLOCK_MONOTONIC,... calls whereas on 5.10 they do not.
In particular case of gzdoom when the game content crosses some degree of 
complexity,
gzdoom process does almost nothing but flooding clock_gettime(), and 
performance drops to standstill.

>From strace
...
clock_gettime(CLOCK_MONOTONIC, {tv_sec=55, tv_nsec=7221346}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=55, tv_nsec=7248727}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=55, tv_nsec=7275614}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=55, tv_nsec=7303488}) = 0
...

This does not happen on 5.10 at all.

-- Package-specific info:
** Version:
Linux version 5.14.0-2-amd64 (debian-ker...@lists.debian.org) (gcc-10 (Debian 
10.3.0-11) 10.3.0, GNU ld (GNU Binutils for Debian) 2.37) #1 SMP Debian 
5.14.9-2 (2021-10-03)

** Command line:
BOOT_IMAGE=/@/boot/vmlinuz-5.14.0-2-amd64 
root=UUID=859ba516-d84d-458c-ac5c-979ce55ca125 ro rootflags=subvol=@ quiet

** Not tainted

** Kernel log:
[21847.948105] ACPI: \_SB_.PLTF.C003: Found 3 idle states
[21847.948383] CPU3 is up
[21847.948415] smpboot: Booting Node 0 Processor 4 APIC 0x4
[21847.948017] microcode: CPU4: patch_level=0x08608103
[21847.949236] ACPI: \_SB_.PLTF.C004: Found 3 idle states
[21847.949648] CPU4 is up
[21847.949681] smpboot: Booting Node 0 Processor 5 APIC 0x5
[21847.949101] microcode: CPU5: patch_level=0x08608103
[21847.949947] ACPI: \_SB_.PLTF.C005: Found 3 idle states
[21847.950282] CPU5 is up
[21847.950314] smpboot: Booting Node 0 Processor 6 APIC 0x6
[21847.949854] microcode: CPU6: patch_level=0x08608103
[21847.950691] ACPI: \_SB_.PLTF.C006: Found 3 idle states
[21847.951321] CPU6 is up
[21847.951381] smpboot: Booting Node 0 Processor 7 APIC 0x7
[21847.950992] microcode: CPU7: patch_level=0x08608103
[21847.951672] ACPI: \_SB_.PLTF.C007: Found 3 idle states
[21847.952095] CPU7 is up
[21847.952988] ACPI: PM: Waking up from system sleep state S3
[21847.957364] ACPI: EC: interrupt unblocked
[21848.004289] ACPI: EC: event unblocked
[21848.004709] pci :00:00.2: can't derive routing for PCI INT A
[21848.004715] pci :00:00.2: PCI INT A: no GSI
[21848.004776] [drm] PCIE GART of 1024M enabled.
[21848.004779] [drm] PTB located at 0x00F40090
[21848.004795] [drm] PSP is resuming...
[21848.024666] [drm] reserve 0x40 from 0xf47f80 for PSP TMR
[21848.034112] nvme nvme1: 16/0/0 default/read/poll queues
[21848.105569] nvme nvme0: 16/0/0 default/read/poll queues
[21848.133966] amdgpu :06:00.0: amdgpu: RAS: optional ras ta ucode is not 
available
[21848.144289] amdgpu :06:00.0: amdgpu: RAP: optional rap ta ucode is not 
available
[21848.144293] amdgpu :06:00.0: amdgpu: SECUREDISPLAY: securedisplay ta 
ucode is not available
[21848.144298] amdgpu :06:00.0: amdgpu: SMU is resuming...
[21848.144345] amdgpu :06:00.0: amdgpu: dpm has been disabled
[21848.145269] amdgpu :06:00.0: amdgpu: SMU is resumed successfully!
[21848.146596] [drm] kiq ring mec 2 pipe 1 q 0
[21848.147535] [drm] DMUB hardware initialized: version=0x01010019
[21848.199782] ACPI: \: failed to evaluate _DSM (0x1001)
[21848.199787] ACPI: \: failed to evaluate _DSM (0x1001)
[21848.199788] ACPI: \: failed to evaluate _DSM (0x1001)
[21848.199788] ACPI: \: failed to evaluate _DSM (0x1001)
[21848.259234] usb 3-3: reset high-speed USB device number 3 using xhci_hcd
[21848.261103] usb 1-3: reset high-speed USB device number 3 using xhci_hcd
[21848.319629] ata1: SATA link down (SStatus 0 SControl 300)
[21848.323298] ata2: SATA link down (SStatus 0 SControl 300)
[21848.541531] usb 3-4: reset full-speed USB device number 4 using xhci_hcd
[21848.545081] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
[21848.545134] [drm] JPEG decode initialized successfully.
[21848.545332] amdgpu :06:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[21848.545334] amdgpu :06:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
[21848.545336] amdgpu :06:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
[21848.545336] amdgpu :06:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 
on hub 0
[21848.545337] amdgpu :06:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 
on hub 0
[21848.545338] amdgpu :06:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 
on hub 0
[21848.545339] amdgpu :06:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 
on hub 0
[21848.545339] amdgpu :06:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 
on hub 0
[21848.545340] amdgpu :06:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 
on hub 0