Question - should the default be changed from KMS to FKMS for the impish release? It seems that freezing is severe enough either not to ship / or ship with a less than perfect default until more time can be devoted to investigate and resolve.
On Mon, 11 Oct 2021 at 15:50, Dave Jones <1946...@bugs.launchpad.net> wrote: > > To summarize the current findings after several days of testing: > > * In all monitors the freeze manifests the same, with the same dmesg output > * Some monitors, in particular it seems higher-frame-rate (100+Hz) monitors > (but by no means exclusive to these) trigger the freeze quite easily, often > before the login screen even appears > * Other monitors, in particular lower-frame-rate (60Hz) monitors do trigger > the freeze, but only after some activity > * A few monitor(s) (so far only a 60Hz one) don't trigger the freeze at all > (thanks to brian-murray for additional testing on this) > > Over the weekend I tested the 5.11.0-1019 and 5.11.0-1021 linux-raspi > kernels from hirsute, installing them onto the impish image. With my > (60Hz) monitor, I was unable to reproduce the crash with either of > these, but jawn-smith *did* reproduce the issue with the hirsute kernel > on his high-frame-rate monitor. > > In other words, even rolling all the way back to the hirsute kernel > won't eliminate the freeze for everyone. The good news is that fkms does > appear to reliably work around the issue (thanks for knoedelfan and > fprietog for the reports in #3 and #4), although that in itself comes > with its own issues (though none as severe as the display freezing). > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1946368 > > Title: > HDMI output freezes under current/proposed impish kernels > > Status in linux-raspi package in Ubuntu: > Confirmed > Status in linux-raspi source package in Impish: > Confirmed > > Bug description: > Under the current (5.13.0-1007.8) or proposed (5.13.0-1008.9) kernels > for the Ubuntu Pi pre-installed desktop impish release, the HDMI > output occasionally freezes. A known workaround at this time is to > change the following line in /boot/firmware/config.txt: > > dtoverlay=vc4-kms-v3d > > To the following: > > dtoverlay=vc4-fkms-v3d > > In other words, to use the "fake" KMS overlay (fkms) instead of the > "full" KMS overlay (kms). > > I've been unable to determine a reliable method of guaranteeing a > freeze, but it appears to occur much more readily when video playback > is occurring, and when other interactions (especially moving windows > around, minimizing, restoring) occurs simultaneously. Display suspend > also periodically causes the same hang, which made me suspect this > might be related to #1944397 but it appears that had a separate cause > (now resolved). > > The following dmesg outputs have been observed immediately after the > display hang; this one from 1007.8: > > [ 513.762138] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 513.762288] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] co > mmit wait timed out > [ 513.762381] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] > *ERROR* [CRTC:76:crtc-3] flip_ > done timed out > [ 524.002211] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 524.002404] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [PLANE:205:plane-25 > ] commit wait timed out > [ 534.242499] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 534.242657] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit > [ 534.250685] ------------[ cut here ]------------ > [ 534.250701] refcount_t: underflow; use-after-free. > [ 534.250735] WARNING: CPU: 1 PID: 120 at lib/refcount.c:87 > refcount_dec_not_one+0xa0/0xbc > [ 534.250758] Modules linked in: rfcomm cmac algif_hash algif_skcipher > af_alg hci_uart btqca btrtl btbcm > btintel bnep snd_soc_hdmi_codec vc4 btsdio snd_soc_core input_leds > bluetooth snd_compress snd_bcm2835(C) > snd_pcm_dmaengine ecdh_generic ecc snd_pcm brcmfmac snd_seq_midi > snd_seq_midi_event bcm2835_codec(C) bcm > 2835_isp(C) bcm2835_v4l2(C) brcmutil snd_rawmidi v4l2_mem2mem > bcm2835_mmal_vchiq(C) videobuf2_dma_contig > videobuf2_vmalloc cfg80211 videobuf2_memops videobuf2_v4l2 snd_seq > videobuf2_common videodev snd_seq_devi > ce mc snd_timer vc_sm_cma(C) raspberrypi_hwmon snd bcm2835_gpiomem > rpivid_mem uio_pdrv_genirq uio sch_fq_ > codel ip_tables x_tables autofs4 btrfs blake2b_generic xor xor_neon > zstd_compress hid_generic usbhid raid > 6_pq libcrc32c dm_mirror dm_region_hash dm_log spidev dwc2 v3d roles > udc_core gpu_sched crct10dif_ce i2c_ > brcmstb i2c_bcm2835 spi_bcm2835 drm_kms_helper syscopyarea xhci_pci > xhci_pci_renesas sysfillrect sysimgbl > t fb_sys_fops cec drm phy_generic ac97_bus aes_arm64 > [ 534.251066] CPU: 1 PID: 120 Comm: kworker/1:2 Tainted: G WC > 5.13.0-1007-raspi #8-Ubuntu > [ 534.251076] Hardware name: Raspberry Pi 400 Rev 1.1 (DT) > [ 534.251083] Workqueue: events drm_mode_rmfb_work_fn [drm] > [ 534.251239] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) > [ 534.251248] pc : refcount_dec_not_one+0xa0/0xbc > [ 534.251257] lr : refcount_dec_not_one+0xa0/0xbc > [ 534.251265] sp : ffff8000118cbb50 > [ 534.251269] x29: ffff8000118cbb50 x28: ffff6cf7ee894400 x27: > ffff6cf80502e000 > [ 534.251285] x26: ffff6cf80502e000 x25: 0000000000000006 x24: > ffff6cf7ee94c500 > [ 534.251300] x23: ffffa94dff246018 x22: ffff6cf833068880 x21: > ffff6cf805027c80 > [ 534.251314] x20: ffff6cf88ead75ac x19: ffff6cf88ead7400 x18: > 0000000000000000 > [ 534.251328] x17: 0000000000000000 x16: ffffa94e0a243314 x15: > 0000000000000000 > [ 534.251342] x14: 0000000000000000 x13: 0000000000000030 x12: > ffff800010035000 > [ 534.251356] x11: ffffa94e0b30dfd0 x10: 00000000fffff000 x9 : > ffffa94e09d09f54 > [ 534.251370] x8 : 00000000ffffefff x7 : ffffa94e0b30dfd0 x6 : > 0000000000000000 > [ 534.251384] x5 : ffff6cf8b799f948 x4 : 0000000000000000 x3 : > 0000000000000027 > [ 534.251397] x2 : 0000000000000000 x1 : 0000000000000000 x0 : > ffff6cf800abec80 > [ 534.251411] Call trace: > [ 534.251416] refcount_dec_not_one+0xa0/0xbc > [ 534.251424] vc4_bo_dec_usecnt+0x2c/0x120 [vc4] > [ 534.251473] vc4_cleanup_fb+0x3c/0x4c [vc4] > [ 534.251518] drm_atomic_helper_cleanup_planes+0x74/0xa0 [drm_kms_helper] > [ 534.251604] vc4_atomic_commit_tail+0x24c/0x36c [vc4] > [ 534.251648] commit_tail+0xac/0x190 [drm_kms_helper] > [ 534.251723] drm_atomic_helper_commit+0x168/0x380 [drm_kms_helper] > [ 534.251796] drm_atomic_commit+0x58/0x70 [drm] > [ 534.251936] atomic_remove_fb+0x2a8/0x2f4 [drm] > [ 534.252073] drm_framebuffer_remove+0x164/0x18c [drm] > [ 534.252209] drm_mode_rmfb_work_fn+0x50/0x70 [drm] > [ 534.252346] process_one_work+0x200/0x4d0 > [ 534.252359] worker_thread+0x2c8/0x470 > [ 534.252367] kthread+0x12c/0x140 > [ 534.252374] ret_from_fork+0x10/0x3c > [ 534.252386] ---[ end trace a97341262fc57e44 ]--- > > And a similar one from 1008.9 (note that most of the time, the stack > trace *doesn't* appear hence I'm not sure if it's related to the > display freeze itself, or something auxiliary): > > [ 221.914617] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] > *ERROR* [CRTC:76:crtc-3] flip_done timed out > [ 221.914617] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 221.914795] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out > [ 232.154711] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 232.154898] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit > > I can produce this same stack trace, but *without* a corresponding > freeze by manually locking the desktop (Super+L) and waiting for the > display fade. However, after the stack trace appears, one can press a > key to bring the display back and login again happily. Here are > several repeated traces from such activity under the 1008.9 proposed > kernel: > > [ 1043.431061] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] > *ERROR* [CRTC:76:crtc-3] flip_done timed out > [ 1043.431136] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1043.431384] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out > [ 1053.671415] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1053.671705] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [PLANE:70:plane-3] commit wait timed out > [ 1063.911800] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1063.912147] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit > [ 1181.162739] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] > *ERROR* [CRTC:76:crtc-3] flip_done timed out > [ 1181.418774] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1181.419072] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out > [ 1191.658996] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1191.659289] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [PLANE:70:plane-3] commit wait timed out > [ 1201.899168] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1201.899438] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit > [ 1332.461450] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] > *ERROR* [CRTC:76:crtc-3] flip_done timed out > [ 1332.461460] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1332.461717] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out > [ 1342.701602] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1342.701890] [drm:drm_atomic_helper_wait_for_dependencies > [drm_kms_helper]] *ERROR* [PLANE:70:plane-3] commit wait timed out > [ 1352.941798] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out > [ 1352.942067] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit > > Note that even when the display is frozen (and cannot be resurrected), > only the display driver appears to crash; the system remains > operational (at least for some time) happily responding to SSH login > requests or, if video is playing, continuing audio output. > > The same occurs with the current linux-firmware-raspi2 release (which > the above stack traces were taken from), or with the latest firmware > (installed from an experimental package in my > https://launchpad.net/~waveform/+archive/ubuntu/firmware PPA). > > To manage notifications about this bug go to: > https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/1946368/+subscriptions > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1946368 Title: HDMI output freezes under current/proposed impish kernels To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/1946368/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs