To: <fill-from-perl scripts/get_maintainer.pl --bug -f ... output> Cc: [[email protected]](mailto:[email protected]), [[email protected]](mailto:[email protected])
Hello DRM/vkms maintainers, We observed repeated RCU stalls during syzkaller-style fuzz testing on an upstream Linux kernel. With PREEMPT(full) enabled, the kernel reports that the rcu_preempt grace-period kthread is starved for >10 seconds while multiple CPUs spin in native_queued_spin_lock_slowpath() in DRM-related paths (vblank handling, drm_ioctl, and DRM file teardown). This appears to be lock contention/livelock under adversarial workloads that prevents RCU GP progress. Environment: * Kernel: 6.18.0 (locally built from upstream) * Config: PREEMPT(full) * Arch: x86_64 * Hardware: QEMU Standard PC (i440FX + PIIX) * Workload: syz-executor (syzkaller-style fuzzing) Observed symptom: * "INFO: rcu_preempt detected stalls on CPUs/tasks" * "rcu_preempt kthread timer wakeup didn't happen for ~10k jiffies" * "rcu_preempt kthread starved for over ~10k jiffies" Triggering context (representative): Task context (ioctl path): drm_ioctl drm_ioctl_kernel drm_mode_createblob_ioctl drm_property_create_blob __kvmalloc_node_noprof / __vmalloc_node_range_noprof IRQ context (vkms vblank simulation): native_queued_spin_lock_slowpath drm_handle_vblank vkms_vblank_simulate hrtimer_interrupt sysvec_apic_timer_interrupt Task context (file teardown path on some runs): native_queued_spin_lock_slowpath drm_file_free drm_close_helper drm_release __fput task_work_run RCU GP kthread: rcu_gp_fqs_loop rcu_gp_kthread Reproducer: No reliable standalone reproducer yet; the issue was observed multiple times under fuzzing workloads. We can provide full dmesg logs (including NMI backtraces), kernel config, and the fuzzing programs/executor logs upon request. Expected behavior: DRM ioctls, vblank handling, and file teardown should not lead to prolonged RCU stalls even under adversarial userspace workloads; the rcu_preempt GP kthread should be able to make progress. Actual behavior: RCU reports prolonged stalls, CPUs spin in native_queued_spin_lock_slowpath(), and the rcu_preempt grace period does not complete for >10 seconds. Thanks, Mingyu Wang
