[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
@xypron I know, https://rvspace.org/en/project/JH7110_Upstream_Plan Can't wait until everything is green on this site :) Good to know that NVMe support is working correctly. I was very surprised to see that an external GPU worked "out of the box" with Ubuntu 23.10. But then it crashed a lot :( -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-riscv in Ubuntu. https://bugs.launchpad.net/bugs/2039782 Title: Stalls on CPUs/tasks on VisionFive 2 with external GPU Status in linux-riscv package in Ubuntu: Confirmed Bug description: I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B board using https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live- server-riscv64.img.gz I have connected an Nvidia GT710 graphics card to the NVMe connector and see rcu_sched stalls. I have not observed this behavior on StarFive VisionFive 2 1.3B boards without an external GPU. The U-Boot installed on SPI flash is https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb [ 93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 softirq=2431/2431 fqs=41 [ 93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 ncpus=4) [ 93.140996] Task dump for CPU 0: [ 93.149549] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x [ 93.164907] Call Trace: [ 93.172715] [] __schedule+0x27a/0x82e [ 93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 93.200202] rcu: Possible timer handling issue on cpu=0 timer-softirq=890 [ 93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0 [ 93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 93.243573] rcu: RCU grace-period kthread stack dump: [ 93.254522] task:rcu_sched state:R stack:0 pid:15ppid:2 flags:0x [ 93.268895] Call Trace: [ 93.277340] [] __schedule+0x27a/0x82e [ 93.288646] [] schedule+0x4e/0xde [ 93.299623] [] schedule_timeout+0x8c/0x15e [ 93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4 [ 93.323170] [] rcu_gp_kthread+0x11a/0x142 [ 93.334901] [] kthread+0xc4/0xe4 [ 93.345833] [] ret_from_fork+0xe/0x20 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
I have the same problem [ 14.873826] Console: switching to colour frame buffer device 240x67 [ 15.120436] radeon 0001:01:00.0: [drm] fb0: radeondrmfb frame buffer device [ 75.138845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 75.145779] rcu: 0-...0: (2 ticks this GP) idle=ae24/1/0x4002 softirq=881/882 fqs=7501 [ 75.156363] rcu: hardirqs softirqs csw/system [ 75.162878] rcu: number: 17595499 00 [ 75.169396] rcu: cputime:0 00 ==> 30016(ms) [ 75.177538] rcu: (detected by 3, t=15011 jiffies, g=121, q=273 ncpus=4) [ 75.185375] Task dump for CPU 0: [ 75.189151] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x0008 [ 75.200753] Call Trace: [ 75.203615] [] __schedule+0x27a/0x82e Timed out for waiting the udev queue being empty. [ 198.633202] watchdog: Watchdog detected hard LOCKUP on cpu 0 [ 198.639832] Modules linked in: radeon(+) hid_generic usbhid hid motorcomm video drm_suballoc_helper i2c_algo_bit drm_ttm_helper ttm drm_display_helper dwmac_starfive stmmac_platform cec rc_core stmmac drm_kms_helper drm axp20x_regulator pcs_xpcs xhci_pci dw_mmc_starfive dw_mmc_pltfm phylink backlight xhci_pci_renesas pinctrl_starfive_jh7110_aon dw_mmc clk_starfive_jh7110_aon axp20x_i2c jh7110_trng clk_starfive_jh7110_isp clk_starfive_jh7110_vout axp20x spi_cadence_quadspi phy_jh7110_usb [ 242.654895] INFO: task kworker/1:2:83 blocked for more than 120 seconds. [ 242.662754] Not tainted 6.5.0-10-generic #10.1-Ubuntu [ 242.669286] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.678454] task:kworker/1:2 state:D stack:0 pid:83ppid:2 flags:0x [ 242.688246] Workqueue: events output_poll_execute [drm_kms_helper] [ 242.695736] Call Trace: [ 242.698600] [] __schedule+0x27a/0x82e [ 242.704733] [] schedule+0x4e/0xde [ 242.710452] [] schedule_preempt_disabled+0x18/0x20 [ 242.717901] [] __mutex_lock.constprop.0+0x3ce/0x6e0 [ 242.725452] [] __mutex_lock_slowpath+0x1a/0x26 [ 242.732495] [] mutex_lock+0x48/0x58 [ 242.738419] [] drm_client_dev_hotplug+0x7c/0x10a [drm] [ 242.746934] [] output_poll_execute+0x1e2/0x21c [drm_kms_helper] [ 242.755890] [] process_one_work+0x1dc/0x3b4 [ 242.762628] [] worker_thread+0x88/0x456 [ 242.768959] [] kthread+0xc4/0xe4 [ 242.774576] [] ret_from_fork+0xe/0x20 [ 242.780717] INFO: task (udev-worker):117 blocked for more than 120 seconds. [ 242.788875] Not tainted 6.5.0-10-generic #10.1-Ubuntu [ 242.795410] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.804577] task:(udev-worker) state:D stack:0 pid:117 ppid:111 flags:0x0006 [ 242.814360] Call Trace: [ 242.817230] [] __schedule+0x27a/0x82e [ 242.823366] [] schedule+0x4e/0xde [ 242.829088] [] schedule_timeout+0x128/0x15e [ 242.835831] [] __wait_for_common+0x17c/0x24a [ 242.897655] [] wait_for_completion+0x26/0x36 [ 242.959630] [] __wait_rcu_gp+0xec/0x192 [ 243.021222] [] synchronize_rcu+0x118/0x126 [ 243.083262] [] __sysrq_swap_key_ops+0xa2/0xf4 [ 243.145935] [] register_sysrq_key+0x1a/0x26 [ 243.208817] [] __drm_fb_helper_initial_config_and_unlock+0x1ae/0x24a [drm_kms_helper] [ 243.276529] [] drm_fb_helper_initial_config+0x3a/0x46 [drm_kms_helper] [ 243.342843] [] radeon_fbdev_client_hotplug+0xb4/0xba [radeon] [ 243.410095] [] drm_client_register+0x4c/0x92 [drm] [ 243.475748] [] radeon_fbdev_setup+0xaa/0xf8 [radeon] [ 243.542662] [] radeon_pci_probe+0xe8/0x158 [radeon] [ 243.609846] [] local_pci_probe+0x40/0x88 [ 243.675177] [] pci_call_probe+0x60/0x17a [ 243.740214] [] pci_device_probe+0x7c/0xdc [ 243.804828] [] call_driver_probe+0x22/0x142 [ 243.869782] [] really_probe+0x9a/0x2a2 [ 243.934343] [] __driver_probe_device+0x7e/0x146 [ 243.999560] [] driver_probe_device+0x38/0xd0 [ 244.064017] [] __driver_attach+0xee/0x1e8 [ 244.128136] [] bus_for_each_dev+0x6c/0xc4 [ 244.192404] [] driver_attach+0x26/0x34 [ 244.256463] [] bus_add_driver+0x112/0x21e [ 244.321058] [] driver_register+0x52/0x106 [ 244.385902] [] __pci_register_driver+0x4c/0x60 [ 244.451082] [] radeon_module_init+0x6c/0x1000 [radeon] [ 244.518090] [] do_one_initcall+0x5c/0x1e2 [ 244.582278] [] do_init_module+0x5e/0x224 [ 244.645610] [] load_module+0x7b4/0x8de [ 244.708029] [] init_module_from_file+0x82/0xd4 [ 244.771181] [] sys_finit_module+0x194/0x330 [ 244.834148] [] do_trap_ecall_u+0xd6/0x154 [ 244.896674] [] ret_from_exception+0x0/0x64 [ 255.222845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 255.285167] rcu: 0-...0: (2 ticks this GP) idle=ae24/1/0x4002 softirq=881/882 fqs=2 [ 255.351454] rcu: hardirqs softirqs csw/system [ 255.414129] rcu: number: 122510523 00 [ 255.476719] rcu: cputime:0 00 ==> 210144(ms) [ 255