[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
@xypron I know, https://rvspace.org/en/project/JH7110_Upstream_Plan Can't wait until everything is green on this site :) Good to know that NVMe support is working correctly. I was very surprised to see that an external GPU worked "out of the box" with Ubuntu 23.10. But then it crashed a lot :( -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-riscv in Ubuntu. https://bugs.launchpad.net/bugs/2039782 Title: Stalls on CPUs/tasks on VisionFive 2 with external GPU Status in linux-riscv package in Ubuntu: Confirmed Bug description: I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B board using https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live- server-riscv64.img.gz I have connected an Nvidia GT710 graphics card to the NVMe connector and see rcu_sched stalls. I have not observed this behavior on StarFive VisionFive 2 1.3B boards without an external GPU. The U-Boot installed on SPI flash is https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb [ 93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 softirq=2431/2431 fqs=41 [ 93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 ncpus=4) [ 93.140996] Task dump for CPU 0: [ 93.149549] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x [ 93.164907] Call Trace: [ 93.172715] [] __schedule+0x27a/0x82e [ 93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 93.200202] rcu: Possible timer handling issue on cpu=0 timer-softirq=890 [ 93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0 [ 93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 93.243573] rcu: RCU grace-period kthread stack dump: [ 93.254522] task:rcu_sched state:R stack:0 pid:15ppid:2 flags:0x [ 93.268895] Call Trace: [ 93.277340] [] __schedule+0x27a/0x82e [ 93.288646] [] schedule+0x4e/0xde [ 93.299623] [] schedule_timeout+0x8c/0x15e [ 93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4 [ 93.323170] [] rcu_gp_kthread+0x11a/0x142 [ 93.334901] [] kthread+0xc4/0xe4 [ 93.345833] [] ret_from_fork+0xe/0x20 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
@opvolger Thank you for confirming the issue. The upstreaming of PCIe for the StarFive VisionFive 2 board is not finalized yet. The kernel team will revisit this issue once there is proper upstream support. They picked up what was available on the kernel list which brought us NVMe support for which I am grateful. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-riscv in Ubuntu. https://bugs.launchpad.net/bugs/2039782 Title: Stalls on CPUs/tasks on VisionFive 2 with external GPU Status in linux-riscv package in Ubuntu: Confirmed Bug description: I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B board using https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live- server-riscv64.img.gz I have connected an Nvidia GT710 graphics card to the NVMe connector and see rcu_sched stalls. I have not observed this behavior on StarFive VisionFive 2 1.3B boards without an external GPU. The U-Boot installed on SPI flash is https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb [ 93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 softirq=2431/2431 fqs=41 [ 93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 ncpus=4) [ 93.140996] Task dump for CPU 0: [ 93.149549] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x [ 93.164907] Call Trace: [ 93.172715] [] __schedule+0x27a/0x82e [ 93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 93.200202] rcu: Possible timer handling issue on cpu=0 timer-softirq=890 [ 93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0 [ 93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 93.243573] rcu: RCU grace-period kthread stack dump: [ 93.254522] task:rcu_sched state:R stack:0 pid:15ppid:2 flags:0x [ 93.268895] Call Trace: [ 93.277340] [] __schedule+0x27a/0x82e [ 93.288646] [] schedule+0x4e/0xde [ 93.299623] [] schedule_timeout+0x8c/0x15e [ 93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4 [ 93.323170] [] rcu_gp_kthread+0x11a/0x142 [ 93.334901] [] kthread+0xc4/0xe4 [ 93.345833] [] ret_from_fork+0xe/0x20 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
** Changed in: linux-riscv (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-riscv in Ubuntu. https://bugs.launchpad.net/bugs/2039782 Title: Stalls on CPUs/tasks on VisionFive 2 with external GPU Status in linux-riscv package in Ubuntu: Confirmed Bug description: I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B board using https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live- server-riscv64.img.gz I have connected an Nvidia GT710 graphics card to the NVMe connector and see rcu_sched stalls. I have not observed this behavior on StarFive VisionFive 2 1.3B boards without an external GPU. The U-Boot installed on SPI flash is https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb [ 93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 softirq=2431/2431 fqs=41 [ 93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 ncpus=4) [ 93.140996] Task dump for CPU 0: [ 93.149549] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x [ 93.164907] Call Trace: [ 93.172715] [] __schedule+0x27a/0x82e [ 93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 [ 93.200202] rcu: Possible timer handling issue on cpu=0 timer-softirq=890 [ 93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0 [ 93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 93.243573] rcu: RCU grace-period kthread stack dump: [ 93.254522] task:rcu_sched state:R stack:0 pid:15ppid:2 flags:0x [ 93.268895] Call Trace: [ 93.277340] [] __schedule+0x27a/0x82e [ 93.288646] [] schedule+0x4e/0xde [ 93.299623] [] schedule_timeout+0x8c/0x15e [ 93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4 [ 93.323170] [] rcu_gp_kthread+0x11a/0x142 [ 93.334901] [] kthread+0xc4/0xe4 [ 93.345833] [] ret_from_fork+0xe/0x20 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU
I have the same problem [ 14.873826] Console: switching to colour frame buffer device 240x67 [ 15.120436] radeon 0001:01:00.0: [drm] fb0: radeondrmfb frame buffer device [ 75.138845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 75.145779] rcu: 0-...0: (2 ticks this GP) idle=ae24/1/0x4002 softirq=881/882 fqs=7501 [ 75.156363] rcu: hardirqs softirqs csw/system [ 75.162878] rcu: number: 17595499 00 [ 75.169396] rcu: cputime:0 00 ==> 30016(ms) [ 75.177538] rcu: (detected by 3, t=15011 jiffies, g=121, q=273 ncpus=4) [ 75.185375] Task dump for CPU 0: [ 75.189151] task:swapper/0 state:R running task stack:0 pid:0 ppid:0 flags:0x0008 [ 75.200753] Call Trace: [ 75.203615] [] __schedule+0x27a/0x82e Timed out for waiting the udev queue being empty. [ 198.633202] watchdog: Watchdog detected hard LOCKUP on cpu 0 [ 198.639832] Modules linked in: radeon(+) hid_generic usbhid hid motorcomm video drm_suballoc_helper i2c_algo_bit drm_ttm_helper ttm drm_display_helper dwmac_starfive stmmac_platform cec rc_core stmmac drm_kms_helper drm axp20x_regulator pcs_xpcs xhci_pci dw_mmc_starfive dw_mmc_pltfm phylink backlight xhci_pci_renesas pinctrl_starfive_jh7110_aon dw_mmc clk_starfive_jh7110_aon axp20x_i2c jh7110_trng clk_starfive_jh7110_isp clk_starfive_jh7110_vout axp20x spi_cadence_quadspi phy_jh7110_usb [ 242.654895] INFO: task kworker/1:2:83 blocked for more than 120 seconds. [ 242.662754] Not tainted 6.5.0-10-generic #10.1-Ubuntu [ 242.669286] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.678454] task:kworker/1:2 state:D stack:0 pid:83ppid:2 flags:0x [ 242.688246] Workqueue: events output_poll_execute [drm_kms_helper] [ 242.695736] Call Trace: [ 242.698600] [] __schedule+0x27a/0x82e [ 242.704733] [] schedule+0x4e/0xde [ 242.710452] [] schedule_preempt_disabled+0x18/0x20 [ 242.717901] [] __mutex_lock.constprop.0+0x3ce/0x6e0 [ 242.725452] [] __mutex_lock_slowpath+0x1a/0x26 [ 242.732495] [] mutex_lock+0x48/0x58 [ 242.738419] [] drm_client_dev_hotplug+0x7c/0x10a [drm] [ 242.746934] [] output_poll_execute+0x1e2/0x21c [drm_kms_helper] [ 242.755890] [] process_one_work+0x1dc/0x3b4 [ 242.762628] [] worker_thread+0x88/0x456 [ 242.768959] [] kthread+0xc4/0xe4 [ 242.774576] [] ret_from_fork+0xe/0x20 [ 242.780717] INFO: task (udev-worker):117 blocked for more than 120 seconds. [ 242.788875] Not tainted 6.5.0-10-generic #10.1-Ubuntu [ 242.795410] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.804577] task:(udev-worker) state:D stack:0 pid:117 ppid:111 flags:0x0006 [ 242.814360] Call Trace: [ 242.817230] [] __schedule+0x27a/0x82e [ 242.823366] [] schedule+0x4e/0xde [ 242.829088] [] schedule_timeout+0x128/0x15e [ 242.835831] [] __wait_for_common+0x17c/0x24a [ 242.897655] [] wait_for_completion+0x26/0x36 [ 242.959630] [] __wait_rcu_gp+0xec/0x192 [ 243.021222] [] synchronize_rcu+0x118/0x126 [ 243.083262] [] __sysrq_swap_key_ops+0xa2/0xf4 [ 243.145935] [] register_sysrq_key+0x1a/0x26 [ 243.208817] [] __drm_fb_helper_initial_config_and_unlock+0x1ae/0x24a [drm_kms_helper] [ 243.276529] [] drm_fb_helper_initial_config+0x3a/0x46 [drm_kms_helper] [ 243.342843] [] radeon_fbdev_client_hotplug+0xb4/0xba [radeon] [ 243.410095] [] drm_client_register+0x4c/0x92 [drm] [ 243.475748] [] radeon_fbdev_setup+0xaa/0xf8 [radeon] [ 243.542662] [] radeon_pci_probe+0xe8/0x158 [radeon] [ 243.609846] [] local_pci_probe+0x40/0x88 [ 243.675177] [] pci_call_probe+0x60/0x17a [ 243.740214] [] pci_device_probe+0x7c/0xdc [ 243.804828] [] call_driver_probe+0x22/0x142 [ 243.869782] [] really_probe+0x9a/0x2a2 [ 243.934343] [] __driver_probe_device+0x7e/0x146 [ 243.999560] [] driver_probe_device+0x38/0xd0 [ 244.064017] [] __driver_attach+0xee/0x1e8 [ 244.128136] [] bus_for_each_dev+0x6c/0xc4 [ 244.192404] [] driver_attach+0x26/0x34 [ 244.256463] [] bus_add_driver+0x112/0x21e [ 244.321058] [] driver_register+0x52/0x106 [ 244.385902] [] __pci_register_driver+0x4c/0x60 [ 244.451082] [] radeon_module_init+0x6c/0x1000 [radeon] [ 244.518090] [] do_one_initcall+0x5c/0x1e2 [ 244.582278] [] do_init_module+0x5e/0x224 [ 244.645610] [] load_module+0x7b4/0x8de [ 244.708029] [] init_module_from_file+0x82/0xd4 [ 244.771181] [] sys_finit_module+0x194/0x330 [ 244.834148] [] do_trap_ecall_u+0xd6/0x154 [ 244.896674] [] ret_from_exception+0x0/0x64 [ 255.222845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 255.285167] rcu: 0-...0: (2 ticks this GP) idle=ae24/1/0x4002 softirq=881/882 fqs=2 [ 255.351454] rcu: hardirqs softirqs csw/system [ 255.414129] rcu: number: 122510523 00 [ 255.476719] rcu: cputime:0 00 ==> 210144(ms) [