[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Opvolger
@xypron

I know, https://rvspace.org/en/project/JH7110_Upstream_Plan

Can't wait until everything is green on this site :)

Good to know that NVMe support is working correctly.

I was very surprised to see that an external GPU worked "out of the box"
with Ubuntu 23.10. But then it crashed a lot :(

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-riscv in Ubuntu.
https://bugs.launchpad.net/bugs/2039782

Title:
  Stalls on CPUs/tasks on VisionFive 2 with external GPU

Status in linux-riscv package in Ubuntu:
  Confirmed

Bug description:
  I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B
  board using
  https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live-
  server-riscv64.img.gz

  I have connected an Nvidia GT710 graphics card to the NVMe connector
  and see rcu_sched stalls. I have not  observed this behavior on
  StarFive VisionFive 2 1.3B boards without an external GPU.

  The U-Boot installed on SPI flash is
  
https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb

  [   93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
  [   93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 
softirq=2431/2431 fqs=41
  [   93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 
ncpus=4)
  [   93.140996] Task dump for CPU 0:
  [   93.149549] task:swapper/0   state:R  running task stack:0 
pid:0 ppid:0  flags:0x
  [   93.164907] Call Trace:
  [   93.172715] [] __schedule+0x27a/0x82e
  [   93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 
jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200
  [   93.200202] rcu: Possible timer handling issue on cpu=0 
timer-softirq=890
  [   93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
  [   93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, 
OOM is now expected behavior.
  [   93.243573] rcu: RCU grace-period kthread stack dump:
  [   93.254522] task:rcu_sched   state:R stack:0 pid:15ppid:2  
flags:0x
  [   93.268895] Call Trace:
  [   93.277340] [] __schedule+0x27a/0x82e
  [   93.288646] [] schedule+0x4e/0xde
  [   93.299623] [] schedule_timeout+0x8c/0x15e
  [   93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4
  [   93.323170] [] rcu_gp_kthread+0x11a/0x142
  [   93.334901] [] kthread+0xc4/0xe4
  [   93.345833] [] ret_from_fork+0xe/0x20

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Opvolger
I have the same problem

[   14.873826] Console: switching to colour frame buffer device 240x67
[   15.120436] radeon 0001:01:00.0: [drm] fb0: radeondrmfb frame buffer device
[   75.138845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   75.145779] rcu: 0-...0: (2 ticks this GP) 
idle=ae24/1/0x4002 softirq=881/882 fqs=7501   
[   75.156363] rcu:  hardirqs   softirqs   csw/system
[   75.162878] rcu:  number: 17595499  00
[   75.169396] rcu: cputime:0  00   ==> 
30016(ms)
[   75.177538] rcu: (detected by 3, t=15011 jiffies, g=121, q=273 ncpus=4)
[   75.185375] Task dump for CPU 0:
[   75.189151] task:swapper/0   state:R  running task stack:0 pid:0 
ppid:0  flags:0x0008
[   75.200753] Call Trace:
[   75.203615] [] __schedule+0x27a/0x82e
Timed out for waiting the udev queue being empty.
[  198.633202] watchdog: Watchdog detected hard LOCKUP on cpu 0
[  198.639832] Modules linked in: radeon(+) hid_generic usbhid hid motorcomm 
video drm_suballoc_helper i2c_algo_bit drm_ttm_helper ttm drm_display_helper 
dwmac_starfive stmmac_platform cec rc_core stmmac drm_kms_helper drm 
axp20x_regulator pcs_xpcs xhci_pci dw_mmc_starfive dw_mmc_pltfm phylink 
backlight xhci_pci_renesas pinctrl_starfive_jh7110_aon dw_mmc 
clk_starfive_jh7110_aon axp20x_i2c jh7110_trng clk_starfive_jh7110_isp 
clk_starfive_jh7110_vout axp20x spi_cadence_quadspi phy_jh7110_usb
[  242.654895] INFO: task kworker/1:2:83 blocked for more than 120 seconds.
[  242.662754]   Not tainted 6.5.0-10-generic #10.1-Ubuntu
[  242.669286] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  242.678454] task:kworker/1:2 state:D stack:0 pid:83ppid:2  
flags:0x
[  242.688246] Workqueue: events output_poll_execute [drm_kms_helper]
[  242.695736] Call Trace:
[  242.698600] [] __schedule+0x27a/0x82e
[  242.704733] [] schedule+0x4e/0xde
[  242.710452] [] schedule_preempt_disabled+0x18/0x20
[  242.717901] [] __mutex_lock.constprop.0+0x3ce/0x6e0
[  242.725452] [] __mutex_lock_slowpath+0x1a/0x26
[  242.732495] [] mutex_lock+0x48/0x58
[  242.738419] [] drm_client_dev_hotplug+0x7c/0x10a [drm]
[  242.746934] [] output_poll_execute+0x1e2/0x21c 
[drm_kms_helper]
[  242.755890] [] process_one_work+0x1dc/0x3b4
[  242.762628] [] worker_thread+0x88/0x456
[  242.768959] [] kthread+0xc4/0xe4
[  242.774576] [] ret_from_fork+0xe/0x20
[  242.780717] INFO: task (udev-worker):117 blocked for more than 120 seconds.
[  242.788875]   Not tainted 6.5.0-10-generic #10.1-Ubuntu
[  242.795410] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  242.804577] task:(udev-worker)   state:D stack:0 pid:117   ppid:111
flags:0x0006
[  242.814360] Call Trace:
[  242.817230] [] __schedule+0x27a/0x82e
[  242.823366] [] schedule+0x4e/0xde
[  242.829088] [] schedule_timeout+0x128/0x15e
[  242.835831] [] __wait_for_common+0x17c/0x24a
[  242.897655] [] wait_for_completion+0x26/0x36
[  242.959630] [] __wait_rcu_gp+0xec/0x192
[  243.021222] [] synchronize_rcu+0x118/0x126
[  243.083262] [] __sysrq_swap_key_ops+0xa2/0xf4
[  243.145935] [] register_sysrq_key+0x1a/0x26
[  243.208817] [] 
__drm_fb_helper_initial_config_and_unlock+0x1ae/0x24a [drm_kms_helper]  
[  243.276529] [] drm_fb_helper_initial_config+0x3a/0x46 
[drm_kms_helper]
[  243.342843] [] radeon_fbdev_client_hotplug+0xb4/0xba 
[radeon]
[  243.410095] [] drm_client_register+0x4c/0x92 [drm]
[  243.475748] [] radeon_fbdev_setup+0xaa/0xf8 [radeon]
[  243.542662] [] radeon_pci_probe+0xe8/0x158 [radeon]
[  243.609846] [] local_pci_probe+0x40/0x88
[  243.675177] [] pci_call_probe+0x60/0x17a
[  243.740214] [] pci_device_probe+0x7c/0xdc
[  243.804828] [] call_driver_probe+0x22/0x142
[  243.869782] [] really_probe+0x9a/0x2a2
[  243.934343] [] __driver_probe_device+0x7e/0x146
[  243.999560] [] driver_probe_device+0x38/0xd0
[  244.064017] [] __driver_attach+0xee/0x1e8
[  244.128136] [] bus_for_each_dev+0x6c/0xc4
[  244.192404] [] driver_attach+0x26/0x34
[  244.256463] [] bus_add_driver+0x112/0x21e
[  244.321058] [] driver_register+0x52/0x106
[  244.385902] [] __pci_register_driver+0x4c/0x60
[  244.451082] [] radeon_module_init+0x6c/0x1000 [radeon]
[  244.518090] [] do_one_initcall+0x5c/0x1e2
[  244.582278] [] do_init_module+0x5e/0x224
[  244.645610] [] load_module+0x7b4/0x8de
[  244.708029] [] init_module_from_file+0x82/0xd4
[  244.771181] [] sys_finit_module+0x194/0x330
[  244.834148] [] do_trap_ecall_u+0xd6/0x154
[  244.896674] [] ret_from_exception+0x0/0x64
[  255.222845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  255.285167] rcu: 0-...0: (2 ticks this GP) 
idle=ae24/1/0x4002 softirq=881/882 fqs=2  
[  255.351454] rcu:  hardirqs   softirqs   csw/system
[  255.414129] rcu:  number: 122510523  00
[  255.476719] rcu: cputime:0  00   ==> 
210144(ms)
[  255