[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Opvolger
@xypron

I know, https://rvspace.org/en/project/JH7110_Upstream_Plan

Can't wait until everything is green on this site :)

Good to know that NVMe support is working correctly.

I was very surprised to see that an external GPU worked "out of the box"
with Ubuntu 23.10. But then it crashed a lot :(

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-riscv in Ubuntu.
https://bugs.launchpad.net/bugs/2039782

Title:
  Stalls on CPUs/tasks on VisionFive 2 with external GPU

Status in linux-riscv package in Ubuntu:
  Confirmed

Bug description:
  I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B
  board using
  https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live-
  server-riscv64.img.gz

  I have connected an Nvidia GT710 graphics card to the NVMe connector
  and see rcu_sched stalls. I have not  observed this behavior on
  StarFive VisionFive 2 1.3B boards without an external GPU.

  The U-Boot installed on SPI flash is
  
https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb

  [   93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
  [   93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 
softirq=2431/2431 fqs=41
  [   93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 
ncpus=4)
  [   93.140996] Task dump for CPU 0:
  [   93.149549] task:swapper/0   state:R  running task stack:0 
pid:0 ppid:0  flags:0x
  [   93.164907] Call Trace:
  [   93.172715] [] __schedule+0x27a/0x82e
  [   93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 
jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200
  [   93.200202] rcu: Possible timer handling issue on cpu=0 
timer-softirq=890
  [   93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
  [   93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, 
OOM is now expected behavior.
  [   93.243573] rcu: RCU grace-period kthread stack dump:
  [   93.254522] task:rcu_sched   state:R stack:0 pid:15ppid:2  
flags:0x
  [   93.268895] Call Trace:
  [   93.277340] [] __schedule+0x27a/0x82e
  [   93.288646] [] schedule+0x4e/0xde
  [   93.299623] [] schedule_timeout+0x8c/0x15e
  [   93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4
  [   93.323170] [] rcu_gp_kthread+0x11a/0x142
  [   93.334901] [] kthread+0xc4/0xe4
  [   93.345833] [] ret_from_fork+0xe/0x20

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Heinrich Schuchardt
@opvolger

Thank you for confirming the issue.

The upstreaming of PCIe for the StarFive VisionFive 2 board is not
finalized yet. The kernel team will revisit this issue once there is
proper upstream support. They picked up what was available on the kernel
list which brought us NVMe support for which I am grateful.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-riscv in Ubuntu.
https://bugs.launchpad.net/bugs/2039782

Title:
  Stalls on CPUs/tasks on VisionFive 2 with external GPU

Status in linux-riscv package in Ubuntu:
  Confirmed

Bug description:
  I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B
  board using
  https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live-
  server-riscv64.img.gz

  I have connected an Nvidia GT710 graphics card to the NVMe connector
  and see rcu_sched stalls. I have not  observed this behavior on
  StarFive VisionFive 2 1.3B boards without an external GPU.

  The U-Boot installed on SPI flash is
  
https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb

  [   93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
  [   93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 
softirq=2431/2431 fqs=41
  [   93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 
ncpus=4)
  [   93.140996] Task dump for CPU 0:
  [   93.149549] task:swapper/0   state:R  running task stack:0 
pid:0 ppid:0  flags:0x
  [   93.164907] Call Trace:
  [   93.172715] [] __schedule+0x27a/0x82e
  [   93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 
jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200
  [   93.200202] rcu: Possible timer handling issue on cpu=0 
timer-softirq=890
  [   93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
  [   93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, 
OOM is now expected behavior.
  [   93.243573] rcu: RCU grace-period kthread stack dump:
  [   93.254522] task:rcu_sched   state:R stack:0 pid:15ppid:2  
flags:0x
  [   93.268895] Call Trace:
  [   93.277340] [] __schedule+0x27a/0x82e
  [   93.288646] [] schedule+0x4e/0xde
  [   93.299623] [] schedule_timeout+0x8c/0x15e
  [   93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4
  [   93.323170] [] rcu_gp_kthread+0x11a/0x142
  [   93.334901] [] kthread+0xc4/0xe4
  [   93.345833] [] ret_from_fork+0xe/0x20

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Heinrich Schuchardt
** Changed in: linux-riscv (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-riscv in Ubuntu.
https://bugs.launchpad.net/bugs/2039782

Title:
  Stalls on CPUs/tasks on VisionFive 2 with external GPU

Status in linux-riscv package in Ubuntu:
  Confirmed

Bug description:
  I am trying to install Ubuntu Mantic on the StarFive VisionFive 2 1.3B
  board using
  https://cdimage.ubuntu.com/releases/23.10/release/ubuntu-23.10-live-
  server-riscv64.img.gz

  I have connected an Nvidia GT710 graphics card to the NVMe connector
  and see rcu_sched stalls. I have not  observed this behavior on
  StarFive VisionFive 2 1.3B boards without an external GPU.

  The U-Boot installed on SPI flash is
  
https://launchpad.net/~ubuntu-risc-v-team/+archive/ubuntu/release/+files/u-boot-starfive_2023.09.22-next-5d2fae79c7d6-0ubuntu1~ppa5_riscv64.deb

  [   93.102845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
  [   93.114452] rcu: 0-...!: (1 GPs behind) idle=c69c/1/0x4002 
softirq=2431/2431 fqs=41
  [   93.128724] rcu: (detected by 2, t=15008 jiffies, g=4353, q=2369 
ncpus=4)
  [   93.140996] Task dump for CPU 0:
  [   93.149549] task:swapper/0   state:R  running task stack:0 
pid:0 ppid:0  flags:0x
  [   93.164907] Call Trace:
  [   93.172715] [] __schedule+0x27a/0x82e
  [   93.183385] rcu: rcu_sched kthread timer wakeup didn't happen for 14937 
jiffies! g4353 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x200
  [   93.200202] rcu: Possible timer handling issue on cpu=0 
timer-softirq=890
  [   93.212733] rcu: rcu_sched kthread starved for 14945 jiffies! g4353 f0x0 
RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
  [   93.228777] rcu: Unless rcu_sched kthread gets sufficient CPU time, 
OOM is now expected behavior.
  [   93.243573] rcu: RCU grace-period kthread stack dump:
  [   93.254522] task:rcu_sched   state:R stack:0 pid:15ppid:2  
flags:0x
  [   93.268895] Call Trace:
  [   93.277340] [] __schedule+0x27a/0x82e
  [   93.288646] [] schedule+0x4e/0xde
  [   93.299623] [] schedule_timeout+0x8c/0x15e
  [   93.311380] [] rcu_gp_fqs_loop+0x2fc/0x3d4
  [   93.323170] [] rcu_gp_kthread+0x11a/0x142
  [   93.334901] [] kthread+0xc4/0xe4
  [   93.345833] [] ret_from_fork+0xe/0x20

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-riscv/+bug/2039782/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2039782] Re: Stalls on CPUs/tasks on VisionFive 2 with external GPU

2023-11-01 Thread Opvolger
I have the same problem

[   14.873826] Console: switching to colour frame buffer device 240x67
[   15.120436] radeon 0001:01:00.0: [drm] fb0: radeondrmfb frame buffer device
[   75.138845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   75.145779] rcu: 0-...0: (2 ticks this GP) 
idle=ae24/1/0x4002 softirq=881/882 fqs=7501   
[   75.156363] rcu:  hardirqs   softirqs   csw/system
[   75.162878] rcu:  number: 17595499  00
[   75.169396] rcu: cputime:0  00   ==> 
30016(ms)
[   75.177538] rcu: (detected by 3, t=15011 jiffies, g=121, q=273 ncpus=4)
[   75.185375] Task dump for CPU 0:
[   75.189151] task:swapper/0   state:R  running task stack:0 pid:0 
ppid:0  flags:0x0008
[   75.200753] Call Trace:
[   75.203615] [] __schedule+0x27a/0x82e
Timed out for waiting the udev queue being empty.
[  198.633202] watchdog: Watchdog detected hard LOCKUP on cpu 0
[  198.639832] Modules linked in: radeon(+) hid_generic usbhid hid motorcomm 
video drm_suballoc_helper i2c_algo_bit drm_ttm_helper ttm drm_display_helper 
dwmac_starfive stmmac_platform cec rc_core stmmac drm_kms_helper drm 
axp20x_regulator pcs_xpcs xhci_pci dw_mmc_starfive dw_mmc_pltfm phylink 
backlight xhci_pci_renesas pinctrl_starfive_jh7110_aon dw_mmc 
clk_starfive_jh7110_aon axp20x_i2c jh7110_trng clk_starfive_jh7110_isp 
clk_starfive_jh7110_vout axp20x spi_cadence_quadspi phy_jh7110_usb
[  242.654895] INFO: task kworker/1:2:83 blocked for more than 120 seconds.
[  242.662754]   Not tainted 6.5.0-10-generic #10.1-Ubuntu
[  242.669286] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  242.678454] task:kworker/1:2 state:D stack:0 pid:83ppid:2  
flags:0x
[  242.688246] Workqueue: events output_poll_execute [drm_kms_helper]
[  242.695736] Call Trace:
[  242.698600] [] __schedule+0x27a/0x82e
[  242.704733] [] schedule+0x4e/0xde
[  242.710452] [] schedule_preempt_disabled+0x18/0x20
[  242.717901] [] __mutex_lock.constprop.0+0x3ce/0x6e0
[  242.725452] [] __mutex_lock_slowpath+0x1a/0x26
[  242.732495] [] mutex_lock+0x48/0x58
[  242.738419] [] drm_client_dev_hotplug+0x7c/0x10a [drm]
[  242.746934] [] output_poll_execute+0x1e2/0x21c 
[drm_kms_helper]
[  242.755890] [] process_one_work+0x1dc/0x3b4
[  242.762628] [] worker_thread+0x88/0x456
[  242.768959] [] kthread+0xc4/0xe4
[  242.774576] [] ret_from_fork+0xe/0x20
[  242.780717] INFO: task (udev-worker):117 blocked for more than 120 seconds.
[  242.788875]   Not tainted 6.5.0-10-generic #10.1-Ubuntu
[  242.795410] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  242.804577] task:(udev-worker)   state:D stack:0 pid:117   ppid:111
flags:0x0006
[  242.814360] Call Trace:
[  242.817230] [] __schedule+0x27a/0x82e
[  242.823366] [] schedule+0x4e/0xde
[  242.829088] [] schedule_timeout+0x128/0x15e
[  242.835831] [] __wait_for_common+0x17c/0x24a
[  242.897655] [] wait_for_completion+0x26/0x36
[  242.959630] [] __wait_rcu_gp+0xec/0x192
[  243.021222] [] synchronize_rcu+0x118/0x126
[  243.083262] [] __sysrq_swap_key_ops+0xa2/0xf4
[  243.145935] [] register_sysrq_key+0x1a/0x26
[  243.208817] [] 
__drm_fb_helper_initial_config_and_unlock+0x1ae/0x24a [drm_kms_helper]  
[  243.276529] [] drm_fb_helper_initial_config+0x3a/0x46 
[drm_kms_helper]
[  243.342843] [] radeon_fbdev_client_hotplug+0xb4/0xba 
[radeon]
[  243.410095] [] drm_client_register+0x4c/0x92 [drm]
[  243.475748] [] radeon_fbdev_setup+0xaa/0xf8 [radeon]
[  243.542662] [] radeon_pci_probe+0xe8/0x158 [radeon]
[  243.609846] [] local_pci_probe+0x40/0x88
[  243.675177] [] pci_call_probe+0x60/0x17a
[  243.740214] [] pci_device_probe+0x7c/0xdc
[  243.804828] [] call_driver_probe+0x22/0x142
[  243.869782] [] really_probe+0x9a/0x2a2
[  243.934343] [] __driver_probe_device+0x7e/0x146
[  243.999560] [] driver_probe_device+0x38/0xd0
[  244.064017] [] __driver_attach+0xee/0x1e8
[  244.128136] [] bus_for_each_dev+0x6c/0xc4
[  244.192404] [] driver_attach+0x26/0x34
[  244.256463] [] bus_add_driver+0x112/0x21e
[  244.321058] [] driver_register+0x52/0x106
[  244.385902] [] __pci_register_driver+0x4c/0x60
[  244.451082] [] radeon_module_init+0x6c/0x1000 [radeon]
[  244.518090] [] do_one_initcall+0x5c/0x1e2
[  244.582278] [] do_init_module+0x5e/0x224
[  244.645610] [] load_module+0x7b4/0x8de
[  244.708029] [] init_module_from_file+0x82/0xd4
[  244.771181] [] sys_finit_module+0x194/0x330
[  244.834148] [] do_trap_ecall_u+0xd6/0x154
[  244.896674] [] ret_from_exception+0x0/0x64
[  255.222845] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  255.285167] rcu: 0-...0: (2 ticks this GP) 
idle=ae24/1/0x4002 softirq=881/882 fqs=2  
[  255.351454] rcu:  hardirqs   softirqs   csw/system
[  255.414129] rcu:  number: 122510523  00
[  255.476719] rcu: cputime:0  00   ==> 
210144(ms)
[