On Wed, 18 May 2022, Peter Geis wrote: > On Wed, May 18, 2022 at 7:56 AM Lee Jones <lee.jo...@linaro.org> wrote: > > > > Looping int a few relevant/active kernel people/lists for full coverage. > > > > On Sun, 01 Dec 2019, Hugh Cole-Baker wrote: > > > > On 29 Nov 2019, at 01:06, Vasily Khoruzhick <anars...@gmail.com> wrote: > > > > On Thu, Nov 28, 2019 at 4:59 PM Kever Yang <kever.y...@rock-chips.com> > > > > wrote: > > > >> > > > >> Hi Vasily, > > > >> > > > >> On 2019/11/28 下午11:51, Vasily Khoruzhick wrote: > > > >>> On Thu, Nov 28, 2019 at 1:23 AM Kever Yang > > > >>> <kever.y...@rock-chips.com> wrote: > > > >>>> Hi Vasily, > > > >>>> > > > >>>> I think this should not be needed, see comments below. > > > >>> Hi Kever, > > > >>> > > > >>> I've spent 2 weeks of my evenings debugging this issue but > > > >> > > > >> I can understand you work pretty hard on make it work, it's not so easy > > > >> to identify the root cause > > > >> > > > >> some times, thanks very much for working on this. > > > >> > > > >>> unfortunately I don't have a proper fix. This is the only solution > > > >>> that makes my rockpro64 reboot reliably with mainline u-boot and ATF. > > > >>> See my comments below. > > > > > > I also had a problem where Linux would hang or panic after rebooting, with > > > mainline u-boot and ATF on a rockpro64. This patch does fix the issue for > > > me, > > > I have tested it by performing 10 reboots from Linux in a row and I've > > > seen > > > no hangs or panics. > > > > > > I noticed the Armbian project have recently included a patch to ATF [1] > > > which > > > switches all power domains on before ATF performs a soft reset. I have > > > also > > > tested using u-boot mainline, without any patches to u-boot, but > > > including ATF > > > patched with your reset fix [2] and the Armbian power domains patch [1]. > > > This > > > also fixes the same hanging on reboot issue for me without modifications > > > to > > > u-boot, I've also tested 10 reboots in a row with this ATF and seen no > > > hangs. > > > > > > So this u-boot patch may not be needed if ATF is patched instead to switch > > > power domains on before soft reset. > > > > > > FWIW, when I was able to see panic messages from Linux when it panicked on > > > boot, the call trace always seemed to include rockchip_pd_power_off() [3]. > > > > > > [1] > > > https://github.com/armbian/build/blob/master/patch/atf/atf-rk3399/switch-power-domains-on-before-reset.patch > > > [2] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/2512 > > > [3] https://gist.github.com/sigmaris/c0e155c8cb0a325d84f549185f9a568c > > > > This last paste looks remarkably similar to an issue currently seen on > > the Radxa ROCK Pi 4B (RK3399) during power-up after a soft reboot > > (`sudo reboot`) is issued. We're presently running v5.15.35 [0]. > > Good Evening,
Hi, Peter, Thank you so much for your reply. > That's definitely not stock v5.15.35, it's been tagged as an android kernel. > 5.15.35-android13-5-00092-g525d77310a20 It's not stock, no. Although the differences from RockPi's perspective are minimal. The main difference is the way the kernel is configured. It's GKI: https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/gki_defconfig Plus a few non-GKI specifics: https://android.googlesource.com/kernel/common/+/refs/heads/android13-5.15/arch/arm64/configs/rockpi4_gki.fragment > > It's not clear how this issue (present 3 years ago) was finally > > resolved. From the thread, it looks as if the fix might have made its > > way into ATF, but I'm 87.6% sure ATF is not running on this platform > > (yet). > > The rk3399 SoC has a hardware bug where the power domains are not > reset upon a soft reset. This leads to situations like this one where > power domains are shut down during shutdown but aren't restored on > reboot. I assume this isn't something we can patch in the kernel driver? > Mainline TF-A was patched to force all power domains online > when a soft reboot is triggered, which solved that issue. Okay, this is what I figured. > What particular issues are you having initializing modern u-boot on > this device? This is the output: https://pastebin.ubuntu.com/p/d5DmsSBnrR/ Speaking with one of the guys who supports RockPi 4 in AOSP, he suspects the DDR settings. Apparently settings for older SoCs sometimes get clobbered when support for newer SoCs is added. I am yet to investigate the u-boot story in any detail. It's on my TODO list for today. > Is there a particular reason it isn't using Mainline TF-A? We're not using Trusted Firmware yet. Although I'm starting to think this should be re-prioritised. > I've also run into issues on rk356x where the regulator powering a > power domain isn't powered due to a soft reset, which also causes > faults like this. Set your main regulators to always-on and see if it > helps with the issue. I'll do that. Thanks for the tip. Our main issue currently is an RCU-lock-up, again on soft reboot: [ 21.226951][ C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 21.227637][ C0] rcu: 5-...!: (1 GPs behind) idle=3de/1/0x4000000000000000 softirq=9/10 fqs=3 last_accelerate: 0000/efb9 dyntick_enabled: 0 [ 21.228890][ C0] (detected by 0, t=5252 jiffies, g=-1167, q=46) Do you think these issues could all be related? Thanks ever so much for your reply Peter. You've potentially saved us hours and hours of debugging. Kind regards, Lee > > Note that the u-boot we're using is also quite old: > > > > U-Boot 2019.10-09248-g8511c75bb4 (Jan 08 2020 - 17:13:03 -0800) > > > > ... so this could easily be the root cause. The current plan is to > > try to update this ASAP. However early attempts are yet to result in > > a successful boot. > > > > I see that Brian recently added a few patches related to PD/DVFS, but > > again, these appear to be ATF related. > > > > Would anyone be able to shed some light onto this for me please? > > > > As always, any help would be gratefully received. > > > > Kind regards, > > Lee > > > > [0] > > Full reboot log can be seen at: https://pastebin.ubuntu.com/p/MjZP2V6kQ3/ > > > > [ 0.699736][ T1] initcall > > __initstub__kmod_iommu__362_155_iommu_subsys_init4+0x0/0x8 returned 0 after > > 0 usecs > > [ 0.700737][ T1] calling > > __initstub__kmod_rockchip_iommu__348_1415_rk_iommu_init4+0x0/0x8 @ 1 > > [ 0.702238][ C5] SError Interrupt on CPU5, code 0xbf000002 -- SError > > [ 0.702248][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted > > 5.15.35-android13-5-00092-g525d77310a20 #1 > > [ 0.702261][ C5] Hardware name: Radxa ROCK Pi 4B (DT) > > [ 0.702266][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt > > [ 0.702289][ C5] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT > > -SSBS BTYPE=--) > > [ 0.702301][ C5] pc : regmap_mmio_read32le+0x14/0x2c > > [ 0.702318][ C5] lr : regmap_mmio_read+0x68/0xd0 > > [ 0.702331][ C5] sp : ffffffc00b6d3b40 > > [ 0.702335][ C5] x29: ffffffc00b6d3b40 x28: 0000000000000000 x27: > > 0000000000000000 > > [ 0.702351][ C5] x26: ffffff8000923680 x25: ffffffc009abc2a0 x24: > > ffffff8000930c00 > > [ 0.702364][ C5] x23: 0000000000000014 x22: ffffff8000930c00 x21: > > 0000000000000008 > > [ 0.702378][ C5] x20: ffffff8000922300 x19: ffffff8000923680 x18: > > ffffffc00b66d058 > > [ 0.702391][ C5] x17: 000000000000ba7e x16: ffffffc00a4dee04 x15: > > 000000000000b67e > > [ 0.702405][ C5] x14: 00000000028dd7a0 x13: 0000000000000040 x12: > > 0000000000000000 > > [ 0.702419][ C5] x11: 0000000000000000 x10: 0000000000000000 x9 : > > 0000000000000005 > > [ 0.702432][ C5] x8 : 0000000000000000 x7 : 00756d6d6f692e30 x6 : > > 3030383035366666 > > [ 0.702445][ C5] x5 : 0000000000000001 x4 : 028dea248fba33d6 x3 : > > 0000000000000000 > > [ 0.702457][ C5] x2 : ffffff8000923680 x1 : 0000000000000008 x0 : > > 0000000000000000 > > [ 0.702472][ C5] Kernel panic - not syncing: Asynchronous SError > > Interrupt > > [ 0.702477][ C5] CPU: 5 PID: 48 Comm: kworker/5:1 Not tainted > > 5.15.35-android13-5-00092-g525d77310a20 #1 > > [ 0.702487][ C5] Hardware name: Radxa ROCK Pi 4B (DT) > > [ 0.702492][ C5] Workqueue: pm genpd_power_off_work_fn.cfi_jt > > [ 0.702506][ C5] Call trace: > > [ 0.702508][ C5] dump_backtrace.cfi_jt+0x0/0x8 > > [ 0.702525][ C5] dump_stack_lvl+0x80/0xb8 > > [ 0.702536][ C5] panic+0x180/0x444 > > [ 0.702547][ C5] arm64_serror_panic+0x1c0/0x210 > > [ 0.702561][ C5] do_serror+0x17c/0x218 > > [ 0.702572][ C5] el1h_64_error_handler+0x38/0x50 > > [ 0.702581][ C5] el1h_64_error+0x7c/0x80 > > [ 0.702589][ C5] regmap_mmio_read32le+0x14/0x2c > > [ 0.702603][ C5] _regmap_bus_reg_read+0x3c/0x90 > > [ 0.702614][ C5] _regmap_read+0xb0/0x24c > > [ 0.702623][ C5] rockchip_pd_power+0x6c4/0xbc0 > > [ 0.702638][ C5] rockchip_pd_power_off+0x18/0x28 > > [ 0.702652][ C5] _genpd_power_off+0x178/0x388 > > [ 0.702663][ C5] genpd_power_off+0x188/0x2e4 > > [ 0.702673][ C5] genpd_power_off_work_fn+0x54/0xe4 > > [ 0.702683][ C5] process_one_work+0x254/0x5a0 > > [ 0.702696][ C5] worker_thread+0x3ec/0x920 > > [ 0.702707][ C5] kthread+0x168/0x1dc > > [ 0.702716][ C5] ret_from_fork+0x10/0x20 > > [ 0.702726][ C5] SMP: stopping secondary CPUs > > > > > > _______________________________________________ > > Linux-rockchip mailing list > > linux-rockc...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-rockchip -- Lee Jones [李琼斯] Principal Technical Lead - Developer Services Linaro.org │ Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog