Hi Stephen, while we do not have this particular Rockchip SoC right now, we have A76 + A55 cores. Thanks!
BR, Adam On Mon Oct 13, 2025 at 10:28:14 +0800, yy18513676366 wrote: > Hi Adam, > > We’re using the Radxa SiRider S1 development board, which is based on the > Rockchip RK3588 SoC. > You can find more details here: https://arace.tech/products/radxa-sirider-s1. > The CPU consists of 4 Cortex-A76 cores and 2 Cortex-A55 cores. > > Best, > Stephen.yang > > > > > > At 2025-10-06 06:00:44, "Adam Lackorzynski" <[email protected]> wrote: > >Hi Stephen, > > > >I doubt this is hardware, it seldomly is. Would you be able to share > >which Arm core it is if you can? > >I'll try to reproduce here, your indication of increasing the timer > >frequency is a good hint. And knowing which core or at least category of > >core can be helpful. > > > > > >BR, Adam > > > >On Thu Oct 02, 2025 at 13:02:50 +0800, yy18513676366 wrote: > >> Hi Adam, > >> > >> > >> I truly appreciate your reply. > >> I actually encountered this issue on real hardware rather than QEMU. > >> May I ask if this problem could be related to the hardware itself? I’m not > >> quite sure I fully understand. > >> > >> > >> Best regards > >> Stephen.yang > >> > >> > >> > >> > >> > >> > >> > >> 在 2025-09-30 15:50:22,"Adam Lackorzynski" <[email protected]> 写道: > >> >Hi Stephen, > >> > > >> >ok, thanks, that's tricky indeed. > >> > > >> >In case you are doing this with QEMU, could you please make sure you > >> >have the following change in your QEMU?: > >> >https://lists.gnu.org/archive/html/qemu-devel/2024-09/msg02207.html > >> > > >> >Or do you see this on hardware? > >> > > >> > > >> >Thanks, > >> >Adam > >> > > >> >On Tue Sep 30, 2025 at 11:14:57 +0800, yy18513676366 wrote: > >> >> Hi Adam, > >> >> > >> >> > >> >> Thank you very much for your reply — it really gave me some hope. > >> >> > >> >> > >> >> This issue is indeed difficult to reproduce reliably, which has been > >> >> one of the main challenges during my debugging. > >> >> So far, I have found that increasing the vtimer interrupt frequency, > >> >> while keeping the traditional handling mode (i.e., without direct > >> >> injection), > >> >> makes the problem significantly easier to reproduce. > >> >> > >> >> > >> >> The relevant changes are as follows. > >> >> 1、In this setup, the vtimer is adjusted from roughly one trigger per > >> >> millisecond to approximately one trigger per microsecond, > >> >> and the system remains stable and functional: > >> >> > >> >> > >> >> diff --git a/src/kern/arm/timer-arm-generic.cpp > >> >> b/src/kern/arm/timer-arm-generic.cpp > >> >> index a040cf46..b4cbbceb 100644 > >> >> --- a/src/kern/arm/timer-arm-generic.cpp > >> >> +++ b/src/kern/arm/timer-arm-generic.cpp > >> >> @@ -64,7 +64,8 @@ void Timer::init(Cpu_number cpu) > >> >> if (cpu == Cpu_number::boot_cpu()) > >> >> { > >> >> _freq0 = frequency(); > >> >> - _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / > >> >> 1000000; > >> >> + //_interval = Unsigned64{_freq0} * Config::Scheduler_granularity > >> >> / 1000000; > >> >> + _interval = Unsigned64{_freq0} * Config::Scheduler_granularity / > >> >> 1000000000; > >> >> printf("ARM generic timer: freq=%ld interval=%ld cnt=%lld\n", > >> >> _freq0, _interval, Gtimer::counter()); > >> >> assert(_freq0); > >> >> > >> >> > >> >> 2、In addition, I selected the mode where interrupts are not directly > >> >> injected: > >> >> diff --git a/src/Kconfig b/src/Kconfig > >> >> index 4391c996..55deeb1c 100644 > >> >> --- a/src/Kconfig > >> >> +++ b/src/Kconfig > >> >> @@ -367,7 +367,7 @@ config IOMMU > >> >> config IRQ_DIRECT_INJECT > >> >> bool "Support direct interrupt forwarding to guests" > >> >> depends on CPU_VIRT && HAS_IRQ_DIRECT_INJECT_OPTION > >> >> - default y > >> >> + default n > >> >> help > >> >> Adds support in the kernel to allow the VMM to let Fiasco > >> >> directly > >> >> forward hardware interrupts to a guest. This enables just the > >> >> > >> >> At the moment, this is the only way I have found that can noticeably > >> >> increase the reproduction rate. > >> >> Once again, thank you for your valuable time and feedback! > >> >> > >> >> Best regards, > >> >> Stephen.yang > >> >> > >> >> > >> >> > >> >> > >> >> At 2025-09-29 00:11:41, "Adam Lackorzynski" <[email protected]> wrote: > >> >> >Hi, > >> >> > > >> >> >On Wed Sep 17, 2025 at 13:57:43 +0800, yy18513676366 wrote: > >> >> >> When running a virtual machine, I encounter an assertion failure > >> >> >> after the VM > >> >> >> has been up for some time. The kernel crashes in src/kern/arm/ > >> >> >> thread-arm-hyp.cpp, specifically in the function > >> >> >> vcpu_vgic_upcall(unsigned > >> >> >> virq): > >> >> >> > >> >> >> vcpu_vgic_upcall(unsigned virq) > >> >> >> { > >> >> >> ...... > >> >> >> assert(state() & Thread_vcpu_user); > >> >> >> ...... > >> >> >> } > >> >> >> > >> >> >> Based on source code inspection and preliminary debugging, the > >> >> >> problem seems to > >> >> >> be related to the management of the Thread_vcpu_user state. > >> >> >> > >> >> >> 1 Under normal circumstances, the vcpu_resume path (transitioning > >> >> >> from the > >> >> >> kernel back to the guest OS) updates the vCPU state to include > >> >> >> Thread_vcpu_user. However, if an interrupt is delivered during this > >> >> >> transition > >> >> >> while the receiving side is not yet ready, the vCPU frequently > >> >> >> return to the > >> >> >> kernel (via vcpu_return_to_kernel) and subsequently process the > >> >> >> interrupt > >> >> >> through guest_irq in vcpu_entries. In this situation, the expected > >> >> >> update of > >> >> >> Thread_vcpu_user may not yet have taken place, which seems result in > >> >> >> the assert > >> >> >> being triggered when a VGIC interrupt is involved. > >> >> >> > >> >> >> 2 A similar condition seems to occur in the vcpu_async_ipc path. > >> >> >> At the end > >> >> >> of IPC handling, this function explicitly clears the > >> >> >> Thread_vcpu_user flag. If > >> >> >> a VGIC interrupt is delivered during this phase, the absence of the > >> >> >> expected > >> >> >> Thread_vcpu_user state seems to lead to the same assertion failure. > >> >> >> > >> >> >> I would like to confirm if the two points above are correct, and > >> >> >> what steps I > >> >> >> should take next to further debug this issue. > >> >> > > >> >> >Thanks for repording. At least the description sounds reasonable to me. > >> >> > > >> >> >Do you have a good way of reliably reproducing this situation? > >> >> > > >> >> >> In addition, I have some assumptions I would like to confirm: > >> >> >> > >> >> >> First, for IPC between non-vcpu threads, the L4 microkernel handles > >> >> >> message > >> >> >> delivery and scheduling (wake/schedule) directly, without requiring > >> >> >> any > >> >> >> forwarding through uvmm. Similarly, interrupts bound via the > >> >> >> interrupt > >> >> >> controller (ICU) to a non-vcpu thread or handler are also managed by > >> >> >> the kernel > >> >> >> and scheduler, and therefore do not necessarily involve uvmm. > >> >> > > >> >> >IPCs between threads are handled by the microkernel. vcpu-thread vs. > >> >> >non-vcpu-thread is just making the difference regarding how it is > >> >> >delivered to the thread. For a non-vcpu thread the receiver has to wait > >> >> >in IPC to get it, in vcpu mode the IPC is received by causing a vcpu > >> >> >event and bringing the vcpu to its entry. This also works without > >> >> >virtualization (note that vcpus also work without hw-virtualization). > >> >> >For interrupts it is the same. For non-vcpu threads they have to block > >> >> >in IPC to get an interrupt, or for vcpu threads, they will be brought > >> >> >to > >> >> >their entry. > >> >> > > >> >> >> Second, passthrough interrupts, when not delivered in > >> >> >> direct-injection mode, > >> >> >> are routed to uvmm for handling if they are bound to a vCPU. > >> >> >> Likewise, services > >> >> >> provided by uvmm (such as virq) are also bound to a vCPU and > >> >> >> therefore require > >> >> >> forwarding through uvmm. > >> >> > > >> >> >Yes. Direct injection will only happen when the vcpu is running. > >> >> > > >> >> >> There seems to have been a similar question in the past, but it does > >> >> >> not seem > >> >> >> to have been resolved. > >> >> >> > >> >> >> Re: Assertion failure error in kernel vgic interrupt processing - > >> >> >> l4-hackers - > >> >> >> OS Site > >> >> >> > >> >> >> I wonder if my questions are related to that post, and if any > >> >> >> solutions exist. > >> >> > > >> >> >Thanks, we need to work on it. Reproducing this situation on our side > >> >> >would be very valuable. > >> >> > > >> >> > > >> >> >Thanks, Adam > >> >_______________________________________________ > >> >l4-hackers mailing list -- [email protected] > >> >To unsubscribe send an email to [email protected] > >_______________________________________________ > >l4-hackers mailing list -- [email protected] > >To unsubscribe send an email to [email protected] _______________________________________________ l4-hackers mailing list -- [email protected] To unsubscribe send an email to [email protected]
