Hi Colin,

thanks for digging into this. Your theory sounds very reasonable but I'll need to look into this in more detail.

There seem to be two problems here:
1. The Double_insertion seems to be triggered by a preceding Out_of_ram
   during DMA buffer allocation. This is definitely a bug.
2. The initial Out_of_ram exception might also be an indicator that
   something strange is happening. What component is logging the
   exception? Normally, the DMA buffer allocation is retried by the
   Client with additional RAM. The allocation should therefore succeed
   during the second try and not log any exception.

Would you mind creating a new Github issue and thereby move the discussion to Github?

Cheers
Johannes


On 20/06/2024 13:22, Colin Parker wrote:
Hello Genodians,
     It's been a while, but I decided to try out Sculpt 24.04. I have mostly
good news: the ps2 issue I mentioned last time seems to be resolved, thanks
for that. The system boots up nicely and looks to have a lot of cool
features. However, I can't get the nvme driver to work. It reads the
partition table, but following any access to a partition, the platform_drv
component fails with the message:

Error: Uncaught exception of type
'Genode::Final_table<Intel::Level_1_descriptor<12u> >::Double_insertion'

So I am able to trace this a little, the exception is thrown within
Driver::Session_component::alloc_dma_buffer(unsigned long, Genode::Cache),
specifically within
Genode::Registry<Driver::Io_mmu_domain>::for_each<...>(...), so I assume
this is related to IO_mmu in some way. If it helps, the address it attempts
to allocate is 0x800000 with size 0x400000.

Interestingly an exception is thrown twice when attempting to allocate that
region, the first time on startup. In that case, based on the log this is
probably happening when the nvme partition table is read. However, this
exception is not fatal, because it's of type Genode::Out_of_ram. The second
time is when trying to use a partition and the exception is
Double_insertion. One theory is that, prior to the first exception, the
page table in IO_mmu gets partially filled, but the allocator ignores that
this happened due to the exception. So, when another request is made, the
table is partially full, although the allocator thinks it's a free region.

I considered giving more RAM to various components (platform_drv and nvme),
but that didn't help, and no component appears to be actually close to its
limit. The next step in my debugging would be to try to figure out
where/why the Out_of_ram is thrown. I suppose that this is in a higher
level page allocator during allocation of space for a lower level page
table? But, why isn't this fixed by increasing the quota? I'm curious if
anyone here can offer insights.

Best regards,
CP


_______________________________________________
users mailing list -- users@lists.genode.org
To unsubscribe send an email to users-le...@lists.genode.org
Archived at 
https://lists.genode.org/mailman3/hyperkitty/list/users@lists.genode.org/message/KNOZW6HPFTTA676OCPQ43ARAQZXB2JED/
_______________________________________________
users mailing list -- users@lists.genode.org
To unsubscribe send an email to users-le...@lists.genode.org
Archived at 
https://lists.genode.org/mailman3/hyperkitty/list/users@lists.genode.org/message/7H5XYDP3H3GK7DVMNUK2YGZYQTXSHF44/

Reply via email to