On 15/04/2026 00:57, Samuel Thibault wrote:
Hello,

Michael Kelly, le sam. 04 avril 2026 20:50:05 +0100, a ecrit:
I'm confident that a solution to the rumdisk_device_read/write could be
implemented using bounce buffers allocated using vm_allocate_contiguous.
Yes, that'd be better than hacking gnumach :)
I've spent some time looking at different methods of solving this problem.

I can't see a method of disabling readdisklabel() and there might be other parts of the rumpkernel that rely on 32 bit physically addressed memory loations, if not now, then in the future.

I thought it might be possible to disable DMA during rump_sys_open and then enable it after that so that rumpdisk_device_read/write would use DMA. I couldn't see a method of doing this without having knowledge of the specific hardware in use.

My final approach has been more successful. All rumpkernel memory allocations are supplied by rumpuser_malloc. I overloaded the implementation in librumpuser to supply pages using vm_allocate_contiguous with the restriction that 32 bit physical addresses are required. Booting i440fx chipset with +4G RAM is then successful provided that suitable bounce buffers are used for all reads/writes from rumpdisk.

The obvious drawback is a performance hit using the bounce buffer when it is not necessarily required, for example when running qemu q35 chipset. Might it be possible to allocate the buffer supplied to rumpdisk_device_read/write with 32 bit physaddr memory in the first instance, perhaps by adding some property to the mach_port of the server ? This is merely thinking aloud (I've not even looked at where the buffer gets allocated) but I was imagining some optimisation like we did for MACH_PORT_SET_KTYPE but maybe the performance hit wil be minor anyway.

There might be some memory allocations in rumpkernel that don't require either 32 bit physaddr or to be contiguously allocated. There is no context provided to rumpuser_malloc to make such determinations so this solution is an all or nothing.

I've only seen rumpuser_malloc() called with size being a multiple of PAGE_SZ. For other cases the function would need to implement a custom memory allocator using vm_allocate_contiguous. For the time being I could emit a warning and just use a whole page for small chunks.

Compiling gnumach illustrated that rumpuser_malloc is only called rarely after boot. It seems to me therefore that there is no reason to be concerned with any performance issues associated with rumpuser_malloc() and that its robustness is the main issue.

I haven't addressed the following that have been raised in this thread yet:

1) How NetBSD manages to guarantee that 32 bit physical addresses are used within the i440fx driver. The memory is allocated using 'pools' which perhaps are either implicitly or explicitly restricted to certain memory segments.

2) I haven't tested this entire problem with i386-pae but -DPAE is specified in the rumpkernel build so I'm supposing that I should.

3) I haven't looked at why q35 works normally but I think that Samuel is probably correct with SATA permitting more than 32 bits of physical memory addressing.

The main purpose in sending this update is to gauge whether this is an appropriate direction. If not, it's back to the drawing board, but if it is then I can continue developing it with greater confidence.

Tasks on the list:

1) Fix vm_allocate_contiguous to take into account larger alignment requirements. This issue is already known to rumpdisk having a suboptimal workaround.

2) Work out the best method of replacing the current rumpuser_malloc/free with custom ones within the rumpkernel build system.

Cheers,

Mike.








Reply via email to