On 21/04/2026 01:10, Samuel Thibault wrote:
Hello,

Michael Kelly, le lun. 20 avril 2026 14:29:43 +0100, a ecrit:
There is no context provided to rumpuser_malloc to make such
determinations so this solution is an all or nothing.
Yes, unfortunately. That being said, if people have >32b memory, they
have a lot of memory, so dedicating quite a bit of it for rumpdisk is
not that bad. I'd however really prefer to avoid it when we know for
sure that the driver is fine with 64b.

It would certainly be better not to alter any of the memory allocations or make use of bounce buffers when they are not required. I don't think though that there is any performance degradation to be overly concerned about. I've made some measurements:

1) There are 0x13c pages allocated using rumpuser_malloc whilst using i440fx and 0x180 pages using q35. Almost all of these are allocated during boot with about 4 further pages allocated during a test compilation of Hurd. This is seemingly the maximum with an additional build of gnumach not allocating further pages. So it seems 1.5M of memory (maximum) is allocated below HIGHMEM rather than above it. rumpdisk.static has quite a large resident set size so most its memory is allocated by other means. I've confirmed that the free pages in each of the memory segments is very similar to that of the existing code after boot, although it doesn't quite tally with the description above. This is using q35 with 4G RAM:

New code after boot:
free:                        15M
free:                       887M
free:                       996M
free:                      1678M

Existing code after boot:
free:                        15M
free:                       888M
free:                       998M
free:                      1682M

Running with a second disk device makes no difference to the memory allocation via rumpuser_malloc.

2) rumpdisk_device_read and rumpdisk_device_write must now use intermediate buffers allocated from memory with physical addresses within the 32 bit space. rumpdisk_device_read however already uses a bounce buffer to avoid alignment issues. The performance loss therefore arises from calling vm_allocate_contiguous rather than vm_allocate. rumpdisk_device_write uses an intermediate buffer (allocated with vm_allocate_contiguous) only when the supplied buffer is not page aligned. I had to change this to always use the intermediate buffer regardless.

I haven't sampled what percentage of calls to rumpdisk_device_write are actually aligned so can't really quantify the performance loss. I cannot however measure any significant difference in average compilation times with and without these changes so possibly the misaligned writes dominate.

Cheers,

Mike.


Reply via email to