On 29.05.24 08:48, Michal Prívozník wrote:
On 5/28/24 18:47, David Hildenbrand wrote:
Am 28.05.24 um 18:15 schrieb Michal Privoznik:
./build/qemu-system-x86_64 \ -m
size=8389632k,slots=16,maxmem=25600000k \ -object
'{"qom-type":"memory-backend-file","id":"ram-node0","mem-path":"hugepages2M","prealloc":true,"size":8590983168,"host-nodes":[0],"policy":"bind"}'
 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0

For DIMMs and friends we now (again) enforce that the size must be
aligned to the page size:

commit 540a1abbf0b243e4cfb4333c5d30a041f7080ba4
Author: David Hildenbrand <da...@redhat.com>
Date:   Wed Jan 17 14:55:54 2024 +0100

     memory-device: reintroduce memory region size check

     We used to check that the memory region size is multiples of the
overall
     requested address alignment for the device memory address.

     We removed that check, because there are cases (i.e., hv-balloon) where
     devices unconditionally request an address alignment that has a very
large
     alignment (i.e., 32 GiB), but the actual memory device size might
not be
     multiples of that alignment.

     However, this change:

     (a) allows for some practically impossible DIMM sizes, like "1GB+1
byte".
     (b) allows for DIMMs that partially cover hugetlb pages, previously
         reported in [1].
...

Partial hugetlb pages do not particularly make sense; wasting memory. Do
we expect people to actually ave working setup that we might break when
disallowing such configurations? Or would we actually help them identify
that something non-sensical is happening?

When using memory-backend-memfd, we already do get a proper error:

  ./build/qemu-system-x86_64 -m 2047m -object
memory-backend-memfd,id=ram-node0,hugetlb=on,size=2047m,reserve=off
-numa node,nodeid=0,cpus=0,memdev=ram-node0 -S
qemu-system-x86_64: failed to resize memfd to 2146435072: Invalid argument

So this only applies to memory-backend-file, where we maybe should fail
in a similar way?


Yeah, let's fail in that case. But non-aligned length is just one of
many reasons madvise()/mbind() can fail. What about the others? Should
we make them report an error or just a warning?

Regarding madvise(), we should report at least a warning.

In qemu_ram_setup_dump() we print an error if QEMU_MADV_DONTDUMP failed.

But we swallow any errors from memory_try_enable_merging() ... in general, we likely have to distinguish the "not supported by the OS" case from " actually supported but failed" case.

In the second patch, maybe we should really fail if something unexpected happens, instead of fake-changing the properties.

--
Cheers,

David / dhildenb


Reply via email to