On 20.01.23 14:47, Daniil Tatianin wrote:
This series introduces new qemu_prealloc_mem_with_timeout() api,
which allows limiting the maximum amount of time to be spent on memory
preallocation. It also adds prealloc statistics collection that is
exposed via an optional timeout handler.
This new api is then utilized by hostmem for guest RAM preallocation
controlled via new object properties called 'prealloc-timeout' and
'prealloc-timeout-fatal'.
This is useful for limiting VM startup time on systems with
unpredictable page allocation delays due to memory fragmentation or the
backing storage. The timeout can be configured to either simply emit a
warning and continue VM startup without having preallocated the entire
guest RAM or just abort startup entirely if that is not acceptable for
a specific use case.
The major use case for preallocation is memory resources that cannot be
overcommitted (hugetlb, file blocks, ...), to avoid running out of such
resources later, while the guest is already running, and crashing it.
Allocating only a fraction "because it takes too long" looks quite
useless in that (main use-case) context. We shouldn't encourage QEMU
users to play with fire in such a way. IOW, there should be no way
around "prealloc-timeout-fatal". Either preallocation succeeded and the
guest can run, or it failed, and the guest can't run.
... but then, management tools can simply start QEMU with "-S", start an
own timer, and zap QEMU if it didn't manage to come up in time, and
simply start a new QEMU instance without preallocation enabled.
The "good" thing about that approach is that it will also cover any
implicit memory preallocation, like using mlock() or VFIO, that don't
run in ordinary per-hostmem preallocation context. If setting QEMU up
takes to long, you might want to try on a different hypervisor in your
cluster instead.
I don't immediately see why we want to make our preallcoation+hostmem
implementation in QEMU more complicated for such a use case.
--
Thanks,
David / dhildenb