Si-Wei Liu <si-wei....@oracle.com> writes: > On 5/15/2025 11:40 PM, Markus Armbruster wrote: >> Jason Wang <jasow...@redhat.com> writes: >> >>> On Thu, May 8, 2025 at 2:47 AM Jonah Palmer <jonah.pal...@oracle.com> wrote: >>>> Current memory operations like pinning may take a lot of time at the >>>> destination. Currently they are done after the source of the migration is >>>> stopped, and before the workload is resumed at the destination. This is a >>>> period where neigher traffic can flow, nor the VM workload can continue >>>> (downtime). >>>> >>>> We can do better as we know the memory layout of the guest RAM at the >>>> destination from the moment that all devices are initializaed. So >>>> moving that operation allows QEMU to communicate the kernel the maps >>>> while the workload is still running in the source, so Linux can start >>>> mapping them. >>>> >>>> As a small drawback, there is a time in the initialization where QEMU >>>> cannot respond to QMP etc. By some testing, this time is about >>>> 0.2seconds. >>> >>> Adding Markus to see if this is a real problem or not. >> >> I guess the answer is "depends", and to get a more useful one, we need >> more information. >> >> When all you care is time from executing qemu-system-FOO to guest >> finish booting, and the guest takes 10s to boot, then an extra 0.2s >> won't matter much. > > There's no such delay of an extra 0.2s or higher per se, it's just shifting > around the page pinning hiccup, no matter it is 0.2s or something else, from > the time of guest booting up to before guest is booted. This saves back guest > boot time or start up delay, but in turn the same delay effectively will be > charged to VM launch time. We follow the same model with VFIO, which would > see the same hiccup during launch (at an early stage where no real mgmt > software would care about). > >> When a management application runs qemu-system-FOO several times to >> probe its capabilities via QMP, then even milliseconds can hurt. >> > Not something like that, this page pinning hiccup is one time only that > occurs in the very early stage when launching QEMU, i.e. there's no > consistent delay every time when QMP is called. The delay in QMP response at > that very point depends on how much memory the VM has, but this is just > specif to VM with VFIO or vDPA devices that have to pin memory for DMA. > Having said, there's no extra delay at all if QEMU args has no vDPA device > assignment, on the other hand, there's same delay or QMP hiccup when VFIO is > around in QEMU args. > >> In what scenarios exactly is QMP delayed? > > Having said, this is not a new problem to QEMU in particular, this QMP delay > is not peculiar, it's existent on VFIO as well.
In what scenarios exactly is QMP delayed compared to before the patch? > Thanks, > -Siwei > >> >> You told us an absolute delay you observed. What's the relative delay, >> i.e. what's the delay with and without these patches? Can you answer this question? >> We need QMP to become available earlier in the startup sequence for >> other reasons. Could we bypass the delay that way? Please understand >> that this would likely be quite difficult: we know from experience that >> messing with the startup sequence is prone to introduce subtle >> compatility breaks and even bugs. >> >>> (I remember VFIO has some optimization in the speed of the pinning, >>> could vDPA do the same?) >> >> That's well outside my bailiwick :) >> >> [...] >>