On Wed, Oct 25, 2023 at 04:40:52PM +0100, Daniel P. Berrangé wrote: > On Wed, Oct 25, 2023 at 11:36:27AM -0400, Peter Xu wrote: > > On Wed, Oct 25, 2023 at 04:25:23PM +0100, Daniel P. Berrangé wrote: > > > > Libvirt will still use fixed-ram for live snapshot purpose, especially > > > > for > > > > Windows? Then auto-pause may still be useful to identify that from what > > > > Fabiano wants to achieve here (which is in reality, non-live)? > > > > > > > > IIRC of previous discussion that was the major point that libvirt can > > > > still > > > > leverage fixed-ram for a live case - since Windows lacks efficient live > > > > snapshot (background-snapshot feature). > > > > > > Libvirt will use fixed-ram for all APIs it has that involve saving to > > > disk, with CPUs both running and paused. > > > > There are still two scenarios. How should we identify them, then? For > > sure we can always make it live, but QEMU needs that information to make it > > efficient for non-live. > > > > Considering when there's no auto-pause, then Libvirt will still need to > > know the scenario first then to decide whether pausing VM before migration > > or do nothing, am I right? > > libvirt will issue a 'stop' before invoking 'migrate' if it > needs to. QEMU should be able to optimize that scenario if > it sees CPUs already stopped when migrate is started ? > > > If so, can Libvirt replace that "pause VM" operation with setting > > auto-pause=on here? Again, the benefit is QEMU can benefit from it. > > > > I think when pausing Libvirt can still receive an event, then it can > > cooperate with state changes? Meanwhile auto-pause=on will be set by > > Libvirt too, so Libvirt will even have that expectation that QMP migrate > > later on will pause the VM. > > > > > > > > > From that POV it sounds like auto-pause is a good knob for that. > > > > > > From libvirt's POV auto-pause will create extra work for integration > > > for no gain. > > > > Yes, I agree for Libvirt there's no gain, as the gain is on QEMU's side. > > Could you elaborate what is the complexity for Libvirt to support it? > > It increases the code paths because we will have to support > and test different behaviour wrt CPU state for fixed-ram > vs non-fixed ram usage.
To me if the user scenario is different, it makes sense to have a flag showing what the user wants to do. Guessing that from "whether VM is running or not" could work in many cases but not all. It means at least for dirty tracking, we only have one option to make it fully transparent, starting dirty tracking when VM starts during such migration. The complexity moves from Libvirt into migration / kvm from this aspect. Meanwhile we lose some other potential optimizations for good, early releasing of resources will never be possible anymore because they need to be prepared to be reused very soon, even if we know they will never. But maybe that's not a major concern. No strong opinion from my side. I'll leave it to Fabiano. I didn't see any further optimization yet with the new cap in this series. I think the trick is current extra overheads are just not high enough for us to care.. even if we know some work is pure overhead. Then indeed we can also postpone the optimizations until justified worthwhile. Thanks, -- Peter Xu