On Wed, Oct 25, 2023 at 04:40:52PM +0100, Daniel P. Berrangé wrote:
> On Wed, Oct 25, 2023 at 11:36:27AM -0400, Peter Xu wrote:
> > On Wed, Oct 25, 2023 at 04:25:23PM +0100, Daniel P. Berrangé wrote:
> > > > Libvirt will still use fixed-ram for live snapshot purpose, especially 
> > > > for
> > > > Windows?  Then auto-pause may still be useful to identify that from what
> > > > Fabiano wants to achieve here (which is in reality, non-live)?
> > > > 
> > > > IIRC of previous discussion that was the major point that libvirt can 
> > > > still
> > > > leverage fixed-ram for a live case - since Windows lacks efficient live
> > > > snapshot (background-snapshot feature).
> > > 
> > > Libvirt will use fixed-ram for all APIs it has that involve saving to
> > > disk, with CPUs both running and paused.
> > 
> > There are still two scenarios.  How should we identify them, then?  For
> > sure we can always make it live, but QEMU needs that information to make it
> > efficient for non-live.
> > 
> > Considering when there's no auto-pause, then Libvirt will still need to
> > know the scenario first then to decide whether pausing VM before migration
> > or do nothing, am I right?
> 
> libvirt will issue a 'stop' before invoking 'migrate' if it
> needs to. QEMU should be able to optimize that scenario if
> it sees CPUs already stopped when migrate is started ?
> 
> > If so, can Libvirt replace that "pause VM" operation with setting
> > auto-pause=on here?  Again, the benefit is QEMU can benefit from it.
> > 
> > I think when pausing Libvirt can still receive an event, then it can
> > cooperate with state changes?  Meanwhile auto-pause=on will be set by
> > Libvirt too, so Libvirt will even have that expectation that QMP migrate
> > later on will pause the VM.
> > 
> > > 
> > > > From that POV it sounds like auto-pause is a good knob for that.
> > > 
> > > From libvirt's POV auto-pause will create extra work for integration
> > > for no gain.
> > 
> > Yes, I agree for Libvirt there's no gain, as the gain is on QEMU's side.
> > Could you elaborate what is the complexity for Libvirt to support it?
> 
> It increases the code paths because we will have to support
> and test different behaviour wrt CPU state for fixed-ram
> vs non-fixed ram usage.

To me if the user scenario is different, it makes sense to have a flag
showing what the user wants to do.

Guessing that from "whether VM is running or not" could work in many cases
but not all.

It means at least for dirty tracking, we only have one option to make it
fully transparent, starting dirty tracking when VM starts during such
migration.  The complexity moves from Libvirt into migration / kvm from
this aspect.

Meanwhile we lose some other potential optimizations for good, early
releasing of resources will never be possible anymore because they need to
be prepared to be reused very soon, even if we know they will never.  But
maybe that's not a major concern.

No strong opinion from my side.  I'll leave it to Fabiano.  I didn't see
any further optimization yet with the new cap in this series.  I think the
trick is current extra overheads are just not high enough for us to
care.. even if we know some work is pure overhead.  Then indeed we can also
postpone the optimizations until justified worthwhile.

Thanks,

-- 
Peter Xu


Reply via email to