On Wed, Oct 25, 2023 at 06:31:53PM +0100, Daniel P. Berrangé wrote: > On Wed, Oct 25, 2023 at 01:20:52PM -0400, Peter Xu wrote: > > On Wed, Oct 25, 2023 at 04:40:52PM +0100, Daniel P. Berrangé wrote: > > > On Wed, Oct 25, 2023 at 11:36:27AM -0400, Peter Xu wrote: > > > > On Wed, Oct 25, 2023 at 04:25:23PM +0100, Daniel P. Berrangé wrote: > > > > > > Libvirt will still use fixed-ram for live snapshot purpose, > > > > > > especially for > > > > > > Windows? Then auto-pause may still be useful to identify that from > > > > > > what > > > > > > Fabiano wants to achieve here (which is in reality, non-live)? > > > > > > > > > > > > IIRC of previous discussion that was the major point that libvirt > > > > > > can still > > > > > > leverage fixed-ram for a live case - since Windows lacks efficient > > > > > > live > > > > > > snapshot (background-snapshot feature). > > > > > > > > > > Libvirt will use fixed-ram for all APIs it has that involve saving to > > > > > disk, with CPUs both running and paused. > > > > > > > > There are still two scenarios. How should we identify them, then? For > > > > sure we can always make it live, but QEMU needs that information to > > > > make it > > > > efficient for non-live. > > > > > > > > Considering when there's no auto-pause, then Libvirt will still need to > > > > know the scenario first then to decide whether pausing VM before > > > > migration > > > > or do nothing, am I right? > > > > > > libvirt will issue a 'stop' before invoking 'migrate' if it > > > needs to. QEMU should be able to optimize that scenario if > > > it sees CPUs already stopped when migrate is started ? > > > > > > > If so, can Libvirt replace that "pause VM" operation with setting > > > > auto-pause=on here? Again, the benefit is QEMU can benefit from it. > > > > > > > > I think when pausing Libvirt can still receive an event, then it can > > > > cooperate with state changes? Meanwhile auto-pause=on will be set by > > > > Libvirt too, so Libvirt will even have that expectation that QMP migrate > > > > later on will pause the VM. > > > > > > > > > > > > > > > From that POV it sounds like auto-pause is a good knob for that. > > > > > > > > > > From libvirt's POV auto-pause will create extra work for integration > > > > > for no gain. > > > > > > > > Yes, I agree for Libvirt there's no gain, as the gain is on QEMU's side. > > > > Could you elaborate what is the complexity for Libvirt to support it? > > > > > > It increases the code paths because we will have to support > > > and test different behaviour wrt CPU state for fixed-ram > > > vs non-fixed ram usage. > > > > To me if the user scenario is different, it makes sense to have a flag > > showing what the user wants to do. > > > > Guessing that from "whether VM is running or not" could work in many cases > > but not all. > > > > It means at least for dirty tracking, we only have one option to make it > > fully transparent, starting dirty tracking when VM starts during such > > migration. The complexity moves from Libvirt into migration / kvm from > > this aspect. > > Even with auto-pause we can't skip dirty tracking, as we don't > guarantee the app won't run 'cont' at some point. > > We could have an explicit capability 'dirty-tracking' which an app > could set as an explicit "promise" that it won't ever need to > (re)start CPUs while migration is running. If dirty-tracking==no, > then any attempt to run 'cont' should return an hard error while > migration is running.
I do have some thoughts even before this series on disabling dirty tracking, but until now I think it might be better to make "dirty track" be hidden as an internal flag, decided by other migration caps/parameters. For example, postcopy-only migration will not require dirty tracking in whatever form. But that can be a higher level "postcopy-only" capability or even a higher concept than that, then it'll set dirty_tracking=false internally. I tried to list our options in the previous email. Quotting from that: https://lore.kernel.org/qemu-devel/ZTktCM%2FccipYaJ80@x1n/ 1) Allow VM starts later 1.a) Start dirty tracking right at this point Not prefer this. This will make all things transparent but IMHO unnecessary complexity on maintaining dirty tracking status. 1.b) Fail the migration Can be a good option, IMHO, treating auto-pause as a promise from the user that VM won't need to be running anymore. If VM starts, promise break, migration fails. 2) Doesn't allow VM starts later Can also be a good option. In this case VM resources (I think mostly, RAM) can be freed right after migrated. If user request VM start, fail the start instead of migration itself. Migration must succeed or data lost. So indeed we can fail the migration already if auto-pause=on. > > > Meanwhile we lose some other potential optimizations for good, early > > releasing of resources will never be possible anymore because they need to > > be prepared to be reused very soon, even if we know they will never. But > > maybe that's not a major concern. > > What resources can we release early, without harming our ability to > restart the current QEMU on failure ? We can't if we always allow a restart indeed. I think releasing resources early may not be a major benefit here even if with the option, depending on whether that can make a difference in any of the use cases. I don't see much yet. Consider release-ram for postcopy, that makes sense only because we'll initiate two QEMUs, so that early release guarantees total memory consumption, more or less, to ~1 VM only. Here we have only one single VM anyway, may not be a problem to release everything later. However I still think there can be something done by QEMU if QEMU knows for sure the VM won't ever be restarted. Dirty tracking can be omitted is one of them. One simple example of an extention of dirty tracking: consider the case where a device doesn't support dirty tracking, then it will need to block live migration normally, but it'll work if auto-pause=true, because tracking is not needed. But as long as such migration starts, we can only either fail migration if VM restarts, or rejects the VM restart request. So that can be more than "dirty tracking overhead" itself. Thanks, -- Peter Xu