Peter Xu <pet...@redhat.com> writes:

> On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int 
>> >> version_id)
>> >>          return -EINVAL;
>> >>      }
>> >>      s->state = r;
>> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>> > 
>> > IIUC current vm_was_suspended (based on my read of your patch) was not the
>> > same as a boolean representing "whether VM is suspended", but only a
>> > temporary field to remember that for a VM stop request.  To be explicit, I
>> > didn't see this flag set in qemu_system_suspend() in your previous patch.
>> > 
>> > If so, we can already do:
>> > 
>> >   vm_set_suspended(s->vm_was_suspended);
>> > 
>> > Irrelevant of RUN_STATE_SUSPENDED?
>> 
>> We need both terms of the expression.
>> 
>> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = 
>> false.
>> We call global_state_store prior to vm_stop_force_state, so the incoming
>> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
>
> Right.
>
>> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
>> calling vm_start, we need to restore the suspended state.  Thus in 
>> global_state_post_load, we must set vm_was_suspended = true.
>
> With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
> already?  Then I think it should call vm_start(SUSPENDED) if to start.
>
> Maybe you're talking about the special case where autostart==false?  We
> used to have this (existing process_incoming_migration_bh()):
>
>     if (!global_state_received() ||
>         global_state_get_runstate() == RUN_STATE_RUNNING) {
>         if (autostart) {
>             vm_start();
>         } else {
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }
>
> If so maybe I get you, because in the "else" path we do seem to lose the
> SUSPENDED state again, but in that case IMHO we should logically set
> vm_was_suspended only when we "lose" it - we didn't lose it during
> migration, but only until we decided to switch to PAUSED (due to
> autostart==false). IOW, change above to something like:
>
>     state = global_state_get_runstate();
>     if (!global_state_received() || runstate_is_alive(state)) {
>         if (autostart) {
>             vm_start(state);
>         } else {
>             if (runstate_is_suspended(state)) {
>                 /* Remember suspended state before setting system to STOPed */
>                 vm_was_suspended = true;
>             }
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }
>
> It may or may not have a functional difference even if current patch,
> though.  However maybe clearer to follow vm_was_suspended's strict
> definition.
>
>> 
>> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> then vm_was_suspended = true.  Migration from that state sets
>> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
>> ends with runstate_set(RUN_STATE_PAUSED).
>> 
>> I will add a comment here in the code.
>>  
>> >>      return 0;
>> >>  }
>> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = 
>> >> {
>> >>      .fields = (VMStateField[]) {
>> >>          VMSTATE_UINT32(size, GlobalState),
>> >>          VMSTATE_BUFFER(runstate, GlobalState),
>> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>> >>          VMSTATE_END_OF_LIST()
>> >>      },
>> >>  };
>> > 
>> > I think this will break migration between old/new, unfortunately.  And
>> > since the global state exist mostly for every VM, all VM setup should be
>> > affected, and over all archs.
>> 
>> Thanks, I keep forgetting that my binary tricks are no good here.  However,
>> I have one other trick up my sleeve, which is to store vm_was_running in
>> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
>> compatible, since that byte is always 0 in older qemu.  It can be implemented
>> with a few lines of code change confined to global_state.c, versus many 
>> lines 
>> spread across files to do it the conventional way using a compat property and
>> a subsection.  Sound OK?  
>
> Tricky!  But sounds okay to me.  I think you're inventing some of your own
> way of being compatible, not relying on machine type as a benefit.  If go
> this route please document clearly on the layout and also what it looked
> like in old binaries.
>
> I think maybe it'll be good to keep using strings, so in the new binaries
> we allow >1 strings, then we define properly on those strings (index 0:
> runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
> express, while 0x00 means old binary, etc.).
>
> I hope this trick will need less code than the subsection solution,
> otherwise I'd still consider going with that, which is the "common
> solution".
>
> Let's also see whether Juan/Fabiano/others has any opinions.

Can't we pack the structure and just go ahead and slash 'runstate' in
half? That would claim some unused bytes for future backward
compatibility issues.

Reply via email to