HVM: split restore state checking from state loading

Jan Beulich Tue, 19 Dec 2023 07:24:16 -0800

On 19.12.2023 15:36, Roger Pau Monné wrote:
> On Mon, Dec 18, 2023 at 03:39:55PM +0100, Jan Beulich wrote:
>> ..., at least as reasonably feasible without making a check hook
>> mandatory (in particular strict vs relaxed/zero-extend length checking
>> can't be done early this way).
>>
>> Note that only one of the two uses of "real" hvm_load() is accompanied
>> with a "checking" one. The other directly consumes hvm_save() output,
>> which ought to be well-formed. This means that while input data related
>> checks don't need repeating in the "load" function when already done by
>> the "check" one (albeit assertions to this effect may be desirable),
>> domain state related checks (e.g. has_xyz(d)) will be required in both
>> places.
>>
>> Suggested-by: Roger Pau Monné <roger....@citrix.com>
>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>> ---
>> Now that this re-arranges hvm_load() anyway, wouldn't it be better to
>> down the vCPU-s ahead of calling arch_hvm_load() (which is now easy to
>> arrange for)?
> 
> Seems OK to me.


As is, or with the suggested adjustment, or either way?

>> Do we really need all the copying involved in use of _hvm_read_entry()
>> (backing hvm_load_entry()? Zero-extending loads are likely easier to
>> handle that way, but for strict loads all we gain is a reduced risk of
>> unaligned accesses (compared to simply pointing into h->data[]).
> 
> I do feel it's safer to copy the data so the checks are done against
> what's loaded.  Albeit hvm_load() is already using hvm_get_entry().

The comment is about individual handlers, not hvm_load() itself. In some
cases - when copying directly into guest state structures - the copying
makes sense. In other cases, where there is separate copying anyway,
things could be done with less duplication of data (see hpet_load(),
which was already converted; along those lines hvm_load() itself also
was already switched away from the original copying).

Checking in any event can't be done against what _is_ loaded (as during
checking we want to avoid altering guest state); it'll always be only
against what is going to be loaded. The difference would be whether we
check data in the incoming buffer or in a copy of that data in a local
variable on the stack. But that applies to checking done in the load
hooks only anyway; the cases with split off check handlers should never
need to do any copying.

>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -379,8 +379,12 @@ long arch_do_domctl(
>>          if ( copy_from_guest(c.data, domctl->u.hvmcontext.buffer, c.size) 
>> != 0 )
>>              goto sethvmcontext_out;
>>  
>> +        ret = hvm_load(d, false, &c);
>> +        if ( ret )
>> +            goto sethvmcontext_out;
>> +
>>          domain_pause(d);
>> -        ret = hvm_load(d, &c);
>> +        ret = hvm_load(d, true, &c);
> 
> Now that the check has been done ahead, do we want to somehow assert
> that this cannot fail?  AIUI that's the expectation.

We certainly can't until all checking was moved out of the load handlers.
And even then I think there are still cases where load might produce an
error. (In fact I would have refused a little more strongly to folding
the prior hvm_check() into hvm_load() if indeed a separate hvm_load()
could have ended up returning void in the long run.)

>> @@ -275,12 +281,10 @@ int hvm_save(struct domain *d, hvm_domai
>>      return 0;
>>  }
>>  
>> -int hvm_load(struct domain *d, hvm_domain_context_t *h)
>> +int hvm_load(struct domain *d, bool real, hvm_domain_context_t *h)
> 
> Maybe the 'real' parameter should instead be an enum:
> 
> enum hvm_load_action {
>     CHECK,
>     LOAD,
> };
> int hvm_load(struct domain *d, enum hvm_load_action action,
>              hvm_domain_context_t *h);

Hmm, yes, it could. I'm not a fan of enums for boolean-like things,
though.

> Otherwise a comment might be warranted about how 'real' affects the
> logic in the function.

I can certainly add a comment immediately ahead of the function:

/*
 * @real = false requests checking of the incoming state, while @real = true
 * requests actual loading, which will then assume that checking was already
 * done or is unnecessary.
 */

>> @@ -291,50 +295,91 @@ int hvm_load(struct domain *d, hvm_domai
>>      if ( !hdr )
>>          return -ENODATA;
>>  
>> -    rc = arch_hvm_load(d, hdr);
>> -    if ( rc )
>> -        return rc;
>> +    rc = arch_hvm_check(d, hdr);
> 
> Shouldn't this _check function only be called when real == false?

Possibly. In v4 I directly transformed what I had in v3:

    ASSERT(!arch_hvm_check(d, hdr));

I.e. it is now the call above plus ...

>> +    if ( real )
>> +    {
>> +        struct vcpu *v;
>> +
>> +        ASSERT(!rc);

... this assertion. Really the little brother of the call site assertion
you're asking for (see above).

>> +        arch_hvm_load(d, hdr);
>>  
>> -    /* Down all the vcpus: we only re-enable the ones that had state saved. 
>> */
>> -    for_each_vcpu(d, v)
>> -        if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> -            vcpu_sleep_nosync(v);
>> +        /*
>> +         * Down all the vcpus: we only re-enable the ones that had state
>> +         * saved.
>> +         */
>> +        for_each_vcpu(d, v)
>> +            if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>> +                vcpu_sleep_nosync(v);
>> +    }
>> +    else if ( rc )
>> +        return rc;
>>  
>>      for ( ; ; )
>>      {
>> +        const char *name;
>> +        hvm_load_handler load;
>> +
>>          if ( h->size - h->cur < sizeof(struct hvm_save_descriptor) )
>>          {
>>              /* Run out of data */
>>              printk(XENLOG_G_ERR
>>                     "HVM%d restore: save did not end with a null entry\n",
>>                     d->domain_id);
>> +            ASSERT(!real);
>>              return -ENODATA;
>>          }
>>  
>>          /* Read the typecode of the next entry  and check for the 
>> end-marker */
>>          desc = (struct hvm_save_descriptor *)(&h->data[h->cur]);
>> -        if ( desc->typecode == 0 )
>> +        if ( desc->typecode == HVM_SAVE_CODE(END) )
>> +        {
>> +            /* Reset cursor for hvm_load(, true, ). */
>> +            if ( !real )
>> +                h->cur = 0;
>>              return 0;
>> +        }
>>  
>>          /* Find the handler for this entry */
>> -        if ( (desc->typecode > HVM_SAVE_CODE_MAX) ||
>> -             ((handler = hvm_sr_handlers[desc->typecode].load) == NULL) )
>> +        if ( desc->typecode >= ARRAY_SIZE(hvm_sr_handlers) ||
>> +             !(name = hvm_sr_handlers[desc->typecode].name) ||
>> +             !(load = hvm_sr_handlers[desc->typecode].load) )
>>          {
>>              printk(XENLOG_G_ERR "HVM%d restore: unknown entry typecode 
>> %u\n",
>>                     d->domain_id, desc->typecode);
> 
> The message is not very accurate here, it does fail when the typecode
> is unknown, but also fails when such typecode has no name or load
> function setup.

Yes and no, and it's not changing in this patch. Are you suggesting I should
change it despite being unrelated? If so, there not being a name (which is
the new check I'm adding) still suggests the code is unknown. There not being
a load handler really indicates a bug in Xen (yet no reason to e.g. BUG() in
that case, the failed loading will hopefully be noticeable enough).

Jan

Re: [PATCH v4 1/5] x86/HVM: split restore state checking from state loading

Reply via email to