On Tue, Nov 08, 2022 at 06:03:54PM +0100, Jan Beulich wrote:
> On 08.11.2022 17:43, Roger Pau Monné wrote:
> > On Tue, Nov 08, 2022 at 05:14:40PM +0100, Jan Beulich wrote:
> >> On 08.11.2022 12:38, Roger Pau Monne wrote:
> >>> Like on the Arm side, return -EINVAL when attempting to do a p2m
> >>> operation on dying domains.
> >>>
> >>> The current logic returns 0 and leaves the domctl parameter
> >>> uninitialized for any parameter fetching operations (like the
> >>> GET_ALLOCATION operation), which is not helpful from a toolstack point
> >>> of view, because there's no indication that the data hasn't been
> >>> fetched.
> >>
> >> While I can see how the present behavior is problematic when it comes
> >> to consuming supposedly returned data, ...
> >>
> >>> --- a/xen/arch/x86/mm/paging.c
> >>> +++ b/xen/arch/x86/mm/paging.c
> >>> @@ -694,9 +694,10 @@ int paging_domctl(struct domain *d, struct 
> >>> xen_domctl_shadow_op *sc,
> >>>  
> >>>      if ( unlikely(d->is_dying) )
> >>>      {
> >>> -        gdprintk(XENLOG_INFO, "Ignoring paging op on dying domain %u\n",
> >>> +        gdprintk(XENLOG_INFO,
> >>> +                 "Tried to do a paging domctl op on dying domain %u\n",
> >>>                   d->domain_id);
> >>> -        return 0;
> >>> +        return -EINVAL;
> >>>      }
> >>
> >> ... going from "success" to "failure" here has a meaningful risk of
> >> regressing callers. It is my understanding that it was deliberate to
> >> mimic success in this case (without meaning to assign "good" or "bad"
> >> to that decision).
> > 
> > I would assume that was the original intention, yes, albeit the commit
> > message doesn't go into details about why mimicking success is
> > required, it's very well possible the code relying on this was xend.
> 
> Quite possible, but you never know who else has cloned code from there.
> 
> >> Can you instead fill the data to be returned in
> >> some simple enough way? I assume a mere memset() isn't going to be
> >> good enough, though (albeit public/domctl.h doesn't explicitly name
> >> any input-only fields, so it may not be necessary to preserve
> >> anything). Maybe zeroing ->mb and ->stats would do?
> > 
> > Hm, it still feels kind of wrong.  We do return errors elsewhere for
> > operations attempted against dying domains, and that seems all fine,
> > not sure why paging operations need to be different in this regard.
> > Arm does also return -EINVAL in that case.
> > 
> > So what about postponing this change to 4.18 in order to avoid
> > surprises, but then taking it in its current form at the start of the
> > development window, as to have time to detect any issues?
> 
> Maybe, but to be honest I'm not convinced. Arm can't really be taken
> for comparison, since the op is pretty new there iirc.

Indeed, but the tools code paths are likely shared between x86 and
Arm, as the hypercalls are the same.

This is a domctl interface, so we are fine to do such changes.  I
understand that we want to avoid such interface changes as much as
possible, but I think we need to fix the hypercall to return error
codes rather than implementing workarounds to try to cope with a wrong
interface behavior in the first place.  Or else we could be
accumulation workarounds here in order to fool caller into thinking
the hypercall has somehow succeed, and provide kind of suitable
looking data for the output parameters.

> >> As a minor remark: _If_ you're changing the printk(), then please
> >> also switch to using %pd.
> > 
> > I've considered this, but then printing: "Tried to do a paging domctl
> > op on dying domain dX" felt kind of repetitive to me because of the
> > usage of domain and dX in the same sentence.  Anyway, will adjust.
> 
> Simply drop the word "domain", as we've done elsewhere when switching
> to %pd?

OK.

Thanks, Roger.

Reply via email to