On Thu, Jul 17, 2025 at 12:27:48PM +0200, Alejandro Vallejo wrote:
> Hi,
> 
> Sorry I'm so late to this. I have a few suggestions to improve the ergonomics
> of domid handling in dom0less/Hyperlaunch.

Thanks for the feedback!

> 
> On Mon Jun 23, 2025 at 8:28 PM CEST, dmkhn wrote:
> > From: Denis Mukhin <dmuk...@ford.com>
> >
> > Currently, there are two different domain ID allocation implementations:
> >
> >   1) Sequential IDs allocation in dom0less Arm code based on max_init_domid;
> >
> >   2) Sequential IDs allocation in XEN_DOMCTL_createdomain; does not use
> >      max_init_domid (both Arm and x86).
> >
> > The domain ID allocation covers dom0 or late hwdom, predefined domains,
> > post-boot domains, excluding Xen system domains (domid >=
> > DOMID_FIRST_RESERVED).
> >
> > It makes sense to have a common helper code for such task across 
> > architectures
> > (Arm and x86) and between dom0less / toolstack domU allocation.
> >
> > Note, fixing dependency on max_init_domid is out of scope of this patch.
> >
> > Wrap the domain ID allocation as an arch-independent function domid_alloc() 
> > in
> > common/domain.c based on the bitmap.
> >
> > Allocation algorithm:
> > - If an explicit domain ID is provided, verify its availability and use it 
> > if
> >   ID is not used;
> > - If DOMID_INVALID is provided, search the range 
> > [1..DOMID_FIRST_RESERVED-1],
> >   starting from the last used ID. IDs are not wrapped around in dom0less 
> > case.
> >   Implementation guarantees that two consecutive calls will never return the
> >   same ID. ID#0 is reserved for the first boot domain (currently, dom0) and
> >   excluded from allocation range.
> >
> > Remove is_free_domid() helper as it is not needed now.
> >
> > No functional change intended.
> >
> > Signed-off-by: Denis Mukhin <dmuk...@ford.com>
> > ---
> > Changes from v9:
> > - dropped unrelated message formatting from create_dom0()
> > - no wraparound of IDs in dom0less case
> > - fixed ID#0 treatment
> >
> > Link to v9: 
> > https://lore.kernel.org/r/20250528225030.2652166-2-dmuk...@ford.com
> > ---
> >  xen/arch/arm/domain_build.c             |  7 ++-
> >  xen/arch/x86/setup.c                    |  7 ++-
> >  xen/common/device-tree/dom0less-build.c | 17 +++---
> >  xen/common/domain.c                     | 75 +++++++++++++++++++++++++
> >  xen/common/domctl.c                     | 42 ++------------
> >  xen/include/xen/domain.h                |  3 +
> >  6 files changed, 102 insertions(+), 49 deletions(-)
> >
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 4ff161887ec3..9fa5143eb98c 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -2049,6 +2049,7 @@ void __init create_dom0(void)
> >          .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
> >      };
> >      unsigned int flags = CDF_privileged | CDF_hardware;
> > +    domid_t domid;
> >      int rc;
> >
> >      /* The vGIC for DOM0 is exactly emulating the hardware GIC */
> > @@ -2073,7 +2074,11 @@ void __init create_dom0(void)
> >      if ( !llc_coloring_enabled )
> >          flags |= CDF_directmap;
> >
> > -    dom0 = domain_create(0, &dom0_cfg, flags);
> > +    domid = domid_alloc(0);
> 
> The way I´d expect domid_alloc() to be used is twofold:
> 
>   1. "Give me this specific domid"
> 
> for which this interface looks fine, perhaps renamed to 
> domid_alloc_exact(domid)
> 
>   2. "Give me any domid"
> 
> for which we'd benefit more from a domid_alloc()
> 
> This removes the heuristics from the interface. Worst-case execution remains 
> the
> same, under 500 iterations. (32K minus a little bit, checked 64bits at a 
> time).

I think we've settled on the domid_alloc() with partitioned values:
- exact ID allocation within [0..DOMID_FIRST_RESERVED-1]
  if input value is within the range
- exhaustive search within the range of [1..DOMID_FIRST_RESERVED-1] (skipping
  reserved ID#0) if the input value is DOMID_INVALID

I was thinking about having two calls originally, but with splitting the APIs,
do_domctl() should have an extra check for the range to re-direct to the
proper alloc variant. In current implementation it is not needed.

> 
> > +    if ( domid == DOMID_INVALID )
> > +        panic("Error allocating domain ID 0\n");
> > +
> > +    dom0 = domain_create(domid, &dom0_cfg, flags);
> >      if ( IS_ERR(dom0) )
> >          panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
> >
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index f32efa7c6045..7adb92d78a18 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -1044,8 +1044,11 @@ static struct domain *__init create_dom0(struct 
> > boot_info *bi)
> >      if ( iommu_enabled )
> >          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> >
> > -    /* Create initial domain.  Not d0 for pvshim. */
> > -    bd->domid = get_initial_domain_id();
> > +    /* Allocate initial domain ID.  Not d0 for pvshim. */
> > +    bd->domid = domid_alloc(get_initial_domain_id());
> > +    if ( bd->domid == DOMID_INVALID )
> > +        panic("Error allocating domain ID %d\n", get_initial_domain_id());
> > +
> >      d = domain_create(bd->domid, &dom0_cfg,
> >                        pv_shim ? 0 : CDF_privileged | CDF_hardware);
> >      if ( IS_ERR(d) )
> > diff --git a/xen/common/device-tree/dom0less-build.c 
> > b/xen/common/device-tree/dom0less-build.c
> > index 3d503c697337..576fdfa6a19a 100644
> > --- a/xen/common/device-tree/dom0less-build.c
> > +++ b/xen/common/device-tree/dom0less-build.c
> > @@ -839,15 +839,13 @@ void __init create_domUs(void)
> >          struct xen_domctl_createdomain d_cfg = {0};
> >          unsigned int flags = 0U;
> >          bool has_dtb = false;
> > +        domid_t domid;
> >          uint32_t val;
> >          int rc;
> >
> >          if ( !dt_device_is_compatible(node, "xen,domain") )
> >              continue;
> >
> > -        if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED )
> > -            panic("No more domain IDs available\n");
> > -
> >          d_cfg.max_evtchn_port = 1023;
> >          d_cfg.max_grant_frames = -1;
> >          d_cfg.max_maptrack_frames = -1;
> > @@ -965,12 +963,13 @@ void __init create_domUs(void)
> >
> >          arch_create_domUs(node, &d_cfg, flags);
> >
> > -        /*
> > -         * The variable max_init_domid is initialized with zero, so here 
> > it's
> > -         * very important to use the pre-increment operator to call
> > -         * domain_create() with a domid > 0. (domid == 0 is reserved for 
> > Dom0)
> > -         */
> > -        d = domain_create(++max_init_domid, &d_cfg, flags);
> > +        domid = domid_alloc(DOMID_INVALID);
> > +        if ( domid == DOMID_INVALID )
> > +            panic("Error allocating ID for domain %s\n", 
> > dt_node_name(node));
> > +        if ( max_init_domid < domid )
> > +            max_init_domid = domid;
> > +
> > +        d = domain_create(domid, &d_cfg, flags);
> >          if ( IS_ERR(d) )
> >              panic("Error creating domain %s (rc = %ld)\n",
> >                    dt_node_name(node), PTR_ERR(d));
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 434d32901b1b..be022c720b13 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -66,6 +66,14 @@ DEFINE_RCU_READ_LOCK(domlist_read_lock);
> >  static struct domain *domain_hash[DOMAIN_HASH_SIZE];
> >  struct domain *domain_list;
> >
> > +/*
> > + * Domain ID allocator.
> > + * Covers dom0 or late hwdom, predefined domains, post-boot domains; 
> > excludes
> > + * Xen system domains (ID >= DOMID_FIRST_RESERVED).
> > + */
> > +static DEFINE_SPINLOCK(domid_lock);
> > +static DECLARE_BITMAP(domid_bitmap, DOMID_FIRST_RESERVED);
> > +
> >  /*
> >   * Insert a domain into the domlist/hash.  This allows the domain to be 
> > looked
> >   * up by domid, and therefore to be the subject of hypercalls/etc.
> > @@ -1452,6 +1460,8 @@ void domain_destroy(struct domain *d)
> >
> >      TRACE_TIME(TRC_DOM0_DOM_REM, d->domain_id);
> >
> > +    domid_free(d->domain_id);
> > +
> 
> Shouldn't this go after domlist_remove()? Otherwise fun things might happen
> if a domid is allocated while the old domain that still keeps the old domid
> is still in its hash.

Yep, it should! Thanks for the catch.

> 
> The domctl lock (maybe) protects this case implicitly, but it's probably 
> better
> to destroy things in a reasonable order.
> 
> >      /* Remove from the domlist/hash. */
> >      domlist_remove(d);
> >
> > @@ -2433,6 +2443,71 @@ void thaw_domains(void)
> >      rcu_read_unlock(&domlist_read_lock);
> >  }
> >
> > +domid_t domid_alloc(domid_t domid)
> > +{
> > +    static domid_t domid_last;
> > +
> > +    spin_lock(&domid_lock);
> > +
> > +    /* Exact match. */
> > +    if ( domid < DOMID_FIRST_RESERVED )
> > +    {
> > +        if ( __test_and_set_bit(domid, domid_bitmap) )
> > +            domid = DOMID_INVALID;
> > +    }
> > +    /*
> > +     * Exhaustive search.
> > +     *
> > +     * Domain ID#0 is reserved for the first boot domain (e.g. control 
> > domain)
> > +     * and excluded from allocation.
> > +     *
> > +     * In dom0less build, domains are not dynamically destroyed, so 
> > there's no
> > +     * need to do a wraparound of the IDs.
> > +     */
> > +#ifdef CONFIG_DOM0LESS_BOOT
> 
> These ifdef guards are problematic. The fact that a platform supports dom0less
> doesn't mean that every boot is dom0less (I can boot a non-dom0less system on
> a dom0less-capable Xen).

These #ifdefs are meant to align the code with the current Arm behavior, but
there will be correction.

There was v9 feedback around create_domUs() on that:
  https://lore.kernel.org/all/d0829041-1375-4161-b2c4-f8dffadbb...@xen.org/

> 
> Furthermore, the rationale for panicking on wraparound is because of 
> exhaustion,
> but you do have a proper bitmap here to do proper exhaustive search, so IMO,
> this branch is not necessary.
> 
> > +    else if ( domid_last + 1 >= DOMID_FIRST_RESERVED )
> > +    {
> > +        domid = DOMID_INVALID;
> > +    }
> > +#endif
> > +    else
> > +    {
> > +        domid = find_next_zero_bit(domid_bitmap,
> > +                                   DOMID_FIRST_RESERVED,
> > +                                   domid_last + 1);
> > +#ifdef CONFIG_DOM0LESS_BOOT
> > +        if ( domid == DOMID_FIRST_RESERVED )
> > +            domid = find_next_zero_bit(domid_bitmap,
> > +                                       DOMID_FIRST_RESERVED,
> > +                                       1);
> 
> nit: I'd say 0 is fair game. On Hyperlaunch (and soon dom0less) it'll be 
> possible
> to have a domU with domid=0 and a hwdom/ctldom with domids != 0 via the domid
> property on the DTB.
> 
> Starting from 1 might be slightly saner for defence in depth, so it really is
> a nit. I don't think being cautious about dom0 is necessarily a bad thing.

I kept 1 to ensure ID#0 is reserved for dom0.

There was v9 feedback around domid_alloc() on that:
  https://lore.kernel.org/all/d0829041-1375-4161-b2c4-f8dffadbb...@xen.org/

> 
> > +#endif
> > +
> > +        if ( domid < DOMID_FIRST_RESERVED )
> > +        {
> > +            __set_bit(domid, domid_bitmap);
> > +            domid_last = domid;
> 
> Rather than setting domid_last here, I'd move it right before the 
> spin_unlock()
> gated by "if ( domid != DOMID_INVALID )". That'd also bump domid_last in the
> exact match case.
> 
> It also allows to drop the (then) redundant braces.
> 
> > +        }
> > +        else
> > +        {
> 
> nit: redundant braces

Ack.

> 
> > +            domid = DOMID_INVALID;
> > +        }
> > +    }
> > +
> > +    spin_unlock(&domid_lock);
> > +
> > +    return domid;
> > +}
> > +
> > +void domid_free(domid_t domid)
> > +{
> > +    if ( domid < DOMID_FIRST_RESERVED )
> > +    {
> > +        spin_lock(&domid_lock);
> > +        __clear_bit(domid, domid_bitmap);
> > +        spin_unlock(&domid_lock);
> > +    }
> > +}
> > +
> >  /*
> >   * Local variables:
> >   * mode: C
> > diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> > index bfe2e1f9f057..8ef0c147c9b0 100644
> > --- a/xen/common/domctl.c
> > +++ b/xen/common/domctl.c
> > @@ -49,20 +49,6 @@ static int xenctl_bitmap_to_nodemask(nodemask_t 
> > *nodemask,
> >                                     MAX_NUMNODES);
> >  }
> >
> > -static inline int is_free_domid(domid_t dom)
> > -{
> > -    struct domain *d;
> > -
> > -    if ( dom >= DOMID_FIRST_RESERVED )
> > -        return 0;
> > -
> > -    if ( (d = rcu_lock_domain_by_id(dom)) == NULL )
> > -        return 1;
> > -
> > -    rcu_unlock_domain(d);
> > -    return 0;
> > -}
> 
> Good riddance. This is racy without the domctl lock.
> 
> > -
> >  void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info)
> >  {
> >      struct vcpu *v;
> > @@ -421,36 +407,18 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
> > u_domctl)
> >
> >      case XEN_DOMCTL_createdomain:
> >      {
> > -        domid_t        dom;
> > -        static domid_t rover = 0;
> > +        domid_t domid = domid_alloc(op->domain);
> >
> > -        dom = op->domain;
> > -        if ( (dom > 0) && (dom < DOMID_FIRST_RESERVED) )
> > +        if ( domid == DOMID_INVALID )
> >          {
> >              ret = -EEXIST;
> 
> nit: IMO. If createdomain didn't set domctl.domid, we shouldn't return EEXIST,
>      but ENOSPC. It's a very impossible case, so I don't care much though.

I agree, but that will be behavior change which I want to avoid.
I kept -EEXIST because I am not sure how users treat the return value.

> 
> > -            if ( !is_free_domid(dom) )
> > -                break;
> > -        }
> > -        else
> > -        {
> > -            for ( dom = rover + 1; dom != rover; dom++ )
> > -            {
> > -                if ( dom == DOMID_FIRST_RESERVED )
> > -                    dom = 1;
> > -                if ( is_free_domid(dom) )
> > -                    break;
> > -            }
> > -
> > -            ret = -ENOMEM;
> > -            if ( dom == rover )
> > -                break;
> > -
> > -            rover = dom;
> > +            break;
> >          }
> >
> > -        d = domain_create(dom, &op->u.createdomain, false);
> > +        d = domain_create(domid, &op->u.createdomain, false);
> >          if ( IS_ERR(d) )
> >          {
> > +            domid_free(domid);
> >              ret = PTR_ERR(d);
> >              d = NULL;
> >              break;
> > diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> > index e10baf2615fd..8aab05ae93c8 100644
> > --- a/xen/include/xen/domain.h
> > +++ b/xen/include/xen/domain.h
> > @@ -38,6 +38,9 @@ void arch_get_domain_info(const struct domain *d,
> >
> >  domid_t get_initial_domain_id(void);
> >
> > +domid_t domid_alloc(domid_t domid);
> > +void domid_free(domid_t domid);
> > +
> >  /* CDF_* constant. Internal flags for domain creation. */
> >  /* Is this a privileged domain? */
> >  #define CDF_privileged           (1U << 0)
> 
> Cheers,
> Alejandro
> 


Reply via email to