> -----Original Message-----
> From: Jonathan Cameron <[email protected]>
> Sent: 19 May 2026 18:34
> To: Manish Honap <[email protected]>
> Cc: Alex Williamson <[email protected]>; Shameer Kolothum Thodi
> <[email protected]>; Ankit Agrawal <[email protected]>;
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Vikram Sethi <[email protected]>; Neo Jia
> <[email protected]>; Tarun Gupta (SW-GPU) <[email protected]>; Zhi Wang
> <[email protected]>; Krishnakant Jaju <[email protected]>; linux-
> [email protected]; [email protected]; [email protected]; qemu-
> [email protected]
> Subject: Re: [RFC 7/9] hw/vfio+cxl: Program HDM decoder 0 at machine_done
> for firmware-committed devices
> 
> External email: Use caution opening links or attachments
> 
> 
> On Mon, 27 Apr 2026 23:42:33 +0530
> <[email protected]> wrote:
> 
> > From: Manish Honap <[email protected]>
> >
> > setup_locked_hdm() runs as a machine_done notifier after all devices
> > have been realized. It programs HDM decoder 0 with the CFMWS base
> 
> Be more specific on this.  Which HDM Decoder?  i.e. what device is it in.
> Ultimately there may be several in the path.  Are you supporting that?
> The reference to FLR() later would only apply to an endpoint (can't FLR a
> host bridge) + the code is clearly operating on the shadow registers for
> the EP.
> 
> I guess maybe you are relying on pass through host bridge decoders?
> If you are - you need to go check those - or better yet program and lock
> them as well.

Yes, for current RFC I had restricted the scope to endpoint decoders with the
setup having default value for hdm_for_passthrough. You are correct, this
assumption was not enforced in the code.

For next V2, I will add realize-time checks in vfio_cxl_setup() that refuse to
start unless:
    - cxl_get_hb_passthrough(hb) is true
    - pcie_count_ds_ports(hb->bus) == 1
    - no switch is present between the cxl-rp and the vfio-pci

> 
> > address so the guest can fault into device memory from the first
> > instruction.
> >
> > The notifier is only registered when the kernel reports the device as
> > firmware-committed (VFIO_CXL_CAP_FIRMWARE_COMMITTED). The host is
> > responsible for HDM decoder programming; the guest has no mechanism to
> > remap host physical address mappings.
> 
> Maybe say why this matters.  Presumably because only thing we can program
> from the guest is GPA to HPA mappings so linear ranges remain linear etc
> (more or less anyway given constraints on how many regions we can plug
> into qemu!)

okay, I will add the details for this in v2.

> 
> >
> > The function uses cxl->fmws_base (set by the optional cxl-fmws-base
> > device property) if non-zero; otherwise it falls back to the
> 
> Why is that property needed?  If it is add it in a separate patch where
> you can explain the need to force it. Smells like a hack so far.

Yes, I will update in v2 so that this part is handled by the value provided
by CFMWS setting.

> 
> > cxl_fmws_base global captured by cxl_fmws_set_memmap() during machine
> > memory-map init. If neither is set, it warns and returns without
> > programming anything.
> 
> Which CFMWS is that base in?  How is that controlled?  We may well have
> some type 3 devices in same VM and probably want those in different CFMWs
> (we'll need to actually upstream the patch to control constraints so the
> kernel won't try using your CFMWS for those - currently all CFMWS are
> entirely flexible).
> 
> 
> >
> > If COMMIT_LOCK is set in decoder 0 CTRL at machine_done time
> > (left-over from a prior FLR?), it is cleared before writing BASE so
> > the subsequent write is not blocked. COMMIT_LOCK is re-set after
> > programming so the hardware enforces the committed base.
> 
> Hmm. We should look into whether there is a correct way to reset that.
> Though as above from the description I'm not sure which decoder this is
> resetting.

Yes, I will rectify this in V2. This is referring to endpoint decoder.

> 
> >
> > read_region() return is checked; failure aborts programming rather
> > than leaving ctrl uninitialized. All write_region() failures are
> propagated.
> > The function exits cleanly rather than leaving the decoder half-
> programmed.
> >
> > Add cxl_fmws_base as a hwaddr global in cxl-host.c (and a stub in
> > cxl-host-stubs.c). It is set once by cxl_fmws_set_memmap() and read
> > later at machine_done time.
> >
> > Signed-off-by: Zhi Wang <[email protected]>
> > Signed-off-by: Manish Honap <[email protected]>
> 
> 
> > @@ -3486,6 +3646,18 @@ static bool vfio_cxl_setup(VFIOPCIDevice *vdev,
> Error **errp)
> >      trace_vfio_cxl_setup_params(vbasedev->name, cxl-
> >hdm_regs_bar_index,
> >                                   cxl->hdm_regs_offset, cxl-
> >hdm_regs_size,
> >                                   cxl->dpa_size);
> > +
> > +    /*
> > +     * Only pre-program the HDM decoder if the kernel reported the
> device as
> > +     * firmware-committed.  Non-committed devices need guest driver
> > + involvement
> 
> For wording these can we be very clear which kernel.  Though I'm also
> unsure what FIRMWARE_COMMITTED means here.  Does it actually mean host
> firmware, or does it mean something running on the host pre guest boot
> (e.g. could it be qemu?)

yes, this is referring to HOST_FIRMWARE. Sorry for the bad wording.
I will update this flag name in the kernel-v3 and qemu-v2

> 
> > +     * to commit the decoder; registering the notifier for them would
> write an
> > +     * uncommitted BASE value that the hardware ignores.
> > +     */
> > +    if (cap->flags & VFIO_CXL_CAP_FIRMWARE_COMMITTED) {
> > +        cxl->machine_done.notify = setup_locked_hdm;
> > +        qemu_add_machine_init_done_notifier(&cxl->machine_done);
> > +    }
> > +
> >      return true;
> >  }
> >
> 


Reply via email to