> -----Original Message----- > From: Jonathan Cameron <[email protected]> > Sent: 19 May 2026 18:34 > To: Manish Honap <[email protected]> > Cc: Alex Williamson <[email protected]>; Shameer Kolothum Thodi > <[email protected]>; Ankit Agrawal <[email protected]>; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; Vikram Sethi <[email protected]>; Neo Jia > <[email protected]>; Tarun Gupta (SW-GPU) <[email protected]>; Zhi Wang > <[email protected]>; Krishnakant Jaju <[email protected]>; linux- > [email protected]; [email protected]; [email protected]; qemu- > [email protected] > Subject: Re: [RFC 7/9] hw/vfio+cxl: Program HDM decoder 0 at machine_done > for firmware-committed devices > > External email: Use caution opening links or attachments > > > On Mon, 27 Apr 2026 23:42:33 +0530 > <[email protected]> wrote: > > > From: Manish Honap <[email protected]> > > > > setup_locked_hdm() runs as a machine_done notifier after all devices > > have been realized. It programs HDM decoder 0 with the CFMWS base > > Be more specific on this. Which HDM Decoder? i.e. what device is it in. > Ultimately there may be several in the path. Are you supporting that? > The reference to FLR() later would only apply to an endpoint (can't FLR a > host bridge) + the code is clearly operating on the shadow registers for > the EP. > > I guess maybe you are relying on pass through host bridge decoders? > If you are - you need to go check those - or better yet program and lock > them as well. Yes, for current RFC I had restricted the scope to endpoint decoders with the setup having default value for hdm_for_passthrough. You are correct, this assumption was not enforced in the code. For next V2, I will add realize-time checks in vfio_cxl_setup() that refuse to start unless: - cxl_get_hb_passthrough(hb) is true - pcie_count_ds_ports(hb->bus) == 1 - no switch is present between the cxl-rp and the vfio-pci > > > address so the guest can fault into device memory from the first > > instruction. > > > > The notifier is only registered when the kernel reports the device as > > firmware-committed (VFIO_CXL_CAP_FIRMWARE_COMMITTED). The host is > > responsible for HDM decoder programming; the guest has no mechanism to > > remap host physical address mappings. > > Maybe say why this matters. Presumably because only thing we can program > from the guest is GPA to HPA mappings so linear ranges remain linear etc > (more or less anyway given constraints on how many regions we can plug > into qemu!) okay, I will add the details for this in v2. > > > > > The function uses cxl->fmws_base (set by the optional cxl-fmws-base > > device property) if non-zero; otherwise it falls back to the > > Why is that property needed? If it is add it in a separate patch where > you can explain the need to force it. Smells like a hack so far. Yes, I will update in v2 so that this part is handled by the value provided by CFMWS setting. > > > cxl_fmws_base global captured by cxl_fmws_set_memmap() during machine > > memory-map init. If neither is set, it warns and returns without > > programming anything. > > Which CFMWS is that base in? How is that controlled? We may well have > some type 3 devices in same VM and probably want those in different CFMWs > (we'll need to actually upstream the patch to control constraints so the > kernel won't try using your CFMWS for those - currently all CFMWS are > entirely flexible). > > > > > > If COMMIT_LOCK is set in decoder 0 CTRL at machine_done time > > (left-over from a prior FLR?), it is cleared before writing BASE so > > the subsequent write is not blocked. COMMIT_LOCK is re-set after > > programming so the hardware enforces the committed base. > > Hmm. We should look into whether there is a correct way to reset that. > Though as above from the description I'm not sure which decoder this is > resetting. Yes, I will rectify this in V2. This is referring to endpoint decoder. > > > > > read_region() return is checked; failure aborts programming rather > > than leaving ctrl uninitialized. All write_region() failures are > propagated. > > The function exits cleanly rather than leaving the decoder half- > programmed. > > > > Add cxl_fmws_base as a hwaddr global in cxl-host.c (and a stub in > > cxl-host-stubs.c). It is set once by cxl_fmws_set_memmap() and read > > later at machine_done time. > > > > Signed-off-by: Zhi Wang <[email protected]> > > Signed-off-by: Manish Honap <[email protected]> > > > > @@ -3486,6 +3646,18 @@ static bool vfio_cxl_setup(VFIOPCIDevice *vdev, > Error **errp) > > trace_vfio_cxl_setup_params(vbasedev->name, cxl- > >hdm_regs_bar_index, > > cxl->hdm_regs_offset, cxl- > >hdm_regs_size, > > cxl->dpa_size); > > + > > + /* > > + * Only pre-program the HDM decoder if the kernel reported the > device as > > + * firmware-committed. Non-committed devices need guest driver > > + involvement > > For wording these can we be very clear which kernel. Though I'm also > unsure what FIRMWARE_COMMITTED means here. Does it actually mean host > firmware, or does it mean something running on the host pre guest boot > (e.g. could it be qemu?) yes, this is referring to HOST_FIRMWARE. Sorry for the bad wording. I will update this flag name in the kernel-v3 and qemu-v2 > > > + * to commit the decoder; registering the notifier for them would > write an > > + * uncommitted BASE value that the hardware ignores. > > + */ > > + if (cap->flags & VFIO_CXL_CAP_FIRMWARE_COMMITTED) { > > + cxl->machine_done.notify = setup_locked_hdm; > > + qemu_add_machine_init_done_notifier(&cxl->machine_done); > > + } > > + > > return true; > > } > > >
