On Mon, 9 Feb 2026 03:06:00 +0000 "Zhijian Li (Fujitsu)" <[email protected]> wrote:
> On 04/02/2026 23:00, Alireza Sanaee wrote: > > The CXL address to device decoding logic is complex because of the need > > to correctly decode fine grained interleave. The current implementation > > prevents use with KVM where executed instructions may reside in that > > memory and gives very slow performance even in TCG. > > > Do you mean KVM will work after this patch in 1-way case Yes > > > > > > In many real cases non interleaved memory configurations are useful and > > for those we can use a more conventional memory region alias allowing > > similar performance to other memory in the system. > > > Do you have any benchmark data to demonstrate the performance improvement? Yes, if you setup a simple devdax application which sweeps through the region with reads and writes, it is easy to see the performance difference, in KVM/TCG. I can share more if you will. > > > > > > > Co-developed-by: Jonathan Cameron <[email protected]> > > Signed-off-by: Jonathan Cameron <[email protected]> > > Signed-off-by: Alireza Sanaee <[email protected]> > > --- > > hw/cxl/cxl-component-utils.c | 8 ++ > > hw/cxl/cxl-host.c | 197 ++++++++++++++++++++++++++++++++++- > > hw/mem/cxl_type3.c | 4 + > > include/hw/cxl/cxl.h | 1 + > > include/hw/cxl/cxl_device.h | 1 + > > 5 files changed, 206 insertions(+), 5 deletions(-) > > > > diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c > > index d36162e91b..d459d04b6d 100644 > > --- a/hw/cxl/cxl-component-utils.c > > +++ b/hw/cxl/cxl-component-utils.c > > @@ -142,6 +142,14 @@ static void dumb_hdm_handler(CXLComponentState > > *cxl_cstate, hwaddr offset, > > value = FIELD_DP32(value, CXL_HDM_DECODER0_CTRL, COMMITTED, 0); > > } > > stl_le_p((uint8_t *)cache_mem + offset, value); > > + > > + if (should_commit) { > > + cfmws_update_non_interleaved(true); > > + } > > + > > + if (should_uncommit) { > > + cfmws_update_non_interleaved(false); > > + } > > > This might be slightly better as an if-else block, since should_commit and > should_uncommit are mutually exclusive: > > > if (should_commit) { > cfmws_update_non_interleaved(true);; > } else if (should_uncommit) { > cfmws_update_non_interleaved(false); > } > > > > } > > > > static void bi_handler(CXLComponentState *cxl_cstate, hwaddr offset, > > diff --git a/hw/cxl/cxl-host.c b/hw/cxl/cxl-host.c > > index 9633b01abf..1fcfe01164 100644 > > --- a/hw/cxl/cxl-host.c > > +++ b/hw/cxl/cxl-host.c > > @@ -104,7 +104,7 @@ void cxl_fmws_link_targets(Error **errp) > > } > > > > static bool cxl_hdm_find_target(uint32_t *cache_mem, hwaddr addr, > > - uint8_t *target) > > + uint8_t *target, bool *interleaved) > > { > > int hdm_inc = R_CXL_HDM_DECODER1_BASE_LO - R_CXL_HDM_DECODER0_BASE_LO; > > unsigned int hdm_count; > > @@ -138,6 +138,11 @@ static bool cxl_hdm_find_target(uint32_t *cache_mem, > > hwaddr addr, > > found = true; > > ig_enc = FIELD_EX32(ctrl, CXL_HDM_DECODER0_CTRL, IG); > > iw_enc = FIELD_EX32(ctrl, CXL_HDM_DECODER0_CTRL, IW); > > + > > + if (interleaved) { > > + *interleaved = iw_enc != 0; > > + } > > + > > target_idx = (addr / cxl_decode_ig(ig_enc)) % (1 << iw_enc); > > > > if (target_idx < 4) { > > @@ -157,7 +162,8 @@ static bool cxl_hdm_find_target(uint32_t *cache_mem, > > hwaddr addr, > > return found; > > [...] > > > + > > +static int update_non_interleaved(Object *obj, void *opaque) > > +{ > > + const int hdm_inc = R_CXL_HDM_DECODER1_BASE_LO - > > R_CXL_HDM_DECODER0_BASE_LO; > > + bool commit = *(bool *)opaque; > > + CXLType3Dev *ct3d; > > + uint32_t *cache_mem; > > + unsigned int hdm_count, i; > > + uint32_t cap; > > + uint64_t dpa_base = 0; > > + > > + if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) { > > + return 0; > > + } > > + > > + ct3d = CXL_TYPE3(obj); > > + cache_mem = ct3d->cxl_cstate.crb.cache_mem_registers; > > + cap = ldl_le_p(cache_mem + R_CXL_HDM_DECODER_CAPABILITY); > > + hdm_count = cxl_decoder_count_dec(FIELD_EX32(cap, > > + > > CXL_HDM_DECODER_CAPABILITY, > > + DECODER_COUNT)); > > + /* > > + * Walk the decoders and find any committed with iw set to 0 > > + * (non interleaved). > > + */ > > + for (i = 0; i < hdm_count; i++) { > > + uint64_t decoder_base, decoder_size, skip; > > + uint32_t hdm_ctrl, low, high; > > + int iw, committed; > > + > > + hdm_ctrl = ldl_le_p(cache_mem + R_CXL_HDM_DECODER0_CTRL + i * > > hdm_inc); > > + committed = FIELD_EX32(hdm_ctrl, CXL_HDM_DECODER0_CTRL, COMMITTED); > > + if (commit ^ committed) { > > > Is there any specific reason for using ^ here instead of '==' ? > > > > Thanks > Zhijian
