Jonathan Cameron wrote:
> On Thu, 23 Jun 2022 21:19:44 -0700
> Dan Williams <[email protected]> wrote:
>
> > CXL regions (interleave sets) are made up of a set of memory devices
> > where each device maps a portion of the interleave with one of its
> > decoders (see CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure).
> > As endpoint decoders are identified by a provisioning tool they can be
> > added to a region provided the region interleave properties are set
> > (way, granularity, HPA) and DPA has been assigned to the decoder.
> >
> > The attach event triggers several validation checks, for example:
> > - is the DPA sized appropriately for the region
> > - is the decoder reachable via the host-bridges identified by the
> > region's root decoder
> > - is the device already active in a different region position slot
> > - are there already regions with a higher HPA active on a given port
> > (per CXL 2.0 8.2.5.12.20 Committing Decoder Programming)
> >
> > ...and the attach event affords an opportunity to collect data and
> > resources relevant to later programming the target lists in switch
> > decoders, for example:
> > - allocate a decoder at each cxl_port in the decode chain
> > - for a given switch port, how many the region's endpoints are hosted
> > through the port
> > - how many unique targets (next hops) does a port need to map to reach
> > those endpoints
> >
> > The act of reconciling this information and deploying it to the decoder
> > configuration is saved for a follow-on patch.
> Hi Dam,
> n
> Only managed to grab a few mins today to debug that crash.. So I know
> the immediate cause but not yet why we got to that state.
>
> Test case (happened to be one I had open) is 2x HB, 2x RP on each,
> direct connected type 3s on all ports.
>
> Manual test script is:
>
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/core/cxl_core.ko
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_acpi.ko
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_port.ko
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_pci.ko
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_mem.ko
> insmod modules/5.19.0-rc3+/kernel/drivers/cxl/cxl_pmem.ko
>
> cd /sys/bus/cxl/devices/decoder0.0/
> cat create_pmem_region
> echo region0 > create_pmem_region
>
> cd region0/
> echo 4 > interleave_ways
> echo $((256 << 22)) > size
> echo 6a6b9b22-e0d4-11ec-9d64-0242ac120002 > uuid
> ls -lh /sys/bus/cxl/devices/endpoint?/upo*
>
> # Then figure out the order hopefully write the correct targets
> echo decoder5.0 > target0
Oh, something simple in the end. Just need to check that DPA is assigned
before region attach. I folded the following into patch 40:
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 0b5acabcc541..d52c97e941fe 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -765,10 +765,17 @@ static int cxl_region_attach(struct cxl_region *cxlr,
return -ENXIO;
}
+ if (!cxled->dpa_res) {
+ dev_dbg(&cxlr->dev, "%s:%s: missing DPA allocation.\n",
+ dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev));
+ return -ENXIO;
+ }
+
if (resource_size(cxled->dpa_res) * p->interleave_ways !=
resource_size(p->res)) {
dev_dbg(&cxlr->dev,
- "decoder-size-%#llx * ways-%d != region-size-%#llx\n",
+ "%s:%s: decoder-size-%#llx * ways-%d !=
region-size-%#llx\n",
+ dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
(u64)resource_size(cxled->dpa_res), p->interleave_ways,
(u64)resource_size(p->res));
return -EINVAL;