Robert Richter wrote: > Dan, > > On 24.11.22 10:34:35, Dan Williams wrote: > > Changes since v3 [1]: > > - Rework / simplify CXL to LIBNVDIMM coordination to remove a > > flush_work() locking dependency from underneath the root device lock. > > - Move the root device rescan to a workqueue > > - Connect RCDs directly as endpoints reachable through a CXL host bridge > > as a dport, i.e. drop the extra dport indirection from v3 > > - Add unit test infrastructure for an RCD configuration > > thank you for this posting. > > Patches #1-#6 are not really prerequisites (except for a trivial > conflict), right? I only reviewed them starting with #6.
In fact they are pre-requisites because of this hunk in: [PATCH v4 10/12] cxl/port: Add RCD endpoint port enumeration http://lore.kernel.org/r/166931493266.2104015.8062923429837042172.st...@dwillia2-xfh.jf.intel.com/ @@ -119,17 +131,22 @@ static int cxl_mem_probe(struct device *dev) return -ENXIO; } - device_lock(&parent_port->dev); - if (!parent_port->dev.driver) { + if (dport->rch) + endpoint_parent = parent_port->uport; + else + endpoint_parent = &parent_port->dev; + + device_lock(endpoint_parent); + if (!endpoint_parent->driver) { dev_err(dev, "CXL port topology %s not enabled\n", dev_name(&parent_port->dev)); rc = -ENXIO; goto unlock; } - rc = devm_cxl_add_endpoint(cxlmd, dport); + rc = devm_cxl_add_endpoint(endpoint_parent, cxlmd, dport); unlock: - device_unlock(&parent_port->dev); + device_unlock(endpoint_parent); put_device(&parent_port->dev); if (rc) return rc; That device_lock(endpoint_parent) in the RCH case locks the ACPI0017 device so that a devm action can be serialized against ACPI0017 being unbound from its driver. Before RCH support this path only locked cxl_port objects to syncrhonize with the cxl_port driver. In the RCH case the port that needs to be locked while the endpoint is attached is the one owned by the cxl_acpi driver. This all happens while holding device_lock(&cxlmd->dev). It means lockdep complains about cxl_bus_rescan() being done under device_lock(&ACPI0017->dev), since rescan may take device_lock(&cxlmd->dev), and it complains about the flush_work() in unregister_nvb() for a similar entanglement. So the initial patches in this series are all about making it possible setup an unregistration chain when the root CXL device goes through ->remove().