Hi Zhijian, We recreated the failure for the cases you mentioned below. We will be adding the fix into v4 I am working on now.
Regards, Terry On 4/7/2025 2:31 AM, Zhijian Li (Fujitsu) wrote: > Hi Terry, > > If I understand correctly, this patch set has only considered the situation > where the > soft reserved area and the region are exactly the same, as in pattern 1. > > However, I believe we also need to consider situations where these two are > not equal, > which are outlined in pattern 2 and 3 below. Let me explain them: > > =========================================== > Pattern 1: > - region0 will be created during OS booting due to programed hdm decoder > - After OS booted, region0 can be re-created again after destroy it > ┌────────────────────┐ > │ CFMW │ > └────────────────────┘ > ┌────────────────────┐ > │ reserved0 │ > └────────────────────┘ > ┌────────────────────┐ > │ mem0 │ > └────────────────────┘ > ┌────────────────────┐ > │ region0 │ > └────────────────────┘ > > > Pattern 2: > The HDM decoder is not in a committed state, so during the kernel boot > process, > egion0 will not be created automatically. In this case, the soft reserved > area will > not be removed from the iomem tree. After the OS starts, > users cannot create a region (cxl create-region) either, as there should > be an intersection between the soft reserved area and the region. > > ┌────────────────────┐ > │ CFMW │ > └────────────────────┘ > ┌────────────────────┐ > │ reserved0 │ > └────────────────────┘ > ┌────────────────────┐ > │ mem0* │ > └────────────────────┘ > ┌────────────────────┐ > │ N/A │ region0 > └────────────────────┘ > *HDM decoder in mem0 is not committed. > > > Pattern 3: > Region0 is a child of the soft reserved area. In this case, the soft reserved > area will > not be removed from the iomem tree, resulting in being unable to be recreated > later after destroy. > ┌────────────────────┐ > │ CFMW │ > └────────────────────┘ > ┌────────────────────┐ > │ reserved │ > └────────────────────┘ > ┌────────────────────┐ > │ mem0 | mem1* │ > └────────────────────┘ > ┌────────────────────┐ > │region0 | N/A │ region1 > └────────────────────┘ > *HDM decoder in mem1 is not committed. > > > Thanks > Zhijian > > > > On 04/04/2025 02:33, Terry Bowman wrote: >> Add the ability to manage SOFT RESERVE iomem resources prior to them being >> added to the iomem resource tree. This allows drivers, such as CXL, to >> remove any pieces of the SOFT RESERVE resource that intersect with created >> CXL regions. >> >> The current approach of leaving the SOFT RESERVE resources as is can cause >> failures during hotplug of devices, such as CXL, because the resource is >> not available for reuse after teardown of the device. >> >> The approach is to add SOFT RESERVE resources to a separate tree during >> boot. This allows any drivers to update the SOFT RESERVE resources before >> they are merged into the iomem resource tree. In addition a notifier chain >> is added so that drivers can be notified when these SOFT RESERVE resources >> are added to the ioeme resource tree. >> >> The CXL driver is modified to use a worker thread that waits for the CXL >> PCI and CXL mem drivers to be loaded and for their probe routine to >> complete. Then the driver walks through any created CXL regions to trim any >> intersections with SOFT RESERVE resources in the iomem tree. >> >> The dax driver uses the new soft reserve notifier chain so it can consume >> any remaining SOFT RESERVES once they're added to the iomem tree. >> >> V3 updates: >> - Remove srmem resource tree from kernel/resource.c, this is no longer >> needed in the current implementation. All SOFT RESERVE resources now >> put on the iomem resource tree. >> - Remove the no longer needed SOFT_RESERVED_MANAGED kernel config option. >> - Add the 'nid' parameter back to hmem_register_resource(); >> - Remove the no longer used soft reserve notification chain (introduced >> in v2). The dax driver is now notified of SOFT RESERVED resources by >> the CXL driver. >> >> v2 updates: >> - Add config option SOFT_RESERVE_MANAGED to control use of the >> separate srmem resource tree at boot. >> - Only add SOFT RESERVE resources to the soft reserve tree during >> boot, they go to the iomem resource tree after boot. >> - Remove the resource trimming code in the previous patch to re-use >> the existing code in kernel/resource.c >> - Add functionality for the cxl acpi driver to wait for the cxl PCI >> and me drivers to load. >> >> Nathan Fontenot (4): >> kernel/resource: Provide mem region release for SOFT RESERVES >> cxl: Update Soft Reserved resources upon region creation >> dax/mum: Save the dax mum platform device pointer >> cxl/dax: Delay consumption of SOFT RESERVE resources >> >> drivers/cxl/Kconfig | 4 --- >> drivers/cxl/acpi.c | 28 +++++++++++++++++++ >> drivers/cxl/core/Makefile | 2 +- >> drivers/cxl/core/region.c | 34 ++++++++++++++++++++++- >> drivers/cxl/core/suspend.c | 41 ++++++++++++++++++++++++++++ >> drivers/cxl/cxl.h | 3 +++ >> drivers/cxl/cxlmem.h | 9 ------- >> drivers/cxl/cxlpci.h | 1 + >> drivers/cxl/pci.c | 2 ++ >> drivers/dax/hmem/device.c | 47 ++++++++++++++++---------------- >> drivers/dax/hmem/hmem.c | 10 ++++--- >> include/linux/dax.h | 11 +++++--- >> include/linux/ioport.h | 3 +++ >> include/linux/pm.h | 7 ----- >> kernel/resource.c | 55 +++++++++++++++++++++++++++++++++++--- >> 15 files changed, 202 insertions(+), 55 deletions(-) >> >> >> base-commit: aae0594a7053c60b82621136257c8b648c67b512