Hi Dan,
On 3/23/2026 12:54 PM, Dan Williams wrote:
Smita Koralahalli wrote:
From: Dan Williams <[email protected]>
Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
Reserved ranges.
Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
loading, it does not enforce that the dependency has finished init
before the current module runs. This can cause HMEM to start before
cxl_acpi has populated the resource tree, breaking detection of overlaps
between Soft Reserved and CXL Windows.
Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
that trigger further module loads. Asynchronous probe flushing
(wait_for_device_probe()) is added later in the series in a deferred
context before HMEM makes ownership decisions for Soft Reserved ranges.
Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
Soft Reserved ranges before CXL drivers have had a chance to claim them.
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Smita Koralahalli <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Alison Schofield <[email protected]>
---
drivers/dax/Kconfig | 2 ++
drivers/dax/hmem/hmem.c | 17 ++++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index d656e4c0eb84..3683bb3f2311 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -48,6 +48,8 @@ config DEV_DAX_CXL
tristate "CXL DAX: direct access to CXL RAM regions"
depends on CXL_BUS && CXL_REGION && DEV_DAX
default CXL_REGION && DEV_DAX
+ depends on CXL_ACPI >= DEV_DAX_HMEM
+ depends on CXL_PCI >= DEV_DAX_HMEM
As I learned from Keith's recent CXL_PMEM dependency fix for CXL_ACPI
[1], this wants to be:
depends on DEV_DAX_HMEM || !DEV_DAX_HMEM
depends on CXL_ACPI || !CXL_ACPI
depends on CXL_PCI || !CXL_PCI
...to make sure that DEV_DAX_CXL can never be built-in unless all of its
dependencies are built-in.
[1]: http://lore.kernel.org/[email protected]
At this point I am wondering if all of the feedback I have for this
series should just be incremental fixes. I also want to have a canned
unit test that verifies the base expectations. That can also be
something I reply incrementally.
Two things on the Kconfig change:
When DEV_DAX_HMEM = y and CXL_ACPI = m and CXL_PCI = m
1. Regarding switching from >= to || ! pattern:
The >= pattern disabled DEV_DAX_CXL entirely when DEV_DAX_HMEM = y and
CXL_ACPI/CXL_PCI = m. So, HMEM unconditionally owned all ranges - the
CXL deferral path is never entered.
With the || ! pattern, DEV_DAX_CXL is enabled, which changes the
ownership behavior based on how the probes starts for CXL_ACPI/CXL_PCI.
On my system I see:
[ 7.379] dax_hmem_platform_probe began
[ 7.384] alloc_dev_dax_range: dax0.0
[ 28.560] cxl acpi probe started <- 21 seconds later
HMEM ends up owning in this case because CXL windows aren't published
yet when HMEM probes (built-in runs before modules load and
request_module might not work this early??), so region_intersects()
returns DISJOINT for all CXL ranges.
But it could go the other way if CXL ACPI and PCI probe starts before
the deferred work is queued in HMEM. (And I think this is the expected
path if DEV_DAX_CXL is enabled..)
But do you think it is okay as of now with resource exclusion handling??
2. Separate build issue with DEV_DAX_HMEM = y, CXL_BUS/ACPI/PCI = m and
CXL_REGION = y.
I hit this build error when I was testing the above config: (Sorry I
should have checked this config before)..
When DEV_DAX_HMEM = y and CXL core is built as a module hmem.c calls
cxl_region_contains_resource() which lives in cxl_core.ko causing an
undefined reference at link time.
This happens with both the >= and || ! Kconfig patterns.
The current #ifdef CONFIG_CXL_REGION guard evaluates to true even when
CXL_REGION is compiled into a module. Changing the guard to check
reachability of the actual module in include/cxl/cxl.h worked for me to
overcome the error:
-#ifdef CONFIG_CXL_REGION
+#if IS_REACHABLE(CONFIG_CXL_BUS) && defined(CONFIG_CXL_REGION)
bool cxl_region_contains_resource(struct resource *res);
#else
...
Not sure if CONFIG_CXL_BUS is the right check here or it should be more
specifically checking on CXL_ACPI or PCI..
Thanks
Smita