On 11.12.23 23:52, Vishal Verma wrote:
Add a sysfs knob for dax devices to control the memmap_on_memory setting
if the dax device were to be hotplugged as system memory.

The default memmap_on_memory setting for dax devices originating via
pmem or hmem is set to 'false' - i.e. no memmap_on_memory semantics, to
preserve legacy behavior. For dax devices via CXL, the default is on.
The sysfs control allows the administrator to override the above
defaults if needed.

Cc: David Hildenbrand <da...@redhat.com>
Cc: Dan Williams <dan.j.willi...@intel.com>
Cc: Dave Jiang <dave.ji...@intel.com>
Cc: Dave Hansen <dave.han...@linux.intel.com>
Cc: Huang Ying <ying.hu...@intel.com>
Tested-by: Li Zhijian <lizhij...@fujitsu.com>
Reviewed-by: Jonathan Cameron <jonathan.came...@huawei.com>
Reviewed-by: David Hildenbrand <da...@redhat.com>
Signed-off-by: Vishal Verma <vishal.l.ve...@intel.com>
---
  drivers/dax/bus.c                       | 47 +++++++++++++++++++++++++++++++++
  Documentation/ABI/testing/sysfs-bus-dax | 17 ++++++++++++
  2 files changed, 64 insertions(+)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 1ff1ab5fa105..2871e5188f0d 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1270,6 +1270,52 @@ static ssize_t numa_node_show(struct device *dev,
  }
  static DEVICE_ATTR_RO(numa_node);
+static ssize_t memmap_on_memory_show(struct device *dev,
+                                    struct device_attribute *attr, char *buf)
+{
+       struct dev_dax *dev_dax = to_dev_dax(dev);
+
+       return sprintf(buf, "%d\n", dev_dax->memmap_on_memory);
+}
+
+static ssize_t memmap_on_memory_store(struct device *dev,
+                                     struct device_attribute *attr,
+                                     const char *buf, size_t len)
+{
+       struct device_driver *drv = dev->driver;
+       struct dev_dax *dev_dax = to_dev_dax(dev);
+       struct dax_region *dax_region = dev_dax->region;
+       struct dax_device_driver *dax_drv = to_dax_drv(drv);
+       ssize_t rc;
+       bool val;
+
+       rc = kstrtobool(buf, &val);
+       if (rc)
+               return rc;
+
+       if (dev_dax->memmap_on_memory == val)
+               return len;
+
+       device_lock(dax_region->dev);
+       if (!dax_region->dev->driver) {
+               device_unlock(dax_region->dev);
+               return -ENXIO;
+       }
+
+       if (dax_drv->type == DAXDRV_KMEM_TYPE) {
+               device_unlock(dax_region->dev);
+               return -EBUSY;
+       }
+
+       device_lock(dev);
+       dev_dax->memmap_on_memory = val;
+       device_unlock(dev);
+
+       device_unlock(dax_region->dev);
+       return len;
+}
+static DEVICE_ATTR_RW(memmap_on_memory);
+
  static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int 
n)
  {
        struct device *dev = container_of(kobj, struct device, kobj);
@@ -1296,6 +1342,7 @@ static struct attribute *dev_dax_attributes[] = {
        &dev_attr_align.attr,
        &dev_attr_resource.attr,
        &dev_attr_numa_node.attr,
+       &dev_attr_memmap_on_memory.attr,
        NULL,
  };
diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/testing/sysfs-bus-dax
index a61a7b186017..b1fd8bf8a7de 100644
--- a/Documentation/ABI/testing/sysfs-bus-dax
+++ b/Documentation/ABI/testing/sysfs-bus-dax
@@ -149,3 +149,20 @@ KernelVersion:     v5.1
  Contact:      nvd...@lists.linux.dev
  Description:
                (RO) The id attribute indicates the region id of a dax region.
+
+What:          /sys/bus/dax/devices/daxX.Y/memmap_on_memory
+Date:          October, 2023
+KernelVersion: v6.8
+Contact:       nvd...@lists.linux.dev
+Description:
+               (RW) Control the memmap_on_memory setting if the dax device
+               were to be hotplugged as system memory. This determines whether
+               the 'altmap' for the hotplugged memory will be placed on the
+               device being hotplugged (memmap_on_memory=1) or if it will be
+               placed on regular memory (memmap_on_memory=0). This attribute
+               must be set before the device is handed over to the 'kmem'
+               driver (i.e.  hotplugged into system-ram). Additionally, this
+               depends on CONFIG_MHP_MEMMAP_ON_MEMORY, and a globally enabled
+               memmap_on_memory parameter for memory_hotplug. This is
+               typically set on the kernel command line -
+               memory_hotplug.memmap_on_memory set to 'true' or 'force'."


Thinking about it, I wonder if we could disallow setting that property to "true" if the current configuration does not allow it.

That is:

1) Removing the "size" parameter from mhp_supports_memmap_on_memory(), it doesn't make any sense anymore.

2) Exporting mhp_supports_memmap_on_memory() to modules.

3) When setting memmap_on_memory, check whether mhp_supports_memmap_on_memory() == true.

Then, the user really gets an error when trying to set it to "true".

--
Cheers,

David / dhildenb


Reply via email to