[PATCH v3 8/9] MAINTAINERS: Add maintainers of the CXL driver

2021-02-12 Thread Ben Widawsky
Cc: Dan Williams 
Cc: Vishal Verma 
Cc: Ira Weiny 
Cc: Alison Schofield 
Signed-off-by: Ben Widawsky 
---
 MAINTAINERS | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6eff4f720c72..93c8694a8f04 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -,6 +,17 @@ M:   Miguel Ojeda 
 S: Maintained
 F: include/linux/compiler_attributes.h
 
+COMPUTE EXPRESS LINK (CXL)
+M: Alison Schofield 
+M: Vishal Verma 
+M: Ira Weiny 
+M: Ben Widawsky 
+M: Dan Williams 
+L: linux-...@vger.kernel.org
+S: Maintained
+F: drivers/cxl/
+F: include/uapi/linux/cxl_mem.h
+
 CONEXANT ACCESSRUNNER USB DRIVER
 L: accessrunner-gene...@lists.sourceforge.net
 S: Orphan
-- 
2.30.0
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


[RFC PATCH 9/9] cxl/mem: Add payload dumping for debug

2021-02-12 Thread Ben Widawsky
It's often useful in debug scenarios to see what the hardware has dumped
out. As it stands today, any device error will result in the payload not
being copied out, so there is no way to triage commands which weren't
expected to fail (and sometimes the payload may have that information).

The functionality is protected by normal kernel security mechanisms as
well as a CONFIG option in the CXL driver.

This was extracted from the original version of the CXL enabling patch
series.

Signed-off-by: Ben Widawsky 
---
 drivers/cxl/Kconfig | 13 +
 drivers/cxl/mem.c   |  8 
 2 files changed, 21 insertions(+)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 97dc4d751651..3eec9276e586 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -50,4 +50,17 @@ config CXL_MEM_RAW_COMMANDS
  potential impact to memory currently in use by the kernel.
 
  If developing CXL hardware or the driver say Y, otherwise say N.
+
+config CXL_MEM_INSECURE_DEBUG
+   bool "CXL.mem debugging"
+   depends on CXL_MEM
+   help
+ Enable debug of all CXL command payloads.
+
+ Some CXL devices and controllers support encryption and other
+ security features. The payloads for the commands that enable
+ those features may contain sensitive clear-text security
+ material. Disable debug of those command payloads by default.
+ If you are a kernel developer actively working on CXL
+ security enabling say Y, otherwise say N.
 endif
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 3bca8451348a..09c11935b824 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -341,6 +341,14 @@ static int __cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm,
 
/* #5 */
rc = cxl_mem_wait_for_doorbell(cxlm);
+
+   if (!cxl_is_security_command(mbox_cmd->opcode) ||
+   IS_ENABLED(CONFIG_CXL_MEM_INSECURE_DEBUG)) {
+   print_hex_dump_debug("Payload ", DUMP_PREFIX_OFFSET, 16, 1,
+mbox_cmd->payload_in, mbox_cmd->size_in,
+true);
+   }
+
if (rc == -ETIMEDOUT) {
cxl_mem_mbox_timeout(cxlm, mbox_cmd);
return rc;
-- 
2.30.0
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Please take a look here

2021-02-12 Thread Anthony R Chavis
Title: Dear






Dear Sir/Ma,


$150 MILLION USD FROM WORLD HEALTH ORGANIZATION DIVERTED AND READY TO BE 
CLAIMED.

Greetings to you and happy new year 2021. Please take few minutes of your time 
to read this.

My name is Mr. Anthony Ray Chavis from North Carolina USA, I Currently work with 
the
World Health Organization (W.H.O) United Kingdom office and have been with them 
for
many years moving from one country to another which has made me one of the top 
people
in the organization.

You are aware of the recent outbreak of Corona Virus Pandemic which started from 
Wuhan
in China and escalated all over the world which lead to shutdown of many 
countries by
the Government. I am happy that as at today many Government have started 
reopening their
Country for work and businesses.

Since the start of the Covid-19 pandemic, Billions of Dollars came in to World 
Health
Organization as donated by individuals, Companies and Countries for the purpose 
of
curbing and containing the deadly corona virus.


I happen to head one of the department in charge of this funds and since last 
year I have
carefully and perfectly cut out the amount of One Hundred and Fifty Million 
United States
Dollars. ($150,000,000. USD). This amount has been deleted from our Database and 
included
in the office file as part of the fund moved out to be disbursed to some 
Countries as
Palliatives.

I have masterfully planned this out in a way that no traces will be found now 
and forever.
What I did is to divert the $150,000,000.USD to a safe place pending when I have 
a good
partner to claim the fund. Due to the fact that we are always being monitor 
along with our
family members this is why I started searching on the for someone who will be 
capable to
stand in the place to receive the fund as a representative to his/her country.
I will like to work with someone like you because you have a good profile. 

This is real, legitimate and will not put you and I in any problem. I will guide 
you properly
till you have received the fund, but you must make sure no one knows about this 
deal between
you and I. After receiving the fund we shall share the fund equally between us.

I will tell you more about this immediately I receive your response with your 
details,
kindly reconfirm to me your full details as follows:
FULL NAMES 
ADDRESS 
DATE OF BIRTH
GENDER
MARITAL STATUS
PHONE NUMBERS
OCCUPATION
ALTERNATIVE EMAIL ADDRESS
ATTACH COPY OF YOUR ID.

This is a good opportunity that only comes once in a lifetime therefore we must 
make good use
of it in this new year.


Kind regards,


Anthony Ray Chavisaccounts.paya...@worldhealthorgan.org
 



___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


[PATCH v3 7/9] cxl/mem: Add set of informational commands

2021-02-12 Thread Ben Widawsky
Add initial set of formal commands beyond basic identify and command
enumeration.

Signed-off-by: Ben Widawsky 
Reviewed-by: Dan Williams 
Reviewed-by: Jonathan Cameron  (v2)
---
 drivers/cxl/mem.c| 9 +
 include/uapi/linux/cxl_mem.h | 5 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index b026c1167bda..3bca8451348a 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -44,12 +44,16 @@
 enum opcode {
CXL_MBOX_OP_INVALID = 0x,
CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
+   CXL_MBOX_OP_GET_FW_INFO = 0x0200,
CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
CXL_MBOX_OP_GET_SUPPORTED_LOGS  = 0x0400,
CXL_MBOX_OP_GET_LOG = 0x0401,
CXL_MBOX_OP_IDENTIFY= 0x4000,
+   CXL_MBOX_OP_GET_PARTITION_INFO  = 0x4100,
CXL_MBOX_OP_SET_PARTITION_INFO  = 0x4101,
+   CXL_MBOX_OP_GET_LSA = 0x4102,
CXL_MBOX_OP_SET_LSA = 0x4103,
+   CXL_MBOX_OP_GET_HEALTH_INFO = 0x4200,
CXL_MBOX_OP_SET_SHUTDOWN_STATE  = 0x4204,
CXL_MBOX_OP_SCAN_MEDIA  = 0x4304,
CXL_MBOX_OP_GET_SCAN_MEDIA  = 0x4305,
@@ -166,6 +170,11 @@ static struct cxl_mem_command mem_commands[] = {
CXL_CMD(RAW, ~0, ~0, 0),
 #endif
CXL_CMD(GET_SUPPORTED_LOGS, 0, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
+   CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
+   CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
+   CXL_CMD(GET_LSA, 0x8, ~0, 0),
+   CXL_CMD(GET_HEALTH_INFO, 0, 0x12, 0),
+   CXL_CMD(GET_LOG, 0x18, ~0, CXL_CMD_FLAG_FORCE_ENABLE),
 };
 
 /*
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index 3b5bf4b58fb4..7670fe0e605a 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -24,6 +24,11 @@
___C(IDENTIFY, "Identify Command"),   \
___C(RAW, "Raw device command"),  \
___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),   \
+   ___C(GET_FW_INFO, "Get FW Info"), \
+   ___C(GET_PARTITION_INFO, "Get Partition Information"),\
+   ___C(GET_LSA, "Get Label Storage Area"),  \
+   ___C(GET_HEALTH_INFO, "Get Health Info"), \
+   ___C(GET_LOG, "Get Log"), \
___C(MAX, "invalid / last command")
 
 #define ___C(a, b) CXL_MEM_COMMAND_ID_##a
-- 
2.30.0
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


[PATCH v3 6/9] cxl/mem: Enable commands via CEL

2021-02-12 Thread Ben Widawsky
CXL devices identified by the memory-device class code must implement
the Device Command Interface (described in 8.2.9 of the CXL 2.0 spec).
While the driver already maintains a list of commands it supports, there
is still a need to be able to distinguish between commands that the
driver knows about from commands that are optionally supported by the
hardware.

The Command Effects Log (CEL) is specified in the CXL 2.0 specification.
The CEL is one of two types of logs, the other being vendor specific.
They are distinguished in hardware/spec via UUID. The CEL is useful for
2 things:
1. Determine which optional commands are supported by the CXL device.
2. Enumerate any vendor specific commands

The CEL is used by the driver to determine which commands are available
in the hardware and therefore which commands userspace is allowed to
execute. The set of enabled commands might be a subset of commands which
are advertised in UAPI via CXL_MEM_SEND_COMMAND IOCTL.

With the CEL enabling comes a internal flag to indicate a base set of
commands that are enabled regardless of CEL. Such commands are required
for basic interaction with the hardware and thus can be useful in debug
cases, for example if the CEL is corrupted.

The implementation leaves the statically defined table of commands and
supplements it with a bitmap to determine commands that are enabled.
This organization was chosen for the following reasons:
- Smaller memory footprint. Doesn't need a table per device.
- Reduce memory allocation complexity.
- Fixed command IDs to opcode mapping for all devices makes development
  and debugging easier.
- Certain helpers are easily achievable, like cxl_for_each_cmd().

Signed-off-by: Ben Widawsky 
Reviewed-by: Dan Williams 
---
 drivers/cxl/cxl.h|   2 +
 drivers/cxl/mem.c| 224 ++-
 include/uapi/linux/cxl_mem.h |   1 +
 3 files changed, 224 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 63d7f7e01b83..a1aa9f787a2a 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -67,6 +67,7 @@ struct cxl_memdev;
  *(CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
  * @mbox_mutex: Mutex to synchronize mailbox access.
  * @firmware_version: Firmware version for the memory device.
+ * @enabled_commands: Hardware commands found enabled in CEL.
  * @pmem: Persistent memory capacity information.
  * @ram: Volatile memory capacity information.
  */
@@ -82,6 +83,7 @@ struct cxl_mem {
size_t payload_size;
struct mutex mbox_mutex; /* Protects device mailbox and firmware */
char firmware_version[0x10];
+   unsigned long *enabled_cmds;
 
struct range pmem_range;
struct range ram_range;
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index a819f090ffe2..b026c1167bda 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -45,6 +45,8 @@ enum opcode {
CXL_MBOX_OP_INVALID = 0x,
CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
+   CXL_MBOX_OP_GET_SUPPORTED_LOGS  = 0x0400,
+   CXL_MBOX_OP_GET_LOG = 0x0401,
CXL_MBOX_OP_IDENTIFY= 0x4000,
CXL_MBOX_OP_SET_PARTITION_INFO  = 0x4101,
CXL_MBOX_OP_SET_LSA = 0x4103,
@@ -103,10 +105,28 @@ static DEFINE_IDA(cxl_memdev_ida);
 static struct dentry *cxl_debugfs;
 static bool cxl_raw_allow_all;
 
+enum {
+   CEL_UUID,
+   VENDOR_DEBUG_UUID,
+};
+
+/* See CXL 2.0 Table 170. Get Log Input Payload */
+static const uuid_t log_uuid[] = {
+   [CEL_UUID] = UUID_INIT(0xda9c0b5, 0xbf41, 0x4b78, 0x8f, 0x79, 0x96,
+  0xb1, 0x62, 0x3b, 0x3f, 0x17),
+   [VENDOR_DEBUG_UUID] = UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f,
+   0xd6, 0x07, 0x19, 0x40, 0x3d, 0x86),
+};
+
 /**
  * struct cxl_mem_command - Driver representation of a memory device command
  * @info: Command information as it exists for the UAPI
  * @opcode: The actual bits used for the mailbox protocol
+ * @flags: Set of flags effecting driver behavior.
+ *
+ *  * %CXL_CMD_FLAG_FORCE_ENABLE: In cases of error, commands with this flag
+ *will be enabled by the driver regardless of what hardware may have
+ *advertised.
  *
  * The cxl_mem_command is the driver's internal representation of commands that
  * are supported by the driver. Some of these commands may not be supported by
@@ -118,9 +138,12 @@ static bool cxl_raw_allow_all;
 struct cxl_mem_command {
struct cxl_command_info info;
enum opcode opcode;
+   u32 flags;
+#define CXL_CMD_FLAG_NONE 0
+#define CXL_CMD_FLAG_FORCE_ENABLE BIT(0)
 };
 
-#define CXL_CMD(_id, sin, sout)
\
+#define CXL_CMD(_id, sin, sout, _flags)
\
[CXL_MEM_COMMAND_ID_##_id] = { \
   

[PATCH v3 5/9] cxl/mem: Add a "RAW" send command

2021-02-12 Thread Ben Widawsky
The CXL memory device send interface will have a number of supported
commands. The raw command is not such a command. Raw commands allow
userspace to send a specified opcode to the underlying hardware and
bypass all driver checks on the command. The primary use for this
command is to [begrudgingly] allow undocumented vendor specific hardware
commands.

While not the main motivation, it also allows prototyping new hardware
commands without a driver patch and rebuild.

While this all sounds very powerful it comes with a couple of caveats:
1. Bug reports using raw commands will not get the same level of
   attention as bug reports using supported commands (via taint).
2. Supported commands will be rejected by the RAW command.

With this comes new debugfs knob to allow full access to your toes with
your weapon of choice.

Cc: Ariel Sibley 
Signed-off-by: Ben Widawsky 
Reviewed-by: Dan Williams  (v2)
---
 drivers/cxl/Kconfig  |  18 ++
 drivers/cxl/mem.c| 121 +++
 include/uapi/linux/cxl_mem.h |  12 +++-
 3 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 9e80b311e928..97dc4d751651 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -32,4 +32,22 @@ config CXL_MEM
  Chapter 2.3 Type 3 CXL Device in the CXL 2.0 specification.
 
  If unsure say 'm'.
+
+config CXL_MEM_RAW_COMMANDS
+   bool "RAW Command Interface for Memory Devices"
+   depends on CXL_MEM
+   help
+ Enable CXL RAW command interface.
+
+ The CXL driver ioctl interface may assign a kernel ioctl command
+ number for each specification defined opcode. At any given point in
+ time the number of opcodes that the specification defines and a device
+ may implement may exceed the kernel's set of associated ioctl function
+ numbers. The mismatch is either by omission, specification is too new,
+ or by design. When prototyping new hardware, or developing / debugging
+ the driver it is useful to be able to submit any possible command to
+ the hardware, even commands that may crash the kernel due to their
+ potential impact to memory currently in use by the kernel.
+
+ If developing CXL hardware or the driver say Y, otherwise say N.
 endif
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index d764a35afea9..a819f090ffe2 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -41,7 +43,14 @@
 
 enum opcode {
CXL_MBOX_OP_INVALID = 0x,
+   CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
+   CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
CXL_MBOX_OP_IDENTIFY= 0x4000,
+   CXL_MBOX_OP_SET_PARTITION_INFO  = 0x4101,
+   CXL_MBOX_OP_SET_LSA = 0x4103,
+   CXL_MBOX_OP_SET_SHUTDOWN_STATE  = 0x4204,
+   CXL_MBOX_OP_SCAN_MEDIA  = 0x4304,
+   CXL_MBOX_OP_GET_SCAN_MEDIA  = 0x4305,
CXL_MBOX_OP_MAX = 0x1
 };
 
@@ -91,6 +100,8 @@ struct cxl_memdev {
 
 static int cxl_mem_major;
 static DEFINE_IDA(cxl_memdev_ida);
+static struct dentry *cxl_debugfs;
+static bool cxl_raw_allow_all;
 
 /**
  * struct cxl_mem_command - Driver representation of a memory device command
@@ -127,6 +138,49 @@ struct cxl_mem_command {
  */
 static struct cxl_mem_command mem_commands[] = {
CXL_CMD(IDENTIFY, 0, 0x43),
+#ifdef CONFIG_CXL_MEM_RAW_COMMANDS
+   CXL_CMD(RAW, ~0, ~0),
+#endif
+};
+
+/*
+ * Commands that RAW doesn't permit. The rationale for each:
+ *
+ * CXL_MBOX_OP_ACTIVATE_FW: Firmware activation requires adjustment /
+ * coordination of transaction timeout values at the root bridge level.
+ *
+ * CXL_MBOX_OP_SET_PARTITION_INFO: The device memory map may change live
+ * and needs to be coordinated with HDM updates.
+ *
+ * CXL_MBOX_OP_SET_LSA: The label storage area may be cached by the
+ * driver and any writes from userspace invalidates those contents.
+ *
+ * CXL_MBOX_OP_SET_SHUTDOWN_STATE: Set shutdown state assumes no writes
+ * to the device after it is marked clean, userspace can not make that
+ * assertion.
+ *
+ * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that
+ * is kept up to date with patrol notifications and error management.
+ */
+static u16 cxl_disabled_raw_commands[] = {
+   CXL_MBOX_OP_ACTIVATE_FW,
+   CXL_MBOX_OP_SET_PARTITION_INFO,
+   CXL_MBOX_OP_SET_LSA,
+   CXL_MBOX_OP_SET_SHUTDOWN_STATE,
+   CXL_MBOX_OP_SCAN_MEDIA,
+   CXL_MBOX_OP_GET_SCAN_MEDIA,
+};
+
+/*
+ * Command sets that RAW doesn't permit. All opcodes in this set are
+ * disabled because they pass plain text security payloads over the
+ * user/kernel boundary. This functionality is intended to be wrapped

[PATCH v3 4/9] cxl/mem: Add basic IOCTL interface

2021-02-12 Thread Ben Widawsky
Add a straightforward IOCTL that provides a mechanism for userspace to
query the supported memory device commands. CXL commands as they appear
to userspace are described as part of the UAPI kerneldoc. The command
list returned via this IOCTL will contain the full set of commands that
the driver supports, however, some of those commands may not be
available for use by userspace.

Memory device commands first appear in the CXL 2.0 specification. They
are submitted through a mailbox mechanism specified in the CXL 2.0
specification.

The send command allows userspace to issue mailbox commands directly to
the hardware. The list of available commands to send are the output of
the query command. The driver verifies basic properties of the command
and possibly inspect the input (or output) payload to determine whether
or not the command is allowed (or might taint the kernel).

Reported-by: kernel test robot  # bug in earlier revision
Signed-off-by: Ben Widawsky 
Reviewed-by: Dan Williams  (v2)
---
 .clang-format |   1 +
 .../userspace-api/ioctl/ioctl-number.rst  |   1 +
 drivers/cxl/mem.c | 280 +-
 include/uapi/linux/cxl_mem.h  | 154 ++
 4 files changed, 435 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/cxl_mem.h

diff --git a/.clang-format b/.clang-format
index 10dc5a9a61b3..3f11c8901b43 100644
--- a/.clang-format
+++ b/.clang-format
@@ -109,6 +109,7 @@ ForEachMacros:
   - 'css_for_each_child'
   - 'css_for_each_descendant_post'
   - 'css_for_each_descendant_pre'
+  - 'cxl_for_each_cmd'
   - 'device_for_each_child_node'
   - 'dma_fence_chain_for_each'
   - 'do_for_each_ftrace_op'
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a4c75a28c839..6eb8e634664d 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -352,6 +352,7 @@ Code  Seq#Include File  
 Comments
  

 0xCC  00-0F  drivers/misc/ibmvmc.h   pseries 
VMC driver
 0xCD  01 linux/reiserfs_fs.h
+0xCE  01-02  uapi/linux/cxl_mem.hCompute 
Express Link Memory Devices
 0xCF  02 fs/cifs/ioctl.c
 0xDB  00-0F  drivers/char/mwave/mwavepub.h
 0xDD  00-3F  ZFCP 
device driver see drivers/s390/scsi/
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index f03482aa155d..d764a35afea9 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+#include 
 #include 
 #include 
 #include 
@@ -39,6 +40,7 @@
 #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
 
 enum opcode {
+   CXL_MBOX_OP_INVALID = 0x,
CXL_MBOX_OP_IDENTIFY= 0x4000,
CXL_MBOX_OP_MAX = 0x1
 };
@@ -90,6 +92,49 @@ struct cxl_memdev {
 static int cxl_mem_major;
 static DEFINE_IDA(cxl_memdev_ida);
 
+/**
+ * struct cxl_mem_command - Driver representation of a memory device command
+ * @info: Command information as it exists for the UAPI
+ * @opcode: The actual bits used for the mailbox protocol
+ *
+ * The cxl_mem_command is the driver's internal representation of commands that
+ * are supported by the driver. Some of these commands may not be supported by
+ * the hardware. The driver will use @info to validate the fields passed in by
+ * the user then submit the @opcode to the hardware.
+ *
+ * See struct cxl_command_info.
+ */
+struct cxl_mem_command {
+   struct cxl_command_info info;
+   enum opcode opcode;
+};
+
+#define CXL_CMD(_id, sin, sout)
\
+   [CXL_MEM_COMMAND_ID_##_id] = { \
+   .info = {  \
+   .id = CXL_MEM_COMMAND_ID_##_id,\
+   .size_in = sin,\
+   .size_out = sout,  \
+   }, \
+   .opcode = CXL_MBOX_OP_##_id,   \
+   }
+
+/*
+ * This table defines the supported mailbox commands for the driver. This table
+ * is made up of a UAPI structure. Non-negative values as parameters in the
+ * table will be validated against the user's input. For example, if size_in is
+ * 0, and the user passed in 1, it is an error.
+ */
+static struct cxl_mem_command mem_commands[] = {
+   CXL_CMD(IDENTIFY, 0, 0x43),
+};
+
+#define cxl_for_each_cmd(cmd)  

[PATCH v3 3/9] cxl/mem: Register CXL memX devices

2021-02-12 Thread Ben Widawsky
From: Dan Williams 

Create the /sys/bus/cxl hierarchy to enumerate:

* Memory Devices (per-endpoint control devices)

* Memory Address Space Devices (platform address ranges with
  interleaving, performance, and persistence attributes)

* Memory Regions (active provisioned memory from an address space device
  that is in use as System RAM or delegated to libnvdimm as Persistent
  Memory regions).

For now, only the per-endpoint control devices are registered on the
'cxl' bus. However, going forward it will provide a mechanism to
coordinate cross-device interleave.

Signed-off-by: Dan Williams 
Signed-off-by: Ben Widawsky 
Reviewed-by: Jonathan Cameron  (v2)
---
 Documentation/ABI/testing/sysfs-bus-cxl   |  26 ++
 .../driver-api/cxl/memory-devices.rst |  17 +
 drivers/cxl/Makefile  |   3 +
 drivers/cxl/bus.c |  29 ++
 drivers/cxl/cxl.h |   3 +
 drivers/cxl/mem.c | 301 +-
 6 files changed, 377 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-cxl
 create mode 100644 drivers/cxl/bus.c

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl 
b/Documentation/ABI/testing/sysfs-bus-cxl
new file mode 100644
index ..2fe7490ad6a8
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -0,0 +1,26 @@
+What:  /sys/bus/cxl/devices/memX/firmware_version
+Date:  December, 2020
+KernelVersion: v5.12
+Contact:   linux-...@vger.kernel.org
+Description:
+   (RO) "FW Revision" string as reported by the Identify
+   Memory Device Output Payload in the CXL-2.0
+   specification.
+
+What:  /sys/bus/cxl/devices/memX/ram/size
+Date:  December, 2020
+KernelVersion: v5.12
+Contact:   linux-...@vger.kernel.org
+Description:
+   (RO) "Volatile Only Capacity" as bytes. Represents the
+   identically named field in the Identify Memory Device Output
+   Payload in the CXL-2.0 specification.
+
+What:  /sys/bus/cxl/devices/memX/pmem/size
+Date:  December, 2020
+KernelVersion: v5.12
+Contact:   linux-...@vger.kernel.org
+Description:
+   (RO) "Persistent Only Capacity" as bytes. Represents the
+   identically named field in the Identify Memory Device Output
+   Payload in the CXL-2.0 specification.
diff --git a/Documentation/driver-api/cxl/memory-devices.rst 
b/Documentation/driver-api/cxl/memory-devices.rst
index 43177e700d62..1bad466f9167 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -27,3 +27,20 @@ CXL Memory Device
 
 .. kernel-doc:: drivers/cxl/mem.c
:internal:
+
+CXL Bus
+---
+.. kernel-doc:: drivers/cxl/bus.c
+   :doc: cxl bus
+
+External Interfaces
+===
+
+CXL IOCTL Interface
+---
+
+.. kernel-doc:: include/uapi/linux/cxl_mem.h
+   :doc: UAPI
+
+.. kernel-doc:: include/uapi/linux/cxl_mem.h
+   :internal:
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 4a30f7c3fc4a..a314a1891f4d 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_CXL_BUS) += cxl_bus.o
 obj-$(CONFIG_CXL_MEM) += cxl_mem.o
 
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL
+cxl_bus-y := bus.o
 cxl_mem-y := mem.o
diff --git a/drivers/cxl/bus.c b/drivers/cxl/bus.c
new file mode 100644
index ..58f74796d525
--- /dev/null
+++ b/drivers/cxl/bus.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+#include 
+#include 
+
+/**
+ * DOC: cxl bus
+ *
+ * The CXL bus provides namespace for control devices and a rendezvous
+ * point for cross-device interleave coordination.
+ */
+struct bus_type cxl_bus_type = {
+   .name = "cxl",
+};
+EXPORT_SYMBOL_GPL(cxl_bus_type);
+
+static __init int cxl_bus_init(void)
+{
+   return bus_register(&cxl_bus_type);
+}
+
+static void cxl_bus_exit(void)
+{
+   bus_unregister(&cxl_bus_type);
+}
+
+module_init(cxl_bus_init);
+module_exit(cxl_bus_exit);
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9cd9bc79fc48..63d7f7e01b83 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -55,6 +55,7 @@
(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=   \
 CXLMDEV_RESET_NEEDED_NOT)
 
+struct cxl_memdev;
 /**
  * struct cxl_mem - A CXL memory device
  * @pdev: The PCI device associated with this CXL device.
@@ -72,6 +73,7 @@
 struct cxl_mem {
struct pci_dev *pdev;
void __iomem *regs;
+   struct cxl_memdev *cxlmd;
 
void __iomem *status_regs;
void __iomem *mbox_regs;
@@ -85,4 +87,5 @@ struct cxl_mem {
struct range ram_range;
 };
 
+extern struct bus_type cxl_bus_type;
 #endif /* __CXL_H__ */
diff --git a/dr

[PATCH v3 2/9] cxl/mem: Find device capabilities

2021-02-12 Thread Ben Widawsky
Provide enough functionality to utilize the mailbox of a memory device.
The mailbox is used to interact with the firmware running on the memory
device. The flow is proven with one implemented command, "identify".
Because the class code has already told the driver this is a memory
device and the identify command is mandatory.

CXL devices contain an array of capabilities that describe the
interactions software can have with the device or firmware running on
the device. A CXL compliant device must implement the device status and
the mailbox capability. Additionally, a CXL compliant memory device must
implement the memory device capability. Each of the capabilities can
[will] provide an offset within the MMIO region for interacting with the
CXL device.

The capabilities tell the driver how to find and map the register space
for CXL Memory Devices. The registers are required to utilize the CXL
spec defined mailbox interface. The spec outlines two mailboxes, primary
and secondary. The secondary mailbox is earmarked for system firmware,
and not handled in this driver.

Primary mailboxes are capable of generating an interrupt when submitting
a background command. That implementation is saved for a later time.

Link: https://www.computeexpresslink.org/download-the-specification
Signed-off-by: Ben Widawsky 
Reviewed-by: Dan Williams  (v2)
---
 drivers/cxl/cxl.h |  88 
 drivers/cxl/mem.c | 542 +-
 drivers/cxl/pci.h |  14 ++
 3 files changed, 642 insertions(+), 2 deletions(-)
 create mode 100644 drivers/cxl/cxl.h

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
new file mode 100644
index ..9cd9bc79fc48
--- /dev/null
+++ b/drivers/cxl/cxl.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2020 Intel Corporation. */
+
+#ifndef __CXL_H__
+#define __CXL_H__
+
+#include 
+#include 
+#include 
+
+/* CXL 2.0 8.2.8.1 Device Capabilities Array Register */
+#define CXLDEV_CAP_ARRAY_OFFSET 0x0
+#define   CXLDEV_CAP_ARRAY_CAP_ID 0
+#define   CXLDEV_CAP_ARRAY_ID_MASK GENMASK(15, 0)
+#define   CXLDEV_CAP_ARRAY_COUNT_MASK GENMASK(47, 32)
+/* CXL 2.0 8.2.8.2.1 CXL Device Capabilities */
+#define CXLDEV_CAP_CAP_ID_DEVICE_STATUS 0x1
+#define CXLDEV_CAP_CAP_ID_PRIMARY_MAILBOX 0x2
+#define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
+#define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
+
+/* CXL 2.0 8.2.8.4 Mailbox Registers */
+#define CXLDEV_MBOX_CAPS_OFFSET 0x00
+#define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
+#define CXLDEV_MBOX_CTRL_OFFSET 0x04
+#define   CXLDEV_MBOX_CTRL_DOORBELL BIT(0)
+#define CXLDEV_MBOX_CMD_OFFSET 0x08
+#define   CXLDEV_MBOX_CMD_COMMAND_OPCODE_MASK GENMASK(15, 0)
+#define   CXLDEV_MBOX_CMD_PAYLOAD_LENGTH_MASK GENMASK(36, 16)
+#define CXLDEV_MBOX_STATUS_OFFSET 0x10
+#define   CXLDEV_MBOX_STATUS_RET_CODE_MASK GENMASK(47, 32)
+#define CXLDEV_MBOX_BG_CMD_STATUS_OFFSET 0x18
+#define CXLDEV_MBOX_PAYLOAD_OFFSET 0x20
+
+/* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
+#define CXLMDEV_STATUS_OFFSET 0x0
+#define   CXLMDEV_DEV_FATAL BIT(0)
+#define   CXLMDEV_FW_HALT BIT(1)
+#define   CXLMDEV_STATUS_MEDIA_STATUS_MASK GENMASK(3, 2)
+#define CXLMDEV_MS_NOT_READY 0
+#define CXLMDEV_MS_READY 1
+#define CXLMDEV_MS_ERROR 2
+#define CXLMDEV_MS_DISABLED 3
+#define CXLMDEV_READY(status)  
\
+   (FIELD_GET(CXLMDEV_STATUS_MEDIA_STATUS_MASK, status) ==\
+CXLMDEV_MS_READY)
+#define   CXLMDEV_MBOX_IF_READY BIT(4)
+#define   CXLMDEV_RESET_NEEDED_MASK GENMASK(7, 5)
+#define CXLMDEV_RESET_NEEDED_NOT 0
+#define CXLMDEV_RESET_NEEDED_COLD 1
+#define CXLMDEV_RESET_NEEDED_WARM 2
+#define CXLMDEV_RESET_NEEDED_HOT 3
+#define CXLMDEV_RESET_NEEDED_CXL 4
+#define CXLMDEV_RESET_NEEDED(status)   
\
+   (FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=   \
+CXLMDEV_RESET_NEEDED_NOT)
+
+/**
+ * struct cxl_mem - A CXL memory device
+ * @pdev: The PCI device associated with this CXL device.
+ * @regs: IO mappings to the device's MMIO
+ * @status_regs: CXL 2.0 8.2.8.3 Device Status Registers
+ * @mbox_regs: CXL 2.0 8.2.8.4 Mailbox Registers
+ * @memdev_regs: CXL 2.0 8.2.8.5 Memory Device Registers
+ * @payload_size: Size of space for payload
+ *(CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
+ * @mbox_mutex: Mutex to synchronize mailbox access.
+ * @firmware_version: Firmware version for the memory device.
+ * @pmem: Persistent memory capacity information.
+ * @ram: Volatile memory capacity information.
+ */
+struct cxl_mem {
+   struct pci_dev *pdev;
+   void __iomem *regs;
+
+   void __iomem *status_regs;
+   void __iomem *mbox_regs;
+   void __iomem *memdev_regs;
+
+   size_t payload_size;
+   struct mutex mbox_mutex; /* Protects device mailbox and firmware */
+   char firmware_version[0x10];
+
+   struct range pmem_range;

[PATCH v3 1/9] cxl/mem: Introduce a driver for CXL-2.0-Type-3 endpoints

2021-02-12 Thread Ben Widawsky
From: Dan Williams 

The CXL.mem protocol allows a device to act as a provider of "System
RAM" and/or "Persistent Memory" that is fully coherent as if the memory
was attached to the typical CPU memory controller.

With the CXL-2.0 specification a PCI endpoint can implement a "Type-3"
device interface and give the operating system control over "Host
Managed Device Memory". See section 2.3 Type 3 CXL Device.

The memory range exported by the device may optionally be described by
the platform firmware memory map, or by infrastructure like LIBNVDIMM to
provision persistent memory capacity from one, or more, CXL.mem devices.

A pre-requisite for Linux-managed memory-capacity provisioning is this
cxl_mem driver that can speak the mailbox protocol defined in section
8.2.8.4 Mailbox Registers.

For now just land the initial driver boiler-plate and Documentation/
infrastructure.

Link: https://www.computeexpresslink.org/download-the-specification
Cc: Jonathan Corbet 
Signed-off-by: Dan Williams 
Signed-off-by: Ben Widawsky 
Acked-by: David Rientjes  (v1)
Reviewed-by: Jonathan Cameron 
---
 Documentation/driver-api/cxl/index.rst| 12 
 .../driver-api/cxl/memory-devices.rst | 29 +
 Documentation/driver-api/index.rst|  1 +
 drivers/Kconfig   |  1 +
 drivers/Makefile  |  1 +
 drivers/cxl/Kconfig   | 35 +++
 drivers/cxl/Makefile  |  4 ++
 drivers/cxl/mem.c | 62 +++
 drivers/cxl/pci.h | 17 +
 include/linux/pci_ids.h   |  1 +
 10 files changed, 163 insertions(+)
 create mode 100644 Documentation/driver-api/cxl/index.rst
 create mode 100644 Documentation/driver-api/cxl/memory-devices.rst
 create mode 100644 drivers/cxl/Kconfig
 create mode 100644 drivers/cxl/Makefile
 create mode 100644 drivers/cxl/mem.c
 create mode 100644 drivers/cxl/pci.h

diff --git a/Documentation/driver-api/cxl/index.rst 
b/Documentation/driver-api/cxl/index.rst
new file mode 100644
index ..036e49553542
--- /dev/null
+++ b/Documentation/driver-api/cxl/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+
+Compute Express Link
+
+
+.. toctree::
+   :maxdepth: 1
+
+   memory-devices
+
+.. only::  subproject and html
diff --git a/Documentation/driver-api/cxl/memory-devices.rst 
b/Documentation/driver-api/cxl/memory-devices.rst
new file mode 100644
index ..43177e700d62
--- /dev/null
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -0,0 +1,29 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: 
+
+===
+Compute Express Link Memory Devices
+===
+
+A Compute Express Link Memory Device is a CXL component that implements the
+CXL.mem protocol. It contains some amount of volatile memory, persistent 
memory,
+or both. It is enumerated as a PCI device for configuration and passing
+messages over an MMIO mailbox. Its contribution to the System Physical
+Address space is handled via HDM (Host Managed Device Memory) decoders
+that optionally define a device's contribution to an interleaved address
+range across multiple devices underneath a host-bridge or interleaved
+across host-bridges.
+
+Driver Infrastructure
+=
+
+This section covers the driver infrastructure for a CXL memory device.
+
+CXL Memory Device
+-
+
+.. kernel-doc:: drivers/cxl/mem.c
+   :doc: cxl mem
+
+.. kernel-doc:: drivers/cxl/mem.c
+   :internal:
diff --git a/Documentation/driver-api/index.rst 
b/Documentation/driver-api/index.rst
index 2456d0a97ed8..d246a18fd78f 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -35,6 +35,7 @@ available subsections can be seen below.
usb/index
firewire
pci/index
+   cxl/index
spi
i2c
ipmb
diff --git a/drivers/Kconfig b/drivers/Kconfig
index dcecc9f6e33f..62c753a73651 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -6,6 +6,7 @@ menu "Device Drivers"
 source "drivers/amba/Kconfig"
 source "drivers/eisa/Kconfig"
 source "drivers/pci/Kconfig"
+source "drivers/cxl/Kconfig"
 source "drivers/pcmcia/Kconfig"
 source "drivers/rapidio/Kconfig"
 
diff --git a/drivers/Makefile b/drivers/Makefile
index fd11b9ac4cc3..678ea810410f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_NVM) += lightnvm/
 obj-y  += base/ block/ misc/ mfd/ nfc/
 obj-$(CONFIG_LIBNVDIMM)+= nvdimm/
 obj-$(CONFIG_DAX)  += dax/
+obj-$(CONFIG_CXL_BUS)  += cxl/
 obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/
 obj-$(CONFIG_NUBUS)+= nubus/
 obj-y  += macintosh/
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
new file mode 100644
index ..9e80b311e928
--- /dev

[PATCH v3 0/9] CXL 2.0 Support

2021-02-12 Thread Ben Widawsky
# Changes since v2 [1]

  * s/mbox_lock/mbox_mutex in kdocs (Ben)
  * Remove stray comments about deleted flags (Ben)
  * Remove flags from CXL_CMD (Ben)
  * Rework cxl_mem_enumerate_cmds() to allow more than 2 commands (Ben, 
Jonathan)
* I misread the spec and this needed more robust handling.
  * Remove validate_payload() as it no longer is useful (Ben)
  * Remove check that CEL returned reasonable command list (Ben)
* It is easy enough to figure this out elsewhere.
* Enable sane set of commands regardless (Ben)
* Remove now useless cxl_enable_cmd() (Ben)
  * Add payload dump debugging regardless of timeout (Dan)
* Extracted to separate RFC patch (Ben)
  * Move PCI_DVSEC_HEADER1_LENGTH_MASK back to cxl.h (Jonathan, Bjorn)
  * Drop duplicated PCI_EXT_CAP_ID_DVSEC (Jonathan)
  * Use PCI_DEVICE_CLASS (Jonathan)
  * Create wrapper for kernel mailbox usage (Jonathan)
* Helps with error conditions
  * Various cosmetic changes (Jonathan)
  * Remove references to removed MUTEX flag (Jonathan)
  * Remove KERNEL flag since not used yet (Jonathan)
  * Remove payload dumping for debug (Jonathan)
  * Show example expansion from macro magic (Jonathan)

---

In addition to the mailing list, please feel free to use #cxl on oftc IRC for
discussion.

---

# Summary

Introduce support for “type-3” memory devices defined in the Compute Express
Link (CXL) 2.0 specification [2]. Specifically, these are the memory devices
defined by section 8.2.8.5 of the CXL 2.0 spec. A reference implementation
emulating these devices has been submitted to the QEMU mailing list [3] and is
available on gitlab [4], but will move to a shared tree on kernel.org after
initial acceptance. “Type-3” is a CXL device that acts as a memory expander for
RAM or Persistent Memory. The device might be interleaved with other CXL devices
in a given physical address range.

In addition to the core functionality of discovering the spec defined registers
and resources, introduce a CXL device model that will be the foundation for
translating CXL capabilities into existing Linux infrastructure for Persistent
Memory and other memory devices. For now, this only includes support for the
management command mailbox the surfacing of type-3 devices. These control
devices fill the role of “DIMMs” / nmemX memory-devices in LIBNVDIMM terms.

## Userspace Interaction

Interaction with the driver and type-3 devices via the CXL drivers is introduced
in this patch series and considered stable ABI. They include

   * sysfs - Documentation/ABI/testing/sysfs-bus-cxl
   * IOCTL - Documentation/driver-api/cxl/memory-devices.rst
   * debugfs - Documentation/ABI/testing/debugfs-debug

Work is in process to add support for CXL interactions to the ndctl project [5]

### Development plans

One of the unique challenges that CXL imposes on the Linux driver model is that
it requires the operating system to perform physical address space management
interleaved across devices and bridges. Whereas LIBNVDIMM handles a list of
established static persistent memory address ranges (for example from the ACPI
NFIT), CXL introduces hotplug and the concept of allocating address space to
instantiate persistent memory ranges. This is similar to PCI in the sense that
the platform establishes the MMIO range for PCI BARs to be allocated, but it is
significantly complicated by the fact that a given device can optionally be
interleaved with other devices and can participate in several interleave-sets at
once. LIBNVDIMM handled something like this with the aliasing between PMEM and
BLOCK-WINDOW mode, but CXL adds flexibility to alias DEVICE MEMORY through up to
10 decoders per device.

All of the above needs to be enabled with respect to PCI hotplug events on
Type-3 memory device which needs hooks to determine if a given device is
contributing to a "System RAM" address range that is unable to be unplugged. In
other words CXL ties PCI hotplug to Memory Hotplug and PCI hotplug needs to be
able to negotiate with memory hotplug.  In the medium term the implications of
CXL hotplug vs ACPI SRAT/SLIT/HMAT need to be reconciled. One capability that
seems to be needed is either the dynamic allocation of new memory nodes, or
default initializing extra pgdat instances beyond what is enumerated in ACPI
SRAT to accommodate hot-added CXL memory.

Patches welcome, questions welcome as the development effort on the post v5.12
capabilities proceeds.

## Running in QEMU

The incantation to get CXL support in QEMU [4] is considered unstable at this
time. Future readers of this cover letter should verify if any changes are
needed. For the novice QEMU user, the following can be copy/pasted into a
working QEMU commandline. It is enough to make the simplest topology possible.
The topology would consist of a single memory window, single type3 device,
single root port, and single host bridge.

+-+
|   CXL PXB   |
| |
|  +---+  |<--+
|  |CXL RP |  |

[PATCH 1/2] libnvdimm: simplify nvdimm_remove()

2021-02-12 Thread Uwe Kleine-König
nvdimm_remove is only ever called after nvdimm_probe() returned
successfully. In this case driver data is always set to a non-NULL value
so the check for driver data being NULL can go away as it's always false.

Signed-off-by: Uwe Kleine-König 
---
 drivers/nvdimm/dimm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c
index 7d4ddc4d9322..94be3ae1d29f 100644
--- a/drivers/nvdimm/dimm.c
+++ b/drivers/nvdimm/dimm.c
@@ -117,9 +117,6 @@ static int nvdimm_remove(struct device *dev)
 {
struct nvdimm_drvdata *ndd = dev_get_drvdata(dev);
 
-   if (!ndd)
-   return 0;
-
nvdimm_bus_lock(dev);
dev_set_drvdata(dev, NULL);
nvdimm_bus_unlock(dev);

base-commit: 5c8fe583cce542aa0b84adc939ce85293de36e5e
-- 
2.29.2
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


[PATCH 2/2] libnvdimm: Make remove callback return void

2021-02-12 Thread Uwe Kleine-König
All drivers return 0 in their remove callback and the driver core ignores
the return value of nvdimm_bus_remove() anyhow. So simplify by changing
the driver remove callback to return void and return 0 unconditionally
to the upper layer.

Signed-off-by: Uwe Kleine-König 
---
 drivers/dax/pmem/compat.c |  3 +--
 drivers/nvdimm/blk.c  |  3 +--
 drivers/nvdimm/bus.c  | 13 +
 drivers/nvdimm/dimm.c |  4 +---
 drivers/nvdimm/pmem.c |  4 +---
 drivers/nvdimm/region.c   |  4 +---
 include/linux/nd.h|  2 +-
 7 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/drivers/dax/pmem/compat.c b/drivers/dax/pmem/compat.c
index 863c114fd88c..d81dc35fd65d 100644
--- a/drivers/dax/pmem/compat.c
+++ b/drivers/dax/pmem/compat.c
@@ -41,10 +41,9 @@ static int dax_pmem_compat_release(struct device *dev, void 
*data)
return 0;
 }
 
-static int dax_pmem_compat_remove(struct device *dev)
+static void dax_pmem_compat_remove(struct device *dev)
 {
device_for_each_child(dev, NULL, dax_pmem_compat_release);
-   return 0;
 }
 
 static struct nd_device_driver dax_pmem_compat_driver = {
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 22e5617b2cea..8a53728e13e6 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -310,11 +310,10 @@ static int nd_blk_probe(struct device *dev)
return nsblk_attach_disk(nsblk);
 }
 
-static int nd_blk_remove(struct device *dev)
+static void nd_blk_remove(struct device *dev)
 {
if (is_nd_btt(dev))
nvdimm_namespace_detach_btt(to_nd_btt(dev));
-   return 0;
 }
 
 static struct nd_device_driver nd_blk_driver = {
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 2304c6183822..48f0985ca8a0 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -113,18 +113,17 @@ static int nvdimm_bus_remove(struct device *dev)
struct nd_device_driver *nd_drv = to_nd_device_driver(dev->driver);
struct module *provider = to_bus_provider(dev);
struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
-   int rc = 0;
 
if (nd_drv->remove) {
debug_nvdimm_lock(dev);
-   rc = nd_drv->remove(dev);
+   nd_drv->remove(dev);
debug_nvdimm_unlock(dev);
}
 
-   dev_dbg(&nvdimm_bus->dev, "%s.remove(%s) = %d\n", dev->driver->name,
-   dev_name(dev), rc);
+   dev_dbg(&nvdimm_bus->dev, "%s.remove(%s)\n", dev->driver->name,
+   dev_name(dev));
module_put(provider);
-   return rc;
+   return 0;
 }
 
 static void nvdimm_bus_shutdown(struct device *dev)
@@ -427,7 +426,7 @@ static void free_badrange_list(struct list_head 
*badrange_list)
list_del_init(badrange_list);
 }
 
-static int nd_bus_remove(struct device *dev)
+static void nd_bus_remove(struct device *dev)
 {
struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev);
 
@@ -446,8 +445,6 @@ static int nd_bus_remove(struct device *dev)
spin_unlock(&nvdimm_bus->badrange.lock);
 
nvdimm_bus_destroy_ndctl(nvdimm_bus);
-
-   return 0;
 }
 
 static int nd_bus_probe(struct device *dev)
diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c
index 94be3ae1d29f..91d9163ee303 100644
--- a/drivers/nvdimm/dimm.c
+++ b/drivers/nvdimm/dimm.c
@@ -113,7 +113,7 @@ static int nvdimm_probe(struct device *dev)
return rc;
 }
 
-static int nvdimm_remove(struct device *dev)
+static void nvdimm_remove(struct device *dev)
 {
struct nvdimm_drvdata *ndd = dev_get_drvdata(dev);
 
@@ -121,8 +121,6 @@ static int nvdimm_remove(struct device *dev)
dev_set_drvdata(dev, NULL);
nvdimm_bus_unlock(dev);
put_ndd(ndd);
-
-   return 0;
 }
 
 static struct nd_device_driver nvdimm_driver = {
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 875076b0ea6c..062f0f22bac9 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -564,7 +564,7 @@ static int nd_pmem_probe(struct device *dev)
return pmem_attach_disk(dev, ndns);
 }
 
-static int nd_pmem_remove(struct device *dev)
+static void nd_pmem_remove(struct device *dev)
 {
struct pmem_device *pmem = dev_get_drvdata(dev);
 
@@ -579,8 +579,6 @@ static int nd_pmem_remove(struct device *dev)
pmem->bb_state = NULL;
}
nvdimm_flush(to_nd_region(dev->parent), NULL);
-
-   return 0;
 }
 
 static void nd_pmem_shutdown(struct device *dev)
diff --git a/drivers/nvdimm/region.c b/drivers/nvdimm/region.c
index bfce87ed72ab..e0c34120df37 100644
--- a/drivers/nvdimm/region.c
+++ b/drivers/nvdimm/region.c
@@ -87,7 +87,7 @@ static int child_unregister(struct device *dev, void *data)
return 0;
 }
 
-static int nd_region_remove(struct device *dev)
+static void nd_region_remove(struct device *dev)
 {
struct nd_region *nd_region = to_nd_region(dev);
 
@@ -108,8 +108,6 @@ static int nd_region_remove(struct device *dev)
 */

Re: [PATCH v2 2/8] cxl/mem: Find device capabilities

2021-02-12 Thread Ben Widawsky
On 21-02-12 13:27:06, Jonathan Cameron wrote:
> On Thu, 11 Feb 2021 07:55:29 -0800
> Ben Widawsky  wrote:
> 
> > On 21-02-11 09:55:48, Jonathan Cameron wrote:
> > > On Wed, 10 Feb 2021 10:16:05 -0800
> > > Ben Widawsky  wrote:
> > >   
> > > > On 21-02-10 08:55:57, Ben Widawsky wrote:  
> > > > > On 21-02-10 15:07:59, Jonathan Cameron wrote:
> > > > > > On Wed, 10 Feb 2021 13:32:52 +
> > > > > > Jonathan Cameron  wrote:
> > > > > > 
> > > > > > > On Tue, 9 Feb 2021 16:02:53 -0800
> > > > > > > Ben Widawsky  wrote:
> > > > > > > 
> > > > > > > > Provide enough functionality to utilize the mailbox of a memory 
> > > > > > > > device.
> > > > > > > > The mailbox is used to interact with the firmware running on 
> > > > > > > > the memory
> > > > > > > > device. The flow is proven with one implemented command, 
> > > > > > > > "identify".
> > > > > > > > Because the class code has already told the driver this is a 
> > > > > > > > memory
> > > > > > > > device and the identify command is mandatory.
> > > > > > > > 
> > > > > > > > CXL devices contain an array of capabilities that describe the
> > > > > > > > interactions software can have with the device or firmware 
> > > > > > > > running on
> > > > > > > > the device. A CXL compliant device must implement the device 
> > > > > > > > status and
> > > > > > > > the mailbox capability. Additionally, a CXL compliant memory 
> > > > > > > > device must
> > > > > > > > implement the memory device capability. Each of the 
> > > > > > > > capabilities can
> > > > > > > > [will] provide an offset within the MMIO region for interacting 
> > > > > > > > with the
> > > > > > > > CXL device.
> > > > > > > > 
> > > > > > > > The capabilities tell the driver how to find and map the 
> > > > > > > > register space
> > > > > > > > for CXL Memory Devices. The registers are required to utilize 
> > > > > > > > the CXL
> > > > > > > > spec defined mailbox interface. The spec outlines two 
> > > > > > > > mailboxes, primary
> > > > > > > > and secondary. The secondary mailbox is earmarked for system 
> > > > > > > > firmware,
> > > > > > > > and not handled in this driver.
> > > > > > > > 
> > > > > > > > Primary mailboxes are capable of generating an interrupt when 
> > > > > > > > submitting
> > > > > > > > a background command. That implementation is saved for a later 
> > > > > > > > time.
> > > > > > > > 
> > > > > > > > Link: 
> > > > > > > > https://www.computeexpresslink.org/download-the-specification
> > > > > > > > Signed-off-by: Ben Widawsky 
> > > > > > > > Reviewed-by: Dan Williams   
> > > > > > > 
> > > > > > > Hi Ben,
> > > > > > > 
> > > > > > > 
> > > > > > > > +/**
> > > > > > > > + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a 
> > > > > > > > memory device.
> > > > > > > > + * @cxlm: The CXL memory device to communicate with.
> > > > > > > > + * @mbox_cmd: Command to send to the memory device.
> > > > > > > > + *
> > > > > > > > + * Context: Any context. Expects mbox_lock to be held.
> > > > > > > > + * Return: -ETIMEDOUT if timeout occurred waiting for 
> > > > > > > > completion. 0 on success.
> > > > > > > > + * Caller should check the return code in @mbox_cmd to 
> > > > > > > > make sure it
> > > > > > > > + * succeeded.  
> > > > > > > 
> > > > > > > cxl_xfer_log() doesn't check mbox_cmd->return_code and for my 
> > > > > > > test it currently
> > > > > > > enters an infinite loop as a result.
> > > > > 
> > > > > I meant to fix that.
> > > > > 
> > > > > > > 
> > > > > > > I haven't checked other paths, but to my mind it is not a good 
> > > > > > > idea to require
> > > > > > > two levels of error checking - the example here proves how easy 
> > > > > > > it is to forget
> > > > > > > one.
> > > > > 
> > > > > Demonstrably, you're correct. I think it would be good to have a 
> > > > > kernel only
> > > > > mbox command that does the error checking though. Let me type 
> > > > > something up and
> > > > > see how it looks.
> > > > 
> > > > Hi Jonathan. What do you think of this? The bit I'm on the fence about 
> > > > is if I
> > > > should validate output size too. I like the simplicity as it is, but it 
> > > > requires
> > > > every caller to possibly check output size, which is kind of the same 
> > > > problem
> > > > you're originally pointing out.  
> > > 
> > > The simplicity is good and this is pretty much what I expected you would 
> > > end up with
> > > (always reassuring)
> > > 
> > > For the output, perhaps just add another parameter to the wrapper for 
> > > minimum
> > > output length expected?
> > > 
> > > Now you mention the length question.  It does rather feel like there 
> > > should also
> > > be some protection on memcpy_fromio() copying too much data if the 
> > > hardware
> > > happens to return an unexpectedly long length.  Should never happen, but
> > > the hardening is worth adding anyway given it's easy to do.
> > > 
> > > Jonathan  
> > 
> > Some backg

Re: [PATCH v2 5/8] cxl/mem: Add a "RAW" send command

2021-02-12 Thread Jonathan Cameron
On Thu, 11 Feb 2021 08:01:48 -0800
Ben Widawsky  wrote:

> On 21-02-11 11:19:24, Jonathan Cameron wrote:
> > On Tue, 9 Feb 2021 16:02:56 -0800
> > Ben Widawsky  wrote:
> >   
> > > The CXL memory device send interface will have a number of supported
> > > commands. The raw command is not such a command. Raw commands allow
> > > userspace to send a specified opcode to the underlying hardware and
> > > bypass all driver checks on the command. This is useful for a couple of
> > > usecases, mainly:
> > > 1. Undocumented vendor specific hardware commands  
> > 
> > This one I get.  There are things we'd love to standardize but often they
> > need proving in a generation of hardware before the data is available to
> > justify taking it to a standards body.  Stuff like performance stats.
> > This stuff will all sit in the vendor defined range.  Maybe there is an
> > argument for in driver hooks to allow proper support even for these
> > (Ben mentioned this in the other branch of the thread).
> >   
> > > 2. Prototyping new hardware commands not yet supported by the driver  
> > 
> > For 2, could just have a convenient place to enable this by one line patch.
> > Some subsystems (SPI comes to mind) do this for their equivalent of raw
> > commands.  The code is all there to enable it but you need to hook it
> > up if you want to use it.  Avoids chance of a distro shipping it.
> >   
> 
> I'm fine to drop #2 as a justification point, or maybe reword the commit 
> message
> to say, "you could also just do... but since we have it for #1 already..."
> 
> > > 
> > > While this all sounds very powerful it comes with a couple of caveats:
> > > 1. Bug reports using raw commands will not get the same level of
> > >attention as bug reports using supported commands (via taint).
> > > 2. Supported commands will be rejected by the RAW command.  
> > 
> > Perhaps I'm missing reading this point 2 (not sure the code actually does 
> > it!)
> > 
> > As stated what worries me as it means when we add support for a new
> > bit of the spec we just broke the userspace ABI.
> >   
> 
> It does not break ABI. The agreement is userspace must always use the QUERY
> command to find out what commands are supported. If it tries to use a RAW
> command that is a supported command, it will be rejected. In the case you
> mention, that's an application bug. If there is a way to document that better
> than what's already in the UAPI kdocs, I'm open to suggestions.
> 
> Unlike perhaps other UAPI, this one only promises to give you a way to 
> determine
> what commands you can use, not the list of what commands you can use.

*crossed fingers* on this.  Users may have a different view when their 
application
just stops working.  It might print a nice error message telling them why
but it still doesn't work and that way lies grumpy Linus and reverts...

Mostly we'll get away with it because no one will notice, but it's unfortunately
still risky.   Personal preference is toplay safer and not allow direct 
userspace
access to commands in the spec (unless we've decided they will always be 
available
directly to userspace).  This includes anything in the ranges reserved for 
future
spec usage.

Jonathan



> 
> > > 
> > > With this comes new debugfs knob to allow full access to your toes with
> > > your weapon of choice.  
> > 
> > A few trivial things inline,
> > 
> > Jonathan
> >   
> > > 
> > > Cc: Ariel Sibley 
> > > Signed-off-by: Ben Widawsky 
> > > Reviewed-by: Dan Williams 
> > > ---
> > >  drivers/cxl/Kconfig  |  18 +
> > >  drivers/cxl/mem.c| 125 ++-
> > >  include/uapi/linux/cxl_mem.h |  12 +++-
> > >  3 files changed, 152 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > index c4ba3aa0a05d..08eaa8e52083 100644
> > > --- a/drivers/cxl/Kconfig
> > > +++ b/drivers/cxl/Kconfig
> > > @@ -33,6 +33,24 @@ config CXL_MEM
> > >  
> > > If unsure say 'm'.
> > >  
> > > +config CXL_MEM_RAW_COMMANDS
> > > + bool "RAW Command Interface for Memory Devices"
> > > + depends on CXL_MEM
> > > + help
> > > +   Enable CXL RAW command interface.
> > > +
> > > +   The CXL driver ioctl interface may assign a kernel ioctl command
> > > +   number for each specification defined opcode. At any given point in
> > > +   time the number of opcodes that the specification defines and a device
> > > +   may implement may exceed the kernel's set of associated ioctl function
> > > +   numbers. The mismatch is either by omission, specification is too new,
> > > +   or by design. When prototyping new hardware, or developing / debugging
> > > +   the driver it is useful to be able to submit any possible command to
> > > +   the hardware, even commands that may crash the kernel due to their
> > > +   potential impact to memory currently in use by the kernel.
> > > +
> > > +   If developing CXL hardware or the driver say Y, otherwise say N.
> > > +
> > >  config CXL_MEM

Re: [PATCH v2 3/8] cxl/mem: Register CXL memX devices

2021-02-12 Thread Jonathan Cameron
On Thu, 11 Feb 2021 12:40:45 -0800
Dan Williams  wrote:

> On Thu, Feb 11, 2021 at 2:19 AM Jonathan Cameron
>  wrote:
> >
> > On Wed, 10 Feb 2021 18:17:25 +
> > Jonathan Cameron  wrote:
> >  
> > > On Tue, 9 Feb 2021 16:02:54 -0800
> > > Ben Widawsky  wrote:
> > >  
> > > > From: Dan Williams 
> > > >
> > > > Create the /sys/bus/cxl hierarchy to enumerate:
> > > >
> > > > * Memory Devices (per-endpoint control devices)
> > > >
> > > > * Memory Address Space Devices (platform address ranges with
> > > >   interleaving, performance, and persistence attributes)
> > > >
> > > > * Memory Regions (active provisioned memory from an address space device
> > > >   that is in use as System RAM or delegated to libnvdimm as Persistent
> > > >   Memory regions).
> > > >
> > > > For now, only the per-endpoint control devices are registered on the
> > > > 'cxl' bus. However, going forward it will provide a mechanism to
> > > > coordinate cross-device interleave.
> > > >
> > > > Signed-off-by: Dan Williams 
> > > > Signed-off-by: Ben Widawsky   
> > >
> > > One stray header, and a request for a tiny bit of reordering to
> > > make it easier to chase through creation and destruction.
> > >
> > > Either way with the header move to earlier patch I'm fine with this one.
> > >
> > > Reviewed-by: Jonathan Cameron   
> >
> > Actually thinking more on this, what is the justification for the
> > complexity + overhead of a percpu_refcount vs a refcount  
> 
> A typical refcount does not have the block and drain semantics of a
> percpu_ref. I'm planning to circle back and make this a first class
> facility of the cdev interface borrowing the debugfs approach [1], but
> for now percpu_ref fits the bill locally.
> 
> > I don't think this is a high enough performance path for it to matter.
> > Perhaps I'm missing a usecase where it does?  
> 
> It's less about percpu_ref performance and more about the
> percpu_ref_tryget_live() facility.
> 
> [1]: 
> http://lore.kernel.org/r/CAPcyv4jEYPsyh0bhbtKGRbK3bgp=_+=2rjx4x0gli5-25vv...@mail.gmail.com

Thanks for the reference. Definitely a nasty corner to clean up so I'll
keep an eye open for a new version of that series.

Jonathan

___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v2 2/8] cxl/mem: Find device capabilities

2021-02-12 Thread Jonathan Cameron
On Thu, 11 Feb 2021 07:55:29 -0800
Ben Widawsky  wrote:

> On 21-02-11 09:55:48, Jonathan Cameron wrote:
> > On Wed, 10 Feb 2021 10:16:05 -0800
> > Ben Widawsky  wrote:
> >   
> > > On 21-02-10 08:55:57, Ben Widawsky wrote:  
> > > > On 21-02-10 15:07:59, Jonathan Cameron wrote:
> > > > > On Wed, 10 Feb 2021 13:32:52 +
> > > > > Jonathan Cameron  wrote:
> > > > > 
> > > > > > On Tue, 9 Feb 2021 16:02:53 -0800
> > > > > > Ben Widawsky  wrote:
> > > > > > 
> > > > > > > Provide enough functionality to utilize the mailbox of a memory 
> > > > > > > device.
> > > > > > > The mailbox is used to interact with the firmware running on the 
> > > > > > > memory
> > > > > > > device. The flow is proven with one implemented command, 
> > > > > > > "identify".
> > > > > > > Because the class code has already told the driver this is a 
> > > > > > > memory
> > > > > > > device and the identify command is mandatory.
> > > > > > > 
> > > > > > > CXL devices contain an array of capabilities that describe the
> > > > > > > interactions software can have with the device or firmware 
> > > > > > > running on
> > > > > > > the device. A CXL compliant device must implement the device 
> > > > > > > status and
> > > > > > > the mailbox capability. Additionally, a CXL compliant memory 
> > > > > > > device must
> > > > > > > implement the memory device capability. Each of the capabilities 
> > > > > > > can
> > > > > > > [will] provide an offset within the MMIO region for interacting 
> > > > > > > with the
> > > > > > > CXL device.
> > > > > > > 
> > > > > > > The capabilities tell the driver how to find and map the register 
> > > > > > > space
> > > > > > > for CXL Memory Devices. The registers are required to utilize the 
> > > > > > > CXL
> > > > > > > spec defined mailbox interface. The spec outlines two mailboxes, 
> > > > > > > primary
> > > > > > > and secondary. The secondary mailbox is earmarked for system 
> > > > > > > firmware,
> > > > > > > and not handled in this driver.
> > > > > > > 
> > > > > > > Primary mailboxes are capable of generating an interrupt when 
> > > > > > > submitting
> > > > > > > a background command. That implementation is saved for a later 
> > > > > > > time.
> > > > > > > 
> > > > > > > Link: 
> > > > > > > https://www.computeexpresslink.org/download-the-specification
> > > > > > > Signed-off-by: Ben Widawsky 
> > > > > > > Reviewed-by: Dan Williams   
> > > > > > 
> > > > > > Hi Ben,
> > > > > > 
> > > > > > 
> > > > > > > +/**
> > > > > > > + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory 
> > > > > > > device.
> > > > > > > + * @cxlm: The CXL memory device to communicate with.
> > > > > > > + * @mbox_cmd: Command to send to the memory device.
> > > > > > > + *
> > > > > > > + * Context: Any context. Expects mbox_lock to be held.
> > > > > > > + * Return: -ETIMEDOUT if timeout occurred waiting for 
> > > > > > > completion. 0 on success.
> > > > > > > + * Caller should check the return code in @mbox_cmd to 
> > > > > > > make sure it
> > > > > > > + * succeeded.  
> > > > > > 
> > > > > > cxl_xfer_log() doesn't check mbox_cmd->return_code and for my test 
> > > > > > it currently
> > > > > > enters an infinite loop as a result.
> > > > 
> > > > I meant to fix that.
> > > > 
> > > > > > 
> > > > > > I haven't checked other paths, but to my mind it is not a good idea 
> > > > > > to require
> > > > > > two levels of error checking - the example here proves how easy it 
> > > > > > is to forget
> > > > > > one.
> > > > 
> > > > Demonstrably, you're correct. I think it would be good to have a kernel 
> > > > only
> > > > mbox command that does the error checking though. Let me type something 
> > > > up and
> > > > see how it looks.
> > > 
> > > Hi Jonathan. What do you think of this? The bit I'm on the fence about is 
> > > if I
> > > should validate output size too. I like the simplicity as it is, but it 
> > > requires
> > > every caller to possibly check output size, which is kind of the same 
> > > problem
> > > you're originally pointing out.  
> > 
> > The simplicity is good and this is pretty much what I expected you would 
> > end up with
> > (always reassuring)
> > 
> > For the output, perhaps just add another parameter to the wrapper for 
> > minimum
> > output length expected?
> > 
> > Now you mention the length question.  It does rather feel like there should 
> > also
> > be some protection on memcpy_fromio() copying too much data if the hardware
> > happens to return an unexpectedly long length.  Should never happen, but
> > the hardening is worth adding anyway given it's easy to do.
> > 
> > Jonathan  
> 
> Some background because I forget what I've said previously... It's unfortunate
> that the spec maxes at 1M mailbox size but has enough bits in the length field
> to support 2M-1. I've made some requests to have this fixed, so maybe 3.0 
> won't
> be awkward like this.

Agreed spec should be tighter

Re: [PATCH v2 2/8] cxl/mem: Find device capabilities

2021-02-12 Thread Jonathan Cameron
On Thu, 11 Feb 2021 10:27:41 -0800
Ben Widawsky  wrote:

> On 21-02-11 09:55:48, Jonathan Cameron wrote:
> > On Wed, 10 Feb 2021 10:16:05 -0800
> > Ben Widawsky  wrote:
> >   
> > > On 21-02-10 08:55:57, Ben Widawsky wrote:  
> > > > On 21-02-10 15:07:59, Jonathan Cameron wrote:
> > > > > On Wed, 10 Feb 2021 13:32:52 +
> > > > > Jonathan Cameron  wrote:
> > > > > 
> > > > > > On Tue, 9 Feb 2021 16:02:53 -0800
> > > > > > Ben Widawsky  wrote:
> > > > > > 
> > > > > > > Provide enough functionality to utilize the mailbox of a memory 
> > > > > > > device.
> > > > > > > The mailbox is used to interact with the firmware running on the 
> > > > > > > memory
> > > > > > > device. The flow is proven with one implemented command, 
> > > > > > > "identify".
> > > > > > > Because the class code has already told the driver this is a 
> > > > > > > memory
> > > > > > > device and the identify command is mandatory.
> > > > > > > 
> > > > > > > CXL devices contain an array of capabilities that describe the
> > > > > > > interactions software can have with the device or firmware 
> > > > > > > running on
> > > > > > > the device. A CXL compliant device must implement the device 
> > > > > > > status and
> > > > > > > the mailbox capability. Additionally, a CXL compliant memory 
> > > > > > > device must
> > > > > > > implement the memory device capability. Each of the capabilities 
> > > > > > > can
> > > > > > > [will] provide an offset within the MMIO region for interacting 
> > > > > > > with the
> > > > > > > CXL device.
> > > > > > > 
> > > > > > > The capabilities tell the driver how to find and map the register 
> > > > > > > space
> > > > > > > for CXL Memory Devices. The registers are required to utilize the 
> > > > > > > CXL
> > > > > > > spec defined mailbox interface. The spec outlines two mailboxes, 
> > > > > > > primary
> > > > > > > and secondary. The secondary mailbox is earmarked for system 
> > > > > > > firmware,
> > > > > > > and not handled in this driver.
> > > > > > > 
> > > > > > > Primary mailboxes are capable of generating an interrupt when 
> > > > > > > submitting
> > > > > > > a background command. That implementation is saved for a later 
> > > > > > > time.
> > > > > > > 
> > > > > > > Link: 
> > > > > > > https://www.computeexpresslink.org/download-the-specification
> > > > > > > Signed-off-by: Ben Widawsky 
> > > > > > > Reviewed-by: Dan Williams   
> > > > > > 
> > > > > > Hi Ben,
> > > > > > 
> > > > > > 
> > > > > > > +/**
> > > > > > > + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory 
> > > > > > > device.
> > > > > > > + * @cxlm: The CXL memory device to communicate with.
> > > > > > > + * @mbox_cmd: Command to send to the memory device.
> > > > > > > + *
> > > > > > > + * Context: Any context. Expects mbox_lock to be held.
> > > > > > > + * Return: -ETIMEDOUT if timeout occurred waiting for 
> > > > > > > completion. 0 on success.
> > > > > > > + * Caller should check the return code in @mbox_cmd to 
> > > > > > > make sure it
> > > > > > > + * succeeded.  
> > > > > > 
> > > > > > cxl_xfer_log() doesn't check mbox_cmd->return_code and for my test 
> > > > > > it currently
> > > > > > enters an infinite loop as a result.
> > > > 
> > > > I meant to fix that.
> > > > 
> > > > > > 
> > > > > > I haven't checked other paths, but to my mind it is not a good idea 
> > > > > > to require
> > > > > > two levels of error checking - the example here proves how easy it 
> > > > > > is to forget
> > > > > > one.
> > > > 
> > > > Demonstrably, you're correct. I think it would be good to have a kernel 
> > > > only
> > > > mbox command that does the error checking though. Let me type something 
> > > > up and
> > > > see how it looks.
> > > 
> > > Hi Jonathan. What do you think of this? The bit I'm on the fence about is 
> > > if I
> > > should validate output size too. I like the simplicity as it is, but it 
> > > requires
> > > every caller to possibly check output size, which is kind of the same 
> > > problem
> > > you're originally pointing out.  
> > 
> > The simplicity is good and this is pretty much what I expected you would 
> > end up with
> > (always reassuring)
> > 
> > For the output, perhaps just add another parameter to the wrapper for 
> > minimum
> > output length expected?
> > 
> > Now you mention the length question.  It does rather feel like there should 
> > also
> > be some protection on memcpy_fromio() copying too much data if the hardware
> > happens to return an unexpectedly long length.  Should never happen, but
> > the hardening is worth adding anyway given it's easy to do.
> > 
> > Jonathan
> >   
> 
> I like it.
> 
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 2e199b05f686..58071a203212 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -293,7 +293,7 @@ static void cxl_mem_mbox_put(struct cxl_mem *cxlm)
>   * See __cxl_mem_mbox_send_cmd()
>   */
>  static int cxl

Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-12 Thread Michal Hocko
On Fri 12-02-21 00:59:29, Mike Rapoport wrote:
> On Thu, Feb 11, 2021 at 01:30:42PM +0100, Michal Hocko wrote:
[...]
> > Have a look how hugetlb proliferates through our MM APIs. I strongly
> > suspect this is strong signal that this won't be any different.
> > 
> > > And even if yes, adding SECRETMEM_HUGE
> > > flag seems to me less confusing than saying "from kernel x.y you can use
> > > MFD_CREATE | MFD_SECRET | MFD_HUGE" etc for all possible combinations.
> > 
> > I really fail to see your point. This is a standard model we have. It is
> > quite natural that flags are added. Moreover adding a new syscall will
> > not make it any less of a problem.
> 
> Nowadays adding a new syscall is not as costly as it used to be. And I
> think it'll provide better extensibility when new features would be added
> to secretmem. 
> 
> For instance, for creating a secretmem fd backed with sealing we'd have
> 
>   memfd_secretm(SECRETMEM_HUGE);

You mean SECRETMEM_HUGE_1G_AND_SEALED or SECRET_HUGE_2MB_WITHOUT_SEALED?
This would be rather an antipatern to our flags design, no? Really there
are orthogonal requirements here and there is absolutely zero reason
to smash everything into a single thing. It is just perfectly fine to
combine those functionalities without a pre-described way how to do
that.

> rather than
> 
>   memfd_create(MFD_ALLOW_SEALING | MFD_HUGETLB | MFD_SECRET);
> 
> 
> Besides, if we overload memfd_secret we add complexity to flags validation
> of allowable flag combinations even with the simplest initial
> implementation.

This is the least of my worry, really. The existing code in
memfd_create, unlike others legacy interfaces, allows extensions just
fine.

> And what it will become when more features are added to secretmem?

Example?

> > > > I by no means do not insist one way or the other but from what I have
> > > > seen so far I have a feeling that the interface hasn't been thought
> > > > through enough.
> > > 
> > > It has been, but we have different thoughts about it ;-)
> > 
> > Then you must be carrying a lot of implicit knowledge which I want you
> > to document.
> 
> I don't have any implicit knowledge, we just have a different perspective.

OK, I will stop discussing now because it doesn't really seem to lead
anywhere.

Just to recap my current understanding. Your main argument so far is
that this is somehow special and you believe it would be confusing
to use an existing interface. I beg to disagree here because memfd
interface is exactly a way to get a file handle to describe a memory
which is what you want. About the only thing that secretmem is special
is that it only operates on mapped areas and read/write interface is
not supported (but I do not see a fundamental reason this couldn't be
added in the future). All the rest is just operating on a memory backed
file. I envison the hugetlb support will follow and sealing sounds like
a useful thing to be requested as well.  All that would have to be
added to a new syscall over time and then we will land at two parallel
interface supporting a largerly overlapping feature set.

To me all the above sounds to be much stronher argument than your worry
this might be confusing.

I will not insist on this but you should have some more thought on those
arguments.
-- 
Michal Hocko
SUSE Labs
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-02-12 Thread David Hildenbrand

On 12.02.21 00:09, Mike Rapoport wrote:

On Thu, Feb 11, 2021 at 01:07:10PM +0100, David Hildenbrand wrote:

On 11.02.21 12:27, Mike Rapoport wrote:

On Thu, Feb 11, 2021 at 10:01:32AM +0100, David Hildenbrand wrote:


So let's talk about the main user-visible differences to other memfd files
(especially, other purely virtual files like hugetlbfs). With secretmem:

- File content can only be read/written via memory mappings.
- File content cannot be swapped out.

I think there are still valid ways to modify file content using syscalls:
e.g., fallocate(PUNCH_HOLE). Things like truncate also seems to work just
fine.
  
These work perfectly with any file, so maybe we should have added

memfd_create as a flag to open(2) back then and now the secretmem file
descriptors?


I think open() vs memfd_create() makes sense: for open, the path 
specifies main properties (tmpfs, hugetlbfs, filesystem). On memfd, 
there is no such path and the "type" has to be specified differently.


Also, open() might open existing files - memfd always creates new files.

  

AFAIKS, we would need MFD_SECRET and disallow
MFD_ALLOW_SEALING and MFD_HUGETLB.


So here we start to multiplex.


Yes. And as Michal said, maybe we can support combinations in the future.


Isn't there a general agreement that syscall multiplexing is not a good
thing?


Looking at mmap(), madvise(), fallocate(), I think multiplexing is just 
fine and flags can be mutually exclusive - as long as we're not 
squashing completely unrelated things into a single system call.


As one example: we don't have mmap_private() vs. mmap_shared() vs. 
mmap_shared_validate(). E.g., MAP_SYNC is only available for 
MAP_SHARED_VALIDATE.




memfd_create already has flags validation that does not look very nice.


I assume you're talking about the hugetlb size specifications, right? 
It's not nice but fairly compact.



Adding there only MFD_SECRET will make it a bit less nice, but when we'll
grow new functionality into secretmem that will become horrible.


What do you have in mind? A couple of MFD_SECRET_* flags that only work 
with MFD_SECRET won't hurt IMHO. Just like we allow MFD_HUGE_* only with 
MFD_HUGETLB.


Thanks,

David / dhildenb
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org