On 12/18/2025 10:52 PM, Alison Schofield wrote:
> On Mon, Dec 15, 2025 at 03:36:30PM -0600, Ben Cheatham wrote:
>> Add man pages for the 'cxl-inject-error' and 'cxl-clear-error' commands.
>> These man pages show usage and examples for each of their use cases.
>>
>> Reviewed-by: Dave Jiang <[email protected]>
>> Signed-off-by: Ben Cheatham <[email protected]>
>> ---
>> Documentation/cxl/cxl-clear-error.txt | 67 +++++++++++++
>> Documentation/cxl/cxl-inject-error.txt | 129 +++++++++++++++++++++++++
>> Documentation/cxl/meson.build | 2 +
>> 3 files changed, 198 insertions(+)
>> create mode 100644 Documentation/cxl/cxl-clear-error.txt
>> create mode 100644 Documentation/cxl/cxl-inject-error.txt
>
> snip
>
>> diff --git a/Documentation/cxl/cxl-inject-error.txt
>> b/Documentation/cxl/cxl-inject-error.txt
>> new file mode 100644
>> index 0000000..e1bebd7
>> --- /dev/null
>> +++ b/Documentation/cxl/cxl-inject-error.txt
>> @@ -0,0 +1,129 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +cxl-inject-error(1)
>> +===================
>> +
>> +NAME
>> +----
>> +cxl-inject-error - Inject CXL errors into CXL devices
>> +
>> +SYNOPSIS
>> +--------
>> +[verse]
>> +'cxl inject-error' <device name> [<options>]
>> +
>> +Inject an error into a CXL device. The type of errors supported depend on
>> the
>> +device specified. The types of devices supported are:
>> +
>> +"Downstream Ports":: A CXL RCH downstream port (dport) or a CXL VH root
>> port.
>> +Eligible CXL 2.0+ ports are dports of ports at depth 1 in the output of
>> cxl-list.
>> +Dports are specified by host name ("0000:0e:01.1").
>
> How are users to find that dport host?
The user needs to know beforehand at the moment. More below.
>
> Is there a cxl list "show me the dports where i can inject protocol errors"
> incantation that we can recommend here.
>
> I ended up looking at /sys/kernel/debug/cxl/ to find the hosts.
>
> Would another attribute added to those dports make sense, be possible?
> like is done for the poison injectable memdevs? ie 'protocol_injectable:
> true'
Which ports support error injection depends on the CXL version of the host. For
CXL 1.1
hosts it's any memory-mapped downstream port, while for 2.0+ it's only CXL root
ports
(ACPI 6.5 Table 18-31).
The kernel adds a debugfs entry for all downstream ports regardless of those
requirements IIRC.
Having the extra entries doesn't break anything since the platform firmware
should reject invalid
injection targets, but it does add an extra hurdle for the user.
I think what I'll do here is submit a kernel patch to clean up the extra
entries (needed to be done anyway)
and add a 'protocol_injectable' attribute for the downstream port when a
debugfs entry exists. I'll probably
send out the kernel patch at the same time as v6.
Let me know if any of that sounds unreasonable or you'd rather I do something
else!
Thanks,
Ben
>
>
>> +"memdevs":: A CXL memory device. Memory devices are specified by device name
>> +("mem0"), device id ("0"), and/or host device name ("0000:35:00.0").
>> +
>> +There are two types of errors which can be injected: CXL protocol errors
>> +and device poison.
>> +
>> +CXL protocol errors can only be used with downstream ports (as defined
>> above).
>> +Protocol errors follow the format of "<protocol>-<severity>". For example,
>> +a "mem-fatal" error is a CXL.mem fatal protocol error. Protocol errors can
>> be
>> +found with the '-N' option of 'cxl-list' under a CXL bus object. For
>> example:
>> +
>> +----
>> +
>> +# cxl list -NB
>> +[
>> + {
>> + "bus":"root0",
>> + "provider":"ACPI.CXL",
>> + "injectable_protocol_errors":[
>> + "mem-correctable",
>> + "mem-fatal",
>> + ]
>> + }
>> +]
>> +
>> +----
>> +
>> +CXL protocol (CXL.cache/mem) error injection requires the platform to
>> support
>> +ACPI v6.5+ error injection (EINJ). In addition to platform support, the
>> +CONFIG_ACPI_APEI_EINJ and CONFIG_ACPI_APEI_EINJ_CXL kernel configuration
>> options
>> +will need to be enabled. For more information, view the Linux kernel
>> documentation
>> +on EINJ.
>> +
>> +Device poison can only by used with CXL memory devices. A device physical
>> address
>> +(DPA) is required to do poison injection. DPAs range from 0 to the size of
>> +device's memory, which can be found using 'cxl-list'. An example injection:
>> +
>> +----
>> +
>> +# cxl inject-error mem0 -t poison -a 0x1000
>> +poison injected at mem0:0x1000
>> +# cxl list -m mem0 -u --media-errors
>> +{
>> + "memdev":"mem0",
>> + "ram_size":"256.00 MiB (268.44 MB)",
>> + "serial":"0",
>> + "host":"0000:0d:00.0",
>> + "firmware_version":"BWFW VERSION 00",
>> + "media_errors":[
>> + {
>> + "offset":"0x1000",
>> + "length":64,
>> + "source":"Injected"
>> + }
>> + ]
>> +}
>> +
>> +----
>> +
>> +Not all devices support poison injection. To see if a device supports
>> poison injection
>> +through debugfs, use 'cxl-list' with the '-N' option and look for the
>> "poison-injectable"
>> +attribute under the device. Example:
>> +
>> +----
>> +
>> +# cxl list -Nu -m mem0
>> +{
>> + "memdev":"mem0",
>> + "ram_size":"256.00 MiB (268.44 MB)",
>> + "serial":"0",
>> + "host":"0000:0d:00.0",
>> + "firmware_version":"BWFW VERSION 00",
>> + "poison_injectable":true
>> +}
>> +
>> +----
>> +
>> +This command depends on the kernel debug filesystem (debugfs) to do CXL
>> protocol
>> +error and device poison injection.
>> +
>> +OPTIONS
>> +-------
>> +-a::
>> +--address::
>> + Device physical address (DPA) to use for poison injection. Address can
>> + be specified in hex or decimal. Required for poison injection.
>> +
>> +-t::
>> +--type::
>> + Type of error to inject into <device name>. The type of error is
>> restricted
>> + by device type. The following shows the possible types under their
>> associated
>> + device type(s):
>> +----
>> +
>> +Downstream Ports: ::
>> + cache-correctable, cache-uncorrectable, cache-fatal, mem-correctable,
>> + mem-fatal
>> +
>> +Memdevs: ::
>> + poison
>> +
>> +----
>> +
>> +--debug::
>> + Enable debug output
>> +
>> +SEE ALSO
>> +--------
>> +linkcxl:cxl-list[1]
>> diff --git a/Documentation/cxl/meson.build b/Documentation/cxl/meson.build
>> index 8085c1c..0b75eed 100644
>> --- a/Documentation/cxl/meson.build
>> +++ b/Documentation/cxl/meson.build
>> @@ -50,6 +50,8 @@ cxl_manpages = [
>> 'cxl-update-firmware.txt',
>> 'cxl-set-alert-config.txt',
>> 'cxl-wait-sanitize.txt',
>> + 'cxl-inject-error.txt',
>> + 'cxl-clear-error.txt',
>> ]
>>
>> foreach man : cxl_manpages
>> --
>> 2.52.0
>>