在 2024/4/18 1:30, Dave Jiang 写道:
On 4/17/24 12:50 AM, Shiyang Ruan wrote:
Currently driver only traces cxl events, poison creation (for both vmem
and pmem type) on cxl memdev is silent. OS needs to be notified then it
could handle poison pages in time. Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.
Please consider below for better clarity:
Currently the driver only traces CXL events. Poison creation (for both ram
and pmem type) on a CXL memdev is silent. The OS needs to be notified so it
can handle poison pages. Per CXL spec, the device error event
can be signaled through the FW-First method or the OS-First method.
Thanks, this is better.
So, add poison creation event handler in OS-First method:
- Qemu:
- CXL device reports POISON creation event to OS by MSI by sending
GMER/DER after injecting a poison record;
Can probably drop the QEMU changes and this is the kernel commit log.
Ok.
- CXL driver:
a. parse the POISON event from GMER/DER;
b. translate poisoned DPA to HPA (PFN);
c. enqueue poisoned PFN to memory_failure's work queue;
Signed-off-by: Shiyang Ruan <ruansy.f...@fujitsu.com>
---
drivers/cxl/core/mbox.c | 119 +++++++++++++++++++++++++++++++++-----
drivers/cxl/cxlmem.h | 8 +--
include/linux/cxl-event.h | 18 +++++-
3 files changed, 125 insertions(+), 20 deletions(-)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f0f54aeccc87..76af0d73859d 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
}
EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
- enum cxl_event_log_type type,
- enum cxl_event_type event_type,
- const uuid_t *uuid, union cxl_event *evt)
+static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region
*cxlr,
I think this needs to be changed to __cxl_report_poison() and the function
below to cxl_report_poison(). Otherwise it goes against typical Linux
methodology of having the __functionX() as the raw functionality function
called by a functionX() wrapper.
This function was designed to do the real reporting work, and could be
called at other places (actually did in previous version). Now that it
is called only below in this version, yes, it's better to change the names.
--
Thanks,
Ruan.
DJ