HI James,

[...]
> 
> The code to signal memory-failure to user-space doesn't depend on the CPU's 
> RAS-extensions.
I roughly check your answer and agree with your general idea.
late I will check it in detail.

I have a question, do you sure that if CPU does not support RAS-extensions 
kernel can still call memory-failure() to send signal to qemu?

After my checking the code, the general flow is RAS module detects the error or 
CPU consumes the hardware poison data, happen exception, then EL3 firmware 
records the address to APEI table and send
notification to kernel. Kernel parses the APEI table to get address and call 
memory_failure() to identify the page to poison. That is to say, usually, after 
RAS detect the error, it call memory_failure(),
otherwise, it does not know whether this address is poison.
I am worried about one thing, if hardware does not has RAS, OS cannot know 
which address is poison, so it cannot identify the address , then the address 
that is delivered to Qemu(user space) may not right.

As you said, kernel can also call memory_failure() even without RAS support. in 
this without RAS case, how it consider the address is poison and needs to send 
SIGBUS to QEMU?

> 
> If Qemu supports notifying the guest about RAS errors using CPER records, it 
> should generate a HEST describing firmware first. It can then
> choose the notification methods, some of which may require optional KVM APIs 
> to support.
> 
> Seattle has a HEST, it doesn't support the CPU RAS-extensions. The kernel can 
> notify user-space about memory_failure() on this machine. I
> would expect Qemu to be able to receive signals and describe memory errors to 
> a guest (1).

Usually we consider the address got from APEI table is poison. If so, I want to 
know, without RAS and APEI table, how it identify the address to hwpoison?

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to