Neoverse-N1 affected by #1349291 may report an Uncontained RAS Error
as Unrecoverable. The kernel's architecture code already considers
Unrecoverable errors as fatal as without kernel-first support no
further error-handling is possible.

Now that KVM attributes SError to the host/guest more precisely
the host's architecture code will always handle host errors that
become pending during world-switch.
Errors misclassified by this errata that affected the guest will be
re-injected to the guest as an implementation-defined SError, which can
be uncontained.

Until kernel-first support is implemented, no workaround is needed
for this issue.

Signed-off-by: James Morse <james.mo...@arm.com>
---
imp-def SError can mean uncontained. In the RAS spec, 2.4.2 "ESB and other
containable errors":
| It is [imp-def] whether [imp-def] and uncategorized SError interrupts
| are containable or Uncontainable.

 Documentation/arm64/silicon-errata.txt | 1 +
 arch/arm64/kernel/traps.c              | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/Documentation/arm64/silicon-errata.txt 
b/Documentation/arm64/silicon-errata.txt
index 68d9b74fd751..7d010f739146 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -62,6 +62,7 @@ stable kernels.
 | ARM            | Cortex-A76      | #1165522        | ARM64_ERRATUM_1165522   
    |
 | ARM            | Cortex-A76      | #1286807        | ARM64_ERRATUM_1286807   
    |
 | ARM            | Neoverse-N1     | #1188873        | ARM64_ERRATUM_1188873   
    |
+| ARM            | Neoverse-N1     | #1349291        | N/A                     
    |
 | ARM            | MMU-500         | #841119,#826419 | N/A                     
    |
 |                |                 |                 |                         
    |
 | Cavium         | ThunderX ITS    | #22375, #24313  | CAVIUM_ERRATUM_22375    
    |
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index ade32046f3fe..4f427ad1089d 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -892,6 +892,10 @@ bool arm64_is_fatal_ras_serror(struct pt_regs *regs, 
unsigned int esr)
                /*
                 * The CPU can't make progress. The exception may have
                 * been imprecise.
+                *
+                * Neoverse-N1 #1349291 means a non-KVM SError reported as
+                * Unrecoverable should be treated as Uncontainable. We
+                * call arm64_serror_panic() in both cases.
                 */
                return true;
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to