On 3/24/20 10:57 AM, Michael Ellerman wrote:
Ganesh Goudar <ganes...@linux.ibm.com> writes:
If we hit UE at an instruction with a fixup entry, flag to
ignore the event and set nip to continue execution at the
fixup entry.
You don't explain why we would want to do that. Or what the consequences
are if we *don't* do it.

As such it's unclear if this is an important fix or just a nice-to-have.

We want avoid panic if we hit MCE during memcpy from pmem devices because
the system is still recoverable and should just result -EIO, So we flag it here
to ignore the UE event. I will respin with better commit message.

For powernv these changes are already made by
commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")
We have masses of code that supposedly abstracts the MCE logic. How did
we end up in the situation where we're having to write the same fix
twice for different platforms?

What is common between pseries and powernv now is saving the MCE event for 
deferred
handling and deferred handling. According to me it becomes bit messy to return
disposition(UE RECOVERED) from common code. So what we can have is a common 
function
which searches the exception table entry and updates nip with fixup address, 
And call
it from different places for pseries and powernv. If you are ok ill spin next 
version.

next

cheers

Reviewed-by: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com>
Reviewed-by: Santosh S <sant...@fossix.org>
Signed-off-by: Ganesh Goudar <ganes...@linux.ibm.com>
---
V2: Fixes a trivial checkpatch error in commit msg.
V3: Use proper subject prefix.
---
  arch/powerpc/platforms/pseries/ras.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 43710b69e09e..58e2483fbb1a 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -10,6 +10,7 @@
  #include <linux/fs.h>
  #include <linux/reboot.h>
  #include <linux/irq_work.h>
+#include <linux/extable.h>
#include <asm/machdep.h>
  #include <asm/rtas.h>
@@ -505,6 +506,7 @@ static int mce_handle_error(struct pt_regs *regs, struct 
rtas_error_log *errp)
        int initiator = rtas_error_initiator(errp);
        int severity = rtas_error_severity(errp);
        u8 error_type, err_sub_type;
+       const struct exception_table_entry *entry;
if (initiator == RTAS_INITIATOR_UNKNOWN)
                mce_err.initiator = MCE_INITIATOR_UNKNOWN;
@@ -558,6 +560,12 @@ static int mce_handle_error(struct pt_regs *regs, struct 
rtas_error_log *errp)
        switch (mce_log->error_type) {
        case MC_ERROR_TYPE_UE:
                mce_err.error_type = MCE_ERROR_TYPE_UE;
+               entry = search_kernel_exception_table(regs->nip);
+               if (entry) {
+                       mce_err.ignore_event = true;
+                       regs->nip = extable_fixup(entry);
+                       disposition = RTAS_DISP_FULLY_RECOVERED;
+               }
                switch (err_sub_type) {
                case MC_ERROR_UE_IFETCH:
                        mce_err.u.ue_error_type = MCE_UE_ERROR_IFETCH;
--
2.17.2

Reply via email to