** Changed in: linux (Ubuntu Zesty)
Status: In Progress => Fix Committed
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1684054
Title:
[LTCTest][Opal][FW860.20] HMI recoverable errors failed to recover and
system goes to dump state.
Status in The Ubuntu-power-systems project:
In Progress
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Zesty:
Fix Committed
Bug description:
== Comment: #0 - Pridhiviraj Paidipeddi <[email protected]> - 2017-04-17
06:08:41 ==
---Problem Description---
HMI Recoverable error injection tests leads to system checkstop followed by
system dump with ubuntu 17.04 os and kernel 4.10.0-19-generic ppc64le
Contact Information = [email protected]
---uname output---
#21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = PowerNV 8284-22A
---System Hang---
System is in dumping state. after dump finishes system will IPL to OS again.
---Debugger---
A debugger is not configured
== Comment: #3 - Pridhiviraj Paidipeddi <[email protected]> - 2017-04-17
06:12:51 ==
# uname -a
#21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux
# cat /etc/os-release
NAME="Ubuntu"
VERSION="17.04 (Zesty Zapus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 17.04"
VERSION_ID="17.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=zesty
UBUNTU_CODENAME=zesty
root@p8wookie:~#
== Comment: #4 - Kevin W. Rudd <[email protected]> - 2017-04-17
11:10:22 ==
== Comment: #5 - MAHESH J. SALGAONKAR <[email protected]> -
2017-04-17 13:34:03 ==
it looks like below commit is a culprit:
=======================================
commit 2337d207288f163e10bd8d4d7eeb0c1c75046a0c
Author: Nicholas Piggin <[email protected]>
Date: Fri Jan 27 14:24:33 2017 +1000
powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts
The branch from hmi_exception_early to hmi_exception_realmode must use
a "relocatable-style" branch, because it is branching from unrelocated
exception code to beyond __end_interrupts.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
=======================================
With the above commit changes now hmi_exception_realmode() is called
using bctrl which ends up messing up TOC (r2) value and further access
using new r2 results into unpredictable behaviour.
----------------------------------------
c000000000025f50 <hmi_exception_realmode>:
c000000000025f50: 3a 01 4c 3c addis r2,r12,314
c000000000025f54: b0 01 42 38 addi r2,r2,432
c000000000025f58: a6 02 08 7c mflr r0
-----------------------------------------
With above commit the hmi_exception_early() code jumps to
c000000000025f50 (hmi_exception_realmode+0x0) which then sets up new
value for r2.
If we revert above commit the code jumps to c000000000025f58
(hmi_exception_realmode+0x8) and hmi handler works fine.
After reverting above patch I don't see this issue anymore. I have
rebuilt the ubuntu kernel after reverting above patch and you can find
the kernel rpm at:
Can you please retry your tests with above kernel and see if issue
still persists.
== Comment: #6 - MAHESH J. SALGAONKAR <[email protected]> -
2017-04-17 23:02:31 ==
Spoke to Michael Ellerman this morning. He helped me to identify the root
cause and a fix patch beow:
diff --git a/arch/powerpc/kernel/exceptions-64s.S
b/arch/powerpc/kernel/exceptions-64s.S
index 857bf7c5b946..7cfeb8768587 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -982,7 +982,7 @@ TRAMP_REAL_BEGIN(hmi_exception_early)
EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN)
EXCEPTION_PROLOG_COMMON_3(0xe60)
addi r3,r1,STACK_FRAME_OVERHEAD
- BRANCH_LINK_TO_FAR(r4, hmi_exception_realmode)
+ BRANCH_LINK_TO_FAR(r12, hmi_exception_realmode)
/* Windup the stack. */
/* Move original HSRR0 and HSRR1 into the respective regs */
ld r9,_MSR(r1)
== Comment: #7 - Pridhiviraj Paidipeddi <[email protected]> -
2017-04-18 01:52:03 ==
== Comment: #8 - Pridhiviraj Paidipeddi <[email protected]> - 2017-04-18
01:53:57 ==
Hi Mahesh
Tested all the HMI Recoverable errors on the below patched kernel, attached
the corresponding executing logs. All tests are working fine.
#21 SMP Mon Apr 17 12:58:30 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux
Thanks
== Comment: #9 - MAHESH J. SALGAONKAR <[email protected]> -
2017-04-18 06:07:56 ==
(In reply to comment #8)
> Hi Mahesh
> Tested all the HMI Recoverable errors on the below patched kernel, attached
> the corresponding executing logs. All tests are working fine.
>
> Linux p8wookie 4.10.0-19.bz153487-generic #21 SMP Mon Apr 17 12:58:30 EDT
> 2017 ppc64le ppc64le ppc64le GNU/Linux
>
>
> Thanks
Thanks. Michael has posted fix for this upstream.
http://patchwork.ozlabs.org/patch/751647/
I will rebuild the new ubuntu kernel with above patch.
== Comment: #12 - Pridhiviraj Paidipeddi <[email protected]> - 2017-04-18
09:27:59 ==
(In reply to comment #11)
> >
> > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032
>
> I have built new kernel with above patch and you can find it below path
>
>:/home2/mahesh/u2/bz153487v2/linux-image-4.10.0-19.bz153487v2-
> generic_4.10.0-19.bz153487v2.21_ppc64el.deb
Tested with this new patched kernel, all tests are working fine.
Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18
07:43:13 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux
Will attach is full the execution logs here.
== Comment: #13 - Pridhiviraj Paidipeddi <[email protected]> -
2017-04-18 09:29:43 ==
== Comment: #14 - MAHESH J. SALGAONKAR <[email protected]> -
2017-04-19 03:52:18 ==
(In reply to comment #12)
> (In reply to comment #11)
> > >
> > > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032
> >
Thanks for testing. We need to mirror this to ubuntu for fix patch
inclusion
>
> Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18 07:43:13 EDT
> 2017 ppc64le ppc64le ppc64le GNU/Linux
>
> Will attach is full the execution logs here.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1684054/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp