[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress
** Tags added: cscc -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1661684 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress Status in The Ubuntu-power-systems project: Opinion Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Ping Tian Han - 2016-12-26 21:59:52 == ---Problem Description--- When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system dropped into xmon: roselp4 login: [ 95.511790] sysrq: SysRq : Changing Loglevel [ 95.511816] sysrq: Loglevel set to 9 [ 289.363833] mlx4_en 0292:60:00.0: removed PHC [ 293.123896] iommu: Removing device 0292:60:00.0 from group 3 [ 303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released [ 303.173865] rpadlpar_io: slot PHB 658 removed [ 335.853779] iommu: Removing device 0021:01:00.0 from group 0 [ 345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released [ 345.893869] rpadlpar_io: slot PHB 33 removed [ 382.204003] min_free_kbytes is not updated to 16885 because user defined value 551564 is preferred [ 446.143648] cpu 152 (hwid 152) Ready to die... [ 446.464057] cpu 153 (hwid 153) Ready to die... [ 446.473525] cpu 154 (hwid 154) Ready to die... [ 446.474077] cpu 155 (hwid 155) Ready to die... [ 446.483529] cpu 156 (hwid 156) Ready to die... [ 446.493532] cpu 157 (hwid 157) Ready to die... [ 446.494078] cpu 158 (hwid 158) Ready to die... [ 446.503527] cpu 159 (hwid 159) Ready to die... [ 446.664534] cpu 144 (hwid 144) Ready to die... [ 446.964113] cpu 145 (hwid 145) Ready to die... [ 446.973525] cpu 146 (hwid 146) Ready to die... [ 446.974094] cpu 147 (hwid 147) Ready to die... [ 446.983944] cpu 148 (hwid 148) Ready to die... [ 446.984062] cpu 149 (hwid 149) Ready to die... [ 446.993518] cpu 150 (hwid 150) Ready to die... [ 446.993543] Querying DEAD? cpu 150 (150) shows 2 [ 446.994098] cpu 151 (hwid 151) Ready to die... [ 447.133726] cpu 136 (hwid 136) Ready to die... [ 447.403532] cpu 137 (hwid 137) Ready to die... [ 447.403772] cpu 138 (hwid 138) Ready to die... [ 447.403839] cpu 139 (hwid 139) Ready to die... [ 447.403887] cpu 140 (hwid 140) Ready to die... [ 447.403937] cpu 141 (hwid 141) Ready to die... [ 447.403979] cpu 142 (hwid 142) Ready to die... [ 447.404038] cpu 143 (hwid 143) Ready to die... [ 447.513546] cpu 128 (hwid 128) Ready to die... [ 447.693533] cpu 129 (hwid 129) Ready to die... [ 447.693999] cpu 130 (hwid 130) Ready to die... [ 447.703530] cpu 131 (hwid 131) Ready to die... [ 447.704087] Querying DEAD? cpu 132 (132) shows 2 [ 447.704102] cpu 132 (hwid 132) Ready to die... [ 447.713534] cpu 133 (hwid 133) Ready to die... [ 447.714064] Querying DEAD? cpu 134 (134) shows 2 cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40] pc: 1ec3072c lr: 1ec2fee0 sp: 1faf6bd0 msr: 800102801000 dar: 212d6c1a2a20c dsisr: 4200 current = 0xc00474c6d600 paca= 0xc7b6b600 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/134 Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11) WARNING: exception is not recoverable, can't continue enter ? for help SP (1faf6bd0) is in userspace 86:mon> 86:mon> t SP (1faf6bd0) is in userspace 86:mon> r R00 = 000212d6c1a2a20f R16 = c0ff1c38 R01 = 1faf6bd0 R17 = c00474c9c080 R02 = 1ed1be80 R18 = c00474c9c000 R03 = 1faf6c80 R19 = c13fdf08 R04 = 0018 R20 = c00474c9c080 R05 = 00e0 R21 = c13e8ad0 R06 = 9e04 R22 = c00474c9c000 R07 = 1faf6d30 R23 = c0047a9a1c40 R08 = 1faf6d28 R24 = 0002 R09 = 000212d6c1a2a20c R25 = c0fd4e6c R10 = 1ec1b118 R26 = c0fd4e6c R11 = 1ee7e040 R27 = c14daae0 R12 = 0163c1d8 R28 = R13 = c7b6b600 R29 = 0086 R14 = c14defb0 R30 = c0fd4e68 R15 = 0001 R31 = 1faf6bd0 pc = 1ec3072c cfar= 1ec2fedc lr = 1ec2fee0 msr = 800102801000 cr = 4200 ctr = 1ec48788 xer = 0020 trap = 300 dar = 000212d6c1a2a20c dsisr = 4200 86:mon> Contact Information = Ping Tian Han/pt...@cn.ibm.com ---uname output--- Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = lpar ---Debugger Data--- cpu 0x86:
[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress
** Changed in: ubuntu-power-systems Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1661684 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress Status in The Ubuntu-power-systems project: Opinion Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Ping Tian Han- 2016-12-26 21:59:52 == ---Problem Description--- When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system dropped into xmon: roselp4 login: [ 95.511790] sysrq: SysRq : Changing Loglevel [ 95.511816] sysrq: Loglevel set to 9 [ 289.363833] mlx4_en 0292:60:00.0: removed PHC [ 293.123896] iommu: Removing device 0292:60:00.0 from group 3 [ 303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released [ 303.173865] rpadlpar_io: slot PHB 658 removed [ 335.853779] iommu: Removing device 0021:01:00.0 from group 0 [ 345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released [ 345.893869] rpadlpar_io: slot PHB 33 removed [ 382.204003] min_free_kbytes is not updated to 16885 because user defined value 551564 is preferred [ 446.143648] cpu 152 (hwid 152) Ready to die... [ 446.464057] cpu 153 (hwid 153) Ready to die... [ 446.473525] cpu 154 (hwid 154) Ready to die... [ 446.474077] cpu 155 (hwid 155) Ready to die... [ 446.483529] cpu 156 (hwid 156) Ready to die... [ 446.493532] cpu 157 (hwid 157) Ready to die... [ 446.494078] cpu 158 (hwid 158) Ready to die... [ 446.503527] cpu 159 (hwid 159) Ready to die... [ 446.664534] cpu 144 (hwid 144) Ready to die... [ 446.964113] cpu 145 (hwid 145) Ready to die... [ 446.973525] cpu 146 (hwid 146) Ready to die... [ 446.974094] cpu 147 (hwid 147) Ready to die... [ 446.983944] cpu 148 (hwid 148) Ready to die... [ 446.984062] cpu 149 (hwid 149) Ready to die... [ 446.993518] cpu 150 (hwid 150) Ready to die... [ 446.993543] Querying DEAD? cpu 150 (150) shows 2 [ 446.994098] cpu 151 (hwid 151) Ready to die... [ 447.133726] cpu 136 (hwid 136) Ready to die... [ 447.403532] cpu 137 (hwid 137) Ready to die... [ 447.403772] cpu 138 (hwid 138) Ready to die... [ 447.403839] cpu 139 (hwid 139) Ready to die... [ 447.403887] cpu 140 (hwid 140) Ready to die... [ 447.403937] cpu 141 (hwid 141) Ready to die... [ 447.403979] cpu 142 (hwid 142) Ready to die... [ 447.404038] cpu 143 (hwid 143) Ready to die... [ 447.513546] cpu 128 (hwid 128) Ready to die... [ 447.693533] cpu 129 (hwid 129) Ready to die... [ 447.693999] cpu 130 (hwid 130) Ready to die... [ 447.703530] cpu 131 (hwid 131) Ready to die... [ 447.704087] Querying DEAD? cpu 132 (132) shows 2 [ 447.704102] cpu 132 (hwid 132) Ready to die... [ 447.713534] cpu 133 (hwid 133) Ready to die... [ 447.714064] Querying DEAD? cpu 134 (134) shows 2 cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40] pc: 1ec3072c lr: 1ec2fee0 sp: 1faf6bd0 msr: 800102801000 dar: 212d6c1a2a20c dsisr: 4200 current = 0xc00474c6d600 paca= 0xc7b6b600 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/134 Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11) WARNING: exception is not recoverable, can't continue enter ? for help SP (1faf6bd0) is in userspace 86:mon> 86:mon> t SP (1faf6bd0) is in userspace 86:mon> r R00 = 000212d6c1a2a20f R16 = c0ff1c38 R01 = 1faf6bd0 R17 = c00474c9c080 R02 = 1ed1be80 R18 = c00474c9c000 R03 = 1faf6c80 R19 = c13fdf08 R04 = 0018 R20 = c00474c9c080 R05 = 00e0 R21 = c13e8ad0 R06 = 9e04 R22 = c00474c9c000 R07 = 1faf6d30 R23 = c0047a9a1c40 R08 = 1faf6d28 R24 = 0002 R09 = 000212d6c1a2a20c R25 = c0fd4e6c R10 = 1ec1b118 R26 = c0fd4e6c R11 = 1ee7e040 R27 = c14daae0 R12 = 0163c1d8 R28 = R13 = c7b6b600 R29 = 0086 R14 = c14defb0 R30 = c0fd4e68 R15 = 0001 R31 = 1faf6bd0 pc = 1ec3072c cfar= 1ec2fedc lr = 1ec2fee0 msr = 800102801000 cr = 4200 ctr = 1ec48788 xer = 0020 trap = 300 dar = 000212d6c1a2a20c dsisr = 4200 86:mon> Contact Information = Ping Tian Han/pt...@cn.ibm.com ---uname output--- Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20
[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress
** Changed in: ubuntu-power-systems Status: New => Incomplete ** Changed in: ubuntu-power-systems Status: Incomplete => Opinion -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1661684 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress Status in The Ubuntu-power-systems project: Opinion Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Ping Tian Han- 2016-12-26 21:59:52 == ---Problem Description--- When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system dropped into xmon: roselp4 login: [ 95.511790] sysrq: SysRq : Changing Loglevel [ 95.511816] sysrq: Loglevel set to 9 [ 289.363833] mlx4_en 0292:60:00.0: removed PHC [ 293.123896] iommu: Removing device 0292:60:00.0 from group 3 [ 303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released [ 303.173865] rpadlpar_io: slot PHB 658 removed [ 335.853779] iommu: Removing device 0021:01:00.0 from group 0 [ 345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released [ 345.893869] rpadlpar_io: slot PHB 33 removed [ 382.204003] min_free_kbytes is not updated to 16885 because user defined value 551564 is preferred [ 446.143648] cpu 152 (hwid 152) Ready to die... [ 446.464057] cpu 153 (hwid 153) Ready to die... [ 446.473525] cpu 154 (hwid 154) Ready to die... [ 446.474077] cpu 155 (hwid 155) Ready to die... [ 446.483529] cpu 156 (hwid 156) Ready to die... [ 446.493532] cpu 157 (hwid 157) Ready to die... [ 446.494078] cpu 158 (hwid 158) Ready to die... [ 446.503527] cpu 159 (hwid 159) Ready to die... [ 446.664534] cpu 144 (hwid 144) Ready to die... [ 446.964113] cpu 145 (hwid 145) Ready to die... [ 446.973525] cpu 146 (hwid 146) Ready to die... [ 446.974094] cpu 147 (hwid 147) Ready to die... [ 446.983944] cpu 148 (hwid 148) Ready to die... [ 446.984062] cpu 149 (hwid 149) Ready to die... [ 446.993518] cpu 150 (hwid 150) Ready to die... [ 446.993543] Querying DEAD? cpu 150 (150) shows 2 [ 446.994098] cpu 151 (hwid 151) Ready to die... [ 447.133726] cpu 136 (hwid 136) Ready to die... [ 447.403532] cpu 137 (hwid 137) Ready to die... [ 447.403772] cpu 138 (hwid 138) Ready to die... [ 447.403839] cpu 139 (hwid 139) Ready to die... [ 447.403887] cpu 140 (hwid 140) Ready to die... [ 447.403937] cpu 141 (hwid 141) Ready to die... [ 447.403979] cpu 142 (hwid 142) Ready to die... [ 447.404038] cpu 143 (hwid 143) Ready to die... [ 447.513546] cpu 128 (hwid 128) Ready to die... [ 447.693533] cpu 129 (hwid 129) Ready to die... [ 447.693999] cpu 130 (hwid 130) Ready to die... [ 447.703530] cpu 131 (hwid 131) Ready to die... [ 447.704087] Querying DEAD? cpu 132 (132) shows 2 [ 447.704102] cpu 132 (hwid 132) Ready to die... [ 447.713534] cpu 133 (hwid 133) Ready to die... [ 447.714064] Querying DEAD? cpu 134 (134) shows 2 cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40] pc: 1ec3072c lr: 1ec2fee0 sp: 1faf6bd0 msr: 800102801000 dar: 212d6c1a2a20c dsisr: 4200 current = 0xc00474c6d600 paca= 0xc7b6b600 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/134 Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11) WARNING: exception is not recoverable, can't continue enter ? for help SP (1faf6bd0) is in userspace 86:mon> 86:mon> t SP (1faf6bd0) is in userspace 86:mon> r R00 = 000212d6c1a2a20f R16 = c0ff1c38 R01 = 1faf6bd0 R17 = c00474c9c080 R02 = 1ed1be80 R18 = c00474c9c000 R03 = 1faf6c80 R19 = c13fdf08 R04 = 0018 R20 = c00474c9c080 R05 = 00e0 R21 = c13e8ad0 R06 = 9e04 R22 = c00474c9c000 R07 = 1faf6d30 R23 = c0047a9a1c40 R08 = 1faf6d28 R24 = 0002 R09 = 000212d6c1a2a20c R25 = c0fd4e6c R10 = 1ec1b118 R26 = c0fd4e6c R11 = 1ee7e040 R27 = c14daae0 R12 = 0163c1d8 R28 = R13 = c7b6b600 R29 = 0086 R14 = c14defb0 R30 = c0fd4e68 R15 = 0001 R31 = 1faf6bd0 pc = 1ec3072c cfar= 1ec2fedc lr = 1ec2fee0 msr = 800102801000 cr = 4200 ctr = 1ec48788 xer = 0020 trap = 300 dar = 000212d6c1a2a20c dsisr = 4200 86:mon> Contact Information = Ping Tian Han/pt...@cn.ibm.com ---uname output--- Linux roselp4 4.8.0-34-generic
[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress
** Also affects: ubuntu-power-systems Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1661684 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Ping Tian Han- 2016-12-26 21:59:52 == ---Problem Description--- When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system dropped into xmon: roselp4 login: [ 95.511790] sysrq: SysRq : Changing Loglevel [ 95.511816] sysrq: Loglevel set to 9 [ 289.363833] mlx4_en 0292:60:00.0: removed PHC [ 293.123896] iommu: Removing device 0292:60:00.0 from group 3 [ 303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released [ 303.173865] rpadlpar_io: slot PHB 658 removed [ 335.853779] iommu: Removing device 0021:01:00.0 from group 0 [ 345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released [ 345.893869] rpadlpar_io: slot PHB 33 removed [ 382.204003] min_free_kbytes is not updated to 16885 because user defined value 551564 is preferred [ 446.143648] cpu 152 (hwid 152) Ready to die... [ 446.464057] cpu 153 (hwid 153) Ready to die... [ 446.473525] cpu 154 (hwid 154) Ready to die... [ 446.474077] cpu 155 (hwid 155) Ready to die... [ 446.483529] cpu 156 (hwid 156) Ready to die... [ 446.493532] cpu 157 (hwid 157) Ready to die... [ 446.494078] cpu 158 (hwid 158) Ready to die... [ 446.503527] cpu 159 (hwid 159) Ready to die... [ 446.664534] cpu 144 (hwid 144) Ready to die... [ 446.964113] cpu 145 (hwid 145) Ready to die... [ 446.973525] cpu 146 (hwid 146) Ready to die... [ 446.974094] cpu 147 (hwid 147) Ready to die... [ 446.983944] cpu 148 (hwid 148) Ready to die... [ 446.984062] cpu 149 (hwid 149) Ready to die... [ 446.993518] cpu 150 (hwid 150) Ready to die... [ 446.993543] Querying DEAD? cpu 150 (150) shows 2 [ 446.994098] cpu 151 (hwid 151) Ready to die... [ 447.133726] cpu 136 (hwid 136) Ready to die... [ 447.403532] cpu 137 (hwid 137) Ready to die... [ 447.403772] cpu 138 (hwid 138) Ready to die... [ 447.403839] cpu 139 (hwid 139) Ready to die... [ 447.403887] cpu 140 (hwid 140) Ready to die... [ 447.403937] cpu 141 (hwid 141) Ready to die... [ 447.403979] cpu 142 (hwid 142) Ready to die... [ 447.404038] cpu 143 (hwid 143) Ready to die... [ 447.513546] cpu 128 (hwid 128) Ready to die... [ 447.693533] cpu 129 (hwid 129) Ready to die... [ 447.693999] cpu 130 (hwid 130) Ready to die... [ 447.703530] cpu 131 (hwid 131) Ready to die... [ 447.704087] Querying DEAD? cpu 132 (132) shows 2 [ 447.704102] cpu 132 (hwid 132) Ready to die... [ 447.713534] cpu 133 (hwid 133) Ready to die... [ 447.714064] Querying DEAD? cpu 134 (134) shows 2 cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40] pc: 1ec3072c lr: 1ec2fee0 sp: 1faf6bd0 msr: 800102801000 dar: 212d6c1a2a20c dsisr: 4200 current = 0xc00474c6d600 paca= 0xc7b6b600 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/134 Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11) WARNING: exception is not recoverable, can't continue enter ? for help SP (1faf6bd0) is in userspace 86:mon> 86:mon> t SP (1faf6bd0) is in userspace 86:mon> r R00 = 000212d6c1a2a20f R16 = c0ff1c38 R01 = 1faf6bd0 R17 = c00474c9c080 R02 = 1ed1be80 R18 = c00474c9c000 R03 = 1faf6c80 R19 = c13fdf08 R04 = 0018 R20 = c00474c9c080 R05 = 00e0 R21 = c13e8ad0 R06 = 9e04 R22 = c00474c9c000 R07 = 1faf6d30 R23 = c0047a9a1c40 R08 = 1faf6d28 R24 = 0002 R09 = 000212d6c1a2a20c R25 = c0fd4e6c R10 = 1ec1b118 R26 = c0fd4e6c R11 = 1ee7e040 R27 = c14daae0 R12 = 0163c1d8 R28 = R13 = c7b6b600 R29 = 0086 R14 = c14defb0 R30 = c0fd4e68 R15 = 0001 R31 = 1faf6bd0 pc = 1ec3072c cfar= 1ec2fedc lr = 1ec2fee0 msr = 800102801000 cr = 4200 ctr = 1ec48788 xer = 0020 trap = 300 dar = 000212d6c1a2a20c dsisr = 4200 86:mon> Contact Information = Ping Tian Han/pt...@cn.ibm.com ---uname output--- Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 ppc64le ppc64le ppc64le
[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress
** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1661684 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress Status in linux package in Ubuntu: Incomplete Bug description: == Comment: #0 - Ping Tian Han- 2016-12-26 21:59:52 == ---Problem Description--- When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system dropped into xmon: roselp4 login: [ 95.511790] sysrq: SysRq : Changing Loglevel [ 95.511816] sysrq: Loglevel set to 9 [ 289.363833] mlx4_en 0292:60:00.0: removed PHC [ 293.123896] iommu: Removing device 0292:60:00.0 from group 3 [ 303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released [ 303.173865] rpadlpar_io: slot PHB 658 removed [ 335.853779] iommu: Removing device 0021:01:00.0 from group 0 [ 345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released [ 345.893869] rpadlpar_io: slot PHB 33 removed [ 382.204003] min_free_kbytes is not updated to 16885 because user defined value 551564 is preferred [ 446.143648] cpu 152 (hwid 152) Ready to die... [ 446.464057] cpu 153 (hwid 153) Ready to die... [ 446.473525] cpu 154 (hwid 154) Ready to die... [ 446.474077] cpu 155 (hwid 155) Ready to die... [ 446.483529] cpu 156 (hwid 156) Ready to die... [ 446.493532] cpu 157 (hwid 157) Ready to die... [ 446.494078] cpu 158 (hwid 158) Ready to die... [ 446.503527] cpu 159 (hwid 159) Ready to die... [ 446.664534] cpu 144 (hwid 144) Ready to die... [ 446.964113] cpu 145 (hwid 145) Ready to die... [ 446.973525] cpu 146 (hwid 146) Ready to die... [ 446.974094] cpu 147 (hwid 147) Ready to die... [ 446.983944] cpu 148 (hwid 148) Ready to die... [ 446.984062] cpu 149 (hwid 149) Ready to die... [ 446.993518] cpu 150 (hwid 150) Ready to die... [ 446.993543] Querying DEAD? cpu 150 (150) shows 2 [ 446.994098] cpu 151 (hwid 151) Ready to die... [ 447.133726] cpu 136 (hwid 136) Ready to die... [ 447.403532] cpu 137 (hwid 137) Ready to die... [ 447.403772] cpu 138 (hwid 138) Ready to die... [ 447.403839] cpu 139 (hwid 139) Ready to die... [ 447.403887] cpu 140 (hwid 140) Ready to die... [ 447.403937] cpu 141 (hwid 141) Ready to die... [ 447.403979] cpu 142 (hwid 142) Ready to die... [ 447.404038] cpu 143 (hwid 143) Ready to die... [ 447.513546] cpu 128 (hwid 128) Ready to die... [ 447.693533] cpu 129 (hwid 129) Ready to die... [ 447.693999] cpu 130 (hwid 130) Ready to die... [ 447.703530] cpu 131 (hwid 131) Ready to die... [ 447.704087] Querying DEAD? cpu 132 (132) shows 2 [ 447.704102] cpu 132 (hwid 132) Ready to die... [ 447.713534] cpu 133 (hwid 133) Ready to die... [ 447.714064] Querying DEAD? cpu 134 (134) shows 2 cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40] pc: 1ec3072c lr: 1ec2fee0 sp: 1faf6bd0 msr: 800102801000 dar: 212d6c1a2a20c dsisr: 4200 current = 0xc00474c6d600 paca= 0xc7b6b600 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/134 Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11) WARNING: exception is not recoverable, can't continue enter ? for help SP (1faf6bd0) is in userspace 86:mon> 86:mon> t SP (1faf6bd0) is in userspace 86:mon> r R00 = 000212d6c1a2a20f R16 = c0ff1c38 R01 = 1faf6bd0 R17 = c00474c9c080 R02 = 1ed1be80 R18 = c00474c9c000 R03 = 1faf6c80 R19 = c13fdf08 R04 = 0018 R20 = c00474c9c080 R05 = 00e0 R21 = c13e8ad0 R06 = 9e04 R22 = c00474c9c000 R07 = 1faf6d30 R23 = c0047a9a1c40 R08 = 1faf6d28 R24 = 0002 R09 = 000212d6c1a2a20c R25 = c0fd4e6c R10 = 1ec1b118 R26 = c0fd4e6c R11 = 1ee7e040 R27 = c14daae0 R12 = 0163c1d8 R28 = R13 = c7b6b600 R29 = 0086 R14 = c14defb0 R30 = c0fd4e68 R15 = 0001 R31 = 1faf6bd0 pc = 1ec3072c cfar= 1ec2fedc lr = 1ec2fee0 msr = 800102801000 cr = 4200 ctr = 1ec48788 xer = 0020 trap = 300 dar = 000212d6c1a2a20c dsisr = 4200 86:mon> Contact Information = Ping Tian Han/pt...@cn.ibm.com ---uname output--- Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = lpar ---Debugger Data--- cpu