[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress

2019-07-24 Thread Brad Figg
** Tags added: cscc

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1661684

Title:
  ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar
  tests under stress

Status in The Ubuntu-power-systems project:
  Opinion
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Ping Tian Han  - 2016-12-26 21:59:52 ==
  ---Problem Description---
  When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system 
dropped into xmon:

  roselp4 login: [   95.511790] sysrq: SysRq : Changing Loglevel
  [   95.511816] sysrq: Loglevel set to 9
  [  289.363833] mlx4_en 0292:60:00.0: removed PHC
  [  293.123896] iommu: Removing device 0292:60:00.0 from group 3
  [  303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released
  [  303.173865] rpadlpar_io: slot PHB 658 removed
  [  335.853779] iommu: Removing device 0021:01:00.0 from group 0
  [  345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  345.893869] rpadlpar_io: slot PHB 33 removed
  [  382.204003] min_free_kbytes is not updated to 16885 because user defined 
value 551564 is preferred
  [  446.143648] cpu 152 (hwid 152) Ready to die...
  [  446.464057] cpu 153 (hwid 153) Ready to die...
  [  446.473525] cpu 154 (hwid 154) Ready to die...
  [  446.474077] cpu 155 (hwid 155) Ready to die...
  [  446.483529] cpu 156 (hwid 156) Ready to die...
  [  446.493532] cpu 157 (hwid 157) Ready to die...
  [  446.494078] cpu 158 (hwid 158) Ready to die...
  [  446.503527] cpu 159 (hwid 159) Ready to die...
  [  446.664534] cpu 144 (hwid 144) Ready to die...
  [  446.964113] cpu 145 (hwid 145) Ready to die...
  [  446.973525] cpu 146 (hwid 146) Ready to die...
  [  446.974094] cpu 147 (hwid 147) Ready to die...
  [  446.983944] cpu 148 (hwid 148) Ready to die...
  [  446.984062] cpu 149 (hwid 149) Ready to die...
  [  446.993518] cpu 150 (hwid 150) Ready to die...
  [  446.993543] Querying DEAD? cpu 150 (150) shows 2
  [  446.994098] cpu 151 (hwid 151) Ready to die...
  [  447.133726] cpu 136 (hwid 136) Ready to die...
  [  447.403532] cpu 137 (hwid 137) Ready to die...
  [  447.403772] cpu 138 (hwid 138) Ready to die...
  [  447.403839] cpu 139 (hwid 139) Ready to die...
  [  447.403887] cpu 140 (hwid 140) Ready to die...
  [  447.403937] cpu 141 (hwid 141) Ready to die...
  [  447.403979] cpu 142 (hwid 142) Ready to die...
  [  447.404038] cpu 143 (hwid 143) Ready to die...
  [  447.513546] cpu 128 (hwid 128) Ready to die...
  [  447.693533] cpu 129 (hwid 129) Ready to die...
  [  447.693999] cpu 130 (hwid 130) Ready to die...
  [  447.703530] cpu 131 (hwid 131) Ready to die...
  [  447.704087] Querying DEAD? cpu 132 (132) shows 2
  [  447.704102] cpu 132 (hwid 132) Ready to die...
  [  447.713534] cpu 133 (hwid 133) Ready to die...
  [  447.714064] Querying DEAD? cpu 134 (134) shows 2
  cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40]
  pc: 1ec3072c
  lr: 1ec2fee0
  sp: 1faf6bd0
 msr: 800102801000
 dar: 212d6c1a2a20c
   dsisr: 4200
current = 0xc00474c6d600
paca= 0xc7b6b600   softe: 0irq_happened: 0x01
  pid   = 0, comm = swapper/134
  Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 
21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11)
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  SP (1faf6bd0) is in userspace
  86:mon> 
  86:mon> t
  SP (1faf6bd0) is in userspace
  86:mon> r
  R00 = 000212d6c1a2a20f   R16 = c0ff1c38
  R01 = 1faf6bd0   R17 = c00474c9c080
  R02 = 1ed1be80   R18 = c00474c9c000
  R03 = 1faf6c80   R19 = c13fdf08
  R04 = 0018   R20 = c00474c9c080
  R05 = 00e0   R21 = c13e8ad0
  R06 = 9e04   R22 = c00474c9c000
  R07 = 1faf6d30   R23 = c0047a9a1c40
  R08 = 1faf6d28   R24 = 0002
  R09 = 000212d6c1a2a20c   R25 = c0fd4e6c
  R10 = 1ec1b118   R26 = c0fd4e6c
  R11 = 1ee7e040   R27 = c14daae0
  R12 = 0163c1d8   R28 = 
  R13 = c7b6b600   R29 = 0086
  R14 = c14defb0   R30 = c0fd4e68
  R15 = 0001   R31 = 1faf6bd0
  pc  = 1ec3072c
  cfar= 1ec2fedc
  lr  = 1ec2fee0
  msr = 800102801000   cr  = 4200
  ctr = 1ec48788   xer = 0020   trap =  300
  dar = 000212d6c1a2a20c   dsisr = 4200
  86:mon> 

  
   
  Contact Information = Ping Tian Han/pt...@cn.ibm.com 
   
  ---uname output---
  Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 
2016 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = lpar 
   
  ---Debugger Data---
  cpu 0x86: 

[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress

2017-08-25 Thread Andrew Cloke
** Changed in: ubuntu-power-systems
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1661684

Title:
  ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar
  tests under stress

Status in The Ubuntu-power-systems project:
  Opinion
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Ping Tian Han  - 2016-12-26 21:59:52 ==
  ---Problem Description---
  When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system 
dropped into xmon:

  roselp4 login: [   95.511790] sysrq: SysRq : Changing Loglevel
  [   95.511816] sysrq: Loglevel set to 9
  [  289.363833] mlx4_en 0292:60:00.0: removed PHC
  [  293.123896] iommu: Removing device 0292:60:00.0 from group 3
  [  303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released
  [  303.173865] rpadlpar_io: slot PHB 658 removed
  [  335.853779] iommu: Removing device 0021:01:00.0 from group 0
  [  345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  345.893869] rpadlpar_io: slot PHB 33 removed
  [  382.204003] min_free_kbytes is not updated to 16885 because user defined 
value 551564 is preferred
  [  446.143648] cpu 152 (hwid 152) Ready to die...
  [  446.464057] cpu 153 (hwid 153) Ready to die...
  [  446.473525] cpu 154 (hwid 154) Ready to die...
  [  446.474077] cpu 155 (hwid 155) Ready to die...
  [  446.483529] cpu 156 (hwid 156) Ready to die...
  [  446.493532] cpu 157 (hwid 157) Ready to die...
  [  446.494078] cpu 158 (hwid 158) Ready to die...
  [  446.503527] cpu 159 (hwid 159) Ready to die...
  [  446.664534] cpu 144 (hwid 144) Ready to die...
  [  446.964113] cpu 145 (hwid 145) Ready to die...
  [  446.973525] cpu 146 (hwid 146) Ready to die...
  [  446.974094] cpu 147 (hwid 147) Ready to die...
  [  446.983944] cpu 148 (hwid 148) Ready to die...
  [  446.984062] cpu 149 (hwid 149) Ready to die...
  [  446.993518] cpu 150 (hwid 150) Ready to die...
  [  446.993543] Querying DEAD? cpu 150 (150) shows 2
  [  446.994098] cpu 151 (hwid 151) Ready to die...
  [  447.133726] cpu 136 (hwid 136) Ready to die...
  [  447.403532] cpu 137 (hwid 137) Ready to die...
  [  447.403772] cpu 138 (hwid 138) Ready to die...
  [  447.403839] cpu 139 (hwid 139) Ready to die...
  [  447.403887] cpu 140 (hwid 140) Ready to die...
  [  447.403937] cpu 141 (hwid 141) Ready to die...
  [  447.403979] cpu 142 (hwid 142) Ready to die...
  [  447.404038] cpu 143 (hwid 143) Ready to die...
  [  447.513546] cpu 128 (hwid 128) Ready to die...
  [  447.693533] cpu 129 (hwid 129) Ready to die...
  [  447.693999] cpu 130 (hwid 130) Ready to die...
  [  447.703530] cpu 131 (hwid 131) Ready to die...
  [  447.704087] Querying DEAD? cpu 132 (132) shows 2
  [  447.704102] cpu 132 (hwid 132) Ready to die...
  [  447.713534] cpu 133 (hwid 133) Ready to die...
  [  447.714064] Querying DEAD? cpu 134 (134) shows 2
  cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40]
  pc: 1ec3072c
  lr: 1ec2fee0
  sp: 1faf6bd0
 msr: 800102801000
 dar: 212d6c1a2a20c
   dsisr: 4200
current = 0xc00474c6d600
paca= 0xc7b6b600   softe: 0irq_happened: 0x01
  pid   = 0, comm = swapper/134
  Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 
21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11)
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  SP (1faf6bd0) is in userspace
  86:mon> 
  86:mon> t
  SP (1faf6bd0) is in userspace
  86:mon> r
  R00 = 000212d6c1a2a20f   R16 = c0ff1c38
  R01 = 1faf6bd0   R17 = c00474c9c080
  R02 = 1ed1be80   R18 = c00474c9c000
  R03 = 1faf6c80   R19 = c13fdf08
  R04 = 0018   R20 = c00474c9c080
  R05 = 00e0   R21 = c13e8ad0
  R06 = 9e04   R22 = c00474c9c000
  R07 = 1faf6d30   R23 = c0047a9a1c40
  R08 = 1faf6d28   R24 = 0002
  R09 = 000212d6c1a2a20c   R25 = c0fd4e6c
  R10 = 1ec1b118   R26 = c0fd4e6c
  R11 = 1ee7e040   R27 = c14daae0
  R12 = 0163c1d8   R28 = 
  R13 = c7b6b600   R29 = 0086
  R14 = c14defb0   R30 = c0fd4e68
  R15 = 0001   R31 = 1faf6bd0
  pc  = 1ec3072c
  cfar= 1ec2fedc
  lr  = 1ec2fee0
  msr = 800102801000   cr  = 4200
  ctr = 1ec48788   xer = 0020   trap =  300
  dar = 000212d6c1a2a20c   dsisr = 4200
  86:mon> 

  
   
  Contact Information = Ping Tian Han/pt...@cn.ibm.com 
   
  ---uname output---
  Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 

[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress

2017-05-09 Thread Manoj Iyer
** Changed in: ubuntu-power-systems
   Status: New => Incomplete

** Changed in: ubuntu-power-systems
   Status: Incomplete => Opinion

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1661684

Title:
  ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar
  tests under stress

Status in The Ubuntu-power-systems project:
  Opinion
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Ping Tian Han  - 2016-12-26 21:59:52 ==
  ---Problem Description---
  When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system 
dropped into xmon:

  roselp4 login: [   95.511790] sysrq: SysRq : Changing Loglevel
  [   95.511816] sysrq: Loglevel set to 9
  [  289.363833] mlx4_en 0292:60:00.0: removed PHC
  [  293.123896] iommu: Removing device 0292:60:00.0 from group 3
  [  303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released
  [  303.173865] rpadlpar_io: slot PHB 658 removed
  [  335.853779] iommu: Removing device 0021:01:00.0 from group 0
  [  345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  345.893869] rpadlpar_io: slot PHB 33 removed
  [  382.204003] min_free_kbytes is not updated to 16885 because user defined 
value 551564 is preferred
  [  446.143648] cpu 152 (hwid 152) Ready to die...
  [  446.464057] cpu 153 (hwid 153) Ready to die...
  [  446.473525] cpu 154 (hwid 154) Ready to die...
  [  446.474077] cpu 155 (hwid 155) Ready to die...
  [  446.483529] cpu 156 (hwid 156) Ready to die...
  [  446.493532] cpu 157 (hwid 157) Ready to die...
  [  446.494078] cpu 158 (hwid 158) Ready to die...
  [  446.503527] cpu 159 (hwid 159) Ready to die...
  [  446.664534] cpu 144 (hwid 144) Ready to die...
  [  446.964113] cpu 145 (hwid 145) Ready to die...
  [  446.973525] cpu 146 (hwid 146) Ready to die...
  [  446.974094] cpu 147 (hwid 147) Ready to die...
  [  446.983944] cpu 148 (hwid 148) Ready to die...
  [  446.984062] cpu 149 (hwid 149) Ready to die...
  [  446.993518] cpu 150 (hwid 150) Ready to die...
  [  446.993543] Querying DEAD? cpu 150 (150) shows 2
  [  446.994098] cpu 151 (hwid 151) Ready to die...
  [  447.133726] cpu 136 (hwid 136) Ready to die...
  [  447.403532] cpu 137 (hwid 137) Ready to die...
  [  447.403772] cpu 138 (hwid 138) Ready to die...
  [  447.403839] cpu 139 (hwid 139) Ready to die...
  [  447.403887] cpu 140 (hwid 140) Ready to die...
  [  447.403937] cpu 141 (hwid 141) Ready to die...
  [  447.403979] cpu 142 (hwid 142) Ready to die...
  [  447.404038] cpu 143 (hwid 143) Ready to die...
  [  447.513546] cpu 128 (hwid 128) Ready to die...
  [  447.693533] cpu 129 (hwid 129) Ready to die...
  [  447.693999] cpu 130 (hwid 130) Ready to die...
  [  447.703530] cpu 131 (hwid 131) Ready to die...
  [  447.704087] Querying DEAD? cpu 132 (132) shows 2
  [  447.704102] cpu 132 (hwid 132) Ready to die...
  [  447.713534] cpu 133 (hwid 133) Ready to die...
  [  447.714064] Querying DEAD? cpu 134 (134) shows 2
  cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40]
  pc: 1ec3072c
  lr: 1ec2fee0
  sp: 1faf6bd0
 msr: 800102801000
 dar: 212d6c1a2a20c
   dsisr: 4200
current = 0xc00474c6d600
paca= 0xc7b6b600   softe: 0irq_happened: 0x01
  pid   = 0, comm = swapper/134
  Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 
21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11)
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  SP (1faf6bd0) is in userspace
  86:mon> 
  86:mon> t
  SP (1faf6bd0) is in userspace
  86:mon> r
  R00 = 000212d6c1a2a20f   R16 = c0ff1c38
  R01 = 1faf6bd0   R17 = c00474c9c080
  R02 = 1ed1be80   R18 = c00474c9c000
  R03 = 1faf6c80   R19 = c13fdf08
  R04 = 0018   R20 = c00474c9c080
  R05 = 00e0   R21 = c13e8ad0
  R06 = 9e04   R22 = c00474c9c000
  R07 = 1faf6d30   R23 = c0047a9a1c40
  R08 = 1faf6d28   R24 = 0002
  R09 = 000212d6c1a2a20c   R25 = c0fd4e6c
  R10 = 1ec1b118   R26 = c0fd4e6c
  R11 = 1ee7e040   R27 = c14daae0
  R12 = 0163c1d8   R28 = 
  R13 = c7b6b600   R29 = 0086
  R14 = c14defb0   R30 = c0fd4e68
  R15 = 0001   R31 = 1faf6bd0
  pc  = 1ec3072c
  cfar= 1ec2fedc
  lr  = 1ec2fee0
  msr = 800102801000   cr  = 4200
  ctr = 1ec48788   xer = 0020   trap =  300
  dar = 000212d6c1a2a20c   dsisr = 4200
  86:mon> 

  
   
  Contact Information = Ping Tian Han/pt...@cn.ibm.com 
   
  ---uname output---
  Linux roselp4 4.8.0-34-generic 

[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress

2017-04-26 Thread Manoj Iyer
** Also affects: ubuntu-power-systems
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1661684

Title:
  ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar
  tests under stress

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Ping Tian Han  - 2016-12-26 21:59:52 ==
  ---Problem Description---
  When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system 
dropped into xmon:

  roselp4 login: [   95.511790] sysrq: SysRq : Changing Loglevel
  [   95.511816] sysrq: Loglevel set to 9
  [  289.363833] mlx4_en 0292:60:00.0: removed PHC
  [  293.123896] iommu: Removing device 0292:60:00.0 from group 3
  [  303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released
  [  303.173865] rpadlpar_io: slot PHB 658 removed
  [  335.853779] iommu: Removing device 0021:01:00.0 from group 0
  [  345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  345.893869] rpadlpar_io: slot PHB 33 removed
  [  382.204003] min_free_kbytes is not updated to 16885 because user defined 
value 551564 is preferred
  [  446.143648] cpu 152 (hwid 152) Ready to die...
  [  446.464057] cpu 153 (hwid 153) Ready to die...
  [  446.473525] cpu 154 (hwid 154) Ready to die...
  [  446.474077] cpu 155 (hwid 155) Ready to die...
  [  446.483529] cpu 156 (hwid 156) Ready to die...
  [  446.493532] cpu 157 (hwid 157) Ready to die...
  [  446.494078] cpu 158 (hwid 158) Ready to die...
  [  446.503527] cpu 159 (hwid 159) Ready to die...
  [  446.664534] cpu 144 (hwid 144) Ready to die...
  [  446.964113] cpu 145 (hwid 145) Ready to die...
  [  446.973525] cpu 146 (hwid 146) Ready to die...
  [  446.974094] cpu 147 (hwid 147) Ready to die...
  [  446.983944] cpu 148 (hwid 148) Ready to die...
  [  446.984062] cpu 149 (hwid 149) Ready to die...
  [  446.993518] cpu 150 (hwid 150) Ready to die...
  [  446.993543] Querying DEAD? cpu 150 (150) shows 2
  [  446.994098] cpu 151 (hwid 151) Ready to die...
  [  447.133726] cpu 136 (hwid 136) Ready to die...
  [  447.403532] cpu 137 (hwid 137) Ready to die...
  [  447.403772] cpu 138 (hwid 138) Ready to die...
  [  447.403839] cpu 139 (hwid 139) Ready to die...
  [  447.403887] cpu 140 (hwid 140) Ready to die...
  [  447.403937] cpu 141 (hwid 141) Ready to die...
  [  447.403979] cpu 142 (hwid 142) Ready to die...
  [  447.404038] cpu 143 (hwid 143) Ready to die...
  [  447.513546] cpu 128 (hwid 128) Ready to die...
  [  447.693533] cpu 129 (hwid 129) Ready to die...
  [  447.693999] cpu 130 (hwid 130) Ready to die...
  [  447.703530] cpu 131 (hwid 131) Ready to die...
  [  447.704087] Querying DEAD? cpu 132 (132) shows 2
  [  447.704102] cpu 132 (hwid 132) Ready to die...
  [  447.713534] cpu 133 (hwid 133) Ready to die...
  [  447.714064] Querying DEAD? cpu 134 (134) shows 2
  cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40]
  pc: 1ec3072c
  lr: 1ec2fee0
  sp: 1faf6bd0
 msr: 800102801000
 dar: 212d6c1a2a20c
   dsisr: 4200
current = 0xc00474c6d600
paca= 0xc7b6b600   softe: 0irq_happened: 0x01
  pid   = 0, comm = swapper/134
  Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 
21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11)
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  SP (1faf6bd0) is in userspace
  86:mon> 
  86:mon> t
  SP (1faf6bd0) is in userspace
  86:mon> r
  R00 = 000212d6c1a2a20f   R16 = c0ff1c38
  R01 = 1faf6bd0   R17 = c00474c9c080
  R02 = 1ed1be80   R18 = c00474c9c000
  R03 = 1faf6c80   R19 = c13fdf08
  R04 = 0018   R20 = c00474c9c080
  R05 = 00e0   R21 = c13e8ad0
  R06 = 9e04   R22 = c00474c9c000
  R07 = 1faf6d30   R23 = c0047a9a1c40
  R08 = 1faf6d28   R24 = 0002
  R09 = 000212d6c1a2a20c   R25 = c0fd4e6c
  R10 = 1ec1b118   R26 = c0fd4e6c
  R11 = 1ee7e040   R27 = c14daae0
  R12 = 0163c1d8   R28 = 
  R13 = c7b6b600   R29 = 0086
  R14 = c14defb0   R30 = c0fd4e68
  R15 = 0001   R31 = 1faf6bd0
  pc  = 1ec3072c
  cfar= 1ec2fedc
  lr  = 1ec2fee0
  msr = 800102801000   cr  = 4200
  ctr = 1ec48788   xer = 0020   trap =  300
  dar = 000212d6c1a2a20c   dsisr = 4200
  86:mon> 

  
   
  Contact Information = Ping Tian Han/pt...@cn.ibm.com 
   
  ---uname output---
  Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 
2016 ppc64le ppc64le ppc64le 

[Kernel-packages] [Bug 1661684] Re: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar tests under stress

2017-03-21 Thread Michael Hohnbaum
** Changed in: linux (Ubuntu)
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1661684

Title:
  ISST-LTE:pVM:roselp4:ubuntu 16.04.2: drop in xmon when running dlpar
  tests under stress

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  == Comment: #0 - Ping Tian Han  - 2016-12-26 21:59:52 ==
  ---Problem Description---
  When testing DLPAR, include slot/cpu/mem, under stress on roselp4, system 
dropped into xmon:

  roselp4 login: [   95.511790] sysrq: SysRq : Changing Loglevel
  [   95.511816] sysrq: Loglevel set to 9
  [  289.363833] mlx4_en 0292:60:00.0: removed PHC
  [  293.123896] iommu: Removing device 0292:60:00.0 from group 3
  [  303.173744] pci_bus 0292:60: busn_res: [bus 60-ff] is released
  [  303.173865] rpadlpar_io: slot PHB 658 removed
  [  335.853779] iommu: Removing device 0021:01:00.0 from group 0
  [  345.893764] pci_bus 0021:01: busn_res: [bus 01-ff] is released
  [  345.893869] rpadlpar_io: slot PHB 33 removed
  [  382.204003] min_free_kbytes is not updated to 16885 because user defined 
value 551564 is preferred
  [  446.143648] cpu 152 (hwid 152) Ready to die...
  [  446.464057] cpu 153 (hwid 153) Ready to die...
  [  446.473525] cpu 154 (hwid 154) Ready to die...
  [  446.474077] cpu 155 (hwid 155) Ready to die...
  [  446.483529] cpu 156 (hwid 156) Ready to die...
  [  446.493532] cpu 157 (hwid 157) Ready to die...
  [  446.494078] cpu 158 (hwid 158) Ready to die...
  [  446.503527] cpu 159 (hwid 159) Ready to die...
  [  446.664534] cpu 144 (hwid 144) Ready to die...
  [  446.964113] cpu 145 (hwid 145) Ready to die...
  [  446.973525] cpu 146 (hwid 146) Ready to die...
  [  446.974094] cpu 147 (hwid 147) Ready to die...
  [  446.983944] cpu 148 (hwid 148) Ready to die...
  [  446.984062] cpu 149 (hwid 149) Ready to die...
  [  446.993518] cpu 150 (hwid 150) Ready to die...
  [  446.993543] Querying DEAD? cpu 150 (150) shows 2
  [  446.994098] cpu 151 (hwid 151) Ready to die...
  [  447.133726] cpu 136 (hwid 136) Ready to die...
  [  447.403532] cpu 137 (hwid 137) Ready to die...
  [  447.403772] cpu 138 (hwid 138) Ready to die...
  [  447.403839] cpu 139 (hwid 139) Ready to die...
  [  447.403887] cpu 140 (hwid 140) Ready to die...
  [  447.403937] cpu 141 (hwid 141) Ready to die...
  [  447.403979] cpu 142 (hwid 142) Ready to die...
  [  447.404038] cpu 143 (hwid 143) Ready to die...
  [  447.513546] cpu 128 (hwid 128) Ready to die...
  [  447.693533] cpu 129 (hwid 129) Ready to die...
  [  447.693999] cpu 130 (hwid 130) Ready to die...
  [  447.703530] cpu 131 (hwid 131) Ready to die...
  [  447.704087] Querying DEAD? cpu 132 (132) shows 2
  [  447.704102] cpu 132 (hwid 132) Ready to die...
  [  447.713534] cpu 133 (hwid 133) Ready to die...
  [  447.714064] Querying DEAD? cpu 134 (134) shows 2
  cpu 0x86: Vector: 300 (Data Access) at [c7b0fd40]
  pc: 1ec3072c
  lr: 1ec2fee0
  sp: 1faf6bd0
 msr: 800102801000
 dar: 212d6c1a2a20c
   dsisr: 4200
current = 0xc00474c6d600
paca= 0xc7b6b600   softe: 0irq_happened: 0x01
  pid   = 0, comm = swapper/134
  Linux version 4.8.0-34-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 
20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Wed Dec 
21 18:53:20 UTC 2016 (Ubuntu 4.8.0-34.36~16.04.1-generic 4.8.11)
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  SP (1faf6bd0) is in userspace
  86:mon> 
  86:mon> t
  SP (1faf6bd0) is in userspace
  86:mon> r
  R00 = 000212d6c1a2a20f   R16 = c0ff1c38
  R01 = 1faf6bd0   R17 = c00474c9c080
  R02 = 1ed1be80   R18 = c00474c9c000
  R03 = 1faf6c80   R19 = c13fdf08
  R04 = 0018   R20 = c00474c9c080
  R05 = 00e0   R21 = c13e8ad0
  R06 = 9e04   R22 = c00474c9c000
  R07 = 1faf6d30   R23 = c0047a9a1c40
  R08 = 1faf6d28   R24 = 0002
  R09 = 000212d6c1a2a20c   R25 = c0fd4e6c
  R10 = 1ec1b118   R26 = c0fd4e6c
  R11 = 1ee7e040   R27 = c14daae0
  R12 = 0163c1d8   R28 = 
  R13 = c7b6b600   R29 = 0086
  R14 = c14defb0   R30 = c0fd4e68
  R15 = 0001   R31 = 1faf6bd0
  pc  = 1ec3072c
  cfar= 1ec2fedc
  lr  = 1ec2fee0
  msr = 800102801000   cr  = 4200
  ctr = 1ec48788   xer = 0020   trap =  300
  dar = 000212d6c1a2a20c   dsisr = 4200
  86:mon> 

  
   
  Contact Information = Ping Tian Han/pt...@cn.ibm.com 
   
  ---uname output---
  Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 
2016 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = lpar 
   
  ---Debugger Data---
  cpu