[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Also affects: ubuntu-power-systems Importance: Undecided Status: New ** Changed in: ubuntu-power-systems Status: New => Triaged ** Changed in: ubuntu-power-systems Importance: Undecided => High ** Changed in: ubuntu-power-systems Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team) ** Tags added: triage-g -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c00
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Changed in: linux (Ubuntu) Status: New => Triaged ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Tags added: kernel-da-key ** Also affects: linux (Ubuntu Bionic) Importance: High Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff59
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
@Gustavo - in case the patches are upstream accepted, nothing is wrong with your request. But keep in mind that 18.04 was recently released, hence all patches that should be added to such an already released Ubuntu version now need to follow the SRU process. https://wiki.ubuntu.com/StableReleaseUpdates This is a more structured process where various considerations and approvals are needed - hence it is more time consuming compared to adding patches to an Ubuntu version that is still in development. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
Hey Frank, I understand all of these patches are upstreamed already on Linus' tree and most of the patches seem to be clean cherry picks (except 15b4dd7981496f51c5f9262a5e0761e48de6655f, which was backported). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Changed in: linux (Ubuntu) Status: Fix Committed => Fix Released ** Changed in: ubuntu-power-systems Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Changed in: linux (Ubuntu Bionic) Status: Triaged => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c017195c] call_cpuidle+0x4
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Changed in: ubuntu-power-systems Status: Triaged => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c017195c] call_cpuidle+0x
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed- bionic'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [W
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Changed in: linux (Ubuntu) Status: Triaged => Fix Committed ** Changed in: ubuntu-power-systems Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x45
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
--- Comment From dougm...@us.ibm.com 2018-05-29 13:09 EDT--- Chanh, are you able to verify the proposed kernel? ** Tags removed: bugnameltc-165882 kernel-da-key severity-high triage-g verification-needed-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018]
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
--- Comment From dougm...@us.ibm.com 2018-05-29 13:42 EDT--- Canonical: the "verification-needed-bionic" tag was removed - is this intentional? ** Tags added: bugnameltc-165882 severity-high -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0]
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
The activity log: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767927/+activity shows that it was removed by LP user 'bugproxy' - probably an accident of the Bugzilla bridge?! Looks like the Bugzilla bridge removed several tags (see log entry '2018-05-29 17:19:23'. @IBM please verify if the bridge behavior is correct and than expected. For now I added the tags again ... ** Tags added: kernel-da-key verification-needed-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace:
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
Any progress on the verification for this? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c017195c] call_cpuidle+0x4c/0x90 [Wed Apr 4 1
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
** Tags added: triage-g -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c0ad1240] stop_loop+0x40/0x5c [Wed Apr 4 13:38:25 2018] [c00ff596fdf0] [c0acd9a4] cpuidle_enter_state+0xa4/0x450 [Wed Apr 4 13:38:25 2018] [c00ff596fe50] [c017195c] call_cpuidle+0x4c/0x90 [Wed Apr 4 13:38:25 2018] [c000
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
--- Comment From gwal...@br.ibm.com 2018-06-08 12:57 EDT--- I got a Boston I used a kernel with proposed patches and I did not see a hang of error in dmesg after ipmi test in KVM guest. ** Tags removed: verification-needed-bionic ** Tags added: verification-done-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1767927 Title: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! Status in The Ubuntu-power-systems project: Fix Committed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Bionic: Fix Committed Bug description: == Comment: #0 - Application Cdeadmin - 2018-03-20 14:10:53 == == Comment: #1 - Application Cdeadmin - 2018-03-20 14:10:54 == == Comment: #2 - Application Cdeadmin - 2018-03-20 14:10:56 == --- Comment From dougmill-ibm 2018-03-20 13:51:47 EDT --- This problem is not tied to a Linux distro. It will be fixed in firmware, as I understand it. Let us close any redundant issues for this same problem. Mark them as duplicate. == Comment: #3 - Application Cdeadmin - 2018-03-20 15:50:54 == --- Comment From mzipse 2018-03-20 15:44:26 EDT --- @stewart-ibm @svaidy , I need to you take a first look. The stop fixes that Vaidy had previously highlighted in a recent note are included in the 3/15 PNOR. == Comment: #5 - Application Cdeadmin - 2018-04-04 16:10:56 == --- Comment From haochanh 2018-04-04 16:04:07 EDT --- We update to 0330, bmc=1.18, then we hit bug 1134. Currently we are running with disable stop5 but still see the watchdog: hard lockup. After 2 hours of test run, I am seeing the "Watchdog: Lockup' and "became unstuck" [Wed Apr 4 13:38:25 2018] Watchdog CPU:42 Hard LOCKUP [Wed Apr 4 13:38:25 2018] Modules linked in: vhost_net vhost macvtap macvlan tap xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) cxl pnv_php mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) kvm_hv kvm binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua input_leds joydev mac_hid idt_89hpesx ipmi_powernv [Wed Apr 4 13:38:25 2018] vmx_crypto ipmi_devintf at24 ofpart uio_pdrv_genirq cmdlinepart uio powernv_flash ipmi_msghandler mtd crct10dif_vpmsum opal_prd ibmpowernv nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc knem(OE) ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic usbhid hid lpfc ast i2c_algo_bit ttm drm_kms_helper nvmet_fc syscopyarea sysfillrect nvmet sysimgblt fb_sys_fops nvme_fc nvme_fabrics crc32c_vpmsum drm i40e scsi_transport_fc aacraid [last unloaded: mlxfw] [Wed Apr 4 13:38:25 2018] CPU: 42 PID: 0 Comm: swapper/42 Tainted: G OE4.15.0-12-generic #13 [Wed Apr 4 13:38:25 2018] NIP: c00a3ca4 LR: c00a3ca4 CTR: c0008000 [Wed Apr 4 13:38:25 2018] REGS: c00ff596fc40 TRAP: 0100 Tainted: G OE (4.15.0-12-generic) [Wed Apr 4 13:38:25 2018] MSR: 90001033 CR: 24004482 XER: 2004 [Wed Apr 4 13:38:25 2018] CFAR: c00ff596fda0 SOFTE: 42 GPR00: c00a3ca4 c00ff596fda0 c16eb200 c00ff596fc40 GPR04: b0001033 c00a3690 24004484 000ffa45 GPR08: 0001 c0d10ed8 00ff GPR12: 90121033 c7a3ce00 c00ff596ff90 GPR16: c0047840 c0047810 c11b5380 GPR20: 0800 c1722484 002a GPR24: 00a8 0007 0007 GPR28: c161d270 c00ffb666fd8 c161d528 0007 [Wed Apr 4 13:38:25 2018] NIP [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] LR [c00a3ca4] power9_idle_type+0x24/0x40 [Wed Apr 4 13:38:25 2018] Call Trace: [Wed Apr 4 13:38:25 2018] [c00ff596fda0] [c00a3ca4] power9_idle_type+0x24/0x40 (unreliable) [Wed Apr 4 13:38:25 2018] [c00ff596fdc0] [c00
[Kernel-packages] [Bug 1767927] Re: ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE!
This bug was fixed in the package linux - 4.15.0-23.25 --- linux (4.15.0-23.25) bionic; urgency=medium * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927) * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630) - arm64: mmu: add the entry trampolines start/end section markers into sections.h - arm64: sdei: Add trampoline code for remapping the kernel * Some PCIe errors not surfaced through rasdaemon (LP: #1769730) - ACPI: APEI: handle PCIe AER errors in separate function - ACPI: APEI: call into AER handling regardless of severity * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003) - scsi: qla2xxx: Fix session cleanup for N2N - scsi: qla2xxx: Remove unused argument from qlt_schedule_sess_for_deletion() - scsi: qla2xxx: Serialize session deletion by using work_lock - scsi: qla2xxx: Serialize session free in qlt_free_session_done - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled. - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() - scsi: qla2xxx: Prevent relogin trigger from sending too many commands - scsi: qla2xxx: Fix double free bug after firmware timeout - scsi: qla2xxx: Fixup locking for session deletion * Several hisi_sas bug fixes (LP: #1768974) - scsi: hisi_sas: dt-bindings: add an property of signal attenuation - scsi: hisi_sas: support the property of signal attenuation for v2 hw - scsi: hisi_sas: fix the issue of link rate inconsistency - scsi: hisi_sas: fix the issue of setting linkrate register - scsi: hisi_sas: increase timer expire of internal abort task - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req - scsi: hisi_sas: fix return value of hisi_sas_task_prep() - scsi: hisi_sas: Code cleanup and minor bug fixes * [bionic] machine stuck and bonding not working well when nvmet_rdma module is loaded (LP: #1764982) - nvmet-rdma: Don't flush system_wq by default during remove_one - nvme-rdma: Don't flush delete_wq by default during remove_one * Warnings/hang during error handling of SATA disks on SAS controller (LP: #1768971) - scsi: libsas: defer ata device eh commands to libata * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948) - ata: do not schedule hot plug if it is a sas host * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927) - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write() - powerpc/64s: return more carefully from sreset NMI - powerpc/64s: sreset panic if there is no debugger or crash dump handlers * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564) - fsnotify: Fix fsnotify_mark_connector race * Hang on network interface removal in Xen virtual machine (LP: #1771620) - xen-netfront: Fix hang on device removal * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977) - net: hns: Avoid action name truncation * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849) - SAUCE: powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus() * Switch Build-Depends: transfig to fig2dev (LP: #1770770) - [Config] update Build-Depends: transfig to fig2dev * smp_call_function_single/many core hangs with stop4 alone (LP: #1768898) - cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt * Add d-i support for Huawei NICs (LP: #1767490) - d-i: add hinic to nic-modules udeb * unregister_netdevice: waiting for eth0 to become free. Usage count = 5 (LP: #1746474) - xfrm: reuse uncached_list to track xdsts * Include nfp driver in linux-modules (LP: #1768526) - [Config] Add nfp.ko to generic inclusion list * Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679) - x86/xen: Reset VCPU0 info pointer after shared_info remap * CVE-2018-3639 (x86) - x86/bugs: Fix the parameters alignment and missing void - KVM: SVM: Move spec control call after restore of GS - x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP - x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS - x86/cpufeatures: Disentangle SSBD enumeration - x86/cpufeatures: Add FEATURE_ZEN - x86/speculation: Handle HT correctly on AMD - x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL - x86/speculation: Add virtualized speculative store bypass disable support - x86/speculation: Rework speculative_store_bypass_update() - x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host} - x86/bugs: Expose x86_spec_ctrl_base directly - x86/bugs: Remove x86_spec_ctrl_set() - x86/bugs: Rework spec_ctrl base and mask logic - x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG - KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD - x86/bugs: Rename