[Kernel-packages] [Bug 1620317] xmon debug session
--- Comment on attachment From bjki...@us.ibm.com 2016-09-23 14:47 EDT--- Looking at the blocked tasks, this looks like it could be an issue fixed recently upstream: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/core.c?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf Gabriel is building a kernel that has this fix added and we'll kick off a weekend run. ** Attachment added: "xmon debug session" https://bugs.launchpad.net/bugs/1620317/+attachment/4747315/+files/xmon.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1620317 Title: ISST-LTE:pNV: system ben is hung during ST (nvme) Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Committed Status in linux source package in Yakkety: Fix Released Bug description: On when we are running I/O intensive tasks and CPU addition/removal, the block may hang stalling the entire machine. The backtrace below is one of the symptoms: [12747.49] ---[ end trace b4d8d720952460b5 ]--- [12747.126885] Trying to free IRQ 357 from IRQ context! [12747.146930] [ cut here ] [12747.166674] WARNING: at /build/linux-iLHNl3/linux-4.4.0/kernel/irq/manage.c:1438 [12747.184069] Modules linked in: minix nls_iso8859_1 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) mlx4_core(OE) binfmt_misc xfs joydev input_leds mac_hid ofpart cmdlinepart powernv_flash ipmi_powernv mtd ipmi_msghandler at24 opal_prd powernv_rng ibmpowernv uio_pdrv_genirq uio sunrpc knem(OE) autofs4 btrfs xor raid6_pq hid_generic usbhid hid uas usb_storage nouveau ast bnx2x i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ahci drm mdio libcrc32c mlx_compat(OE) libahci vxlan nvme ip6_udp_tunnel udp_tunnel [12747.349013] CPU: 80 PID: 0 Comm: swapper/80 Tainted: GW OEL 4.4.0-21-generic #37-Ubuntu [12747.369046] task: c00f1fab89b0 ti: c00f1fb6c000 task.ti: c00f1fb6c000 [12747.404848] NIP: c0131888 LR: c0131884 CTR: 300303f0 [12747.808333] REGS: c00f1fb6e550 TRAP: 0700 Tainted: GW OEL (4.4.0-21-generic) [12747.867658] MSR: 900100029033 CR: 2802 XER: 2000 [12747.884783] CFAR: c0aea8f4 SOFTE: 1 GPR00: c0131884 c00f1fb6e7d0 c15b4200 0028 GPR04: c00f2a409c50 c00f2a41b4e0 000f2948 33da GPR08: 0007 c0f8b27c 000f2948 90011003 GPR12: 2200 c7b6f800 c00f2a40a938 0100 GPR16: c00f1148 3a98 GPR20: d9521008 d95146a0 f000 GPR24: c4a19ef0 0003 007d GPR28: 0165 c00eefeb1800 c00eef830600 0165 [12748.243270] NIP [c0131888] __free_irq+0x238/0x370 [12748.254089] LR [c0131884] __free_irq+0x234/0x370 [12748.269738] Call Trace: [12748.286740] [c00f1fb6e7d0] [c0131884] __free_irq+0x234/0x370 (unreliable) [12748.289687] [c00f1fb6e860] [c0131af8] free_irq+0x88/0xb0 [12748.304594] [c00f1fb6e890] [d9514528] nvme_suspend_queue+0xc8/0x150 [nvme] [12748.333825] [c00f1fb6e8c0] [d951681c] nvme_dev_disable+0x3fc/0x400 [nvme] [12748.340913] [c00f1fb6e9a0] [d9516ae4] nvme_timeout+0xe4/0x260 [nvme] [12748.357136] [c00f1fb6ea60] [c0548a34] blk_mq_rq_timed_out+0x64/0x110 [12748.383939] [c00f1fb6ead0] [c054c540] bt_for_each+0x160/0x170 [12748.399292] [c00f1fb6eb40] [c054d4e8] blk_mq_queue_tag_busy_iter+0x78/0x110 [12748.402665] [c00f1fb6eb90] [c0547358] blk_mq_rq_timer+0x48/0x140 [12748.438649] [c00f1fb6ebd0] [c014a13c] call_timer_fn+0x5c/0x1c0 [12748.468126] [c00f1fb6ec60] [c014a5fc] run_timer_softirq+0x31c/0x3f0 [12748.483367] [c00f1fb6ed30] [c00beb78] __do_softirq+0x188/0x3e0 [12748.498378] [c00f1fb6ee20] [c00bf048] irq_exit+0xc8/0x100 [12748.501048] [c00f1fb6ee40] [c001f954] timer_interrupt+0xa4/0xe0 [12748.516377] [c00f1fb6ee70] [c0002714] decrementer_common+0x114/0x180 [12748.547282] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90 [12748.547282] LR = arch_local_irq_restore+0x74/0x90 [12748.574141] [c00f1fb6f160] [0001] 0x1 (unreliable) [12748.592405] [c00f1fb6f180] [c0aedc3c] dump_stack+0xd0/0xf0 [12748.596461] [c00f1fb6f1c0] [c01006fc] deq
[Kernel-packages] [Bug 1620317] xmon debug session
--- Comment on attachment From bjki...@us.ibm.com 2016-09-23 14:47 EDT--- Looking at the blocked tasks, this looks like it could be an issue fixed recently upstream: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/core.c?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf Gabriel is building a kernel that has this fix added and we'll kick off a weekend run. ** Attachment added: "xmon debug session" https://bugs.launchpad.net/bugs/1620317/+attachment/4750567/+files/xmon.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1620317 Title: ISST-LTE:pNV: system ben is hung during ST (nvme) Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Committed Status in linux source package in Yakkety: Fix Released Bug description: On when we are running I/O intensive tasks and CPU addition/removal, the block may hang stalling the entire machine. The backtrace below is one of the symptoms: [12747.49] ---[ end trace b4d8d720952460b5 ]--- [12747.126885] Trying to free IRQ 357 from IRQ context! [12747.146930] [ cut here ] [12747.166674] WARNING: at /build/linux-iLHNl3/linux-4.4.0/kernel/irq/manage.c:1438 [12747.184069] Modules linked in: minix nls_iso8859_1 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) mlx4_core(OE) binfmt_misc xfs joydev input_leds mac_hid ofpart cmdlinepart powernv_flash ipmi_powernv mtd ipmi_msghandler at24 opal_prd powernv_rng ibmpowernv uio_pdrv_genirq uio sunrpc knem(OE) autofs4 btrfs xor raid6_pq hid_generic usbhid hid uas usb_storage nouveau ast bnx2x i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ahci drm mdio libcrc32c mlx_compat(OE) libahci vxlan nvme ip6_udp_tunnel udp_tunnel [12747.349013] CPU: 80 PID: 0 Comm: swapper/80 Tainted: GW OEL 4.4.0-21-generic #37-Ubuntu [12747.369046] task: c00f1fab89b0 ti: c00f1fb6c000 task.ti: c00f1fb6c000 [12747.404848] NIP: c0131888 LR: c0131884 CTR: 300303f0 [12747.808333] REGS: c00f1fb6e550 TRAP: 0700 Tainted: GW OEL (4.4.0-21-generic) [12747.867658] MSR: 900100029033 CR: 2802 XER: 2000 [12747.884783] CFAR: c0aea8f4 SOFTE: 1 GPR00: c0131884 c00f1fb6e7d0 c15b4200 0028 GPR04: c00f2a409c50 c00f2a41b4e0 000f2948 33da GPR08: 0007 c0f8b27c 000f2948 90011003 GPR12: 2200 c7b6f800 c00f2a40a938 0100 GPR16: c00f1148 3a98 GPR20: d9521008 d95146a0 f000 GPR24: c4a19ef0 0003 007d GPR28: 0165 c00eefeb1800 c00eef830600 0165 [12748.243270] NIP [c0131888] __free_irq+0x238/0x370 [12748.254089] LR [c0131884] __free_irq+0x234/0x370 [12748.269738] Call Trace: [12748.286740] [c00f1fb6e7d0] [c0131884] __free_irq+0x234/0x370 (unreliable) [12748.289687] [c00f1fb6e860] [c0131af8] free_irq+0x88/0xb0 [12748.304594] [c00f1fb6e890] [d9514528] nvme_suspend_queue+0xc8/0x150 [nvme] [12748.333825] [c00f1fb6e8c0] [d951681c] nvme_dev_disable+0x3fc/0x400 [nvme] [12748.340913] [c00f1fb6e9a0] [d9516ae4] nvme_timeout+0xe4/0x260 [nvme] [12748.357136] [c00f1fb6ea60] [c0548a34] blk_mq_rq_timed_out+0x64/0x110 [12748.383939] [c00f1fb6ead0] [c054c540] bt_for_each+0x160/0x170 [12748.399292] [c00f1fb6eb40] [c054d4e8] blk_mq_queue_tag_busy_iter+0x78/0x110 [12748.402665] [c00f1fb6eb90] [c0547358] blk_mq_rq_timer+0x48/0x140 [12748.438649] [c00f1fb6ebd0] [c014a13c] call_timer_fn+0x5c/0x1c0 [12748.468126] [c00f1fb6ec60] [c014a5fc] run_timer_softirq+0x31c/0x3f0 [12748.483367] [c00f1fb6ed30] [c00beb78] __do_softirq+0x188/0x3e0 [12748.498378] [c00f1fb6ee20] [c00bf048] irq_exit+0xc8/0x100 [12748.501048] [c00f1fb6ee40] [c001f954] timer_interrupt+0xa4/0xe0 [12748.516377] [c00f1fb6ee70] [c0002714] decrementer_common+0x114/0x180 [12748.547282] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90 [12748.547282] LR = arch_local_irq_restore+0x74/0x90 [12748.574141] [c00f1fb6f160] [0001] 0x1 (unreliable) [12748.592405] [c00f1fb6f180] [c0aedc3c] dump_stack+0xd0/0xf0 [12748.596461] [c00f1fb6f1c0] [c01006fc] deq