[Kernel-packages] [Bug 1702998] Comment bridged from LTC Bugzilla
--- Comment From bren...@br.ibm.com 2017-09-11 14:23 EDT--- Lata, Chandan, Canonical created a special kernel with this fix. They need us to test it before integrating the patch in the kernel. Could you please test it and let them know the result of this one-off kernel? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1702998 Title: Ubuntu 17.04: Guest crashed @writeback_sb_inodes+0x310/0x590 Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: In Progress Status in linux source package in Zesty: In Progress Bug description: == Comment: #0 - Lata Kuntal- 2017-03-03 00:50:54 == Ubuntu 17.04 guest dropped at xmon after crashing at writeback_sb_inodes+0x310/0x590. The guest is having XFS rootfs and NPIV disk. It crashed after 30+ hrs of BASE and NFS stress test . Crash logs === root@guskvm:~# virsh console gusg1 --force Connected to domain gusg1 Escape character is ^] 0:mon> 0:mon> t [c000a4bc7940] c036f790 writeback_sb_inodes+0x310/0x590 [c000a4bc7a50] c036faf4 __writeback_inodes_wb+0xe4/0x150 [c000a4bc7ab0] c036ff1c wb_writeback+0x2cc/0x440 [c000a4bc7b80] c0370c30 wb_workfn+0x150/0x560 [c000a4bc7c90] c00ed8c0 process_one_work+0x2b0/0x5a0 [c000a4bc7d20] c00edc58 worker_thread+0xa8/0x650 [c000a4bc7dc0] c00f67b4 kthread+0x154/0x1a0 [c000a4bc7e30] c000b4e8 ret_from_kernel_thread+0x5c/0x74 0:mon> r R00 = c036f790 R16 = c000eca70300 R01 = c000a4bc78e0 R17 = c000f7035240 R02 = c143c900 R18 = R03 = c000f7035150 R19 = R04 = 0019 R20 = c000a4bc4000 R05 = 0100 R21 = ff7f R06 = R22 = c433d758 R07 = R23 = c433d738 R08 = 00034995 R24 = R09 = R25 = R10 = 8000 R26 = c000f70351d8 R11 = c000a4bc7a40 R27 = R12 = 2200 R28 = 0001 R13 = cfb8 R29 = c433d728 R14 = R30 = c000f7035150 R15 = c000f70351d8 R31 = pc = c036c120 locked_inode_to_wb_and_lock_list+0x50/0x290 cfar= c00b2a14 kvmppc_save_tm+0x168/0x16c lr = c036f790 writeback_sb_inodes+0x310/0x590 msr = 80009033 cr = 24002482 ctr = c0381e30 xer = trap = 300 dar = dsisr = 4000 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000a4bc7660] pc: c036c120: locked_inode_to_wb_and_lock_list+0x50/0x290 lr: c036f790: writeback_sb_inodes+0x310/0x590 sp: c000a4bc78e0 msr: 80009033 dar: 0 dsisr: 4000 current = 0xc000fbe96000 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 17305, comm = kworker/u16:0 Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 (Ubuntu 4.10.0-8.10-generic 4.10.0-rc8) 0:mon> d || 0:mon> Host and guest kernel build = 4.10.0-8-generic OPAL firmware version T side: FW860.20 (SV860_078) Boot side : FW860.20 (SV860_078) == Comment: #4 - VIPIN K. PARASHAR - 2017-03-03 02:55:20 == [140071.761707] Adding 153536k swap on /dev/loop0. Priority:-2 extents:1 across:153536k FS [140072.153143] Adding 153472k swap on /dev/loop0. Priority:-2 extents:1 across:153472k FS [140072.441833] Unable to handle kernel paging request for data at address 0x [140072.442064] Faulting instruction address: 0xc036c120 0:mon> 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000a4bc7660] pc: c036c120: locked_inode_to_wb_and_lock_list+0x50/0x290 lr: c036f790: writeback_sb_inodes+0x310/0x590 sp: c000a4bc78e0 msr: 80009033 dar: 0 dsisr: 4000 current = 0xc000fbe96000 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 17305, comm = kworker/u16:0 Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 (Ubuntu 4.10.0-8.10-generic 4.10.0-rc8) 0:mon> t [c000a4bc7940] c036f790 writeback_sb_inodes+0x310/0x590 [c000a4bc7a50] c036faf4 __writeback_inodes_wb+0xe4/0x150 [c000a4bc7ab0] c036ff1c
[Kernel-packages] [Bug 1702998] Comment bridged from LTC Bugzilla
--- Comment From chraj...@in.ibm.com 2017-07-18 01:04 EDT--- I believe the following two commits associated with "writeback" code are required for fixing this bug, commit 03e262798884b0a5f948b17433afd80606cb3497 Author: Jan KaraDate: Thu Mar 23 01:36:53 2017 +0100 block: Fix bdi assignment to bdev inode when racing with disk delete When disk->fops->open() in __blkdev_get() returns -ERESTARTSYS, we restart the process of opening the block device. However we forget to switch bdev->bd_bdi back to noop_backing_dev_info and as a result bdev inode will be pointing to a stale bdi. Fix the problem by setting bdev->bd_bdi later when __blkdev_get() is already guaranteed to succeed. commit f759741d9d913eb57784a94b9bca78b376fc26a9 Author: Jan Kara Date: Thu Mar 23 01:37:00 2017 +0100 block: Fix oops in locked_inode_to_wb_and_lock_list() When block device is closed, we call inode_detach_wb() in __blkdev_put() which sets inode->i_wb to NULL. That is contrary to expectations that inode->i_wb stays valid once set during the whole inode's lifetime and leads to oops in wb_get() in locked_inode_to_wb_and_lock_list() because inode_to_wb() returned NULL. The reason why we called inode_detach_wb() is not valid anymore though. BDI is guaranteed to stay along until we call bdi_put() from bdev_evict_inode() so we can postpone calling inode_detach_wb() to that moment. Also add a warning to catch if someone uses inode_detach_wb() in a dangerous way. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1702998 Title: Ubuntu 17.04: Guest crashed @writeback_sb_inodes+0x310/0x590 Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Triaged Bug description: == Comment: #0 - Lata Kuntal - 2017-03-03 00:50:54 == Ubuntu 17.04 guest dropped at xmon after crashing at writeback_sb_inodes+0x310/0x590. The guest is having XFS rootfs and NPIV disk. It crashed after 30+ hrs of BASE and NFS stress test . Crash logs === root@guskvm:~# virsh console gusg1 --force Connected to domain gusg1 Escape character is ^] 0:mon> 0:mon> t [c000a4bc7940] c036f790 writeback_sb_inodes+0x310/0x590 [c000a4bc7a50] c036faf4 __writeback_inodes_wb+0xe4/0x150 [c000a4bc7ab0] c036ff1c wb_writeback+0x2cc/0x440 [c000a4bc7b80] c0370c30 wb_workfn+0x150/0x560 [c000a4bc7c90] c00ed8c0 process_one_work+0x2b0/0x5a0 [c000a4bc7d20] c00edc58 worker_thread+0xa8/0x650 [c000a4bc7dc0] c00f67b4 kthread+0x154/0x1a0 [c000a4bc7e30] c000b4e8 ret_from_kernel_thread+0x5c/0x74 0:mon> r R00 = c036f790 R16 = c000eca70300 R01 = c000a4bc78e0 R17 = c000f7035240 R02 = c143c900 R18 = R03 = c000f7035150 R19 = R04 = 0019 R20 = c000a4bc4000 R05 = 0100 R21 = ff7f R06 = R22 = c433d758 R07 = R23 = c433d738 R08 = 00034995 R24 = R09 = R25 = R10 = 8000 R26 = c000f70351d8 R11 = c000a4bc7a40 R27 = R12 = 2200 R28 = 0001 R13 = cfb8 R29 = c433d728 R14 = R30 = c000f7035150 R15 = c000f70351d8 R31 = pc = c036c120 locked_inode_to_wb_and_lock_list+0x50/0x290 cfar= c00b2a14 kvmppc_save_tm+0x168/0x16c lr = c036f790 writeback_sb_inodes+0x310/0x590 msr = 80009033 cr = 24002482 ctr = c0381e30 xer = trap = 300 dar = dsisr = 4000 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000a4bc7660] pc: c036c120: locked_inode_to_wb_and_lock_list+0x50/0x290 lr: c036f790: writeback_sb_inodes+0x310/0x590 sp: c000a4bc78e0 msr: 80009033 dar: 0 dsisr: 4000 current = 0xc000fbe96000 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 17305, comm = kworker/u16:0 Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 (Ubuntu 4.10.0-8.10-generic 4.10.0-rc8) 0:mon> d || 0:mon> Host and guest kernel build = 4.10.0-8-generic OPAL firmware version T side: FW860.20 (SV860_078) Boot side : FW860.20 (SV860_078) == Comment: #4 - VIPIN K. PARASHAR - 2017-03-03 02:55:20 == [140071.761707] Adding 153536k
[Kernel-packages] [Bug 1702998] Comment bridged from LTC Bugzilla
--- Comment From vipar...@in.ibm.com 2017-07-07 17:18 EDT--- This is same issue being debugged under LTC bug 149014 / LP1659111 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1702998 Title: Ubuntu 17.04: Guest crashed @writeback_sb_inodes+0x310/0x590 Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Lata Kuntal- 2017-03-03 00:50:54 == Ubuntu 17.04 guest dropped at xmon after crashing at writeback_sb_inodes+0x310/0x590. The guest is having XFS rootfs and NPIV disk. It crashed after 30+ hrs of BASE and NFS stress test . Crash logs === root@guskvm:~# virsh console gusg1 --force Connected to domain gusg1 Escape character is ^] 0:mon> 0:mon> t [c000a4bc7940] c036f790 writeback_sb_inodes+0x310/0x590 [c000a4bc7a50] c036faf4 __writeback_inodes_wb+0xe4/0x150 [c000a4bc7ab0] c036ff1c wb_writeback+0x2cc/0x440 [c000a4bc7b80] c0370c30 wb_workfn+0x150/0x560 [c000a4bc7c90] c00ed8c0 process_one_work+0x2b0/0x5a0 [c000a4bc7d20] c00edc58 worker_thread+0xa8/0x650 [c000a4bc7dc0] c00f67b4 kthread+0x154/0x1a0 [c000a4bc7e30] c000b4e8 ret_from_kernel_thread+0x5c/0x74 0:mon> r R00 = c036f790 R16 = c000eca70300 R01 = c000a4bc78e0 R17 = c000f7035240 R02 = c143c900 R18 = R03 = c000f7035150 R19 = R04 = 0019 R20 = c000a4bc4000 R05 = 0100 R21 = ff7f R06 = R22 = c433d758 R07 = R23 = c433d738 R08 = 00034995 R24 = R09 = R25 = R10 = 8000 R26 = c000f70351d8 R11 = c000a4bc7a40 R27 = R12 = 2200 R28 = 0001 R13 = cfb8 R29 = c433d728 R14 = R30 = c000f7035150 R15 = c000f70351d8 R31 = pc = c036c120 locked_inode_to_wb_and_lock_list+0x50/0x290 cfar= c00b2a14 kvmppc_save_tm+0x168/0x16c lr = c036f790 writeback_sb_inodes+0x310/0x590 msr = 80009033 cr = 24002482 ctr = c0381e30 xer = trap = 300 dar = dsisr = 4000 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000a4bc7660] pc: c036c120: locked_inode_to_wb_and_lock_list+0x50/0x290 lr: c036f790: writeback_sb_inodes+0x310/0x590 sp: c000a4bc78e0 msr: 80009033 dar: 0 dsisr: 4000 current = 0xc000fbe96000 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 17305, comm = kworker/u16:0 Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 (Ubuntu 4.10.0-8.10-generic 4.10.0-rc8) 0:mon> d || 0:mon> Host and guest kernel build = 4.10.0-8-generic OPAL firmware version T side: FW860.20 (SV860_078) Boot side : FW860.20 (SV860_078) == Comment: #4 - VIPIN K. PARASHAR - 2017-03-03 02:55:20 == [140071.761707] Adding 153536k swap on /dev/loop0. Priority:-2 extents:1 across:153536k FS [140072.153143] Adding 153472k swap on /dev/loop0. Priority:-2 extents:1 across:153472k FS [140072.441833] Unable to handle kernel paging request for data at address 0x [140072.442064] Faulting instruction address: 0xc036c120 0:mon> 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000a4bc7660] pc: c036c120: locked_inode_to_wb_and_lock_list+0x50/0x290 lr: c036f790: writeback_sb_inodes+0x310/0x590 sp: c000a4bc78e0 msr: 80009033 dar: 0 dsisr: 4000 current = 0xc000fbe96000 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 17305, comm = kworker/u16:0 Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 (Ubuntu 4.10.0-8.10-generic 4.10.0-rc8) 0:mon> t [c000a4bc7940] c036f790 writeback_sb_inodes+0x310/0x590 [c000a4bc7a50] c036faf4 __writeback_inodes_wb+0xe4/0x150 [c000a4bc7ab0] c036ff1c wb_writeback+0x2cc/0x440 [c000a4bc7b80] c0370c30 wb_workfn+0x150/0x560 [c000a4bc7c90] c00ed8c0 process_one_work+0x2b0/0x5a0 [c000a4bc7d20] c00edc58 worker_thread+0xa8/0x650