Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)
On Tue, Jul 04, 2017 at 09:06:55PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > > On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote: > >> Eryu Guan <eg...@redhat.com> writes: > >> > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > >> >> > >> >> Can you try this patch and see if it changes anything? (with the debug > >> >> still applied). > >> > > >> > This patch fixes the crash for me. After appliying this patch (with all > >> > other debug patches still applied), kernel didn't print any warnings or > >> > calltraces or debug messages. > >> > >> OK. It's not meant to fix it :) > > > > Understand. > > > >> > >> I can't form any connection between your bisection result and that > >> patch, nothing is making any sense TBH. > >> > >> What hardware are you on? And are you doing CPU hotplug or anything like > >> that? > > > > It's a "PowerVM" guest (I'm not familiar with powerpc, I don't know what > > does that mean..) running on Power8 host. I didn't do any CPU hotplug or > > anything like that. > > OK thanks. > > We might have to try and sync up on irc so we can debug this a bit faster. Sure, where can I find you? I'm in #xfs at freenode, nick eguan. But maybe tomorrow, I have to take off today. > > Can you try this hunk also? This new WARN_ON didn't trigger (I skipped the other warning messages, they're the same warnings as in my last reply). Thanks, Eryu > > cheers > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index c74bf39ef764..7c55721b1f1d 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -3902,6 +3906,7 @@ static int alloc_and_link_pwqs(struct workqueue_struct > *wq) >"ordering guarantee broken for workqueue %s\n", wq->name); > return ret; > } else { > + WARN_ON(cpumask_empty(unbound_std_wq_attrs[highpri]->cpumask)); > return apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]); > } > } > >
Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)
On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > >> > >> Can you try this patch and see if it changes anything? (with the debug > >> still applied). > > > > This patch fixes the crash for me. After appliying this patch (with all > > other debug patches still applied), kernel didn't print any warnings or > > calltraces or debug messages. > > OK. It's not meant to fix it :) Understand. > > I can't form any connection between your bisection result and that > patch, nothing is making any sense TBH. > > What hardware are you on? And are you doing CPU hotplug or anything like that? It's a "PowerVM" guest (I'm not familiar with powerpc, I don't know what does that mean..) running on Power8 host. I didn't do any CPU hotplug or anything like that. lscpu output: Architecture: ppc64le Byte Order:Little Endian CPU(s):16 On-line CPU(s) list: 0-15 Thread(s) per core:8 Core(s) per socket:1 Socket(s): 2 NUMA node(s): 3 Model: 2.1 (pvr 004b 0201) Model name:POWER8 (architected), altivec supported Hypervisor vendor: (null) Virtualization type: full L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-7 NUMA node2 CPU(s): 8-15 NUMA node3 CPU(s): > > Can you back out the last patch I sent and try this? I appended the calltraces from the test here, I also attached full dmesg log, which included the boot log. [ 74.410871] [ cut here ] [ 74.410895] WARNING: CPU: 0 PID: 2378 at kernel/workqueue.c:3346 alloc_unbound_pwq+0x320/0x690 [ 74.410901] Modules linked in: ext4 jbd2 mbcache sg pseries_rng ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp [ 74.410949] CPU: 0 PID: 2378 Comm: mount Not tainted 4.12.0.debug+ #35 [ 74.410954] task: c003f0447280 task.stack: c003f039c000 [ 74.410959] NIP: c011a310 LR: c011a300 CTR: c011a1e4 [ 74.410963] REGS: c003f039f550 TRAP: 0700 Not tainted (4.12.0.debug+) [ 74.410968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> [ 74.410993] CR: 2402 XER: 0001 [ 74.410998] CFAR: c0581584 SOFTE: 1 [ 74.410998] GPR00: c011a590 c003f039f7d0 c1751800 0001 [ 74.410998] GPR04: 00a0 00c0 [ 74.410998] GPR08: 0030 [ 74.410998] GPR12: 0001 cfac 0002 c003fd237000 [ 74.410998] GPR16: c003d1a10400 0002 [ 74.410998] GPR20: c003cb7ac560 c003fd0387a0 c179a294 [ 74.410998] GPR24: c003cb7ac400 c003f02349c0 00a0 c003f0234a00 [ 74.410998] GPR28: 6ca6897b c003cb7ac400 c179a294 [ 74.411082] NIP [c011a310] alloc_unbound_pwq+0x320/0x690 [ 74.411087] LR [c011a300] alloc_unbound_pwq+0x310/0x690 [ 74.411091] Call Trace: [ 74.411095] [c003f039f7d0] [c011a590] alloc_unbound_pwq+0x5a0/0x690 (unreliable) [ 74.411103] [c003f039f830] [c011aad4] apply_wqattrs_prepare+0x1f4/0x340 [ 74.43] [c003f039f8a0] [c011ac5c] apply_workqueue_attrs_locked+0x3c/0xa0 [ 74.411120] [c003f039f8d0] [c011b1a4] apply_workqueue_attrs+0x54/0x90 [ 74.411127] [c003f039f910] [c011d774] __alloc_workqueue_key+0x184/0x5b0 [ 74.411145] [c003f039f9d0] [d00015211768] ext4_fill_super+0x1c68/0x33e0 [ext4] [ 74.411152] [c003f039fb10] [c03910fc] mount_bdev+0x22c/0x260 [ 74.411168] [c003f039fbb0] [d00015209020] ext4_mount+0x20/0x40 [ext4] [ 74.411174] [c003f039fbd0] [c0392544] mount_fs+0x74/0x210 [ 74.411181] [c003f039fc80] [c03c0808] vfs_kern_mount+0x78/0x220 [ 74.411188] [c003f039fd00] [c03c61c4] do_mount+0x254/0xf70 [ 74.411194] [c003f039fde0] [c03c7304] SyS_mount+0x94/0x100 [ 74.411201] [c003f039fe30] [c000b190] system_call+0x38/0xe0 [ 74.411206] Instruction dump: [ 74.411211] 554ac03e 7f8ae050 7b9c0020 2fac 409e0290 7f44d378 38a0 484672cd [ 74.411227] 6000 7c63d278 7c630074 7863d182 <0b03> 3ca061c8 3f42001e 60a58647 [ 74.411243] ---[ end trace b720011b125c3341 ]--- [ 74.411253] [ cut here ] [ 74.411258] WARNING: CPU: 0 PID: 2378 at kernel/workqueue.c:3376 alloc_unbound_pwq+0x4b0/0x690 [ 74.411262] Modules linked in: ext4 jbd2 mbcache sg pseries_rng ghash_generic g
Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)
On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > > > I have to update the patch a bit to make it compile. > > Sure. > > >> + WARN_ON(cpumask_empty(worker->task->cpus_allowed)); > >> + WARN_ON(cpumask_empty(pool->attrs->cpumask)); > > > > Seems only the last two WARN_ON were triggered. > > OK thanks. > > Can you try this patch and see if it changes anything? (with the debug > still applied). This patch fixes the crash for me. After appliying this patch (with all other debug patches still applied), kernel didn't print any warnings or calltraces or debug messages. > > We've been trying to reproduce the bug here but haven't had any luck so far. I'm using this reproducer: for i in `seq 5`; do mkfs -t ext4 -F /dev/sda5 && sleep 3 && mount /dev/sda5 /mnt/ext4 && umount /dev/sda5 done Thanks, Eryu > > cheers > > diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c > index 4640f6d64f8b..b310ecc07e00 100644 > --- a/arch/powerpc/kernel/setup_64.c > +++ b/arch/powerpc/kernel/setup_64.c > @@ -733,6 +733,8 @@ void __init setup_per_cpu_areas(void) > for_each_possible_cpu(cpu) { > __per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu]; > paca[cpu].data_offset = __per_cpu_offset[cpu]; > + > + set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]); > } > } > #endif
Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)
On Thu, Jun 29, 2017 at 10:06:31PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > > On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote: > >> Eryu Guan <eg...@redhat.com> writes: > >> > >> > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > >> >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote: > >> >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> >> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote: > >> >> > >> >> >> Thanks for the excellent bug report, I am a little lost on the stack > >> >> >> trace, it shows a bad page access that we think is triggered by the > >> >> >> mmap changes? The patch changed the return type to integrate the call > >> >> >> into trace-cmd. Could you point me to the tests that can help > >> >> >> reproduce the crash. Could you also suggest how long to try the test > >> >> >> cases for? > >> >> > > >> >> > Sorry, I should have provided it in the first place. It's as simple as > >> >> > mounting an ext4 filesystem on my test ppc64le host, i.e. > >> >> > > >> >> > mkdir -p /mnt/ext4 > >> >> > mkfs -t ext4 -F /dev/sda5 > >> >> > mount /dev/sda5 /mnt/ext4 > >> >> > >> >> I tried this test a few times with the kernel and could not reproduce > >> >> it. > >> >> Could you please share the config and compiler details, I'll retry with > >> >> -rc7. > >> >> > >> >> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC, > >> >> slub/slab debug, list corruption, etc catch anything at the time of the > >> >> corruption? > >> > > >> > Testing with debug kernel (config file attached) didn't trigger kernel > >> > crash, but only warnings > >> > >> But the warning says try_to_wake_up() is using a CPU number that's out > >> of bounds, which means when you lookup the runqueue for that CPU you > >> just get junk, and that's what was triggering the crash in your previous > >> report. > >> > >> So at least that part of the mystery is solved. > >> > >> > [ 99.686770] [ cut here ] > >> > [ 99.686868] WARNING: CPU: 1 PID: 2272 at > >> > ./include/linux/cpumask.h:121 try_to_wake_up+0x17c/0x8f0 > >> > >> static inline unsigned int cpumask_check(unsigned int cpu) > >> { > >> #ifdef CONFIG_DEBUG_PER_CPU_MAPS > >>WARN_ON_ONCE(cpu >= nr_cpumask_bits); > >> #endif /* CONFIG_DEBUG_PER_CPU_MAPS */ > >>return cpu; > >> } > >> > >> > [ 99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng > >> > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd > >> > grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth > >> > scsi_transport_srp > >> > [ 99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug > >> > #28 > >> > [ 99.686955] task: c003f00b7b00 task.stack: c003f25e > >> > [ 99.686959] NIP: c01359ec LR: c0135ed4 CTR: > >> > c016f940 > >> > [ 99.686964] REGS: c003f25e3420 TRAP: 0700 Not tainted > >> > (4.12.0-rc7.debug) > >> > [ 99.686968] MSR: 80010282b033 > >> > <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> > >> > [ 99.686994] CR: 28028822 XER: 0001 > >> > [ 99.687000] CFAR: c0135cb4 SOFTE: 0 > >> > [ 99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 > >> > 00a0 > >> > [ 99.687000] GPR04: 00a0 00c0 > >> > > >> > [ 99.687000] GPR08: 00a0 > >> > 41e0 > >> > [ 99.687000] GPR12: 8800 cfac0a80 0002 > >> > c003fd20b000 > >> > [ 99.687000] GPR16: c003cabb0400 > >> > 0002 > >> > [ 99.687000] GPR20: c003f7a59d60 c1326300 > >> > c1795d00 > >> > [ 99.687000] GPR24: c000
Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote: > >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote: > >> > >> >> Thanks for the excellent bug report, I am a little lost on the stack > >> >> trace, it shows a bad page access that we think is triggered by the > >> >> mmap changes? The patch changed the return type to integrate the call > >> >> into trace-cmd. Could you point me to the tests that can help > >> >> reproduce the crash. Could you also suggest how long to try the test > >> >> cases for? > >> > > >> > Sorry, I should have provided it in the first place. It's as simple as > >> > mounting an ext4 filesystem on my test ppc64le host, i.e. > >> > > >> > mkdir -p /mnt/ext4 > >> > mkfs -t ext4 -F /dev/sda5 > >> > mount /dev/sda5 /mnt/ext4 > >> > >> I tried this test a few times with the kernel and could not reproduce it. > >> Could you please share the config and compiler details, I'll retry with > >> -rc7. > >> > >> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC, > >> slub/slab debug, list corruption, etc catch anything at the time of the > >> corruption? > > > > Testing with debug kernel (config file attached) didn't trigger kernel > > crash, but only warnings > > But the warning says try_to_wake_up() is using a CPU number that's out > of bounds, which means when you lookup the runqueue for that CPU you > just get junk, and that's what was triggering the crash in your previous > report. > > So at least that part of the mystery is solved. > > > [ 99.686770] [ cut here ] > > [ 99.686868] WARNING: CPU: 1 PID: 2272 at ./include/linux/cpumask.h:121 > > try_to_wake_up+0x17c/0x8f0 > > static inline unsigned int cpumask_check(unsigned int cpu) > { > #ifdef CONFIG_DEBUG_PER_CPU_MAPS > WARN_ON_ONCE(cpu >= nr_cpumask_bits); > #endif /* CONFIG_DEBUG_PER_CPU_MAPS */ > return cpu; > } > > > [ 99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng > > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace > > sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp > > [ 99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug #28 > > [ 99.686955] task: c003f00b7b00 task.stack: c003f25e > > [ 99.686959] NIP: c01359ec LR: c0135ed4 CTR: > > c016f940 > > [ 99.686964] REGS: c003f25e3420 TRAP: 0700 Not tainted > > (4.12.0-rc7.debug) > > [ 99.686968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> > > [ 99.686994] CR: 28028822 XER: 0001 > > [ 99.687000] CFAR: c0135cb4 SOFTE: 0 > > [ 99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 > > 00a0 > > [ 99.687000] GPR04: 00a0 00c0 > > > > [ 99.687000] GPR08: 00a0 > > 41e0 > > [ 99.687000] GPR12: 8800 cfac0a80 0002 > > c003fd20b000 > > [ 99.687000] GPR16: c003cabb0400 > > 0002 > > [ 99.687000] GPR20: c003f7a59d60 c1326300 > > c1795d00 > > [ 99.687000] GPR24: c1799d48 c179a294 > > c003ec786be8 > > [ 99.687000] GPR28: c003ec786680 00a0 > > c003ec786300 > > [ 99.687083] NIP [c01359ec] try_to_wake_up+0x17c/0x8f0 > > [ 99.687088] LR [c0135ed4] try_to_wake_up+0x664/0x8f0 > > [ 99.687092] Call Trace: > > [ 99.687095] [c003f25e36a0] [c0135da0] > > try_to_wake_up+0x530/0x8f0 (unreliable) > > [ 99.687104] [c003f25e3730] [c0114ea8] > > create_worker+0x148/0x220 > > [ 99.687110] [c003f25e37d0] [c011a418] > > alloc_unbound_pwq+0x4c8/0x620 > > [ 99.687117] [c003f25e3830] [c011a9c4] > > apply_wqattrs_prepare+0x1f4/0x340 > > [ 99.687123] [c003f25e38a0] [c011ab4c] > > apply_workqueue_attrs_lock
Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
On Thu, Jun 29, 2017 at 08:27:11PM +1000, Michael Ellerman wrote: > Eryu Guan <eg...@redhat.com> writes: > > > Hi all, > > > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > > page access. But it's not reproducing on every ppc64le host we've > > tested, but it usually happened in filesystem testings. > > > > > And I've confirmed that reverting above commit 'resolves' the crash. > > Do you mean ~v4.12-rc7 with that commit reverted still triggers the > crash? Correct. I also confirmed that reverting it when it was HEAD also fixed the crash. Thanks, Eryu
Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote: > > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote: > > >> Thanks for the excellent bug report, I am a little lost on the stack > >> trace, it shows a bad page access that we think is triggered by the > >> mmap changes? The patch changed the return type to integrate the call > >> into trace-cmd. Could you point me to the tests that can help > >> reproduce the crash. Could you also suggest how long to try the test > >> cases for? > > > > Sorry, I should have provided it in the first place. It's as simple as > > mounting an ext4 filesystem on my test ppc64le host, i.e. > > > > mkdir -p /mnt/ext4 > > mkfs -t ext4 -F /dev/sda5 > > mount /dev/sda5 /mnt/ext4 > > > > I tried this test a few times with the kernel and could not reproduce it. > Could you please share the config and compiler details, I'll retry with -rc7. > > In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC, > slub/slab debug, list corruption, etc catch anything at the time of the > corruption? Testing with debug kernel (config file attached) didn't trigger kernel crash, but only warnings [ 99.686770] [ cut here ] [ 99.686868] WARNING: CPU: 1 PID: 2272 at ./include/linux/cpumask.h:121 try_to_wake_up+0x17c/0x8f0 [ 99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp [ 99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug #28 [ 99.686955] task: c003f00b7b00 task.stack: c003f25e [ 99.686959] NIP: c01359ec LR: c0135ed4 CTR: c016f940 [ 99.686964] REGS: c003f25e3420 TRAP: 0700 Not tainted (4.12.0-rc7.debug) [ 99.686968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> [ 99.686994] CR: 28028822 XER: 0001 [ 99.687000] CFAR: c0135cb4 SOFTE: 0 [ 99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 00a0 [ 99.687000] GPR04: 00a0 00c0 [ 99.687000] GPR08: 00a0 41e0 [ 99.687000] GPR12: 8800 cfac0a80 0002 c003fd20b000 [ 99.687000] GPR16: c003cabb0400 0002 [ 99.687000] GPR20: c003f7a59d60 c1326300 c1795d00 [ 99.687000] GPR24: c1799d48 c179a294 c003ec786be8 [ 99.687000] GPR28: c003ec786680 00a0 c003ec786300 [ 99.687083] NIP [c01359ec] try_to_wake_up+0x17c/0x8f0 [ 99.687088] LR [c0135ed4] try_to_wake_up+0x664/0x8f0 [ 99.687092] Call Trace: [ 99.687095] [c003f25e36a0] [c0135da0] try_to_wake_up+0x530/0x8f0 (unreliable) [ 99.687104] [c003f25e3730] [c0114ea8] create_worker+0x148/0x220 [ 99.687110] [c003f25e37d0] [c011a418] alloc_unbound_pwq+0x4c8/0x620 [ 99.687117] [c003f25e3830] [c011a9c4] apply_wqattrs_prepare+0x1f4/0x340 [ 99.687123] [c003f25e38a0] [c011ab4c] apply_workqueue_attrs_locked+0x3c/0xa0 [ 99.687130] [c003f25e38d0] [c011b094] apply_workqueue_attrs+0x54/0x90 [ 99.687137] [c003f25e3910] [c011d674] __alloc_workqueue_key+0x184/0x5b0 [ 99.687155] [c003f25e39d0] [d00013dd1768] ext4_fill_super+0x1c68/0x33e0 [ext4] [ 99.687162] [c003f25e3b10] [c0390f7c] mount_bdev+0x22c/0x260 [ 99.687178] [c003f25e3bb0] [d00013dc9020] ext4_mount+0x20/0x40 [ext4] [ 99.687184] [c003f25e3bd0] [c03923c4] mount_fs+0x74/0x210 [ 99.687191] [c003f25e3c80] [c03c0688] vfs_kern_mount+0x78/0x220 [ 99.687197] [c003f25e3d00] [c03c6044] do_mount+0x254/0xf70 [ 99.687204] [c003f25e3de0] [c03c7184] SyS_mount+0x94/0x100 [ 99.687210] [c003f25e3e30] [c000b190] system_call+0x38/0xe0 [ 99.687215] Instruction dump: [ 99.687220] 409d000c 3924 9121002c 387d0018 4803be2d 6000 7fa3eb78 48911321 [ 99.687236] 6000 2fb7 409e0124 480001e0 <0fe0> 7fca3670 7d4a0194 57c906be [ 99.687252] ---[ end trace e80d5ad75ae4c2a0 ]--- [ 99.691902] EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null) Thanks, Eryu config-ppc64le-debug.bz2 Description: BZip2 compressed data
Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote: > > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote: > > >> Thanks for the excellent bug report, I am a little lost on the stack > >> trace, it shows a bad page access that we think is triggered by the > >> mmap changes? The patch changed the return type to integrate the call > >> into trace-cmd. Could you point me to the tests that can help > >> reproduce the crash. Could you also suggest how long to try the test > >> cases for? > > > > Sorry, I should have provided it in the first place. It's as simple as > > mounting an ext4 filesystem on my test ppc64le host, i.e. > > > > mkdir -p /mnt/ext4 > > mkfs -t ext4 -F /dev/sda5 > > mount /dev/sda5 /mnt/ext4 > > > > I tried this test a few times with the kernel and could not reproduce it. Yes, it's not reproduced on every host, I'm not sure what makes my test host so unique yet. > Could you please share the config and compiler details, I'll retry with -rc7. [root@ibm-p8-03-lp6 ~]# gcc --version gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16) Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [root@ibm-p8-03-lp6 ~]# rpm -q gcc gcc-4.8.5-16.el7.ppc64le I attached kernel config file. > > In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC, > slub/slab debug, list corruption, etc catch anything at the time of the > corruption? OK, I'll retry with a debug kernel and report back. Thanks, Eryu config-ppc64le.bz2 Description: BZip2 compressed data
Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote: > > Hi all, > > > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > > page access. But it's not reproducing on every ppc64le host we've > > tested, but it usually happened in filesystem testings. > > > > [ 207.403459] Unable to handle kernel paging request for unaligned access > > at address 0xc001c52c5e7f > > [ 207.403470] Faulting instruction address: 0xc04d470c > > [ 207.403475] Oops: Kernel access of bad area, sig: 7 [#1] > > [ 207.403477] SMP NR_CPUS=2048 > > [ 207.403478] NUMA > > [ 207.403480] pSeries > > [ 207.403483] Modules linked in: ext4 jbd2 mbcache sg pseries_rng > > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace > > sunrpc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp > > [ 207.403503] CPU: 0 PID: 2263 Comm: mount Not tainted 4.12.0-rc7 #26 > > [ 207.403506] task: c003ef2fde00 task.stack: c003de394000 > > [ 207.403509] NIP: c04d470c LR: c011cd24 CTR: > > c0130de0 > > [ 207.403512] REGS: c003de397450 TRAP: 0600 Not tainted (4.12.0-rc7) > > [ 207.403515] MSR: 80010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> > > [ 207.403521] CR: 28028844 XER: 0001 > > [ 207.403525] CFAR: c011cd20 DAR: c001c52c5e7f DSISR: > > SOFTE: 0 > > [ 207.403525] GPR00: c011cce8 c003de3976d0 c1049500 > > c003f2c6ec20 > > [ 207.403525] GPR04: c003f2c6ec20 c001c52c5e7f > > 0001 > > [ 207.403525] GPR08: 000c5543cab19830 000198e19900 0008 > > > > [ 207.403525] GPR12: c0130de0 cfac > > c003f1328000 > > [ 207.403525] GPR16: c003de700400 > > c003de700594 > > [ 207.403525] GPR20: 0002 4000 > > c0cc5780 > > [ 207.403525] GPR24: 0001c45ffc5f 0001c45ffc5f > > c107dd00 > > [ 207.403525] GPR28: c003f2c6f434 0004 0800 > > c003f2c6ec00 > > [ 207.403567] NIP [c04d470c] llist_add_batch+0xc/0x40 > > [ 207.403571] LR [c011cd24] try_to_wake_up+0x4a4/0x5b0 > > [ 207.403573] Call Trace: > > [ 207.403576] [c003de3976d0] [c011cce8] > > try_to_wake_up+0x468/0x5b0 (unreliable) > > [ 207.403581] [c003de397750] [c0102cc8] > > create_worker+0x148/0x250 > > [ 207.403585] [c003de3977f0] [c0105e7c] > > alloc_unbound_pwq+0x3bc/0x4c0 > > [ 207.403589] [c003de397850] [c01064bc] > > apply_wqattrs_prepare+0x2ac/0x320 > > [ 207.403593] [c003de3978c0] [c010656c] > > apply_workqueue_attrs_locked+0x3c/0xa0 > > [ 207.403597] [c003de3978f0] [c0106acc] > > apply_workqueue_attrs+0x4c/0x80 > > [ 207.403601] [c003de397930] [c010866c] > > __alloc_workqueue_key+0x16c/0x4e0 > > [ 207.403615] [c003de3979f0] [d00013de5ce0] > > ext4_fill_super+0x1c70/0x3390 [ext4] > > [ 207.403620] [c003de397b30] [c031739c] mount_bdev+0x21c/0x250 > > [ 207.403633] [c003de397bd0] [d00013dddb80] ext4_mount+0x20/0x40 > > [ext4] > > [ 207.403637] [c003de397bf0] [c0318944] mount_fs+0x74/0x210 > > [ 207.403641] [c003de397ca0] [c0340638] > > vfs_kern_mount+0x68/0x1d0 > > [ 207.403644] [c003de397d10] [c0345348] do_mount+0x278/0xef0 > > [ 207.403648] [c003de397de0] [c03463e4] SyS_mount+0x94/0x100 > > [ 207.403652] [c003de397e30] [c000af84] system_call+0x38/0xe0 > > [ 207.403655] Instruction dump: > > [ 207.403658] 6042 3860 4e800020 6000 6042 7c832378 > > 4e800020 6000 > > [ 207.403663] 6000 e925 f924 7c0004ac <7d4028a8> 7c2a4800 > > 40c20010 7c6029ad > > [ 207.403669] ---[ end trace 4fa94bf890f28f69 ]--- > > > > Today I've finally found a host that could reliably trigger the crash by > > mounting an ext4 filesystem and I've done a git bisect. The first bad > > pointed to this commit: > > Thanks for the excellent bug report, I am a little lost on the stack > trace, it shows a bad page access that we think is triggered by the > mmap changes? The patch changed the return type to integrate the call > into trace-cmd. Could you point me to
[v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host
Hi all, Li Wang and I are constantly seeing ppc64le hosts crashing due to bad page access. But it's not reproducing on every ppc64le host we've tested, but it usually happened in filesystem testings. [ 207.403459] Unable to handle kernel paging request for unaligned access at address 0xc001c52c5e7f [ 207.403470] Faulting instruction address: 0xc04d470c [ 207.403475] Oops: Kernel access of bad area, sig: 7 [#1] [ 207.403477] SMP NR_CPUS=2048 [ 207.403478] NUMA [ 207.403480] pSeries [ 207.403483] Modules linked in: ext4 jbd2 mbcache sg pseries_rng ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp [ 207.403503] CPU: 0 PID: 2263 Comm: mount Not tainted 4.12.0-rc7 #26 [ 207.403506] task: c003ef2fde00 task.stack: c003de394000 [ 207.403509] NIP: c04d470c LR: c011cd24 CTR: c0130de0 [ 207.403512] REGS: c003de397450 TRAP: 0600 Not tainted (4.12.0-rc7) [ 207.403515] MSR: 80010280b033[ 207.403521] CR: 28028844 XER: 0001 [ 207.403525] CFAR: c011cd20 DAR: c001c52c5e7f DSISR: SOFTE: 0 [ 207.403525] GPR00: c011cce8 c003de3976d0 c1049500 c003f2c6ec20 [ 207.403525] GPR04: c003f2c6ec20 c001c52c5e7f 0001 [ 207.403525] GPR08: 000c5543cab19830 000198e19900 0008 [ 207.403525] GPR12: c0130de0 cfac c003f1328000 [ 207.403525] GPR16: c003de700400 c003de700594 [ 207.403525] GPR20: 0002 4000 c0cc5780 [ 207.403525] GPR24: 0001c45ffc5f 0001c45ffc5f c107dd00 [ 207.403525] GPR28: c003f2c6f434 0004 0800 c003f2c6ec00 [ 207.403567] NIP [c04d470c] llist_add_batch+0xc/0x40 [ 207.403571] LR [c011cd24] try_to_wake_up+0x4a4/0x5b0 [ 207.403573] Call Trace: [ 207.403576] [c003de3976d0] [c011cce8] try_to_wake_up+0x468/0x5b0 (unreliable) [ 207.403581] [c003de397750] [c0102cc8] create_worker+0x148/0x250 [ 207.403585] [c003de3977f0] [c0105e7c] alloc_unbound_pwq+0x3bc/0x4c0 [ 207.403589] [c003de397850] [c01064bc] apply_wqattrs_prepare+0x2ac/0x320 [ 207.403593] [c003de3978c0] [c010656c] apply_workqueue_attrs_locked+0x3c/0xa0 [ 207.403597] [c003de3978f0] [c0106acc] apply_workqueue_attrs+0x4c/0x80 [ 207.403601] [c003de397930] [c010866c] __alloc_workqueue_key+0x16c/0x4e0 [ 207.403615] [c003de3979f0] [d00013de5ce0] ext4_fill_super+0x1c70/0x3390 [ext4] [ 207.403620] [c003de397b30] [c031739c] mount_bdev+0x21c/0x250 [ 207.403633] [c003de397bd0] [d00013dddb80] ext4_mount+0x20/0x40 [ext4] [ 207.403637] [c003de397bf0] [c0318944] mount_fs+0x74/0x210 [ 207.403641] [c003de397ca0] [c0340638] vfs_kern_mount+0x68/0x1d0 [ 207.403644] [c003de397d10] [c0345348] do_mount+0x278/0xef0 [ 207.403648] [c003de397de0] [c03463e4] SyS_mount+0x94/0x100 [ 207.403652] [c003de397e30] [c000af84] system_call+0x38/0xe0 [ 207.403655] Instruction dump: [ 207.403658] 6042 3860 4e800020 6000 6042 7c832378 4e800020 6000 [ 207.403663] 6000 e925 f924 7c0004ac <7d4028a8> 7c2a4800 40c20010 7c6029ad [ 207.403669] ---[ end trace 4fa94bf890f28f69 ]--- Today I've finally found a host that could reliably trigger the crash by mounting an ext4 filesystem and I've done a git bisect. The first bad pointed to this commit: commit 9c355917fcf006af47ffaa5ae43a1a804764a6f6 Author: Balbir Singh Date: Wed Apr 12 16:35:19 2017 +1000 powerpc/tracing: Allow tracing of mmap syscalls Currently sys_mmap() and sys_mmap2() (32-bit only), are not visible to the syscall tracing machinery. This means users are not able to see the execution of mmap() syscalls using the syscall tracer. Fix that by using SYSCALL_DEFINE6 for sys_mmap() and sys_mmap2() so that the meta-data associated with these syscalls is visible to the syscall tracer. A side-effect of this change is that the return type has changed from unsigned long to long. However this should have no effect, the only code in the kernel which uses the result of these syscalls is in the syscall return path, which is written in asm and treats the result as unsigned regardless. Example output: cat-3399 [001] 196.542410: sys_mmap(addr: 7fff922a, len: 2, prot: 3, flags: 812, fd: 3, offset: 1b) cat-3399 [001] 196.542443: sys_mmap -> 0x7fff922a cat-3399 [001] 196.542668: sys_munmap(addr: 7fff922c, len: 6d2c) cat-3399 [001]