Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Eryu Guan
On Tue, Jul 04, 2017 at 09:06:55PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> 
> > On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote:
> >> Eryu Guan <eg...@redhat.com> writes:
> >> > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote:
> >> >> 
> >> >> Can you try this patch and see if it changes anything? (with the debug
> >> >> still applied).
> >> >
> >> > This patch fixes the crash for me. After appliying this patch (with all
> >> > other debug patches still applied), kernel didn't print any warnings or
> >> > calltraces or debug messages.
> >> 
> >> OK. It's not meant to fix it :)
> >
> > Understand.
> >
> >> 
> >> I can't form any connection between your bisection result and that
> >> patch, nothing is making any sense TBH.
> >> 
> >> What hardware are you on? And are you doing CPU hotplug or anything like 
> >> that?
> >
> > It's a "PowerVM" guest (I'm not familiar with powerpc, I don't know what
> > does that mean..) running on Power8 host. I didn't do any CPU hotplug or
> > anything like that.
> 
> OK thanks.
> 
> We might have to try and sync up on irc so we can debug this a bit faster.

Sure, where can I find you? I'm in #xfs at freenode, nick eguan. But
maybe tomorrow, I have to take off today.

> 
> Can you try this hunk also?

This new WARN_ON didn't trigger (I skipped the other warning messages,
they're the same warnings as in my last reply).

Thanks,
Eryu

> 
> cheers
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index c74bf39ef764..7c55721b1f1d 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3902,6 +3906,7 @@ static int alloc_and_link_pwqs(struct workqueue_struct 
> *wq)
>"ordering guarantee broken for workqueue %s\n", wq->name);
>   return ret;
>   } else {
> + WARN_ON(cpumask_empty(unbound_std_wq_attrs[highpri]->cpumask));
>   return apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]);
>   }
>  }
> 
> 


Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Eryu Guan
On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote:
> >> 
> >> Can you try this patch and see if it changes anything? (with the debug
> >> still applied).
> >
> > This patch fixes the crash for me. After appliying this patch (with all
> > other debug patches still applied), kernel didn't print any warnings or
> > calltraces or debug messages.
> 
> OK. It's not meant to fix it :)

Understand.

> 
> I can't form any connection between your bisection result and that
> patch, nothing is making any sense TBH.
> 
> What hardware are you on? And are you doing CPU hotplug or anything like that?

It's a "PowerVM" guest (I'm not familiar with powerpc, I don't know what
does that mean..) running on Power8 host. I didn't do any CPU hotplug or
anything like that.

lscpu output:
Architecture:  ppc64le
Byte Order:Little Endian
CPU(s):16
On-line CPU(s) list:   0-15
Thread(s) per core:8
Core(s) per socket:1
Socket(s): 2
NUMA node(s):  3
Model: 2.1 (pvr 004b 0201)
Model name:POWER8 (architected), altivec supported
Hypervisor vendor: (null)
Virtualization type:   full
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-7
NUMA node2 CPU(s): 8-15
NUMA node3 CPU(s):

> 
> Can you back out the last patch I sent and try this?

I appended the calltraces from the test here, I also attached full dmesg
log, which included the boot log.

[   74.410871] [ cut here ]
[   74.410895] WARNING: CPU: 0 PID: 2378 at kernel/workqueue.c:3346 
alloc_unbound_pwq+0x320/0x690
[   74.410901] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace 
sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp
[   74.410949] CPU: 0 PID: 2378 Comm: mount Not tainted 4.12.0.debug+ #35
[   74.410954] task: c003f0447280 task.stack: c003f039c000
[   74.410959] NIP: c011a310 LR: c011a300 CTR: c011a1e4
[   74.410963] REGS: c003f039f550 TRAP: 0700   Not tainted  (4.12.0.debug+)
[   74.410968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
[   74.410993]   CR: 2402  XER: 0001
[   74.410998] CFAR: c0581584 SOFTE: 1
[   74.410998] GPR00: c011a590 c003f039f7d0 c1751800 
0001
[   74.410998] GPR04: 00a0 00c0  

[   74.410998] GPR08:    
0030
[   74.410998] GPR12: 0001 cfac 0002 
c003fd237000
[   74.410998] GPR16: c003d1a10400   
0002
[   74.410998] GPR20:  c003cb7ac560 c003fd0387a0 
c179a294
[   74.410998] GPR24: c003cb7ac400 c003f02349c0 00a0 
c003f0234a00
[   74.410998] GPR28: 6ca6897b c003cb7ac400 c179a294 

[   74.411082] NIP [c011a310] alloc_unbound_pwq+0x320/0x690
[   74.411087] LR [c011a300] alloc_unbound_pwq+0x310/0x690
[   74.411091] Call Trace:
[   74.411095] [c003f039f7d0] [c011a590] 
alloc_unbound_pwq+0x5a0/0x690 (unreliable)
[   74.411103] [c003f039f830] [c011aad4] 
apply_wqattrs_prepare+0x1f4/0x340
[   74.43] [c003f039f8a0] [c011ac5c] 
apply_workqueue_attrs_locked+0x3c/0xa0
[   74.411120] [c003f039f8d0] [c011b1a4] 
apply_workqueue_attrs+0x54/0x90
[   74.411127] [c003f039f910] [c011d774] 
__alloc_workqueue_key+0x184/0x5b0
[   74.411145] [c003f039f9d0] [d00015211768] 
ext4_fill_super+0x1c68/0x33e0 [ext4]
[   74.411152] [c003f039fb10] [c03910fc] mount_bdev+0x22c/0x260
[   74.411168] [c003f039fbb0] [d00015209020] ext4_mount+0x20/0x40 [ext4]
[   74.411174] [c003f039fbd0] [c0392544] mount_fs+0x74/0x210
[   74.411181] [c003f039fc80] [c03c0808] vfs_kern_mount+0x78/0x220
[   74.411188] [c003f039fd00] [c03c61c4] do_mount+0x254/0xf70
[   74.411194] [c003f039fde0] [c03c7304] SyS_mount+0x94/0x100
[   74.411201] [c003f039fe30] [c000b190] system_call+0x38/0xe0
[   74.411206] Instruction dump:
[   74.411211] 554ac03e 7f8ae050 7b9c0020 2fac 409e0290 7f44d378 38a0 
484672cd
[   74.411227] 6000 7c63d278 7c630074 7863d182 <0b03> 3ca061c8 3f42001e 
60a58647
[   74.411243] ---[ end trace b720011b125c3341 ]---
[   74.411253] [ cut here ]
[   74.411258] WARNING: CPU: 0 PID: 2378 at kernel/workqueue.c:3376 
alloc_unbound_pwq+0x4b0/0x690
[   74.411262] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
ghash_generic g

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-30 Thread Eryu Guan
On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> >
> > I have to update the patch a bit to make it compile.
> 
> Sure.
> 
> >> +  WARN_ON(cpumask_empty(worker->task->cpus_allowed));
> >> +  WARN_ON(cpumask_empty(pool->attrs->cpumask));
> >
> > Seems only the last two WARN_ON were triggered.
> 
> OK thanks.
> 
> Can you try this patch and see if it changes anything? (with the debug
> still applied).

This patch fixes the crash for me. After appliying this patch (with all
other debug patches still applied), kernel didn't print any warnings or
calltraces or debug messages.

> 
> We've been trying to reproduce the bug here but haven't had any luck so far.

I'm using this reproducer:
for i in `seq 5`; do
mkfs -t ext4 -F /dev/sda5 && sleep 3 && mount /dev/sda5 /mnt/ext4 && 
umount /dev/sda5
done

Thanks,
Eryu

> 
> cheers
> 
> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 4640f6d64f8b..b310ecc07e00 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -733,6 +733,8 @@ void __init setup_per_cpu_areas(void)
>   for_each_possible_cpu(cpu) {
>  __per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];
>   paca[cpu].data_offset = __per_cpu_offset[cpu];
> +
> + set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
>   }
>  }
>  #endif


Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 10:06:31PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> 
> > On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote:
> >> Eryu Guan <eg...@redhat.com> writes:
> >> 
> >> > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote:
> >> >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote:
> >> >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote:
> >> >> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote:
> >> >> 
> >> >> >> Thanks for the excellent bug report, I am a little lost on the stack
> >> >> >> trace, it shows a bad page access that we think is triggered by the
> >> >> >> mmap changes? The patch changed the return type to integrate the call
> >> >> >> into trace-cmd. Could you point me to the tests that can help
> >> >> >> reproduce the crash. Could you also suggest how long to try the test
> >> >> >> cases for?
> >> >> >
> >> >> > Sorry, I should have provided it in the first place. It's as simple as
> >> >> > mounting an ext4 filesystem on my test ppc64le host, i.e.
> >> >> >
> >> >> > mkdir -p /mnt/ext4
> >> >> > mkfs -t ext4 -F /dev/sda5
> >> >> > mount /dev/sda5 /mnt/ext4
> >> >> 
> >> >> I tried this test a few times with the kernel and could not reproduce 
> >> >> it.
> >> >> Could you please share the config and compiler details, I'll retry with 
> >> >> -rc7.
> >> >> 
> >> >> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC,
> >> >> slub/slab debug, list corruption, etc catch anything at the time of the
> >> >> corruption?
> >> >
> >> > Testing with debug kernel (config file attached) didn't trigger kernel
> >> > crash, but only warnings
> >> 
> >> But the warning says try_to_wake_up() is using a CPU number that's out
> >> of bounds, which means when you lookup the runqueue for that CPU you
> >> just get junk, and that's what was triggering the crash in your previous
> >> report.
> >> 
> >> So at least that part of the mystery is solved.
> >> 
> >> > [   99.686770] [ cut here ]
> >> > [   99.686868] WARNING: CPU: 1 PID: 2272 at 
> >> > ./include/linux/cpumask.h:121 try_to_wake_up+0x17c/0x8f0
> >> 
> >> static inline unsigned int cpumask_check(unsigned int cpu)
> >> {
> >> #ifdef CONFIG_DEBUG_PER_CPU_MAPS
> >>WARN_ON_ONCE(cpu >= nr_cpumask_bits);
> >> #endif /* CONFIG_DEBUG_PER_CPU_MAPS */
> >>return cpu;
> >> }
> >> 
> >> > [   99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
> >> > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd 
> >> > grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth 
> >> > scsi_transport_srp
> >> > [   99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug 
> >> > #28
> >> > [   99.686955] task: c003f00b7b00 task.stack: c003f25e
> >> > [   99.686959] NIP: c01359ec LR: c0135ed4 CTR: 
> >> > c016f940
> >> > [   99.686964] REGS: c003f25e3420 TRAP: 0700   Not tainted  
> >> > (4.12.0-rc7.debug)
> >> > [   99.686968] MSR: 80010282b033 
> >> > <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
> >> > [   99.686994]   CR: 28028822  XER: 0001
> >> > [   99.687000] CFAR: c0135cb4 SOFTE: 0
> >> > [   99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 
> >> > 00a0
> >> > [   99.687000] GPR04: 00a0 00c0  
> >> > 
> >> > [   99.687000] GPR08:  00a0  
> >> > 41e0
> >> > [   99.687000] GPR12: 8800 cfac0a80 0002 
> >> > c003fd20b000
> >> > [   99.687000] GPR16: c003cabb0400   
> >> > 0002
> >> > [   99.687000] GPR20:  c003f7a59d60 c1326300 
> >> > c1795d00
> >> > [   99.687000] GPR24: c000

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> 
> > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote:
> >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote:
> >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote:
> >> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote:
> >> 
> >> >> Thanks for the excellent bug report, I am a little lost on the stack
> >> >> trace, it shows a bad page access that we think is triggered by the
> >> >> mmap changes? The patch changed the return type to integrate the call
> >> >> into trace-cmd. Could you point me to the tests that can help
> >> >> reproduce the crash. Could you also suggest how long to try the test
> >> >> cases for?
> >> >
> >> > Sorry, I should have provided it in the first place. It's as simple as
> >> > mounting an ext4 filesystem on my test ppc64le host, i.e.
> >> >
> >> > mkdir -p /mnt/ext4
> >> > mkfs -t ext4 -F /dev/sda5
> >> > mount /dev/sda5 /mnt/ext4
> >> 
> >> I tried this test a few times with the kernel and could not reproduce it.
> >> Could you please share the config and compiler details, I'll retry with 
> >> -rc7.
> >> 
> >> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC,
> >> slub/slab debug, list corruption, etc catch anything at the time of the
> >> corruption?
> >
> > Testing with debug kernel (config file attached) didn't trigger kernel
> > crash, but only warnings
> 
> But the warning says try_to_wake_up() is using a CPU number that's out
> of bounds, which means when you lookup the runqueue for that CPU you
> just get junk, and that's what was triggering the crash in your previous
> report.
> 
> So at least that part of the mystery is solved.
> 
> > [   99.686770] [ cut here ]
> > [   99.686868] WARNING: CPU: 1 PID: 2272 at ./include/linux/cpumask.h:121 
> > try_to_wake_up+0x17c/0x8f0
> 
> static inline unsigned int cpumask_check(unsigned int cpu)
> {
> #ifdef CONFIG_DEBUG_PER_CPU_MAPS
>   WARN_ON_ONCE(cpu >= nr_cpumask_bits);
> #endif /* CONFIG_DEBUG_PER_CPU_MAPS */
>   return cpu;
> }
> 
> > [   99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
> > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace 
> > sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp
> > [   99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug #28
> > [   99.686955] task: c003f00b7b00 task.stack: c003f25e
> > [   99.686959] NIP: c01359ec LR: c0135ed4 CTR: 
> > c016f940
> > [   99.686964] REGS: c003f25e3420 TRAP: 0700   Not tainted  
> > (4.12.0-rc7.debug)
> > [   99.686968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
> > [   99.686994]   CR: 28028822  XER: 0001
> > [   99.687000] CFAR: c0135cb4 SOFTE: 0
> > [   99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 
> > 00a0
> > [   99.687000] GPR04: 00a0 00c0  
> > 
> > [   99.687000] GPR08:  00a0  
> > 41e0
> > [   99.687000] GPR12: 8800 cfac0a80 0002 
> > c003fd20b000
> > [   99.687000] GPR16: c003cabb0400   
> > 0002
> > [   99.687000] GPR20:  c003f7a59d60 c1326300 
> > c1795d00
> > [   99.687000] GPR24: c1799d48  c179a294 
> > c003ec786be8
> > [   99.687000] GPR28:  c003ec786680 00a0 
> > c003ec786300
> > [   99.687083] NIP [c01359ec] try_to_wake_up+0x17c/0x8f0
> > [   99.687088] LR [c0135ed4] try_to_wake_up+0x664/0x8f0
> > [   99.687092] Call Trace:
> > [   99.687095] [c003f25e36a0] [c0135da0] 
> > try_to_wake_up+0x530/0x8f0 (unreliable)
> > [   99.687104] [c003f25e3730] [c0114ea8] 
> > create_worker+0x148/0x220
> > [   99.687110] [c003f25e37d0] [c011a418] 
> > alloc_unbound_pwq+0x4c8/0x620
> > [   99.687117] [c003f25e3830] [c011a9c4] 
> > apply_wqattrs_prepare+0x1f4/0x340
> > [   99.687123] [c003f25e38a0] [c011ab4c] 
> > apply_workqueue_attrs_lock

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 08:27:11PM +1000, Michael Ellerman wrote:
> Eryu Guan <eg...@redhat.com> writes:
> 
> > Hi all,
> >
> > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad
> > page access. But it's not reproducing on every ppc64le host we've
> > tested, but it usually happened in filesystem testings.
> 
> 
> 
> > And I've confirmed that reverting above commit 'resolves' the crash.
> 
> Do you mean ~v4.12-rc7 with that commit reverted still triggers the
> crash?

Correct. I also confirmed that reverting it when it was HEAD also fixed
the crash.

Thanks,
Eryu


Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote:
> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote:
> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote:
> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote:
> 
> >> Thanks for the excellent bug report, I am a little lost on the stack
> >> trace, it shows a bad page access that we think is triggered by the
> >> mmap changes? The patch changed the return type to integrate the call
> >> into trace-cmd. Could you point me to the tests that can help
> >> reproduce the crash. Could you also suggest how long to try the test
> >> cases for?
> >
> > Sorry, I should have provided it in the first place. It's as simple as
> > mounting an ext4 filesystem on my test ppc64le host, i.e.
> >
> > mkdir -p /mnt/ext4
> > mkfs -t ext4 -F /dev/sda5
> > mount /dev/sda5 /mnt/ext4
> >
> 
> I tried this test a few times with the kernel and could not reproduce it.
> Could you please share the config and compiler details, I'll retry with -rc7.
> 
> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC,
> slub/slab debug, list corruption, etc catch anything at the time of the
> corruption?

Testing with debug kernel (config file attached) didn't trigger kernel
crash, but only warnings

[   99.686770] [ cut here ]
[   99.686868] WARNING: CPU: 1 PID: 2272 at ./include/linux/cpumask.h:121 
try_to_wake_up+0x17c/0x8f0
[   99.686873] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace 
sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp
[   99.686950] CPU: 1 PID: 2272 Comm: mount Not tainted 4.12.0-rc7.debug #28
[   99.686955] task: c003f00b7b00 task.stack: c003f25e
[   99.686959] NIP: c01359ec LR: c0135ed4 CTR: c016f940
[   99.686964] REGS: c003f25e3420 TRAP: 0700   Not tainted  
(4.12.0-rc7.debug)
[   99.686968] MSR: 80010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
[   99.686994]   CR: 28028822  XER: 0001
[   99.687000] CFAR: c0135cb4 SOFTE: 0
[   99.687000] GPR00: c0135da0 c003f25e36a0 c1751800 
00a0
[   99.687000] GPR04: 00a0 00c0  

[   99.687000] GPR08:  00a0  
41e0
[   99.687000] GPR12: 8800 cfac0a80 0002 
c003fd20b000
[   99.687000] GPR16: c003cabb0400   
0002
[   99.687000] GPR20:  c003f7a59d60 c1326300 
c1795d00
[   99.687000] GPR24: c1799d48  c179a294 
c003ec786be8
[   99.687000] GPR28:  c003ec786680 00a0 
c003ec786300
[   99.687083] NIP [c01359ec] try_to_wake_up+0x17c/0x8f0
[   99.687088] LR [c0135ed4] try_to_wake_up+0x664/0x8f0
[   99.687092] Call Trace:
[   99.687095] [c003f25e36a0] [c0135da0] try_to_wake_up+0x530/0x8f0 
(unreliable)
[   99.687104] [c003f25e3730] [c0114ea8] create_worker+0x148/0x220
[   99.687110] [c003f25e37d0] [c011a418] 
alloc_unbound_pwq+0x4c8/0x620
[   99.687117] [c003f25e3830] [c011a9c4] 
apply_wqattrs_prepare+0x1f4/0x340
[   99.687123] [c003f25e38a0] [c011ab4c] 
apply_workqueue_attrs_locked+0x3c/0xa0
[   99.687130] [c003f25e38d0] [c011b094] 
apply_workqueue_attrs+0x54/0x90
[   99.687137] [c003f25e3910] [c011d674] 
__alloc_workqueue_key+0x184/0x5b0
[   99.687155] [c003f25e39d0] [d00013dd1768] 
ext4_fill_super+0x1c68/0x33e0 [ext4]
[   99.687162] [c003f25e3b10] [c0390f7c] mount_bdev+0x22c/0x260
[   99.687178] [c003f25e3bb0] [d00013dc9020] ext4_mount+0x20/0x40 [ext4]
[   99.687184] [c003f25e3bd0] [c03923c4] mount_fs+0x74/0x210
[   99.687191] [c003f25e3c80] [c03c0688] vfs_kern_mount+0x78/0x220
[   99.687197] [c003f25e3d00] [c03c6044] do_mount+0x254/0xf70
[   99.687204] [c003f25e3de0] [c03c7184] SyS_mount+0x94/0x100
[   99.687210] [c003f25e3e30] [c000b190] system_call+0x38/0xe0
[   99.687215] Instruction dump:
[   99.687220] 409d000c 3924 9121002c 387d0018 4803be2d 6000 7fa3eb78 
48911321
[   99.687236] 6000 2fb7 409e0124 480001e0 <0fe0> 7fca3670 7d4a0194 
57c906be
[   99.687252] ---[ end trace e80d5ad75ae4c2a0 ]---
[   99.691902] EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: 
(null)

Thanks,
Eryu


config-ppc64le-debug.bz2
Description: BZip2 compressed data


Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote:
> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan <eg...@redhat.com> wrote:
> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote:
> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote:
> 
> >> Thanks for the excellent bug report, I am a little lost on the stack
> >> trace, it shows a bad page access that we think is triggered by the
> >> mmap changes? The patch changed the return type to integrate the call
> >> into trace-cmd. Could you point me to the tests that can help
> >> reproduce the crash. Could you also suggest how long to try the test
> >> cases for?
> >
> > Sorry, I should have provided it in the first place. It's as simple as
> > mounting an ext4 filesystem on my test ppc64le host, i.e.
> >
> > mkdir -p /mnt/ext4
> > mkfs -t ext4 -F /dev/sda5
> > mount /dev/sda5 /mnt/ext4
> >
> 
> I tried this test a few times with the kernel and could not reproduce it.

Yes, it's not reproduced on every host, I'm not sure what makes my test
host so unique yet.

> Could you please share the config and compiler details, I'll retry with -rc7.

[root@ibm-p8-03-lp6 ~]# gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

[root@ibm-p8-03-lp6 ~]# rpm -q gcc
gcc-4.8.5-16.el7.ppc64le

I attached kernel config file.

> 
> In the meanwhile, does enabling kmemleak, DEBUG_PAGE_ALLOC,
> slub/slab debug, list corruption, etc catch anything at the time of the
> corruption?

OK, I'll retry with a debug kernel and report back.

Thanks,
Eryu


config-ppc64le.bz2
Description: BZip2 compressed data


Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Eryu Guan
On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote:
> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan <eg...@redhat.com> wrote:
> > Hi all,
> >
> > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad
> > page access. But it's not reproducing on every ppc64le host we've
> > tested, but it usually happened in filesystem testings.
> >
> > [  207.403459] Unable to handle kernel paging request for unaligned access 
> > at address 0xc001c52c5e7f
> > [  207.403470] Faulting instruction address: 0xc04d470c
> > [  207.403475] Oops: Kernel access of bad area, sig: 7 [#1]
> > [  207.403477] SMP NR_CPUS=2048
> > [  207.403478] NUMA
> > [  207.403480] pSeries
> > [  207.403483] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
> > ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace 
> > sunrpc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp
> > [  207.403503] CPU: 0 PID: 2263 Comm: mount Not tainted 4.12.0-rc7 #26
> > [  207.403506] task: c003ef2fde00 task.stack: c003de394000
> > [  207.403509] NIP: c04d470c LR: c011cd24 CTR: 
> > c0130de0
> > [  207.403512] REGS: c003de397450 TRAP: 0600   Not tainted  (4.12.0-rc7)
> > [  207.403515] MSR: 80010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
> > [  207.403521]   CR: 28028844  XER: 0001
> > [  207.403525] CFAR: c011cd20 DAR: c001c52c5e7f DSISR:  
> > SOFTE: 0
> > [  207.403525] GPR00: c011cce8 c003de3976d0 c1049500 
> > c003f2c6ec20
> > [  207.403525] GPR04: c003f2c6ec20 c001c52c5e7f  
> > 0001
> > [  207.403525] GPR08: 000c5543cab19830 000198e19900 0008 
> > 
> > [  207.403525] GPR12: c0130de0 cfac  
> > c003f1328000
> > [  207.403525] GPR16:  c003de700400  
> > c003de700594
> > [  207.403525] GPR20: 0002  4000 
> > c0cc5780
> > [  207.403525] GPR24: 0001c45ffc5f  0001c45ffc5f 
> > c107dd00
> > [  207.403525] GPR28: c003f2c6f434 0004 0800 
> > c003f2c6ec00
> > [  207.403567] NIP [c04d470c] llist_add_batch+0xc/0x40
> > [  207.403571] LR [c011cd24] try_to_wake_up+0x4a4/0x5b0
> > [  207.403573] Call Trace:
> > [  207.403576] [c003de3976d0] [c011cce8] 
> > try_to_wake_up+0x468/0x5b0 (unreliable)
> > [  207.403581] [c003de397750] [c0102cc8] 
> > create_worker+0x148/0x250
> > [  207.403585] [c003de3977f0] [c0105e7c] 
> > alloc_unbound_pwq+0x3bc/0x4c0
> > [  207.403589] [c003de397850] [c01064bc] 
> > apply_wqattrs_prepare+0x2ac/0x320
> > [  207.403593] [c003de3978c0] [c010656c] 
> > apply_workqueue_attrs_locked+0x3c/0xa0
> > [  207.403597] [c003de3978f0] [c0106acc] 
> > apply_workqueue_attrs+0x4c/0x80
> > [  207.403601] [c003de397930] [c010866c] 
> > __alloc_workqueue_key+0x16c/0x4e0
> > [  207.403615] [c003de3979f0] [d00013de5ce0] 
> > ext4_fill_super+0x1c70/0x3390 [ext4]
> > [  207.403620] [c003de397b30] [c031739c] mount_bdev+0x21c/0x250
> > [  207.403633] [c003de397bd0] [d00013dddb80] ext4_mount+0x20/0x40 
> > [ext4]
> > [  207.403637] [c003de397bf0] [c0318944] mount_fs+0x74/0x210
> > [  207.403641] [c003de397ca0] [c0340638] 
> > vfs_kern_mount+0x68/0x1d0
> > [  207.403644] [c003de397d10] [c0345348] do_mount+0x278/0xef0
> > [  207.403648] [c003de397de0] [c03463e4] SyS_mount+0x94/0x100
> > [  207.403652] [c003de397e30] [c000af84] system_call+0x38/0xe0
> > [  207.403655] Instruction dump:
> > [  207.403658] 6042 3860 4e800020 6000 6042 7c832378 
> > 4e800020 6000
> > [  207.403663] 6000 e925 f924 7c0004ac <7d4028a8> 7c2a4800 
> > 40c20010 7c6029ad
> > [  207.403669] ---[ end trace 4fa94bf890f28f69 ]---
> >
> > Today I've finally found a host that could reliably trigger the crash by
> > mounting an ext4 filesystem and I've done a git bisect. The first bad
> > pointed to this commit:
> 
> Thanks for the excellent bug report, I am a little lost on the stack
> trace, it shows a bad page access that we think is triggered by the
> mmap changes? The patch changed the return type to integrate the call
> into trace-cmd. Could you point me to

[v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Eryu Guan
Hi all,

Li Wang and I are constantly seeing ppc64le hosts crashing due to bad
page access. But it's not reproducing on every ppc64le host we've
tested, but it usually happened in filesystem testings.

[  207.403459] Unable to handle kernel paging request for unaligned access at 
address 0xc001c52c5e7f
[  207.403470] Faulting instruction address: 0xc04d470c
[  207.403475] Oops: Kernel access of bad area, sig: 7 [#1]
[  207.403477] SMP NR_CPUS=2048
[  207.403478] NUMA
[  207.403480] pSeries
[  207.403483] Modules linked in: ext4 jbd2 mbcache sg pseries_rng 
ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace 
sunrpc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp
[  207.403503] CPU: 0 PID: 2263 Comm: mount Not tainted 4.12.0-rc7 #26
[  207.403506] task: c003ef2fde00 task.stack: c003de394000
[  207.403509] NIP: c04d470c LR: c011cd24 CTR: c0130de0
[  207.403512] REGS: c003de397450 TRAP: 0600   Not tainted  (4.12.0-rc7)
[  207.403515] MSR: 80010280b033 
[  207.403521]   CR: 28028844  XER: 0001
[  207.403525] CFAR: c011cd20 DAR: c001c52c5e7f DSISR:  
SOFTE: 0
[  207.403525] GPR00: c011cce8 c003de3976d0 c1049500 
c003f2c6ec20
[  207.403525] GPR04: c003f2c6ec20 c001c52c5e7f  
0001
[  207.403525] GPR08: 000c5543cab19830 000198e19900 0008 

[  207.403525] GPR12: c0130de0 cfac  
c003f1328000
[  207.403525] GPR16:  c003de700400  
c003de700594
[  207.403525] GPR20: 0002  4000 
c0cc5780
[  207.403525] GPR24: 0001c45ffc5f  0001c45ffc5f 
c107dd00
[  207.403525] GPR28: c003f2c6f434 0004 0800 
c003f2c6ec00
[  207.403567] NIP [c04d470c] llist_add_batch+0xc/0x40
[  207.403571] LR [c011cd24] try_to_wake_up+0x4a4/0x5b0
[  207.403573] Call Trace:
[  207.403576] [c003de3976d0] [c011cce8] try_to_wake_up+0x468/0x5b0 
(unreliable)
[  207.403581] [c003de397750] [c0102cc8] create_worker+0x148/0x250
[  207.403585] [c003de3977f0] [c0105e7c] 
alloc_unbound_pwq+0x3bc/0x4c0
[  207.403589] [c003de397850] [c01064bc] 
apply_wqattrs_prepare+0x2ac/0x320
[  207.403593] [c003de3978c0] [c010656c] 
apply_workqueue_attrs_locked+0x3c/0xa0
[  207.403597] [c003de3978f0] [c0106acc] 
apply_workqueue_attrs+0x4c/0x80
[  207.403601] [c003de397930] [c010866c] 
__alloc_workqueue_key+0x16c/0x4e0
[  207.403615] [c003de3979f0] [d00013de5ce0] 
ext4_fill_super+0x1c70/0x3390 [ext4]
[  207.403620] [c003de397b30] [c031739c] mount_bdev+0x21c/0x250
[  207.403633] [c003de397bd0] [d00013dddb80] ext4_mount+0x20/0x40 [ext4]
[  207.403637] [c003de397bf0] [c0318944] mount_fs+0x74/0x210
[  207.403641] [c003de397ca0] [c0340638] vfs_kern_mount+0x68/0x1d0
[  207.403644] [c003de397d10] [c0345348] do_mount+0x278/0xef0
[  207.403648] [c003de397de0] [c03463e4] SyS_mount+0x94/0x100
[  207.403652] [c003de397e30] [c000af84] system_call+0x38/0xe0
[  207.403655] Instruction dump:
[  207.403658] 6042 3860 4e800020 6000 6042 7c832378 4e800020 
6000
[  207.403663] 6000 e925 f924 7c0004ac <7d4028a8> 7c2a4800 40c20010 
7c6029ad
[  207.403669] ---[ end trace 4fa94bf890f28f69 ]---

Today I've finally found a host that could reliably trigger the crash by
mounting an ext4 filesystem and I've done a git bisect. The first bad
pointed to this commit:

commit 9c355917fcf006af47ffaa5ae43a1a804764a6f6
Author: Balbir Singh 
Date:   Wed Apr 12 16:35:19 2017 +1000

powerpc/tracing: Allow tracing of mmap syscalls

Currently sys_mmap() and sys_mmap2() (32-bit only), are not visible to the
syscall tracing machinery. This means users are not able to see the 
execution of
mmap() syscalls using the syscall tracer.

Fix that by using SYSCALL_DEFINE6 for sys_mmap() and sys_mmap2() so that the
meta-data associated with these syscalls is visible to the syscall tracer.

A side-effect of this change is that the return type has changed from 
unsigned
long to long. However this should have no effect, the only code in the 
kernel
which uses the result of these syscalls is in the syscall return path, 
which is
written in asm and treats the result as unsigned regardless.

Example output:
  cat-3399  [001]    196.542410: sys_mmap(addr: 7fff922a, len: 
2, prot: 3, flags: 812, fd: 3, offset: 1b)
  cat-3399  [001]    196.542443: sys_mmap -> 0x7fff922a
  cat-3399  [001]    196.542668: sys_munmap(addr: 7fff922c, len: 
6d2c)
  cat-3399  [001]