Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Eryu Guan
On Tue, Jul 04, 2017 at 09:06:55PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > > On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote: > >> Eryu Guan writes: > >> > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > >>

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Michael Ellerman
Eryu Guan writes: > On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote: >> Eryu Guan writes: >> > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: >> >> >> >> Can you try this patch and see if it changes anything? (with the

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Eryu Guan
On Tue, Jul 04, 2017 at 04:26:11PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > >> > >> Can you try this patch and see if it changes anything? (with the debug > >> still applied). > > > > This patch

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-07-04 Thread Michael Ellerman
Eryu Guan writes: > On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: >> >> Can you try this patch and see if it changes anything? (with the debug >> still applied). > > This patch fixes the crash for me. After appliying this patch (with all > other debug

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-30 Thread Tejun Heo
Hello, Michael. On Fri, Jun 30, 2017 at 11:08:22AM +1000, Michael Ellerman wrote: > Tejun Heo writes: > > > Could be the same problem as the one reported in the following thread. > > > > http://lkml.kernel.org/r/1497266622.15415.39.ca...@abdul.in.ibm.com > > > > The root cause

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-30 Thread Eryu Guan
On Fri, Jun 30, 2017 at 08:07:02PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > > > I have to update the patch a bit to make it compile. > > Sure. > > >> + WARN_ON(cpumask_empty(worker->task->cpus_allowed)); > >> + WARN_ON(cpumask_empty(pool->attrs->cpumask)); > >

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-30 Thread Michael Ellerman
Eryu Guan writes: > > I have to update the patch a bit to make it compile. Sure. >> +WARN_ON(cpumask_empty(worker->task->cpus_allowed)); >> +WARN_ON(cpumask_empty(pool->attrs->cpumask)); > > Seems only the last two WARN_ON were triggered. OK thanks. Can you try this

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-29 Thread Michael Ellerman
Tejun Heo writes: > Hello, > > Could be the same problem as the one reported in the following thread. > > http://lkml.kernel.org/r/1497266622.15415.39.ca...@abdul.in.ibm.com > > The root cause there is ppc arch code not setting up possible cpu <-> > numa mapping during boot.

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-29 Thread Tejun Heo
Hello, Could be the same problem as the one reported in the following thread. http://lkml.kernel.org/r/1497266622.15415.39.ca...@abdul.in.ibm.com The root cause there is ppc arch code not setting up possible cpu <-> numa mapping during boot. Thanks. -- tejun

Re: kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 10:06:31PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > > On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote: > >> Eryu Guan writes: > >> > >> > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: >

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Michael Ellerman
Eryu Guan writes: > On Thu, Jun 29, 2017 at 08:27:11PM +1000, Michael Ellerman wrote: >> Eryu Guan writes: >> >> > Hi all, >> > >> > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad >> > page access. But it's not reproducing on every

kworker with empty task->cpus_allowed (was Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host)

2017-06-29 Thread Michael Ellerman
Eryu Guan writes: > On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote: >> Eryu Guan writes: >> >> > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: >> >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: >> >>

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 09:12:55PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: > >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000,

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Michael Ellerman
Eryu Guan writes: > On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: >> On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: >> > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: >> >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 08:27:11PM +1000, Michael Ellerman wrote: > Eryu Guan writes: > > > Hi all, > > > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > > page access. But it's not reproducing on every ppc64le host we've > > tested, but it usually

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Michael Ellerman
Eryu Guan writes: > Hi all, > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > page access. But it's not reproducing on every ppc64le host we've > tested, but it usually happened in filesystem testings. > And I've confirmed that reverting above

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: > > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan wrote: > > >> Thanks for the

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Eryu Guan
On Thu, Jun 29, 2017 at 06:47:50PM +1000, Balbir Singh wrote: > On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: > > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan wrote: > > >> Thanks for the

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-29 Thread Balbir Singh
On Thu, Jun 29, 2017 at 1:41 PM, Eryu Guan wrote: > On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: >> On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan wrote: >> Thanks for the excellent bug report, I am a little lost on the stack >> trace, it shows a

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Michael Ellerman
Hi Eryu, Thanks for the bug report. Eryu Guan writes: > Hi all, > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad I'm curious why you're seeing this and not other folks. What compiler are you using? > page access. But it's not reproducing on every

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Eryu Guan
On Thu, Jun 29, 2017 at 03:16:10AM +1000, Balbir Singh wrote: > On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan wrote: > > Hi all, > > > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > > page access. But it's not reproducing on every ppc64le host we've > >

Re: [v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Balbir Singh
On Wed, Jun 28, 2017 at 6:32 PM, Eryu Guan wrote: > Hi all, > > Li Wang and I are constantly seeing ppc64le hosts crashing due to bad > page access. But it's not reproducing on every ppc64le host we've > tested, but it usually happened in filesystem testings. > > [ 207.403459]

[v4.12-rc1 regression] mount ext4 fs results in kernel crash on PPC64le host

2017-06-28 Thread Eryu Guan
Hi all, Li Wang and I are constantly seeing ppc64le hosts crashing due to bad page access. But it's not reproducing on every ppc64le host we've tested, but it usually happened in filesystem testings. [ 207.403459] Unable to handle kernel paging request for unaligned access at address