Josh Poimboeuf wrote:
> On Tue, Oct 03, 2017 at 10:44:13PM +0900, Tetsuo Handa wrote:
> > Josh Poimboeuf wrote:
> > 
> > > On Tue, Oct 03, 2017 at 12:37:44PM +0200, Borislav Petkov wrote:
> > > > On Tue, Oct 03, 2017 at 07:29:36PM +0900, Tetsuo Handa wrote:
> > > > > Tetsuo Handa wrote:
> > > > > > Tetsuo Handa wrote:
> > > > > > > Tetsuo Handa wrote:
> > > > > > > > I'm seeing below error between
> > > > > > > > 4898b99c261efe32 ("Merge tag 'acpi-4.13-rc7' of 
> > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm") 
> > > > > > > > (git bisect good (presumably))
> > > > > > > > e6f3faa734a00c60 ("locking/lockdep: Fix workqueue crossrelease 
> > > > > > > > annotation") (git bisect bad) on linux.git .
> > > > > > > 
> > > > > > > F.Y.I. This error remains as of 46c1e79fee417f15 ("Merge branch 
> > > > > > > 'perf-urgent-for-linus' of
> > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") on 
> > > > > > > linux.git .
> > > > > > > 
> > > > > > 
> > > > > > This error still remains as of 6e80ecdddf4ea6f3 ("Merge branch 
> > > > > > 'libnvdimm-fixes'
> > > > > > of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm") on 
> > > > > > linux.git .
> > > > > > 
> > > > > > I'm suspecting that this error is causing very unstable x86_32 
> > > > > > kernel.
> > > > > > It seems that this error occurs (though rare frequency) even on 
> > > > > > x86_64 kernel.
> > > > > > 
> > > > > > Nobody cares?
> > > > > > 
> > > > > 4.14-rc3 still trivially panics due to this error. Is this problem 
> > > > > known?
> > > 
> > > Can you try with the following patch?  It should hopefully give more
> > > useful information in the dump.
> > > 
> > I see. Here is the result.
> 
> Hm, that's not what I expected to happen...  I suspect this is stack
> corruption, with the result being slightly different every time.  Can
> you see if this patch fixes the panic?

This patch did not fix the problem. But disabling CONFIG_PROVE_LOCKING seems
to avoid this problem. Since "git log 4898b99c261efe32...e6f3faa734a00c60"
range includes lockdep changes, this might be a lockdep problem.

----------
# diff .config.old .config
2132c2132
< CONFIG_PROVE_LOCKING=y
---
> # CONFIG_PROVE_LOCKING is not set
2135,2136d2134
< CONFIG_LOCKDEP_CROSSRELEASE=y
< CONFIG_LOCKDEP_COMPLETIONS=y
2142d2139
< CONFIG_TRACE_IRQFLAGS=y
2157c2154
< CONFIG_PROVE_RCU=y
---
> # CONFIG_PROVE_RCU is not set
----------

Maybe there is a bug in completion and/or crossrelease handling?

Reply via email to