On Fri, May 29, 2020 at 4:05 PM Paul E. McKenney <[email protected]> wrote:
>
> On Fri, May 29, 2020 at 08:20:12AM +0200, Dmitry Vyukov wrote:
> > On Thu, May 28, 2020 at 10:48 PM Paul E. McKenney <[email protected]> 
> > wrote:
> > >
> > > On Thu, May 28, 2020 at 10:19:02PM +0200, Thomas Gleixner wrote:
> > > > Paul,
> > > >
> > > > "Paul E. McKenney" <[email protected]> writes:
> > > > > On Thu, May 28, 2020 at 03:33:44PM +0200, Thomas Gleixner wrote:
> > > > >> syzbot <[email protected]> 
> > > > >> writes:
> > > > >> Weird. I have no idea how that thing is an EQS here.
> > > > >
> > > > > No argument on the "Weird" part!  ;-)
> > > > >
> > > > > Is this a NO_HZ_FULL=y kernel?
> > > >
> > > > No, it has only NO_HZ_IDLE.
> > > >
> > > >   https://syzkaller.appspot.com/x/.config?x=47b0740d89299c10
> > >
> > > OK, from the .config, another suggestion is to build the kernel
> > > with CONFIG_RCU_EQS_DEBUG=y.  This still requires that this issue be
> > > reproduced, but it might catch the problem earlier.
> >
> > How much does it slow down execution? If we enable it on syzbot, it
> > will affect all fuzzing done by syzbot always.
> > It can tolerate significant slowdown and it's far from a production
> > kernel (it enables KASAN, KCOV, LOCKDEP and more). But I am still
> > asking because some debugging features are built without performance
> > in mind at all (like let's just drop a global lock in every
> > kmalloc/free, which may be too much even for a standard debug build).
>
> It is an extra WARN_ON_ONCE() with a simple comparison, but on almost
> every kernel entry/exit path.
>
> So not something you want in production, but much lighter weight than
> any of the tools you listed above.
>
> Full disclosure:  It usually fires for new architectures or for new
> timer hardware/drivers.  Which might allow you to enable it selectively.


This sounds reasonable. I've enabled it:
https://github.com/google/syzkaller/commit/3905eaae004605f4ec4dab83e6883173796118c8
syzbot will pick up within a day or so. Then crashes will have any
additional checks captured.

The arch/hardware is quite old: x86_64/GCE. It also booted for me in
qemu without warnings.




>                                                         Thanx, Paul
>
> > > > > If so, one possibility is that the call
> > > > > to rcu_user_exit() went missing somehow.  If not, then RCU should have
> > > > > been watching userspace execution.
> > > > >
> > > > > Again, the only thing I can think of (should this prove to be
> > > > > reproducible) is the rcu_dyntick trace event.
> > > >
> > > > :)
> > > >
> > > > Thanks,
> > > >
> > > >         tglx
> > >
> > >                                                         Thanx, Paul
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups 
> > > "syzkaller-bugs" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an 
> > > email to [email protected].
> > > To view this discussion on the web visit 
> > > https://groups.google.com/d/msgid/syzkaller-bugs/20200528204839.GR2869%40paulmck-ThinkPad-P72.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/20200529140521.GA2869%40paulmck-ThinkPad-P72.

Reply via email to