Re: [RESEND] Re: usb/net/p54: trying to register non-static key in p54_unregister_leds

2017-09-26 Thread Andrey Konovalov
On Sat, Sep 23, 2017 at 9:37 PM, 'Christian Lamparter' via syzkaller
 wrote:
> This got rejected by gmail once. Let's see if it works now.
>
> On Thursday, September 21, 2017 8:22:45 PM CEST Andrey Konovalov wrote:
>> On Wed, Sep 20, 2017 at 9:55 PM, Johannes Berg
>>  wrote:
>> > On Wed, 2017-09-20 at 21:27 +0200, Christian Lamparter wrote:
>> >
>> >> It seems this is caused as a result of:
>> >> -> lock_map_acquire(&work->lockdep_map);
>> >>   lock_map_release(&work->lockdep_map);
>> >>
>> >> in flush_work() [0]
>> >
>> > Agree.
>> >
>> >> This was added by:
>> >>
>> >>   commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a
>> >>   Author: Stephen Boyd 
>> >>   Date:   Fri Apr 20 17:28:50 2012 -0700
>> >>
>> >>   workqueue: Catch more locking problems with flush_work()
>> >
>> > Yes, but that doesn't matter.
>> >
>> >> Looking at the Stephen's patch, it's clear that it was made
>> >> with "static DECLARE_WORK(work, my_work)" in mind. However
>> >> p54's led_work is "per-device", hence it is stored in the
>> >> devices context p54_common, which is dynamically allocated.
>> >> So, maybe revert Stephen's patch?
>> >
>> > I disagree - as the lockdep warning says:
>> >
>> >> > INFO: trying to register non-static key.
>> >> > the code is fine but needs lockdep annotation.
>> >> > turning off the locking correctness validator.
>> >
>> > What it needs is to actually correctly go through initializing the work
>> > at least once.
>> >
>> > Without more information, I can't really say what's going on, but I
>> > assume that something is failing and p54_unregister_leds() is getting
>> > invoked without p54_init_leds() having been invoked, so essentially
>> > it's trying to flush a work that was never initialized?
>> >
>> > INIT_DELAYED_WORK() does, after all, initialize the lockdep map
>> > properly via __INIT_WORK().
>
> Ok, thanks. This does indeed explain it.
>
> But this also begs the question: Is this really working then?
> From what I can tell, if CONFIG_LOCKDEP is not set then there's no BUG
> no WARN, no other splat or any other odd system behaviour. Does
> [cancel | flush]_[delayed_]work[_sync] really "just work" by *accident*,
> as long the delayed_work | work_struct is zeroed out?
> And should it work in the future as well?
>
>> Since I'm able to reproduce this, please let me know if you need me to
>> collect some debug traces to help with the triage.
>
> Do you want to take a shot at making a patch too? At a quick glance, it
> should be enough to move the [#ifdef CONFIG_P54_LEDS ... #endif] block
> in p54_unregister_common() into the if (priv->registered) { block
> (preferably before the ieee80211_unregister_hw(dev).

Just mailed a patch.

>
> Regards,
> Christian
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


Re: [RESEND] Re: usb/net/p54: trying to register non-static key in p54_unregister_leds

2017-09-24 Thread Johannes Berg
On Sat, 2017-09-23 at 21:37 +0200, Christian Lamparter wrote:

> But this also begs the question: Is this really working then?
> From what I can tell, if CONFIG_LOCKDEP is not set then there's no
> BUG no WARN, no other splat or any other odd system behaviour. Does
> [cancel | flush]_[delayed_]work[_sync] really "just work" by
> *accident*, as long the delayed_work | work_struct is zeroed out? 

It looks like it does, but I'm not sure it's not more or less by
accident. Look at get_work_pool() for example, it might actually return
non-NULL in this case, and then in start_flush_work() you'll probably
fall into one of the few "already_gone" cases.

> And should it work in the future as well?

I guess it's not really guaranteed, the API doesn't state anything to
that effect. Not that I'm looking forward to a new workqueue rewrite ;)

johannes


[RESEND] Re: usb/net/p54: trying to register non-static key in p54_unregister_leds

2017-09-23 Thread Christian Lamparter
This got rejected by gmail once. Let's see if it works now.

On Thursday, September 21, 2017 8:22:45 PM CEST Andrey Konovalov wrote:
> On Wed, Sep 20, 2017 at 9:55 PM, Johannes Berg
>  wrote:
> > On Wed, 2017-09-20 at 21:27 +0200, Christian Lamparter wrote:
> >
> >> It seems this is caused as a result of:
> >> -> lock_map_acquire(&work->lockdep_map);
> >>   lock_map_release(&work->lockdep_map);
> >>
> >> in flush_work() [0]
> >
> > Agree.
> >
> >> This was added by:
> >>
> >>   commit 0976dfc1d0cd80a4e9dfaf87bd8744612bde475a
> >>   Author: Stephen Boyd 
> >>   Date:   Fri Apr 20 17:28:50 2012 -0700
> >>
> >>   workqueue: Catch more locking problems with flush_work()
> >
> > Yes, but that doesn't matter.
> >
> >> Looking at the Stephen's patch, it's clear that it was made
> >> with "static DECLARE_WORK(work, my_work)" in mind. However
> >> p54's led_work is "per-device", hence it is stored in the
> >> devices context p54_common, which is dynamically allocated.
> >> So, maybe revert Stephen's patch?
> >
> > I disagree - as the lockdep warning says:
> >
> >> > INFO: trying to register non-static key.
> >> > the code is fine but needs lockdep annotation.
> >> > turning off the locking correctness validator.
> >
> > What it needs is to actually correctly go through initializing the work
> > at least once.
> >
> > Without more information, I can't really say what's going on, but I
> > assume that something is failing and p54_unregister_leds() is getting
> > invoked without p54_init_leds() having been invoked, so essentially
> > it's trying to flush a work that was never initialized?
> >
> > INIT_DELAYED_WORK() does, after all, initialize the lockdep map
> > properly via __INIT_WORK().

Ok, thanks. This does indeed explain it.

But this also begs the question: Is this really working then?
>From what I can tell, if CONFIG_LOCKDEP is not set then there's no BUG
no WARN, no other splat or any other odd system behaviour. Does
[cancel | flush]_[delayed_]work[_sync] really "just work" by *accident*,
as long the delayed_work | work_struct is zeroed out? 
And should it work in the future as well?

> Since I'm able to reproduce this, please let me know if you need me to
> collect some debug traces to help with the triage.

Do you want to take a shot at making a patch too? At a quick glance, it
should be enough to move the [#ifdef CONFIG_P54_LEDS ... #endif] block
in p54_unregister_common() into the if (priv->registered) { block
(preferably before the ieee80211_unregister_hw(dev).

Regards,
Christian