Re: [PATCH] leds: trigger: fix potential deadlock with libata
On 08.03.2021 15:56:53, Marc Kleine-Budde wrote: > > > meanwhile this patch hit v5.10.x stable and caused a performance > > > degradation on our use case: > > > > > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > > > controller. CAN stands for Controller Area Network and here used to > > > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > > > Transport Protocol) transfer is running. With this patch, we see CAN > > > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > > > > > Reverting this patch, restores the old performance. > > > > > > What is the best way to solve this dilemma? Identify the critical path > > > in our use case? Is there a way we can get around the irqsave in > > > led_trigger_event()? > > > > > > > Probably, we can change from rwlock to rcu here, POC code as follow, > > only compile tested. Marc, could you see whether this help the > > performance on your platform? Please note that I haven't test it in a > > running kernel and I'm not that familir with led subsystem, so use it > > with caution ;-) > > > > (While at it, I think maybe we miss the leddev_list_lock in net/mac80211 > > in the patch) > > I can confirm, this patch basically restores the old performance. Hmmm - it first looked OK, now it doesn't - let me do some better testing. Sorry for the noise, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [PATCH] leds: trigger: fix potential deadlock with libata
On 07.03.2021 10:02:32, Boqun Feng wrote: > On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote: > > Hello *, > > > > On 02.11.2020 11:41:52, Andrea Righi wrote: > > > We have the following potential deadlock condition: > > > > > > > > > WARNING: possible irq lock inversion dependency detected > > > 5.10.0-rc2+ #25 Not tainted > > > > > > swapper/3/0 just changed the state of lock: > > > 8880063bd618 (>lock){-...}-{2:2}, at: > > > ata_bmdma_interrupt+0x27/0x200 > > > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > > > (>leddev_list_lock){.+.?}-{2:2} > > > > > > and interrupts could create inverse lock ordering between them. > > > > [...] > > > > > --- > > > drivers/leds/led-triggers.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c > > > index 91da90cfb11d..16d1a93a10a8 100644 > > > --- a/drivers/leds/led-triggers.c > > > +++ b/drivers/leds/led-triggers.c > > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > > > enum led_brightness brightness) > > > { > > > struct led_classdev *led_cdev; > > > + unsigned long flags; > > > > > > if (!trig) > > > return; > > > > > > - read_lock(>leddev_list_lock); > > > + read_lock_irqsave(>leddev_list_lock, flags); > > > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > > > led_set_brightness(led_cdev, brightness); > > > - read_unlock(>leddev_list_lock); > > > + read_unlock_irqrestore(>leddev_list_lock, flags); > > > } > > > EXPORT_SYMBOL_GPL(led_trigger_event); > > > > meanwhile this patch hit v5.10.x stable and caused a performance > > degradation on our use case: > > > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > > controller. CAN stands for Controller Area Network and here used to > > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > > Transport Protocol) transfer is running. With this patch, we see CAN > > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > > > Reverting this patch, restores the old performance. > > > > What is the best way to solve this dilemma? Identify the critical path > > in our use case? Is there a way we can get around the irqsave in > > led_trigger_event()? > > > > Probably, we can change from rwlock to rcu here, POC code as follow, > only compile tested. Marc, could you see whether this help the > performance on your platform? Please note that I haven't test it in a > running kernel and I'm not that familir with led subsystem, so use it > with caution ;-) > > (While at it, I think maybe we miss the leddev_list_lock in net/mac80211 > in the patch) I can confirm, this patch basically restores the old performance. Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hi, On 3/7/21 5:13 PM, Pavel Machek wrote: > Hi! > >>> --- a/drivers/leds/led-triggers.c >>> +++ b/drivers/leds/led-triggers.c >>> @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, >>> enum led_brightness brightness) >>> { >>> struct led_classdev *led_cdev; >>> + unsigned long flags; >>> >>> if (!trig) >>> return; >>> >>> - read_lock(>leddev_list_lock); >>> + read_lock_irqsave(>leddev_list_lock, flags); >>> list_for_each_entry(led_cdev, >led_cdevs, trig_list) >>> led_set_brightness(led_cdev, brightness); >>> - read_unlock(>leddev_list_lock); >>> + read_unlock_irqrestore(>leddev_list_lock, flags); >>> } >>> EXPORT_SYMBOL_GPL(led_trigger_event) >> >> meanwhile this patch hit v5.10.x stable and caused a performance >> degradation on our use case: >> >> It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN >> controller. CAN stands for Controller Area Network and here used to >> connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific >> Transport Protocol) transfer is running. With this patch, we see CAN >> frames delayed for ~6ms, the usual gap between CAN frames is 240µs. >> >> Reverting this patch, restores the old performance. >> >> What is the best way to solve this dilemma? Identify the critical path >> in our use case? Is there a way we can get around the irqsave in >> led_trigger_event()? > > Hans was pushing for this patch, perhaps he has some ideas... I was not pushing for this particular fix, I was asking about a fix for the lockdep identified potential deadlock. And you replied that this was already fixed in your for-next branch when I asked, so all in all, other then reporting the potential deadlock (after it was already fixed) I have very little do to with this patch. With that all said, I must say that I'm surprised that switching from read_lock() to read_lock_irqsave() causes such a hefty penalty, so I wonder what is really going on here. Using the irqsave version disables interrupts, but AFAIK only on the current core and only for the duration of the led_set_brightness() call(s) . Is the system perhaps pinning IRQs to a specific CPU in combination with a led_set_brightness() somehow taking much longer then it should? Note that led_set_brightness() calls are not allowed to block, if they block they should use the brightness_set_blocking callback in their led_class_dev struct not the regular brightness_set callback. In which case the LED-core will defer the actually setting of the LED to a workqueue. So one thing which might be worthwhile to check is if any of the LED drivers on the system in question are using the brightness_set callback, where they should be using the blocking one. Regards, Hans
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hi! > > --- > > drivers/leds/led-triggers.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c > > index 91da90cfb11d..16d1a93a10a8 100644 > > --- a/drivers/leds/led-triggers.c > > +++ b/drivers/leds/led-triggers.c > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > > enum led_brightness brightness) > > { > > struct led_classdev *led_cdev; > > + unsigned long flags; > > > > if (!trig) > > return; > > > > - read_lock(>leddev_list_lock); > > + read_lock_irqsave(>leddev_list_lock, flags); > > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > > led_set_brightness(led_cdev, brightness); > > - read_unlock(>leddev_list_lock); > > + read_unlock_irqrestore(>leddev_list_lock, flags); > > } > > EXPORT_SYMBOL_GPL(led_trigger_event); > > meanwhile this patch hit v5.10.x stable and caused a performance > degradation on our use case: > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > controller. CAN stands for Controller Area Network and here used to > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > Transport Protocol) transfer is running. With this patch, we see CAN > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > Reverting this patch, restores the old performance. > > What is the best way to solve this dilemma? Identify the critical path > in our use case? Is there a way we can get around the irqsave in > led_trigger_event()? 6ms is quite long. Are you actively using any triggers? Do you have LED blinking on CAN access? Can you verify if it is cli/sti taking too long, or if the led_set_brightness takes too long? Best regards, Pavel -- http://www.livejournal.com/~pavelmachek signature.asc Description: PGP signature
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hi! > > --- a/drivers/leds/led-triggers.c > > +++ b/drivers/leds/led-triggers.c > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > > enum led_brightness brightness) > > { > > struct led_classdev *led_cdev; > > + unsigned long flags; > > > > if (!trig) > > return; > > > > - read_lock(>leddev_list_lock); > > + read_lock_irqsave(>leddev_list_lock, flags); > > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > > led_set_brightness(led_cdev, brightness); > > - read_unlock(>leddev_list_lock); > > + read_unlock_irqrestore(>leddev_list_lock, flags); > > } > > EXPORT_SYMBOL_GPL(led_trigger_event); > > meanwhile this patch hit v5.10.x stable and caused a performance > degradation on our use case: > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > controller. CAN stands for Controller Area Network and here used to > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > Transport Protocol) transfer is running. With this patch, we see CAN > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > Reverting this patch, restores the old performance. > > What is the best way to solve this dilemma? Identify the critical path > in our use case? Is there a way we can get around the irqsave in > led_trigger_event()? Hans was pushing for this patch, perhaps he has some ideas... Pavel -- http://www.livejournal.com/~pavelmachek signature.asc Description: PGP signature
Re: [PATCH] leds: trigger: fix potential deadlock with libata
On Sun, Mar 07, 2021 at 10:02:32AM +0800, Boqun Feng wrote: > On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote: > > Hello *, > > > > On 02.11.2020 11:41:52, Andrea Righi wrote: > > > We have the following potential deadlock condition: > > > > > > > > > WARNING: possible irq lock inversion dependency detected > > > 5.10.0-rc2+ #25 Not tainted > > > > > > swapper/3/0 just changed the state of lock: > > > 8880063bd618 (>lock){-...}-{2:2}, at: > > > ata_bmdma_interrupt+0x27/0x200 > > > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > > > (>leddev_list_lock){.+.?}-{2:2} > > > > > > and interrupts could create inverse lock ordering between them. > > > > [...] > > > > > --- > > > drivers/leds/led-triggers.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c > > > index 91da90cfb11d..16d1a93a10a8 100644 > > > --- a/drivers/leds/led-triggers.c > > > +++ b/drivers/leds/led-triggers.c > > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > > > enum led_brightness brightness) > > > { > > > struct led_classdev *led_cdev; > > > + unsigned long flags; > > > > > > if (!trig) > > > return; > > > > > > - read_lock(>leddev_list_lock); > > > + read_lock_irqsave(>leddev_list_lock, flags); > > > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > > > led_set_brightness(led_cdev, brightness); > > > - read_unlock(>leddev_list_lock); > > > + read_unlock_irqrestore(>leddev_list_lock, flags); > > > } > > > EXPORT_SYMBOL_GPL(led_trigger_event); > > > > meanwhile this patch hit v5.10.x stable and caused a performance > > degradation on our use case: > > > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > > controller. CAN stands for Controller Area Network and here used to > > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > > Transport Protocol) transfer is running. With this patch, we see CAN > > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > > > Reverting this patch, restores the old performance. > > > > What is the best way to solve this dilemma? Identify the critical path > > in our use case? Is there a way we can get around the irqsave in > > led_trigger_event()? > > > > Probably, we can change from rwlock to rcu here, POC code as follow, > only compile tested. Marc, could you see whether this help the > performance on your platform? Please note that I haven't test it in a > running kernel and I'm not that familir with led subsystem, so use it > with caution ;-) If we don't want to touch the led subsystem at all maybe we could try to fix the problem in libata, we just need to prevent calling led_trigger_blink_oneshot() with >lock held from ata_qc_complete(), maybe doing the led blinking from another context (a workqueue for example)? -Andrea
Re: [PATCH] leds: trigger: fix potential deadlock with libata
On Sat, Mar 06, 2021 at 09:39:54PM +0100, Marc Kleine-Budde wrote: > Hello *, > > On 02.11.2020 11:41:52, Andrea Righi wrote: > > We have the following potential deadlock condition: > > > > > > WARNING: possible irq lock inversion dependency detected > > 5.10.0-rc2+ #25 Not tainted > > > > swapper/3/0 just changed the state of lock: > > 8880063bd618 (>lock){-...}-{2:2}, at: > > ata_bmdma_interrupt+0x27/0x200 > > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > > (>leddev_list_lock){.+.?}-{2:2} > > > > and interrupts could create inverse lock ordering between them. > > [...] > > > --- > > drivers/leds/led-triggers.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c > > index 91da90cfb11d..16d1a93a10a8 100644 > > --- a/drivers/leds/led-triggers.c > > +++ b/drivers/leds/led-triggers.c > > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > > enum led_brightness brightness) > > { > > struct led_classdev *led_cdev; > > + unsigned long flags; > > > > if (!trig) > > return; > > > > - read_lock(>leddev_list_lock); > > + read_lock_irqsave(>leddev_list_lock, flags); > > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > > led_set_brightness(led_cdev, brightness); > > - read_unlock(>leddev_list_lock); > > + read_unlock_irqrestore(>leddev_list_lock, flags); > > } > > EXPORT_SYMBOL_GPL(led_trigger_event); > > meanwhile this patch hit v5.10.x stable and caused a performance > degradation on our use case: > > It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN > controller. CAN stands for Controller Area Network and here used to > connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific > Transport Protocol) transfer is running. With this patch, we see CAN > frames delayed for ~6ms, the usual gap between CAN frames is 240µs. > > Reverting this patch, restores the old performance. > > What is the best way to solve this dilemma? Identify the critical path > in our use case? Is there a way we can get around the irqsave in > led_trigger_event()? > Probably, we can change from rwlock to rcu here, POC code as follow, only compile tested. Marc, could you see whether this help the performance on your platform? Please note that I haven't test it in a running kernel and I'm not that familir with led subsystem, so use it with caution ;-) (While at it, I think maybe we miss the leddev_list_lock in net/mac80211 in the patch) Regards, Boqun --->8 diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c index 4e7b78a84149..ae68ccab6cc9 100644 --- a/drivers/leds/led-triggers.c +++ b/drivers/leds/led-triggers.c @@ -171,10 +171,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig) /* Remove any existing trigger */ if (led_cdev->trigger) { - write_lock_irqsave(_cdev->trigger->leddev_list_lock, flags); - list_del(_cdev->trig_list); - write_unlock_irqrestore(_cdev->trigger->leddev_list_lock, + spin_lock_irqsave(_cdev->trigger->leddev_list_lock, flags); + list_del_rcu(_cdev->trig_list); + spin_unlock_irqrestore(_cdev->trigger->leddev_list_lock, flags); + /* Wait for the readers gone */ + synchronize_rcu(); cancel_work_sync(_cdev->set_brightness_work); led_stop_software_blink(led_cdev); if (led_cdev->trigger->deactivate) @@ -186,9 +188,9 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig) led_set_brightness(led_cdev, LED_OFF); } if (trig) { - write_lock_irqsave(>leddev_list_lock, flags); - list_add_tail(_cdev->trig_list, >led_cdevs); - write_unlock_irqrestore(>leddev_list_lock, flags); + spin_lock_irqsave(>leddev_list_lock, flags); + list_add_tail_rcu(_cdev->trig_list, >led_cdevs); + spin_unlock_irqrestore(>leddev_list_lock, flags); led_cdev->trigger = trig; if (trig->activate) @@ -223,9 +225,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig) trig->deactivate(led_cdev); err_activate: - write_lock_irqsave(_cdev->trigger->leddev_list_lock, flags); - list_del(_cdev->trig_list); - write_unlock_irqrestore(_cdev->trigger->leddev_list_lock, flags); + spin_lock_irqsave(_cdev->trigger->leddev_list_lock, flags); + list_del_rcu(_cdev->trig_list); + spin_unlock_irqrestore(_cdev->trigger->leddev_list_lock, flags); + + /* XXX could use call_rcu() here? */ +
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hello *, On 02.11.2020 11:41:52, Andrea Righi wrote: > We have the following potential deadlock condition: > > > WARNING: possible irq lock inversion dependency detected > 5.10.0-rc2+ #25 Not tainted > > swapper/3/0 just changed the state of lock: > 8880063bd618 (>lock){-...}-{2:2}, at: > ata_bmdma_interrupt+0x27/0x200 > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > (>leddev_list_lock){.+.?}-{2:2} > > and interrupts could create inverse lock ordering between them. [...] > --- > drivers/leds/led-triggers.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c > index 91da90cfb11d..16d1a93a10a8 100644 > --- a/drivers/leds/led-triggers.c > +++ b/drivers/leds/led-triggers.c > @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig, > enum led_brightness brightness) > { > struct led_classdev *led_cdev; > + unsigned long flags; > > if (!trig) > return; > > - read_lock(>leddev_list_lock); > + read_lock_irqsave(>leddev_list_lock, flags); > list_for_each_entry(led_cdev, >led_cdevs, trig_list) > led_set_brightness(led_cdev, brightness); > - read_unlock(>leddev_list_lock); > + read_unlock_irqrestore(>leddev_list_lock, flags); > } > EXPORT_SYMBOL_GPL(led_trigger_event); meanwhile this patch hit v5.10.x stable and caused a performance degradation on our use case: It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN controller. CAN stands for Controller Area Network and here used to connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific Transport Protocol) transfer is running. With this patch, we see CAN frames delayed for ~6ms, the usual gap between CAN frames is 240µs. Reverting this patch, restores the old performance. What is the best way to solve this dilemma? Identify the critical path in our use case? Is there a way we can get around the irqsave in led_trigger_event()? regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [PATCH] leds: trigger: fix potential deadlock with libata
On Wed, Nov 25, 2020 at 03:15:18PM +0100, Andrea Righi wrote: ... > > I'd hate to see this in stable 3 days after Linus merges it... > > > > Do these need _irqsave, too? > > > > drivers/leds/led-triggers.c: read_lock(>leddev_list_lock); > > drivers/leds/led-triggers.c: read_unlock(>leddev_list_lock); > > drivers/leds/led-triggers.c: read_lock(>leddev_list_lock); > > drivers/leds/led-triggers.c: read_unlock(>leddev_list_lock); > > > > Best regards, > > I think also led_trigger_blink_setup() needs to use irqsave/irqrestore, > in fact: > > $ git grep "led_trigger_blink(" > drivers/leds/led-triggers.c:void led_trigger_blink(struct led_trigger *trig, > drivers/power/supply/power_supply_leds.c: > led_trigger_blink(psy->charging_blink_full_solid_trig, > include/linux/leds.h:void led_trigger_blink(struct led_trigger *trigger, > unsigned long *delay_on, > include/linux/leds.h:static inline void led_trigger_blink(struct led_trigger > *trigger, > > power_supply_leds.c is using led_trigger_blink() from a workqueue > context, so potentially the same deadlock condition can also happen. > > Let me know if you want me to send a new patch to include also this > case. Just sent (and tested) a v2 of this patch that changes also led_trigger_blink_setup(). -Andrea
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hi Pavel, On Wed, Nov 25, 2020 at 01:46:48PM +0100, Pavel Machek wrote: > Hi! > > > We have the following potential deadlock condition: > > > > > > WARNING: possible irq lock inversion dependency detected > > 5.10.0-rc2+ #25 Not tainted > > > > swapper/3/0 just changed the state of lock: > > 8880063bd618 (>lock){-...}-{2:2}, at: > > ata_bmdma_interrupt+0x27/0x200 > > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > > (>leddev_list_lock){.+.?}-{2:2} > > > > and interrupts could create inverse lock ordering between them. > > > > other info that might help us debug this: > > Possible interrupt unsafe locking scenario: > > > > CPU0CPU1 > > > >lock(>leddev_list_lock); > > local_irq_disable(); > > lock(>lock); > > lock(>leddev_list_lock); > > > > lock(>lock); > > > > *** DEADLOCK *** > > > > no locks held by swapper/3/0. > > > > the shortest dependencies between 2nd lock and 1st lock: > > -> (>leddev_list_lock){.+.?}-{2:2} ops: 46 { > > HARDIRQ-ON-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > IN-SOFTIRQ-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >kbd_bh+0x9e/0xc0 > >tasklet_action_common.constprop.0+0xe9/0x100 > >tasklet_action+0x22/0x30 > >__do_softirq+0xcc/0x46d > >run_ksoftirqd+0x3f/0x70 > >smpboot_thread_fn+0x116/0x1f0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > SOFTIRQ-ON-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > INITIAL READ USE at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > >} > >... key at: [] __key.0+0x0/0x10 > >... acquired at: > > _raw_read_lock+0x42/0x90 > > led_trigger_blink_oneshot+0x3b/0x90 > > ledtrig_disk_activity+0x3c/0xa0 > > ata_qc_complete+0x26/0x450 > > ata_do_link_abort+0xa3/0xe0 > > ata_port_freeze+0x2e/0x40 > > ata_hsm_qc_complete+0x94/0xa0 > > ata_sff_hsm_move+0x177/0x7a0 > > ata_sff_pio_task+0xc7/0x1b0 > > process_one_work+0x240/0x560 > > worker_thread+0x58/0x3d0 > > kthread+0x151/0x170 > > ret_from_fork+0x1f/0x30 > > > > -> (>lock){-...}-{2:2} ops: 69 { > > IN-HARDIRQ-W at: > > lock_acquire+0x15f/0x420 > > _raw_spin_lock_irqsave+0x52/0xa0 > > ata_bmdma_interrupt+0x27/0x200 > > __handle_irq_event_percpu+0xd5/0x2b0 > > handle_irq_event+0x57/0xb0 > > handle_edge_irq+0x8c/0x230 > > asm_call_irq_on_stack+0xf/0x20 > > common_interrupt+0x100/0x1c0 > > asm_common_interrupt+0x1e/0x40 > > native_safe_halt+0xe/0x10 > > arch_cpu_idle+0x15/0x20 > > default_idle_call+0x59/0x1c0 > > do_idle+0x22c/0x2c0 > > cpu_startup_entry+0x20/0x30 > > start_secondary+0x11d/0x150 > > secondary_startup_64_no_verify+0xa6/0xab > > INITIAL USE at: > > lock_acquire+0x15f/0x420 > > _raw_spin_lock_irqsave+0x52/0xa0 > > ata_dev_init+0x54/0xe0 > >
Re: [PATCH] leds: trigger: fix potential deadlock with libata
On Wed, Nov 25, 2020 at 01:46:48PM +0100, Pavel Machek wrote: > Hi! > > > We have the following potential deadlock condition: > > > > > > WARNING: possible irq lock inversion dependency detected > > 5.10.0-rc2+ #25 Not tainted > > > > swapper/3/0 just changed the state of lock: > > 8880063bd618 (>lock){-...}-{2:2}, at: > > ata_bmdma_interrupt+0x27/0x200 > > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > > (>leddev_list_lock){.+.?}-{2:2} > > > > and interrupts could create inverse lock ordering between them. > > > > other info that might help us debug this: > > Possible interrupt unsafe locking scenario: > > > > CPU0CPU1 > > > >lock(>leddev_list_lock); > > local_irq_disable(); > > lock(>lock); > > lock(>leddev_list_lock); > > > > lock(>lock); > > > > *** DEADLOCK *** > > > > no locks held by swapper/3/0. > > > > the shortest dependencies between 2nd lock and 1st lock: > > -> (>leddev_list_lock){.+.?}-{2:2} ops: 46 { > > HARDIRQ-ON-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > IN-SOFTIRQ-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >kbd_bh+0x9e/0xc0 > >tasklet_action_common.constprop.0+0xe9/0x100 > >tasklet_action+0x22/0x30 > >__do_softirq+0xcc/0x46d > >run_ksoftirqd+0x3f/0x70 > >smpboot_thread_fn+0x116/0x1f0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > SOFTIRQ-ON-R at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > > INITIAL READ USE at: > >lock_acquire+0x15f/0x420 > >_raw_read_lock+0x42/0x90 > >led_trigger_event+0x2b/0x70 > >rfkill_global_led_trigger_worker+0x94/0xb0 > >process_one_work+0x240/0x560 > >worker_thread+0x58/0x3d0 > >kthread+0x151/0x170 > >ret_from_fork+0x1f/0x30 > >} > >... key at: [] __key.0+0x0/0x10 > >... acquired at: > > _raw_read_lock+0x42/0x90 > > led_trigger_blink_oneshot+0x3b/0x90 > > ledtrig_disk_activity+0x3c/0xa0 > > ata_qc_complete+0x26/0x450 > > ata_do_link_abort+0xa3/0xe0 > > ata_port_freeze+0x2e/0x40 > > ata_hsm_qc_complete+0x94/0xa0 > > ata_sff_hsm_move+0x177/0x7a0 > > ata_sff_pio_task+0xc7/0x1b0 > > process_one_work+0x240/0x560 > > worker_thread+0x58/0x3d0 > > kthread+0x151/0x170 > > ret_from_fork+0x1f/0x30 > > > > -> (>lock){-...}-{2:2} ops: 69 { > > IN-HARDIRQ-W at: > > lock_acquire+0x15f/0x420 > > _raw_spin_lock_irqsave+0x52/0xa0 > > ata_bmdma_interrupt+0x27/0x200 > > __handle_irq_event_percpu+0xd5/0x2b0 > > handle_irq_event+0x57/0xb0 > > handle_edge_irq+0x8c/0x230 > > asm_call_irq_on_stack+0xf/0x20 > > common_interrupt+0x100/0x1c0 > > asm_common_interrupt+0x1e/0x40 > > native_safe_halt+0xe/0x10 > > arch_cpu_idle+0x15/0x20 > > default_idle_call+0x59/0x1c0 > > do_idle+0x22c/0x2c0 > > cpu_startup_entry+0x20/0x30 > > start_secondary+0x11d/0x150 > > secondary_startup_64_no_verify+0xa6/0xab > > INITIAL USE at: > > lock_acquire+0x15f/0x420 > > _raw_spin_lock_irqsave+0x52/0xa0 > > ata_dev_init+0x54/0xe0 > > ata_link_init+0x8b/0xd0
Re: [PATCH] leds: trigger: fix potential deadlock with libata
Hi! > We have the following potential deadlock condition: > > > WARNING: possible irq lock inversion dependency detected > 5.10.0-rc2+ #25 Not tainted > > swapper/3/0 just changed the state of lock: > 8880063bd618 (>lock){-...}-{2:2}, at: > ata_bmdma_interrupt+0x27/0x200 > but this lock took another, HARDIRQ-READ-unsafe lock in the past: > (>leddev_list_lock){.+.?}-{2:2} > > and interrupts could create inverse lock ordering between them. > > other info that might help us debug this: > Possible interrupt unsafe locking scenario: > > CPU0CPU1 > >lock(>leddev_list_lock); > local_irq_disable(); > lock(>lock); > lock(>leddev_list_lock); > > lock(>lock); > > *** DEADLOCK *** > > no locks held by swapper/3/0. > > the shortest dependencies between 2nd lock and 1st lock: > -> (>leddev_list_lock){.+.?}-{2:2} ops: 46 { > HARDIRQ-ON-R at: >lock_acquire+0x15f/0x420 >_raw_read_lock+0x42/0x90 >led_trigger_event+0x2b/0x70 >rfkill_global_led_trigger_worker+0x94/0xb0 >process_one_work+0x240/0x560 >worker_thread+0x58/0x3d0 >kthread+0x151/0x170 >ret_from_fork+0x1f/0x30 > IN-SOFTIRQ-R at: >lock_acquire+0x15f/0x420 >_raw_read_lock+0x42/0x90 >led_trigger_event+0x2b/0x70 >kbd_bh+0x9e/0xc0 >tasklet_action_common.constprop.0+0xe9/0x100 >tasklet_action+0x22/0x30 >__do_softirq+0xcc/0x46d >run_ksoftirqd+0x3f/0x70 >smpboot_thread_fn+0x116/0x1f0 >kthread+0x151/0x170 >ret_from_fork+0x1f/0x30 > SOFTIRQ-ON-R at: >lock_acquire+0x15f/0x420 >_raw_read_lock+0x42/0x90 >led_trigger_event+0x2b/0x70 >rfkill_global_led_trigger_worker+0x94/0xb0 >process_one_work+0x240/0x560 >worker_thread+0x58/0x3d0 >kthread+0x151/0x170 >ret_from_fork+0x1f/0x30 > INITIAL READ USE at: >lock_acquire+0x15f/0x420 >_raw_read_lock+0x42/0x90 >led_trigger_event+0x2b/0x70 >rfkill_global_led_trigger_worker+0x94/0xb0 >process_one_work+0x240/0x560 >worker_thread+0x58/0x3d0 >kthread+0x151/0x170 >ret_from_fork+0x1f/0x30 >} >... key at: [] __key.0+0x0/0x10 >... acquired at: > _raw_read_lock+0x42/0x90 > led_trigger_blink_oneshot+0x3b/0x90 > ledtrig_disk_activity+0x3c/0xa0 > ata_qc_complete+0x26/0x450 > ata_do_link_abort+0xa3/0xe0 > ata_port_freeze+0x2e/0x40 > ata_hsm_qc_complete+0x94/0xa0 > ata_sff_hsm_move+0x177/0x7a0 > ata_sff_pio_task+0xc7/0x1b0 > process_one_work+0x240/0x560 > worker_thread+0x58/0x3d0 > kthread+0x151/0x170 > ret_from_fork+0x1f/0x30 > > -> (>lock){-...}-{2:2} ops: 69 { > IN-HARDIRQ-W at: > lock_acquire+0x15f/0x420 > _raw_spin_lock_irqsave+0x52/0xa0 > ata_bmdma_interrupt+0x27/0x200 > __handle_irq_event_percpu+0xd5/0x2b0 > handle_irq_event+0x57/0xb0 > handle_edge_irq+0x8c/0x230 > asm_call_irq_on_stack+0xf/0x20 > common_interrupt+0x100/0x1c0 > asm_common_interrupt+0x1e/0x40 > native_safe_halt+0xe/0x10 > arch_cpu_idle+0x15/0x20 > default_idle_call+0x59/0x1c0 > do_idle+0x22c/0x2c0 > cpu_startup_entry+0x20/0x30 > start_secondary+0x11d/0x150 > secondary_startup_64_no_verify+0xa6/0xab > INITIAL USE at: > lock_acquire+0x15f/0x420 > _raw_spin_lock_irqsave+0x52/0xa0 > ata_dev_init+0x54/0xe0 > ata_link_init+0x8b/0xd0 > ata_port_alloc+0x1f1/0x210 > ata_host_alloc+0xf1/0x130 > ata_host_alloc_pinfo+0x14/0xb0 > ata_pci_sff_prepare_host+0x41/0xa0 > ata_pci_bmdma_prepare_host+0x14/0x30 >
[PATCH] leds: trigger: fix potential deadlock with libata
We have the following potential deadlock condition: WARNING: possible irq lock inversion dependency detected 5.10.0-rc2+ #25 Not tainted swapper/3/0 just changed the state of lock: 8880063bd618 (>lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200 but this lock took another, HARDIRQ-READ-unsafe lock in the past: (>leddev_list_lock){.+.?}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0CPU1 lock(>leddev_list_lock); local_irq_disable(); lock(>lock); lock(>leddev_list_lock); lock(>lock); *** DEADLOCK *** no locks held by swapper/3/0. the shortest dependencies between 2nd lock and 1st lock: -> (>leddev_list_lock){.+.?}-{2:2} ops: 46 { HARDIRQ-ON-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 IN-SOFTIRQ-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 kbd_bh+0x9e/0xc0 tasklet_action_common.constprop.0+0xe9/0x100 tasklet_action+0x22/0x30 __do_softirq+0xcc/0x46d run_ksoftirqd+0x3f/0x70 smpboot_thread_fn+0x116/0x1f0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 SOFTIRQ-ON-R at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 INITIAL READ USE at: lock_acquire+0x15f/0x420 _raw_read_lock+0x42/0x90 led_trigger_event+0x2b/0x70 rfkill_global_led_trigger_worker+0x94/0xb0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 } ... key at: [] __key.0+0x0/0x10 ... acquired at: _raw_read_lock+0x42/0x90 led_trigger_blink_oneshot+0x3b/0x90 ledtrig_disk_activity+0x3c/0xa0 ata_qc_complete+0x26/0x450 ata_do_link_abort+0xa3/0xe0 ata_port_freeze+0x2e/0x40 ata_hsm_qc_complete+0x94/0xa0 ata_sff_hsm_move+0x177/0x7a0 ata_sff_pio_task+0xc7/0x1b0 process_one_work+0x240/0x560 worker_thread+0x58/0x3d0 kthread+0x151/0x170 ret_from_fork+0x1f/0x30 -> (>lock){-...}-{2:2} ops: 69 { IN-HARDIRQ-W at: lock_acquire+0x15f/0x420 _raw_spin_lock_irqsave+0x52/0xa0 ata_bmdma_interrupt+0x27/0x200 __handle_irq_event_percpu+0xd5/0x2b0 handle_irq_event+0x57/0xb0 handle_edge_irq+0x8c/0x230 asm_call_irq_on_stack+0xf/0x20 common_interrupt+0x100/0x1c0 asm_common_interrupt+0x1e/0x40 native_safe_halt+0xe/0x10 arch_cpu_idle+0x15/0x20 default_idle_call+0x59/0x1c0 do_idle+0x22c/0x2c0 cpu_startup_entry+0x20/0x30 start_secondary+0x11d/0x150 secondary_startup_64_no_verify+0xa6/0xab INITIAL USE at: lock_acquire+0x15f/0x420 _raw_spin_lock_irqsave+0x52/0xa0 ata_dev_init+0x54/0xe0 ata_link_init+0x8b/0xd0 ata_port_alloc+0x1f1/0x210 ata_host_alloc+0xf1/0x130 ata_host_alloc_pinfo+0x14/0xb0 ata_pci_sff_prepare_host+0x41/0xa0 ata_pci_bmdma_prepare_host+0x14/0x30 piix_init_one+0x21f/0x600 local_pci_probe+0x48/0x80 pci_device_probe+0x105/0x1c0 really_probe+0x221/0x490 driver_probe_device+0xe9/0x160