Re: [PATCH] irq: Request and release resources for chained IRQs

2020-07-07 Thread Linus Walleij
On Thu, Jun 18, 2020 at 8:42 PM Thomas Gleixner  wrote:
> David Heidelberg  writes:
> > is there chance to get this patch included or could be this issue
> > solved with different approach?
>
> Included into what? This patch is incorrect as I pointed out in review
> here:
>
>   
> https://lore.kernel.org/lkml/alpine.deb.2.21.1812071143480.14...@nanos.tec.linutronix.de/
>
> So why are you even asking?
>
> I recommended to switch this away from chained handler and then the
> whole story ended with this mail:
>
>   
> https://lore.kernel.org/lkml/alpine.deb.2.21.1812071404140.14...@nanos.tec.linutronix.de/
>
> I have no idea how the GPIO people solved that problem, but certainly
> not by applying this.

We didn't solve that problem.

The reason is that changing the platform from chained to regular handler
involves of course changing all the other drivers in the irqchip hierarchy
since chained is an all-or-nothing approach, and that needs to happen
over the whole set of Qualcomm drivers.

Sooner or later I guess I will get to it unless Bjorn or Brian or someone
else beats me to it.

Yours,
Linus Walleij


Re: [PATCH] irq: Request and release resources for chained IRQs

2020-06-18 Thread Thomas Gleixner
David Heidelberg  writes:
> is there chance to get this patch included or could be this issue 
> solved with different approach?

Included into what? This patch is incorrect as I pointed out in review
here:

  
https://lore.kernel.org/lkml/alpine.deb.2.21.1812071143480.14...@nanos.tec.linutronix.de/

So why are you even asking?

I recommended to switch this away from chained handler and then the
whole story ended with this mail:

  
https://lore.kernel.org/lkml/alpine.deb.2.21.1812071404140.14...@nanos.tec.linutronix.de/

I have no idea how the GPIO people solved that problem, but certainly
not by applying this.

Thanks,

tglx


Re: [PATCH] irq: Request and release resources for chained IRQs

2020-06-15 Thread Greg KH
On Mon, Jun 15, 2020 at 10:23:53PM +0200, David Heidelberg wrote:
> Hello,
> 
> is there chance to get this patch included

What patch?

Included where?

> or could be this issue solved
> with different approach?

Such as what?

Totally confused,

greg k-h


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Thomas Gleixner
Linus,

On Fri, 7 Dec 2018, Linus Walleij wrote:
> On Fri, Dec 7, 2018 at 12:52 PM Thomas Gleixner  wrote:
> > Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
> > either I so hate that chained handler mess
> 
> I think it is just extremely uncommon to remove a chained handler
> (I don't know if anything exercises that path even) that is why we
> don't see any fallout from it. It's probably just untested code.

git grep irq_set_chained.*NULL

tells that there are users. Whether any of this is ever invoked is a
different questions.

> Do you see that chained handlers have any merit at all or should
> they all be moved to nested? The question needs asking, but IIUC
> there are performance benefits with chaining as opposed to
> nesting. :/

The nested case in gpio land is about nesting in irq thread context, which
is obviously slower. So instead of nesting you can just use a regular
interrupt for demultiplexing.

The thing about chained is:

  1) It's hidden, i.e. not exposed in /proc/interrupt

  2) It's not protected against runaway interrupts, which has happened in
 x86 land. We've converted those over to use regular interrupts to
 catch that case and shutdown the offending interrupt.
 e.g. ba714a9c1dea

  3) All the chained handlers have their own flow control inside the
 handler.

The chained performance benefit compared to a regular handler is
minimal. The main difference are a few conditionals and the desc->action
indirection. You probably can measure it with a high interrupt load, but
I'm still not convinced that it outweighs the downsides for a lot of the
places which use chained interrupts. I can see the benefit for very low
level interrupts, but even there we had hard to debug issues with some ARM
SoCs which ran into interrupt storms.

But hey, we surely can fix that chained issue. It's not going to be pretty
though :)

Thanks,

tglx










Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Thomas Gleixner
Linus,

On Fri, 7 Dec 2018, Linus Walleij wrote:
> On Fri, Dec 7, 2018 at 12:52 PM Thomas Gleixner  wrote:
> > Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
> > either I so hate that chained handler mess
> 
> I think it is just extremely uncommon to remove a chained handler
> (I don't know if anything exercises that path even) that is why we
> don't see any fallout from it. It's probably just untested code.

git grep irq_set_chained.*NULL

tells that there are users. Whether any of this is ever invoked is a
different questions.

> Do you see that chained handlers have any merit at all or should
> they all be moved to nested? The question needs asking, but IIUC
> there are performance benefits with chaining as opposed to
> nesting. :/

The nested case in gpio land is about nesting in irq thread context, which
is obviously slower. So instead of nesting you can just use a regular
interrupt for demultiplexing.

The thing about chained is:

  1) It's hidden, i.e. not exposed in /proc/interrupt

  2) It's not protected against runaway interrupts, which has happened in
 x86 land. We've converted those over to use regular interrupts to
 catch that case and shutdown the offending interrupt.
 e.g. ba714a9c1dea

  3) All the chained handlers have their own flow control inside the
 handler.

The chained performance benefit compared to a regular handler is
minimal. The main difference are a few conditionals and the desc->action
indirection. You probably can measure it with a high interrupt load, but
I'm still not convinced that it outweighs the downsides for a lot of the
places which use chained interrupts. I can see the benefit for very low
level interrupts, but even there we had hard to debug issues with some ARM
SoCs which ran into interrupt storms.

But hey, we surely can fix that chained issue. It's not going to be pretty
though :)

Thanks,

tglx










Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Linus Walleij
On Fri, Dec 7, 2018 at 12:52 PM Thomas Gleixner  wrote:

> > > This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> > > as TLMM.
>
> TLMM == Totally Lost Manufacturer Mess?

It's pretty OK for the most part actually.

> You cannot invoke those callbacks from __irq_do_set_handler() as that is
> holding the irq descriptor lock with interrupts disabled.

How typical. Hm I wonder how to fix this properly then.

> Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
> either I so hate that chained handler mess

I think it is just extremely uncommon to remove a chained handler
(I don't know if anything exercises that path even) that is why we
don't see any fallout from it. It's probably just untested code.

Do you see that chained handlers have any merit at all or should
they all be moved to nested? The question needs asking, but IIUC
there are performance benefits with chaining as opposed to
nesting. :/

Yours,
Linus Walleij


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Linus Walleij
On Fri, Dec 7, 2018 at 12:52 PM Thomas Gleixner  wrote:

> > > This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> > > as TLMM.
>
> TLMM == Totally Lost Manufacturer Mess?

It's pretty OK for the most part actually.

> You cannot invoke those callbacks from __irq_do_set_handler() as that is
> holding the irq descriptor lock with interrupts disabled.

How typical. Hm I wonder how to fix this properly then.

> Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
> either I so hate that chained handler mess

I think it is just extremely uncommon to remove a chained handler
(I don't know if anything exercises that path even) that is why we
don't see any fallout from it. It's probably just untested code.

Do you see that chained handlers have any merit at all or should
they all be moved to nested? The question needs asking, but IIUC
there are performance benefits with chaining as opposed to
nesting. :/

Yours,
Linus Walleij


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Thomas Gleixner
On Fri, 7 Dec 2018, Linus Walleij wrote:

Sorry for answering late.

> On Thu, Nov 29, 2018 at 2:31 PM Linus Walleij  
> wrote:
> >
> > This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> > as TLMM.

TLMM == Totally Lost Manufacturer Mess?

> > However, the irqchip core helpers in GPIO assumes that
> > the .irq_request_resources() callback is always called before
> > .irq_enable(), and the latter is where module refcount and
> > also gpiochip_lock_as_irq() is normally called.
> >
> > When .irq_enable() is called without .irq_request_resources()
> > having been called first, it seems like we are enabling
> > an IRQ on a GPIO line that has not first been locked to be
> > used as IRQ and we get the above warning. This happens since
> > as of
> > commit 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> > this is strictly assumed for all GPIO-based IRQs.
> >
> > As it seems reasonable to assume that .irq_request_resources()
> > is always strictly called before .irq_enable(), we add the
> > irq_[request|release]_resources() functions to the internal
> > API and call them also when adding a chained irqchip to
> > an IRQ.
> >
> > I am a bit uncertain about the call site for
> > irq_released_resources() inside chip.c, but it appears this
> > path is for uninstalling a chained handler.

You cannot invoke those callbacks from __irq_do_set_handler() as that is
holding the irq descriptor lock with interrupts disabled.

The calling convention for irq_released_resources() is that it's called
with the bus lock held (if the chip provides the bus lock callbacks), but
definitely not with desc->lock held.

Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
either I so hate that chained handler mess

Thanks,

tglx


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Thomas Gleixner
On Fri, 7 Dec 2018, Linus Walleij wrote:

Sorry for answering late.

> On Thu, Nov 29, 2018 at 2:31 PM Linus Walleij  
> wrote:
> >
> > This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> > as TLMM.

TLMM == Totally Lost Manufacturer Mess?

> > However, the irqchip core helpers in GPIO assumes that
> > the .irq_request_resources() callback is always called before
> > .irq_enable(), and the latter is where module refcount and
> > also gpiochip_lock_as_irq() is normally called.
> >
> > When .irq_enable() is called without .irq_request_resources()
> > having been called first, it seems like we are enabling
> > an IRQ on a GPIO line that has not first been locked to be
> > used as IRQ and we get the above warning. This happens since
> > as of
> > commit 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> > this is strictly assumed for all GPIO-based IRQs.
> >
> > As it seems reasonable to assume that .irq_request_resources()
> > is always strictly called before .irq_enable(), we add the
> > irq_[request|release]_resources() functions to the internal
> > API and call them also when adding a chained irqchip to
> > an IRQ.
> >
> > I am a bit uncertain about the call site for
> > irq_released_resources() inside chip.c, but it appears this
> > path is for uninstalling a chained handler.

You cannot invoke those callbacks from __irq_do_set_handler() as that is
holding the irq descriptor lock with interrupts disabled.

The calling convention for irq_released_resources() is that it's called
with the bus lock held (if the chip provides the bus lock callbacks), but
definitely not with desc->lock held.

Needs more thought. Btw, the uninstall path does not invoke irq_deactive()
either I so hate that chained handler mess

Thanks,

tglx


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Linus Walleij
On Thu, Nov 29, 2018 at 2:31 PM Linus Walleij  wrote:

> This addresses a bug in the irqchip core that was triggered
> by new code assuming a strict semantic order of the irqchip
> calls:
>
>  .irq_request_resources()
>  .irq_enable()
>  .irq_disable()
>  .irq_release_resources()
>
> Mostly this is working fine when handled inside manage.c,
> the calls are strictly happening in the above assumed order.
>
> However on a Qualcomm platform I get the following in dmesg:
>
> WARNING: CPU: 0 PID: 1 at ../drivers/gpio/gpiolib.c:3513
>  gpiochip_irq_enable+0x18/0x44
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0
>  Not tainted 4.20.0-rc4-2-g1b8a75b25c6e-dirty #9
> Hardware name: Generic DT based system
> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> [] (show_stack) from [] (dump_stack+0x78/0x8c)
> [] (dump_stack) from [] (__warn+0xe0/0xf8)
> [] (__warn) from [] (warn_slowpath_null+0x40/0x48)
> [] (warn_slowpath_null) from []
>  (gpiochip_irq_enable+0x18/0x44)
> [] (gpiochip_irq_enable) from [] (irq_enable+0x44/0x64)
> [] (irq_enable) from [] (__irq_startup+0xa0/0xa8)
> [] (__irq_startup) from [] (irq_startup+0x4c/0x130)
> [] (irq_startup) from []
>  (irq_set_chained_handler_and_data+0x4c/0x78)
> [] (irq_set_chained_handler_and_data) from []
>  (pm8xxx_probe+0x160/0x22c)
> [] (pm8xxx_probe) from [] (platform_drv_probe+0x48/0x98)
>
> What happens is that when the pm8xxx driver tries to register
> a chained IRQ irq_set_chained_handler_and_data() will eventually
> set the type and call irq_startup() and thus the .irq_enable()
> callback on the irqchip.
>
> This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> as TLMM.
>
> However, the irqchip core helpers in GPIO assumes that
> the .irq_request_resources() callback is always called before
> .irq_enable(), and the latter is where module refcount and
> also gpiochip_lock_as_irq() is normally called.
>
> When .irq_enable() is called without .irq_request_resources()
> having been called first, it seems like we are enabling
> an IRQ on a GPIO line that has not first been locked to be
> used as IRQ and we get the above warning. This happens since
> as of
> commit 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> this is strictly assumed for all GPIO-based IRQs.
>
> As it seems reasonable to assume that .irq_request_resources()
> is always strictly called before .irq_enable(), we add the
> irq_[request|release]_resources() functions to the internal
> API and call them also when adding a chained irqchip to
> an IRQ.
>
> I am a bit uncertain about the call site for
> irq_released_resources() inside chip.c, but it appears this
> path is for uninstalling a chained handler.
>
> Cc: sta...@vger.kernel.org
> Cc: Bjorn Andersson 
> Cc: Hans Verkuil 
> Fixes: 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> Signed-off-by: Linus Walleij 

IRQ folks: did you have time to look at this?

It's a very real regression for me.

Yours,
Linus Walleij


Re: [PATCH] irq: Request and release resources for chained IRQs

2018-12-07 Thread Linus Walleij
On Thu, Nov 29, 2018 at 2:31 PM Linus Walleij  wrote:

> This addresses a bug in the irqchip core that was triggered
> by new code assuming a strict semantic order of the irqchip
> calls:
>
>  .irq_request_resources()
>  .irq_enable()
>  .irq_disable()
>  .irq_release_resources()
>
> Mostly this is working fine when handled inside manage.c,
> the calls are strictly happening in the above assumed order.
>
> However on a Qualcomm platform I get the following in dmesg:
>
> WARNING: CPU: 0 PID: 1 at ../drivers/gpio/gpiolib.c:3513
>  gpiochip_irq_enable+0x18/0x44
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0
>  Not tainted 4.20.0-rc4-2-g1b8a75b25c6e-dirty #9
> Hardware name: Generic DT based system
> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> [] (show_stack) from [] (dump_stack+0x78/0x8c)
> [] (dump_stack) from [] (__warn+0xe0/0xf8)
> [] (__warn) from [] (warn_slowpath_null+0x40/0x48)
> [] (warn_slowpath_null) from []
>  (gpiochip_irq_enable+0x18/0x44)
> [] (gpiochip_irq_enable) from [] (irq_enable+0x44/0x64)
> [] (irq_enable) from [] (__irq_startup+0xa0/0xa8)
> [] (__irq_startup) from [] (irq_startup+0x4c/0x130)
> [] (irq_startup) from []
>  (irq_set_chained_handler_and_data+0x4c/0x78)
> [] (irq_set_chained_handler_and_data) from []
>  (pm8xxx_probe+0x160/0x22c)
> [] (pm8xxx_probe) from [] (platform_drv_probe+0x48/0x98)
>
> What happens is that when the pm8xxx driver tries to register
> a chained IRQ irq_set_chained_handler_and_data() will eventually
> set the type and call irq_startup() and thus the .irq_enable()
> callback on the irqchip.
>
> This irqchip is in drivers/pinctrl/qcom/pinctrl-msm.c and known
> as TLMM.
>
> However, the irqchip core helpers in GPIO assumes that
> the .irq_request_resources() callback is always called before
> .irq_enable(), and the latter is where module refcount and
> also gpiochip_lock_as_irq() is normally called.
>
> When .irq_enable() is called without .irq_request_resources()
> having been called first, it seems like we are enabling
> an IRQ on a GPIO line that has not first been locked to be
> used as IRQ and we get the above warning. This happens since
> as of
> commit 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> this is strictly assumed for all GPIO-based IRQs.
>
> As it seems reasonable to assume that .irq_request_resources()
> is always strictly called before .irq_enable(), we add the
> irq_[request|release]_resources() functions to the internal
> API and call them also when adding a chained irqchip to
> an IRQ.
>
> I am a bit uncertain about the call site for
> irq_released_resources() inside chip.c, but it appears this
> path is for uninstalling a chained handler.
>
> Cc: sta...@vger.kernel.org
> Cc: Bjorn Andersson 
> Cc: Hans Verkuil 
> Fixes: 461c1a7d4733d ("gpiolib: override irq_enable/disable")
> Signed-off-by: Linus Walleij 

IRQ folks: did you have time to look at this?

It's a very real regression for me.

Yours,
Linus Walleij