Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-13 Thread Lars-Peter Clausen

On 8/12/20 1:01 PM, Christian Eggers wrote:

Hi Lars

On Monday, 3 August 2020, 08:52:54 CEST, Lars-Peter Clausen wrote:

On 8/3/20 8:44 AM, Christian Eggers wrote:

...
is my patch sufficient, or would you prefer a different solution?

The code in normal upstream is correct, there is no need to patch it
since iio_sysfs_trigger_work() always runs with IRQs disabled.


Are you using a non-upstream kernel? Maybe a RT kernel?

I use v5.4.-rt

That explains it. Have a look at
0200-irqwork-push-most-work-into-softirq-context.patch.

The right fix for this issue is to add the following snippet to the RT
patchset.

diff --git a/drivers/iio/trigger/iio-trig-sysfs.c
b/drivers/iio/trigger/iio-trig-sysfs.c
--- a/drivers/iio/trigger/iio-trig-sysfs.c
+++ b/drivers/iio/trigger/iio-trig-sysfs.c
@@ -161,6 +161,7 @@ static int iio_sysfs_trigger_probe(int id)
   iio_trigger_set_drvdata(t->trig, t);

   init_irq_work(&t->work, iio_sysfs_trigger_work);
+t->work.flags = IRQ_WORK_HARD_IRQ;

   ret = iio_trigger_register(t->trig);
   if (ret)

I can confirm that this works for iio-trig-sysfs on 5.4.54-rt32. Currently I
do not use iio-trig-hrtimer, but if I remember correctly, the problem was also
present there.


Similar story, I think. On mainline hrtimers run in hardirq mode by 
default, whereas in RT they run in softirq mode by default. So we 
haven't see the issue in mainline.


To fix this we need to explicitly specify that the IIO hrtimer always 
needs to run in hardirq mode by using the HRTIMER_MODE_REL_HARD flag. 
I'll send a patch.



Do you want to apply your patch for mainline? In contrast to v5.4,
IRQ_WORK_HARD_IRQ is already available there (moved to smp_types.h).
Unfortunately I cannot test it on mainline for now, as my BSP stuff is not
ported yet.


Sounds like a plan :)



Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-12 Thread Christian Eggers
Hi Lars

On Monday, 3 August 2020, 08:52:54 CEST, Lars-Peter Clausen wrote:
> On 8/3/20 8:44 AM, Christian Eggers wrote:
> > ...
> > is my patch sufficient, or would you prefer a different solution?
> 
> The code in normal upstream is correct, there is no need to patch it
> since iio_sysfs_trigger_work() always runs with IRQs disabled.
> 
> >> Are you using a non-upstream kernel? Maybe a RT kernel?
> > 
> > I use v5.4.-rt
> 
> That explains it. Have a look at
> 0200-irqwork-push-most-work-into-softirq-context.patch.
> 
> The right fix for this issue is to add the following snippet to the RT
> patchset.
> 
> diff --git a/drivers/iio/trigger/iio-trig-sysfs.c
> b/drivers/iio/trigger/iio-trig-sysfs.c
> --- a/drivers/iio/trigger/iio-trig-sysfs.c
> +++ b/drivers/iio/trigger/iio-trig-sysfs.c
> @@ -161,6 +161,7 @@ static int iio_sysfs_trigger_probe(int id)
>   iio_trigger_set_drvdata(t->trig, t);
> 
>   init_irq_work(&t->work, iio_sysfs_trigger_work);
> +t->work.flags = IRQ_WORK_HARD_IRQ;
> 
>   ret = iio_trigger_register(t->trig);
>   if (ret)

I can confirm that this works for iio-trig-sysfs on 5.4.54-rt32. Currently I 
do not use iio-trig-hrtimer, but if I remember correctly, the problem was also 
present there.

Do you want to apply your patch for mainline? In contrast to v5.4, 
IRQ_WORK_HARD_IRQ is already available there (moved to smp_types.h). 
Unfortunately I cannot test it on mainline for now, as my BSP stuff is not 
ported yet.

Best regards
Christian






Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Lars-Peter Clausen

On 8/3/20 8:44 AM, Christian Eggers wrote:

Hi Lars,

On Monday, 3 August 2020, 08:37:43 CEST, Lars-Peter Clausen wrote:

The sysfs IIO trigger uses irq_work to schedule the iio_trigger_poll()
and the promise of irq_work is that the callback will run in hard IRQ
context. That's the whole point of it.

irq_work_run_list(), which shows up in your callgraph, has as
BUG_ON(!irqs_disabled())[1], so we should never even get to calling
iio_trigger_poll() if IRQs where not disabled at this point. That's the
same condition that triggers the WARN_ON() in __handle_irq_event_percpu.

is my patch sufficient, or would you prefer a different solution?
The code in normal upstream is correct, there is no need to patch it 
since iio_sysfs_trigger_work() always runs with IRQs disabled.



Are you using a non-upstream kernel? Maybe a RT kernel?

I use v5.4.-rt


That explains it. Have a look at 
0200-irqwork-push-most-work-into-softirq-context.patch.


The right fix for this issue is to add the following snippet to the RT 
patchset.


diff --git a/drivers/iio/trigger/iio-trig-sysfs.c 
b/drivers/iio/trigger/iio-trig-sysfs.c

--- a/drivers/iio/trigger/iio-trig-sysfs.c
+++ b/drivers/iio/trigger/iio-trig-sysfs.c
@@ -161,6 +161,7 @@ static int iio_sysfs_trigger_probe(int id)
 iio_trigger_set_drvdata(t->trig, t);

 init_irq_work(&t->work, iio_sysfs_trigger_work);
+    t->work.flags = IRQ_WORK_HARD_IRQ;

 ret = iio_trigger_register(t->trig);
 if (ret)



Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Christian Eggers
Hi Lars,

On Monday, 3 August 2020, 08:37:43 CEST, Lars-Peter Clausen wrote:
> The sysfs IIO trigger uses irq_work to schedule the iio_trigger_poll()
> and the promise of irq_work is that the callback will run in hard IRQ
> context. That's the whole point of it.
> 
> irq_work_run_list(), which shows up in your callgraph, has as
> BUG_ON(!irqs_disabled())[1], so we should never even get to calling
> iio_trigger_poll() if IRQs where not disabled at this point. That's the
> same condition that triggers the WARN_ON() in __handle_irq_event_percpu.
is my patch sufficient, or would you prefer a different solution?

> Are you using a non-upstream kernel? Maybe a RT kernel?
I use v5.4.-rt

regards
Christian





Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Lars-Peter Clausen

On 8/3/20 7:16 AM, Christian Eggers wrote:

On Saturday, 1 August 2020, 18:02:34 CEST, Jonathan Cameron wrote:

On Mon, 27 Jul 2020 16:57:13 +0200

Christian Eggers  wrote:

iio_trigger_poll() calls generic_handle_irq(). This function expects to
be run with local IRQs disabled.

Was there an error or warning that lead to this patch?

[   17.448466] 000: [ cut here ]
[   17.448481] 000: WARNING: CPU: 0 PID: 9 at kernel/irq/handle.c:152 
__handle_irq_event_percpu+0x55/0xae
[   17.448511] 000: irq 236 handler irq_default_primary_handler+0x1/0x4 enabled 
interrupts
[   17.448526] 000: Modules linked in: bridge stp llc usb_f_ncm u_ether 
libcomposite sd_mod configfs cdc_acm usb_storage scsi_mod ci_hdrc_imx ci_hdrc 
st_magn_spi ulpi st_sensors_spi ehci_hcd regmap_spi tcpm roles st_magn_i2c 
typec st_sensors_i2c udc_core st_magn as73211 st_sensors imx_thermal usb49xx 
usbcore industrialio_triggered_buffer rtc_rv3028 kfifo_buf at24 usb_common 
nls_base i2c_dev usbmisc_imx phy_mxs_usb anatop_regulator imx2_wdt imx_fan 
spidev leds_pwm leds_gpio led_class iio_trig_sysfs imx6sx_adc industrialio 
fixed at25 spi_imx spi_bitbang imx_napi dev imx_sdma virt_dma nfsv3 nfs lockd 
grace sunrpc ksz9477_i2c ksz9477 tag_ksz ksz_common dsa_core phylink regmap_i2c 
i2c_imx i2c_core fec ptp pps_core micrel
[   17.448712] 000: CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.47-rt28+ 
#446
[   17.448723] 000: Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[   17.448738] 000: [] (unwind_backtrace) from [] 
(show_stack+0xb/0xc)
[   17.448754] 000: [] (show_stack) from [] 
(__warn+0x7b/0x8c)
[   17.448772] 000: [] (__warn) from [] 
(warn_slowpath_fmt+0x31/0x50)
[   17.448787] 000: [] (warn_slowpath_fmt) from [] 
(__handle_irq_event_percpu+0x55/0xae)
[   17.448807] 000: [] (__handle_irq_event_percpu) from [] 
(handle_irq_event_percpu+0x19/0x40)
[   17.448823] 000: [] (handle_irq_event_percpu) from [] 
(handle_irq_event+0x3f/0x5c)
[   17.448839] 000: [] (handle_irq_event) from [] 
(handle_simple_irq+0x67/0x6a)
[   17.448854] 000: [] (handle_simple_irq) from [] 
(generic_handle_irq+0xd/0x16)
[   17.448870] 000: [] (generic_handle_irq) from [] 
(iio_trigger_poll+0x33/0x44 [industrialio])
[   17.448962] 000: [] (iio_trigger_poll [industrialio]) from 
[] (irq_work_run_list+0x43/0x66)
[   17.449010] 000: [] (irq_work_run_list) from [] 
(run_timer_softirq+0x7/0x3c)
[   17.449029] 000: [] (run_timer_softirq) from [] 
(__do_softirq+0x10f/0x160)
[   17.449045] 000: [] (__do_softirq) from [] 
(run_ksoftirqd+0x19/0x2c)
[   17.449061] 000: [] (run_ksoftirqd) from [] 
(smpboot_thread_fn+0x13b/0x140)
[   17.449078] 000: [] (smpboot_thread_fn) from [] 
(kthread+0xa3/0xac)
[   17.449095] 000: [] (kthread) from [] 
(ret_from_fork+0x11/0x20)
[   17.449110] 000: Exception stack(0xc2063fb0 to 0xc2063ff8)
[   17.449119] 000: 3fa0:   
 
[   17.449130] 000: 3fc0:       
 
[   17.449139] 000: 3fe0:     0013 
[   17.449146] 000: ---[ end trace 0002 ]---




Or can you point to what call in generic_handle_irq is making the
assumption that we are breaking?

Given this is using the irq_work framework I'm wondering if this is
a more general problem?

If I understand correctly, the kernel temporarily disables hardware interrupts
while hardware irq handlers are run. In case of the iio-trig-hrtim and 
iio-trig-sysfs
interrupts, __handle_irq_event_percpu() is not called from a hardware irq
(where interrupts would be disabled), but from software.


The sysfs IIO trigger uses irq_work to schedule the iio_trigger_poll() 
and the promise of irq_work is that the callback will run in hard IRQ 
context. That's the whole point of it.


irq_work_run_list(), which shows up in your callgraph, has as 
BUG_ON(!irqs_disabled())[1], so we should never even get to calling 
iio_trigger_poll() if IRQs where not disabled at this point. That's the 
same condition that triggers the WARN_ON() in __handle_irq_event_percpu.


Are you using a non-upstream kernel? Maybe a RT kernel?

- Lars

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/irq_work.c#n163





Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Christian Eggers
On Saturday, 1 August 2020, 18:02:34 CEST, Jonathan Cameron wrote:
> On Mon, 27 Jul 2020 16:57:13 +0200
> 
> Christian Eggers  wrote:
> > iio_trigger_poll() calls generic_handle_irq(). This function expects to
> > be run with local IRQs disabled.
> 
> Was there an error or warning that lead to this patch?
[   17.448466] 000: [ cut here ]
[   17.448481] 000: WARNING: CPU: 0 PID: 9 at kernel/irq/handle.c:152 
__handle_irq_event_percpu+0x55/0xae
[   17.448511] 000: irq 236 handler irq_default_primary_handler+0x1/0x4 enabled 
interrupts
[   17.448526] 000: Modules linked in: bridge stp llc usb_f_ncm u_ether 
libcomposite sd_mod configfs cdc_acm usb_storage scsi_mod ci_hdrc_imx ci_hdrc 
st_magn_spi ulpi st_sensors_spi ehci_hcd regmap_spi tcpm roles st_magn_i2c 
typec st_sensors_i2c udc_core st_magn as73211 st_sensors imx_thermal usb49xx 
usbcore industrialio_triggered_buffer rtc_rv3028 kfifo_buf at24 usb_common 
nls_base i2c_dev usbmisc_imx phy_mxs_usb anatop_regulator imx2_wdt imx_fan 
spidev leds_pwm leds_gpio led_class iio_trig_sysfs imx6sx_adc industrialio 
fixed at25 spi_imx spi_bitbang imx_napi dev imx_sdma virt_dma nfsv3 nfs lockd 
grace sunrpc ksz9477_i2c ksz9477 tag_ksz ksz_common dsa_core phylink regmap_i2c 
i2c_imx i2c_core fec ptp pps_core micrel
[   17.448712] 000: CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.47-rt28+ 
#446
[   17.448723] 000: Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[   17.448738] 000: [] (unwind_backtrace) from [] 
(show_stack+0xb/0xc)
[   17.448754] 000: [] (show_stack) from [] 
(__warn+0x7b/0x8c)
[   17.448772] 000: [] (__warn) from [] 
(warn_slowpath_fmt+0x31/0x50)
[   17.448787] 000: [] (warn_slowpath_fmt) from [] 
(__handle_irq_event_percpu+0x55/0xae)
[   17.448807] 000: [] (__handle_irq_event_percpu) from [] 
(handle_irq_event_percpu+0x19/0x40)
[   17.448823] 000: [] (handle_irq_event_percpu) from [] 
(handle_irq_event+0x3f/0x5c)
[   17.448839] 000: [] (handle_irq_event) from [] 
(handle_simple_irq+0x67/0x6a)
[   17.448854] 000: [] (handle_simple_irq) from [] 
(generic_handle_irq+0xd/0x16)
[   17.448870] 000: [] (generic_handle_irq) from [] 
(iio_trigger_poll+0x33/0x44 [industrialio])
[   17.448962] 000: [] (iio_trigger_poll [industrialio]) from 
[] (irq_work_run_list+0x43/0x66)
[   17.449010] 000: [] (irq_work_run_list) from [] 
(run_timer_softirq+0x7/0x3c)
[   17.449029] 000: [] (run_timer_softirq) from [] 
(__do_softirq+0x10f/0x160)
[   17.449045] 000: [] (__do_softirq) from [] 
(run_ksoftirqd+0x19/0x2c)
[   17.449061] 000: [] (run_ksoftirqd) from [] 
(smpboot_thread_fn+0x13b/0x140)
[   17.449078] 000: [] (smpboot_thread_fn) from [] 
(kthread+0xa3/0xac)
[   17.449095] 000: [] (kthread) from [] 
(ret_from_fork+0x11/0x20)
[   17.449110] 000: Exception stack(0xc2063fb0 to 0xc2063ff8)
[   17.449119] 000: 3fa0:   
 
[   17.449130] 000: 3fc0:       
 
[   17.449139] 000: 3fe0:     0013 
[   17.449146] 000: ---[ end trace 0002 ]---



> Or can you point to what call in generic_handle_irq is making the
> assumption that we are breaking?
> 
> Given this is using the irq_work framework I'm wondering if this is
> a more general problem?

If I understand correctly, the kernel temporarily disables hardware interrupts
while hardware irq handlers are run. In case of the iio-trig-hrtim and 
iio-trig-sysfs
interrupts, __handle_irq_event_percpu() is not called from a hardware irq
(where interrupts would be disabled), but from software.

Similar examples found here:
0a29ac5bd3 ("net: usb: lan78xx: Disable interrupts before calling 
generic_handle_irq()")

and
drivers/i2c/busses/i2c-cht-wc.c:103


> 
> Basically more info please!
> 
> Thanks,
> 
> Jonathan
> 
Regards
Christian





Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Lars-Peter Clausen

On 8/1/20 6:02 PM, Jonathan Cameron wrote:

On Mon, 27 Jul 2020 16:57:13 +0200
Christian Eggers  wrote:


iio_trigger_poll() calls generic_handle_irq(). This function expects to
be run with local IRQs disabled.

Was there an error or warning that lead to this patch?
Or can you point to what call in generic_handle_irq is making the
assumption that we are breaking?

Given this is using the irq_work framework I'm wondering if this is
a more general problem?

Basically more info please!


There is this series https://lkml.org/lkml/2020/3/6/433 which causes 
generic_handle_irq() to issue an warning if it is called with IRQs on, 
for a IRQ controller that can't handle it.


But I'm not convinced this applies to the IIO code, since this is a 
purely virtual interrupt and is not interfering with any interrupt 
controller hardware.





Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-01 Thread Jonathan Cameron
On Mon, 27 Jul 2020 16:57:13 +0200
Christian Eggers  wrote:

> iio_trigger_poll() calls generic_handle_irq(). This function expects to
> be run with local IRQs disabled.

Was there an error or warning that lead to this patch?
Or can you point to what call in generic_handle_irq is making the
assumption that we are breaking?

Given this is using the irq_work framework I'm wondering if this is
a more general problem?

Basically more info please!

Thanks,

Jonathan


> 
> Signed-off-by: Christian Eggers 
> Cc: sta...@vger.kernel.org
> ---
>  drivers/iio/trigger/iio-trig-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/iio/trigger/iio-trig-sysfs.c 
> b/drivers/iio/trigger/iio-trig-sysfs.c
> index e09e58072872..66a96b1632f8 100644
> --- a/drivers/iio/trigger/iio-trig-sysfs.c
> +++ b/drivers/iio/trigger/iio-trig-sysfs.c
> @@ -94,7 +94,9 @@ static void iio_sysfs_trigger_work(struct irq_work *work)
>   struct iio_sysfs_trig *trig = container_of(work, struct iio_sysfs_trig,
>   work);
>  
> + local_irq_disable();
>   iio_trigger_poll(trig->trig);
> + local_irq_enable();
>  }
>  
>  static ssize_t iio_sysfs_trigger_poll(struct device *dev,