Re: [linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-13 Thread Martin Kelly

On 04/11/2018 05:40 AM, Maxime Ripard wrote:

On Tue, Apr 10, 2018 at 12:33:27PM -0700, Martin Kelly wrote:

On 04/10/2018 06:54 AM, Maxime Ripard wrote:

On Fri, Apr 06, 2018 at 02:52:35PM +0100, Andre Przywara wrote:

On 05/04/18 20:48, Martin Kelly wrote:

On 04/05/2018 06:07 AM, Maxime Ripard wrote:

On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:

Hi,

I've noticed strange behavior on my H3 (nanopi neo air) and am
wondering if
anyone has suggestions for further debugging it, as I'm getting stumped.

Specifically, I have configured a device (Invensense MPU9250) to deliver
interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
handler is being called at only about 6 Hz.

Looking at a logic analyzer, I see the hardware is interrupting at 10
Hz as
it should, but sometimes the interrupts are just missed from the kernel
side. So you might see a 200 ms gap between calls to the IRQ handler,
but
100 ms between hardware IRQ events.


This really looks like you're just missing the edge. Interrupts
handlers in Linux run with the interrupts disabled, so if you happen
to have another interrupt running at the time where your device is
emmiting its own, you'll miss it.


But the software/kernel shouldn't matter in that case, should it?
It is actually the port controller hardware registering the interrupt
cause, and then forwarding this to the GIC, and that to the CPU.
So once the Allwinner port controller has sampled the IRQ, it sets the
pending bit in the PG_EINT_STATUS_REG, from then on the interrupt cannot
be lost anymore. Unless it's configured as a line level IRQ on the pin
controller side, where a lowered line (the end of the pulse) would mean
the pending state is cleared again. So it should really be edge on the
pinctrl side.

Or am I missing something?


That's what I would think a proper controller would behave yeah, but
I've never experienced one behaving like that.

It should be pretty easy to test, you just need to read the pending
register once the interrupts are re-enabled.



Would this be in the arm-gic code or in the sunxi-pinctrl code?


The interrupts are masked at the CPU level, so even before the
GIC. However, you can always add a hack to read the PIO registers
whenever the interrupts are re-enabled.


When I instrument the sunxi-pinctrl code, I see lots of calls to
sunxi_pinctrl_irq_ack but no calls to sunxi_pinctrl_irq_mask, so the
controller is not seeing the interrupts at all.


If I remember this properly, _irq_mask will only be called when you
disable that interrupt in particular, through a call to disable_irq
for example.



Yes, based on what I'm seeing, I agree. I tried a hack to readl the PIO 
INT STATUS register for the interrupt I'm dealing with, and it appears 
it's not being set to pending, AFAICT.



If it can use level interrupts, you probably should use that instead.


Well, it's always level between the pinctrl and the GIC, and even if it
would be edge, the GIC would store this state until the CPU acknowledges
it. PSTATE.I=0 shouldn't have an effect.

I would actually expect it to be the other way around: configuring as
*level* on the pinctrl side allows for IRQ *pulses* to be lost.


The behaviour I've seen on some controllers is that it's actually
following the input pin state, which means that if the input pin goes
low, the line between the pin controller and the GIC will also go
low. And since it's level based, you will not notice it.


Could it be that the pinctrl is clocked too slowly, so it can't sample
the pins quickly enough and misses the rather short pulse?


By default it's clocked at 32kHz, which means a period of around
30us. That's indeed not enough if the pulse is around 50us. I guess
you could try to play with the input-debounce property and see if it
makes things better.



Could you expand on that? Since 30us is shorter than the pulse time, I'm
unclear on why the interrupts would still be missed.


Right, nevermind, that was a brainfart :)

Still playing with the clock and the clock scaler would be a good idea
too to see if it has any impact.



Ah OK. I will try changing the clock oscillator and see what happens. 
Thanks very much again for your help!



Maxime



--
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-10 Thread Martin Kelly

On 04/10/2018 06:54 AM, Maxime Ripard wrote:

On Fri, Apr 06, 2018 at 02:52:35PM +0100, Andre Przywara wrote:

On 05/04/18 20:48, Martin Kelly wrote:

On 04/05/2018 06:07 AM, Maxime Ripard wrote:

On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:

Hi,

I've noticed strange behavior on my H3 (nanopi neo air) and am
wondering if
anyone has suggestions for further debugging it, as I'm getting stumped.

Specifically, I have configured a device (Invensense MPU9250) to deliver
interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
handler is being called at only about 6 Hz.

Looking at a logic analyzer, I see the hardware is interrupting at 10
Hz as
it should, but sometimes the interrupts are just missed from the kernel
side. So you might see a 200 ms gap between calls to the IRQ handler,
but
100 ms between hardware IRQ events.


This really looks like you're just missing the edge. Interrupts
handlers in Linux run with the interrupts disabled, so if you happen
to have another interrupt running at the time where your device is
emmiting its own, you'll miss it.


But the software/kernel shouldn't matter in that case, should it?
It is actually the port controller hardware registering the interrupt
cause, and then forwarding this to the GIC, and that to the CPU.
So once the Allwinner port controller has sampled the IRQ, it sets the
pending bit in the PG_EINT_STATUS_REG, from then on the interrupt cannot
be lost anymore. Unless it's configured as a line level IRQ on the pin
controller side, where a lowered line (the end of the pulse) would mean
the pending state is cleared again. So it should really be edge on the
pinctrl side.

Or am I missing something?


That's what I would think a proper controller would behave yeah, but
I've never experienced one behaving like that.

It should be pretty easy to test, you just need to read the pending
register once the interrupts are re-enabled.



Would this be in the arm-gic code or in the sunxi-pinctrl code? When I 
instrument the sunxi-pinctrl code, I see lots of calls to 
sunxi_pinctrl_irq_ack but no calls to sunxi_pinctrl_irq_mask, so the 
controller is not seeing the interrupts at all.



If it can use level interrupts, you probably should use that instead.


Well, it's always level between the pinctrl and the GIC, and even if it
would be edge, the GIC would store this state until the CPU acknowledges
it. PSTATE.I=0 shouldn't have an effect.

I would actually expect it to be the other way around: configuring as
*level* on the pinctrl side allows for IRQ *pulses* to be lost.


The behaviour I've seen on some controllers is that it's actually
following the input pin state, which means that if the input pin goes
low, the line between the pin controller and the GIC will also go
low. And since it's level based, you will not notice it.


Could it be that the pinctrl is clocked too slowly, so it can't sample
the pins quickly enough and misses the rather short pulse?


By default it's clocked at 32kHz, which means a period of around
30us. That's indeed not enough if the pulse is around 50us. I guess
you could try to play with the input-debounce property and see if it
makes things better.



Could you expand on that? Since 30us is shorter than the pulse time, I'm 
unclear on why the interrupts would still be missed.


--
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-06 Thread Martin Kelly

On 04/05/2018 10:50 PM, Maxime Ripard wrote:

On Thu, Apr 05, 2018 at 12:48:56PM -0700, Martin Kelly wrote:

On 04/05/2018 06:07 AM, Maxime Ripard wrote:

On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:

Hi,

I've noticed strange behavior on my H3 (nanopi neo air) and am wondering if
anyone has suggestions for further debugging it, as I'm getting stumped.

Specifically, I have configured a device (Invensense MPU9250) to deliver
interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
handler is being called at only about 6 Hz.

Looking at a logic analyzer, I see the hardware is interrupting at 10 Hz as
it should, but sometimes the interrupts are just missed from the kernel
side. So you might see a 200 ms gap between calls to the IRQ handler, but
100 ms between hardware IRQ events.


This really looks like you're just missing the edge. Interrupts
handlers in Linux run with the interrupts disabled, so if you happen
to have another interrupt running at the time where your device is
emmiting its own, you'll miss it.

If it can use level interrupts, you probably should use that instead.


Yes, I have to agree with your theory. I switched to level interrupts and
nothing is missed now.


Ok, great.



Thanks very much for the suggestion!

--
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-06 Thread Martin Kelly

On 04/06/2018 06:52 AM, Andre Przywara wrote:

Hi,

On 05/04/18 20:48, Martin Kelly wrote:

On 04/05/2018 06:07 AM, Maxime Ripard wrote:

On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:

Hi,

I've noticed strange behavior on my H3 (nanopi neo air) and am
wondering if
anyone has suggestions for further debugging it, as I'm getting stumped.

Specifically, I have configured a device (Invensense MPU9250) to deliver
interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
handler is being called at only about 6 Hz.

Looking at a logic analyzer, I see the hardware is interrupting at 10
Hz as
it should, but sometimes the interrupts are just missed from the kernel
side. So you might see a 200 ms gap between calls to the IRQ handler,
but
100 ms between hardware IRQ events.


This really looks like you're just missing the edge. Interrupts
handlers in Linux run with the interrupts disabled, so if you happen
to have another interrupt running at the time where your device is
emmiting its own, you'll miss it.


But the software/kernel shouldn't matter in that case, should it?
It is actually the port controller hardware registering the interrupt
cause, and then forwarding this to the GIC, and that to the CPU.
So once the Allwinner port controller has sampled the IRQ, it sets the
pending bit in the PG_EINT_STATUS_REG, from then on the interrupt cannot
be lost anymore. Unless it's configured as a line level IRQ on the pin
controller side, where a lowered line (the end of the pulse) would mean
the pending state is cleared again. So it should really be edge on the
pinctrl side.

Or am I missing something?



Yes, this is what is confusing me; I would expect the pending bit to be 
set and the interrupt to still be service if only a single edge occurred 
while interrupts were disabled (albeit slightly delayed).


Another strange aspect of this is that roughly the same percentage of 
interrupts are missed at low (10 Hz) and higher (500 Hz) frequencies: 
40% or so. This would align well with a sampling rate mismatch.


I am also seeing some very strange extreme cases of nearly a full second 
without interrupts, which seems hard to explain under any theory I can 
come up with. Here a screenshot from my logic analyzer:


https://ibb.co/cUQSBc

D0 is the I2C SCL
D1 is the I2C SDA
D2 is the interrupt (falling edge in this case)

The interrupt handler causes I2C traffic, so when you see the traffic 
shortly after the interrupt, the interrupt handler fired, while for 
other cases it it did not.



If it can use level interrupts, you probably should use that instead.


Well, it's always level between the pinctrl and the GIC, and even if it
would be edge, the GIC would store this state until the CPU acknowledges
it. PSTATE.I=0 shouldn't have an effect.

I would actually expect it to be the other way around: configuring as
*level* on the pinctrl side allows for IRQ *pulses* to be lost.

Could it be that the pinctrl is clocked too slowly, so it can't sample
the pins quickly enough and misses the rather short pulse?

Cheers,
Andre.


Yes, I have to agree with your theory. I switched to level interrupts
and nothing is missed now.

One thing I'm confused about: I had expected that if a single
edge-triggered interrupt is missed because interrupts were disabled, it
would still be resumed later. From my testing, it appears when
interrupts are missed, the entire period is skipped, and we we act on
the next interrupt instead.

For example, I'm seeing:
0 ms: interrupt, handler fires
100 ms: interrupt, no handler
200 ms: interrupt, handler fires

I never see a slightly-delayed handler, as I would expect if interrupts
are being processed after they are re-enabled.

I also noticed that, depending on what's going on in the system and the
interrupt frequencies (my tests ranged from 10 to 500 Hz), the pattern
of which interrupts are missed changes. Sometimes at 10 Hz, I see almost
an entire second of missed interrupts, and then a few seconds of every
interrupt being hit, and then another long batch of missed interrupts.
At higher frequencies, the misses are about the same frequency (about
40% misses), but more fine-grained instead of batched up. I'm not sure
what to make of these patterns, but 40% miss rate on a 10 Hz signal (at
near 0% CPU load) really surprises me.



--
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-06 Thread Andre Przywara
Hi,

On 05/04/18 20:48, Martin Kelly wrote:
> On 04/05/2018 06:07 AM, Maxime Ripard wrote:
>> On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:
>>> Hi,
>>>
>>> I've noticed strange behavior on my H3 (nanopi neo air) and am
>>> wondering if
>>> anyone has suggestions for further debugging it, as I'm getting stumped.
>>>
>>> Specifically, I have configured a device (Invensense MPU9250) to deliver
>>> interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
>>> handler is being called at only about 6 Hz.
>>>
>>> Looking at a logic analyzer, I see the hardware is interrupting at 10
>>> Hz as
>>> it should, but sometimes the interrupts are just missed from the kernel
>>> side. So you might see a 200 ms gap between calls to the IRQ handler,
>>> but
>>> 100 ms between hardware IRQ events.
>>
>> This really looks like you're just missing the edge. Interrupts
>> handlers in Linux run with the interrupts disabled, so if you happen
>> to have another interrupt running at the time where your device is
>> emmiting its own, you'll miss it.

But the software/kernel shouldn't matter in that case, should it?
It is actually the port controller hardware registering the interrupt
cause, and then forwarding this to the GIC, and that to the CPU.
So once the Allwinner port controller has sampled the IRQ, it sets the
pending bit in the PG_EINT_STATUS_REG, from then on the interrupt cannot
be lost anymore. Unless it's configured as a line level IRQ on the pin
controller side, where a lowered line (the end of the pulse) would mean
the pending state is cleared again. So it should really be edge on the
pinctrl side.

Or am I missing something?

>> If it can use level interrupts, you probably should use that instead.

Well, it's always level between the pinctrl and the GIC, and even if it
would be edge, the GIC would store this state until the CPU acknowledges
it. PSTATE.I=0 shouldn't have an effect.

I would actually expect it to be the other way around: configuring as
*level* on the pinctrl side allows for IRQ *pulses* to be lost.

Could it be that the pinctrl is clocked too slowly, so it can't sample
the pins quickly enough and misses the rather short pulse?

Cheers,
Andre.

> Yes, I have to agree with your theory. I switched to level interrupts
> and nothing is missed now.
> 
> One thing I'm confused about: I had expected that if a single
> edge-triggered interrupt is missed because interrupts were disabled, it
> would still be resumed later. From my testing, it appears when
> interrupts are missed, the entire period is skipped, and we we act on
> the next interrupt instead.
> 
> For example, I'm seeing:
> 0 ms: interrupt, handler fires
> 100 ms: interrupt, no handler
> 200 ms: interrupt, handler fires
> 
> I never see a slightly-delayed handler, as I would expect if interrupts
> are being processed after they are re-enabled.
> 
> I also noticed that, depending on what's going on in the system and the
> interrupt frequencies (my tests ranged from 10 to 500 Hz), the pattern
> of which interrupts are missed changes. Sometimes at 10 Hz, I see almost
> an entire second of missed interrupts, and then a few seconds of every
> interrupt being hit, and then another long batch of missed interrupts.
> At higher frequencies, the misses are about the same frequency (about
> 40% misses), but more fine-grained instead of batched up. I'm not sure
> what to make of these patterns, but 40% miss rate on a 10 Hz signal (at
> near 0% CPU load) really surprises me.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[linux-sunxi] Re: Strange behavior with missing H3 interrupts

2018-04-05 Thread Martin Kelly

On 04/05/2018 06:07 AM, Maxime Ripard wrote:

On Wed, Apr 04, 2018 at 02:50:25PM -0700, Martin Kelly wrote:

Hi,

I've noticed strange behavior on my H3 (nanopi neo air) and am wondering if
anyone has suggestions for further debugging it, as I'm getting stumped.

Specifically, I have configured a device (Invensense MPU9250) to deliver
interrupts at 10Hz to PG_EINT11. For some reason, though, the interrupt
handler is being called at only about 6 Hz.

Looking at a logic analyzer, I see the hardware is interrupting at 10 Hz as
it should, but sometimes the interrupts are just missed from the kernel
side. So you might see a 200 ms gap between calls to the IRQ handler, but
100 ms between hardware IRQ events.


This really looks like you're just missing the edge. Interrupts
handlers in Linux run with the interrupts disabled, so if you happen
to have another interrupt running at the time where your device is
emmiting its own, you'll miss it.

If it can use level interrupts, you probably should use that instead.

Maxime



Yes, I have to agree with your theory. I switched to level interrupts 
and nothing is missed now.


One thing I'm confused about: I had expected that if a single 
edge-triggered interrupt is missed because interrupts were disabled, it 
would still be resumed later. From my testing, it appears when 
interrupts are missed, the entire period is skipped, and we we act on 
the next interrupt instead.


For example, I'm seeing:
0 ms: interrupt, handler fires
100 ms: interrupt, no handler
200 ms: interrupt, handler fires

I never see a slightly-delayed handler, as I would expect if interrupts 
are being processed after they are re-enabled.


I also noticed that, depending on what's going on in the system and the 
interrupt frequencies (my tests ranged from 10 to 500 Hz), the pattern 
of which interrupts are missed changes. Sometimes at 10 Hz, I see almost 
an entire second of missed interrupts, and then a few seconds of every 
interrupt being hit, and then another long batch of missed interrupts. 
At higher frequencies, the misses are about the same frequency (about 
40% misses), but more fine-grained instead of batched up. I'm not sure 
what to make of these patterns, but 40% miss rate on a 10 Hz signal (at 
near 0% CPU load) really surprises me.


--
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.