On Wed, 2012-04-11 at 11:33 +1000, Benjamin Herrenschmidt wrote: > On Wed, 2012-04-11 at 11:13 +1000, Benjamin Herrenschmidt wrote: > > On Sat, 2012-04-07 at 14:27 +0200, Andreas Schwab wrote: > > > Benjamin Herrenschmidt <b...@kernel.crashing.org> writes: > > > > > > > It's arguable that this irq_set_irq_type(,NONE) shouln't be there but > > > > still ... it's been around for ever and things worked :-) So something > > > > -else- is causing the problem and I'd like to understand what exactly. > > > > > > AFAICS before a09b659cd68c10ec6a30cb91ebd2c327fcd5bfe5 > > > irq_set_irq_type(,NONE) was actually a no-op. > > > > So I'm still a bit baffled... ie, I understand some of what's happening > > but not why it breaks things, I haven't yet managed to reproduce but I > > haven't tried too hard just yet (was away from the HW) : > > Allright, I have a repro-case, I'll dig.
Ok, so it's Grant's fault :-) So basically, it's quite subtle and I'm only 99% sure of the details but I believe what happens is: - The audio interrupts get virq 64 and 65 (so above NR_IRQS) - The reverse map isn't pre-filled at map time (we should probably do it nowadays), we do it lazily so ... - The audio interrupt happens, it tries to pre-fill the reverse map and ... fails (see below) - This cause irq_linear_revmap() to returns 0 - This hits another bug in mpic where when that happens it doesn't EOI the interrupt, which means the priority remains stuck high and we don't get any other interrupt. There's thus two bugs here: - We don't properly establish the reverse mapping for 64 and 65 (well, not always, again see below) - We don't EOI an interrupt we can't reverse map (we should warn & EOI, I'll send a patch for that). The second problem is a bug we shouldn't hit if the first one didn't happen, the first one comes from way irq_find_mapping() works. Basically when the revmap entry is 0, we call it to find the mapping & populate the revmap... and that fails. That fails because it will only search up to irq_virq_count which is statically initialized to NR_IRQS, which in our case is 64 so it won't find our interrupts 64 and 65... The reason it works if you don't do the set_type(NONE) is a fluke, it will crash as soon as you actually do audio: The set_type(NONE) has the effect in mpic to configure the interrupt to level low (default). This is also the idle state of the audio interrupt (which is positive edge), and the MPIC appears to be latching it (yeah odd, it does seem to latch level interrupts). So as soon as the audio driver enables it, it fires, triggering the bug. Without the set_type(NONE) the occurence of the bug is delayed to the fist time you get a real audio interrupt. Now what is the solution ? Well, there's several things I think we need to do: - MPIC should properly EOI interrupts it can't revmap (I'll fix that) - MPIC should probably not do the set_type(NONE) on map, it's not useful - However, we do have a risk of discrepency between the default trigger type in the irq_desc vs. the default HW value at startup/map time, so we do need to do something to reconcile them (that was the intend of NONE I think)... without that, we risk having irq_create_of_mapping() not setting the trigger because it thinks it thinks it's right... What do you think is best here ? We can probably read the HW config and sync the irq desc appropriately ... - The whole business with irq_virq_count needs fixing. Basically the default value shouldn't be NR_IRQ. I suggest making it 0 and have all the use sites do something like: max = irq_virq_count ? irq_virq_count : nr_irqs; (Grant, can you take care of that ?) - We still need to clear up all the NR_IRQS occurences in arch/powerpc Cheers, Ben. _______________________________________________ devicetree-discuss mailing list devicetree-discuss@lists.ozlabs.org https://lists.ozlabs.org/listinfo/devicetree-discuss