On Mon, Aug 8, 2016 at 2:31 AM, Jon Hunter <jonath...@nvidia.com> wrote: > > On 06/08/16 00:45, John Stultz wrote: >> On Mon, Aug 1, 2016 at 3:26 AM, Jon Hunter <jonath...@nvidia.com> wrote: >>> Hi John, >>> >>> On 30/07/16 05:39, John Stultz wrote: >>>> Hey Jon, >>>> So after rebasing my nexus7 patch stack onto pre-4.8-rc1 tree, I >>>> noticed the power/volume buttons stopped working. >>>> >>>> I did a manual rebased bisection and chased it down to your commit >>>> 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ"). >>>> >>>> Reverting that patch makes things work again, so I wanted to see if >>>> there was any debugging info I could provide to try to help narrow >>>> down the problem here. (Sorry, I'd tinker myself with it some and try >>>> to debug the issue, but after burning my friday night on this, I'm >>>> eager to get away from the keyboard for the weekend). >>> >>> Before this commit bad IRQ type settings in device-tree were not getting >>> reported and so failures to set the IRQ type were going unnoticed. It's >>> most likely a bad IRQ type settings somewhere. >>> >>> As Thomas mentioned hopefully dmesg will shed a bit more light. >>> >>> Otherwise it can be worth looking at the ->irq_set_type() function for >>> the irqchips in the path of the interrupt requested to see if any are >>> failing. Looking at the nexus7 (assuming qcom variant), it looks like >>> there are 3 irqchips in the path (pm8921 --> apq8064-pinctrl --> gic). >>> The pm8xxx_irq_set_type() could return a failure when setting up the IRQ >>> type and could be worth checking. It does not look like the set_type for >>> the apq8064-pinctrl should ever fail (apart from calling BUG() which >>> would be obvious). The gic can also return a failure for setting the >>> type, but I did not see anything at first glance that looks incorrect in >>> the dts. >>> >>> If we can narrow down irqchip, then hopefully it will be clearer. >> >> The pm_8xxx_irq_set_type doesn't seem to be failing as far as I can see.. >> >> Looking at the patch that seems to cause the trouble, I narrowed it >> down to just the following chunk: >> >> @@ -614,7 +615,11 @@ unsigned int irq_create_fwspec_mapping(struct >> irq_fwspec *fwspec) >> * it now and return the interrupt number. >> */ >> if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) { >> - irq_set_irq_type(virq, type); >> + irq_data = irq_get_irq_data(virq); >> + if (!irq_data) >> + return 0; >> + >> + irqd_set_trigger_type(irq_data, type); >> return virq; >> } >> >> If I revert just that, it works again. >> >> I was worried we were hitting an early failure from !irq_data, but it >> seems there's some subtle difference between irqd_set_trigger_type and >> irq_set_type that makes the former break for me. > > Thanks this is good info and at the same time odd. > > I am guessing that it is failing above because the irq_data is not found > for the irq?
So actually no. We usually call irqd_set_trigger_type() but something still doesn't work. Interestingly, just adding irq_set_irq_type(virq, type); to the top of that block (leaving the rest of the code) also works. > What is odd, is that the above sequence is only executed if a irq > mapping exists and so really, AFAICT this should not happen. Ie. the irq > descriptor should have been allocated for the mapping to exist. We > should probably warn if this happens. > > Without reverting the above, can you add a print to show the > domain->name, hwirq and virq information if !irq_data? That will confirm > the domain for us. So I put some printk info in (in either case since I'm never seeing the !irq_data case happen): [ 1.514217] JDB: virq: 93 hwirq: 74 domain name: msmgpio [ 1.838342] JDB: virq: 25 hwirq: 6 domain name: msmgpio Which is odd, looking at: shell@flo:/ $ cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 16: 1159 1138 1332 1574 GIC-0 18 Edge gp_timer 25: 0 0 0 0 msmgpio 6 Edge ekth3500 111: 6 0 0 0 GIC-0 51 Edge qcom_rpm_ack 112: 0 0 0 0 GIC-0 53 Edge qcom_rpm_err 113: 0 0 0 0 GIC-0 54 Edge qcom_rpm_wakeup 114: 48 0 0 0 GIC-0 132 Edge msm_otg, ci_hdrc_msm 115: 796 0 0 0 GIC-0 130 Level bam_dma 116: 0 0 0 0 GIC-0 128 Level bam_dma 117: 0 0 0 0 GIC-0 127 Level bam_dma 118: 2627 0 0 0 GIC-0 136 Level mmci-pl18x (cmd) 119: 54 0 0 0 GIC-0 226 Level i2c_qup 120: 21 0 0 0 GIC-0 183 Level i2c_qup 122: 0 0 0 0 GIC-0 189 Level i2c_qup 123: 202 0 0 0 GIC-0 190 Level msm_serial0 124: 0 0 0 0 GIC-0 70 Edge smsm 125: 0 0 0 0 GIC-0 121 Edge smsm 126: 0 0 0 0 GIC-0 236 Edge smsm 127: 0 0 0 0 GIC-0 169 Edge smsm 131: 0 0 0 0 pm8xxx 195 Edge Volume Up 165: 0 0 0 0 pm8xxx 229 Edge Volume Down 184: 0 0 0 0 pm8xxx 39 Edge pm8xxx_rtc_alarm 185: 0 0 0 0 pm8xxx 50 Edge pmic8xxx_pwrkey_release 186: 0 0 0 0 pm8xxx 51 Edge pmic8xxx_pwrkey_press IPI0: 0 1 1 1 CPU wakeup interrupts IPI1: 0 0 0 0 Timer broadcast interrupts IPI2: 944 539 1015 529 Rescheduling interrupts IPI3: 1 4 6 4 Function call interrupts IPI4: 0 0 0 0 CPU stop interrupts IPI5: 0 0 0 0 IRQ work interrupts IPI6: 0 0 0 0 completion interrupts Err: 0 Since 25 maps to the ekth3500 (touch panel, which is still working fine), but 93/74 doesn't seem to map to anything, and the problematic irqs are the volume keys 195/229 and power keys 50/51. thanks -john