Thanks for you information.
It is running in AP mode, and uses compat-wireless-3.5-3, but the "irq 16: 
nobody cared" issue still happens when I update to compat-wireless-3.6.6-1.
I found a mail list talk about this, 
https://patchwork.kernel.org/patch/1536701/, I see this patch not in 3.6.6-1, 
and just set the mask to ATH9K_INT_FATAL, 
then the ath_isr() will just reset the hardware, it is just same with what I 
think in my mind, and I will take a try for it.

Another question I said, do you think we should spin_lock_irqsave and 
spin_unlock_irqrestore in the ath9k_tasklet instead of spin_lock/spin_unlock?

Thanks,
- Felix

-----Original Message-----
From: adrian.ch...@gmail.com [mailto:adrian.ch...@gmail.com] On Behalf Of 
Adrian Chadd
Sent: Monday, November 12, 2012 5:44 PM
To: Felix Liao
Cc: ath9k-devel@lists.ath9k.org
Subject: Re: [ath9k-devel] ath9k_tasklet() can be interrupted by a hardware IRQ

Well, any sync error like that where the PCIe glue throws an error like that is 
a bad thing.

It could be from a variety of places. It could be a busted PCIe interconnect 
somehow. I've seen this happen before. The only real way to diagnose that is to 
grab a PCIe bus analyser.

It however could also be from the NIC going in and out of network sleep. Is 
this in STA or AP mode?


Adrian

On 12 November 2012 01:06, Felix Liao <felix.l...@watchguard.com> wrote:
> Hi ath9k team,
> I trace the ath9k_hw_getisr() today to find out why I get the zero ISR 
> mask, and I found that when I get the zero ISR mask, this function 
> return the boolean true, then I trace into the function 
> ar9003_hw_get_isr(),
>
> static bool ar9003_hw_get_isr(struct ath_hw *ah, enum ath9k_int 
> *masked) {
>         ....
>         async_cause = REG_READ(ah, AR_INTR_ASYNC_CAUSE);
>
>         if (async_cause & (AR_INTR_MAC_IRQ | AR_INTR_ASYNC_MASK_MCI)) {
>                 if ((REG_READ(ah, AR_RTC_STATUS) & AR_RTC_STATUS_M)
>                                 == AR_RTC_STATUS_ON)
>                         isr = REG_READ(ah, AR_ISR);
>         }
>
>
>         sync_cause = REG_READ(ah, AR_INTR_SYNC_CAUSE) & 
> AR_INTR_SYNC_DEFAULT;
>
>         *masked = 0;
>
>         if (!isr && !sync_cause && !async_cause)
>                 return false;
>
>         if (isr == 0) {
>                 //debug value here is isr=0, async_cause=0, sync_cause=0x20 
> --> AR_INTR_SYNC_HOST1_FATAL
>         }
>         ...
> }
>
> I have no idea why I always get the isr=0 and sync_cause=0x20 which means 
> AR_INTR_SYNC_HOST1_FATAL, but we can't deal with this value in the source 
> code, do you know any about this?
>
> Thanks,
> - Felix
>
> -----Original Message-----
> From: Felix Liao
> Sent: Monday, November 12, 2012 10:10 AM
> To: 'ath9k-devel@lists.ath9k.org'
> Subject: RE: ath9k_tasklet() can be interrupted by a hardware IRQ
>
> Hi ath9k team:
> Another interest question is since the ath9k interrupt was disabled, how can 
> we get a new IRQ to interrupt the tasklet?
> So I investigated and found that the cause of the issue "irq 16: nobody 
> cared" in my box is the ath9k's irq handler return too many IRQ_NONE since I 
> read the ISR register always return 0 as below.
>
>         /*
>          * Figure out the reason(s) for the interrupt.  Note
>          * that the hal returns a pseudo-ISR that may include
>          * bits we haven't explicitly enabled so we mask the
>          * value to insure we only process bits we requested.
>          */
>         ath9k_hw_getisr(ah, &status);   /* NB: clears ISR too */
>         status &= ah->imask;    /* discard unasked-for bits */
>
>         /*
>          * If there are no status bits set, then this interrupt was not
>          * for me (should have been caught above).
>          */
>         if (!status)
>                 return IRQ_NONE;
>
> if I go to reset chip instead of return IRQ_NONE in this position, the things 
> become a little healthier than before, I will never see the message "irq 16: 
> nobody cared".
>
> I check each IRQ using "if(AR_SREV_9300(ah))" and they are all passed, that 
> is, no other hardware share the irq with ath9k.
> I need to know why we read the ISR register and get value 0, and return 
> IRQ_NONE, the chip I used is AR9300.
> Can we using this change below in the ath_isr() function:
>         if (!status)
>                 goto reset_chip;
>
> Thanks,
> - Felix
> -----
> From: Felix Liao
> Sent: Friday, November 09, 2012 9:33 AM
> To: 'ath9k-devel@lists.ath9k.org'
> Subject: ath9k_tasklet() can be interrupted by a hardware IRQ
>
> Hi all,
> I met a kernel call trace below using the ath9k driver, we can see that the 
> ath9k_tasklet() was interrupted by a hardware IRQ, and at this time the ath9k 
> IRQ was disabled, so the kernel will complain that there is none irq handler 
> for this interrupt.
>
> [   60.977474] irq 16: nobody cared (try booting with the "irqpoll" option) [ 
>   60.984194] Call Trace:
> [   60.986660] [dfff1f50] [c000843c] show_stack+0x70/0x1c8 (unreliable) [   
> 60.993041] [dfff1f90] [c00777a0] __report_bad_irq+0x44/0xe8 [   60.998718] 
> [dfff1fb0] [c0077a44] note_interrupt+0x200/0x260 [   61.004396] [dfff1fe0] 
> [c0078704] handle_fasteoi_irq+0xc4/0x10c [   61.010340] [dfff1ff0] [c0010fdc] 
> call_handle_irq+0x18/0x28 [   61.015929] [dfff3e50] [c00051c8] 
> do_IRQ+0xcc/0x1b4 [   61.020825] [dfff3e80] [c0011fc4] 
> ret_from_except+0x0/0x18 [   61.026349] --- Exception: 501 at 
> ioread32+0x8/0x14 [   61.026355]     LR = ath_descdma_setup+0x294/0x664 
> [ath9k] [   61.036733] [dfff3f40] [c04f5bdc] softirq_to_name+0x0/0x28 
> (unreliable) [   61.043390] [dfff3f50] [e5ad66fc] 
> ath9k_hw_enable_interrupts+0x160/0x1b0 [ath9k_hw] [   61.051074] [dfff3f70] 
> [e5b41368] ath9k_tasklet+0xc0/0x208 [ath9k] [   61.057274] [dfff3f90] 
> [c0043298] tasklet_action+0xcc/0xf8 [   61.062777] [dfff3fb0] [c0043d48] 
> __do_softirq+0xdc/0x18c [   61.068194] [dfff3ff0] [c0010fb4] 
> call_do_softirq+0x14/0x24 [   61.073784] [d8613e60] [c0004fec] 
> do_softirq+0x8c/0x98 [   61.078939] [d8613e80] [c0043bd0] 
> local_bh_enable+0x9c/0xa0 [   61.084536] [d8613e90] [c036d910] 
> unix_create1+0x184/0x1ac [   61.090039] [d8613eb0] [c036d9a0] 
> unix_create+0x68/0xbc [   61.095286] [d8613ec0] [c02c6c34] 
> __sock_create+0x114/0x248 [   61.100877] [d8613ef0] [c02c7008] 
> sys_socket+0x54/0x84 [   61.106031] [d8613f10] [c02c70e8] 
> sys_socketcall+0xb0/0x258 [   61.111622] [d8613f40] [c0011970] 
> ret_from_syscall+0x0/0x3c [   61.117338] --- Exception: c01 at 0xeff41fc [   
> 61.117343]     LR = 0xeff4350 [   61.124578] handlers:
> [   61.126851] [<e5b3eb44>] (ath_isr+0x0/0x21c [ath9k]) [   61.131840] 
> Disabling IRQ #16 [   61.143234] ath: phy0: Failed to stop TX DMA, 
> queues=0x001!
>
> I made a patch below for this issue and use it to reduce the probability of 
> this issue, you can see that, I can't disable the IRQ from start to end of 
> this function, I need to restore the IRQ flags before we call ath_rx_tasklet 
> and ath_tx_tasklet, since these functions will disable and enable the local 
> softirq using spin_lock_bh and spin_unlock_bh,  since with this gap of time, 
> the tasklet still can be interrupted by the hardware IRQ.
>
> diff -pNaur 
> compat-wireless-3.6.6-1-orig/drivers/net/wireless/ath/ath9k/main.c 
> compat-wireless-3.6.6-1-fixed/drivers/net/wireless/ath/ath9k/main.c
> --- compat-wireless-3.6.6-1-orig/drivers/net/wireless/ath/ath9k/main.c    
> 2012-11-08 00:17:08.000000000 +0800
> +++ compat-wireless-3.6.6-1-fixed/drivers/net/wireless/ath/ath9k/main.
> +++ c
> +++ 2012-11-09 07:27:28.459837051 +0800
> @@ -368,6 +368,7 @@ void ath9k_tasklet(unsigned long data)
>               u32 status = sc->intrstatus;
>               u32 rxmask;
>
> +             local_irq_save(flags);
>               ath9k_ps_wakeup(sc);
>               spin_lock(&sc->sc_pcu_lock);
>
> @@ -383,7 +384,7 @@ void ath9k_tasklet(unsigned long data)
>                              goto out;
>               }
>
> -              spin_lock_irqsave(&sc->sc_pm_lock, flags);
> +             spin_lock(&sc->sc_pm_lock);
>               if ((status & ATH9K_INT_TSFOOR) && sc->ps_enabled) {
>                              /*
>                               * TSF sync does not look correct; remain awake 
> to sync with @@ -392,7 +393,7 @@ void ath9k_tasklet(unsigned long data)
>                              ath_dbg(common, PS, "TSFOOR - Sync with next 
> Beacon\n");
>                              sc->ps_flags |= PS_WAIT_FOR_BEACON | 
> PS_BEACON_SYNC;
>               }
> -              spin_unlock_irqrestore(&sc->sc_pm_lock, flags);
> +             spin_unlock(&sc->sc_pm_lock);
>
>                if (ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
>                              rxmask = (ATH9K_INT_RXHP | ATH9K_INT_RXLP | 
> ATH9K_INT_RXEOL | @@ -400,6 +401,7 @@ void ath9k_tasklet(unsigned long data)
>               else
>                              rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL 
> | ATH9K_INT_RXORN);
>
> +             local_irq_restore(flags);
>               if (status & rxmask) {
>                              /* Check for high priority Rx first */
>                              if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) && @@ 
> -416,6 +418,7 @@ void ath9k_tasklet(unsigned long data)
>                                             ath_tx_tasklet(sc);
>               }
>
> +             local_irq_save(flags);
>               ath9k_btcoex_handle_interrupt(sc, status);
>
>  out:
> @@ -424,6 +427,7 @@ out:
>
>                spin_unlock(&sc->sc_pcu_lock);
>               ath9k_ps_restore(sc);
> +             local_irq_restore(flags);
> }
>
>  irqreturn_t ath_isr(int irq, void *dev)
>
> but what I need to say is we should disable the IRQ entire of this function, 
> so we need to use the spin_lock/spin_unlock instead of 
> spin_lock_bh/spin_unlock_bh in the functions called by it, but I know it is a 
> big job to do, so I ask for your suggestion.
> In a simple way,  this patch can fix this issue incompletely.
>
> Thanks,
> - Felix
>
> _______________________________________________
> ath9k-devel mailing list
> ath9k-devel@lists.ath9k.org
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel
_______________________________________________
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel

Reply via email to