On Thu, Apr 23 2020, Stefan Sperling <s...@stsp.name> wrote:
> I have observed a uvm fault in ieee80211_mira_probe_timeout_up() while
> testing with iwm(4) and tcpbench:
>
> void
> ieee80211_mira_probe_timeout_up(void *arg)
> {
>       struct ieee80211_mira_node *mn = arg;
>       int s;
>
>       s = splnet();
>       mn->probe_timer_expired[IEEE80211_MIRA_PROBE_TO_UP] = 1;
>       DPRINTFN(3, ("probe up timeout fired\n"));
>       splx(s);
> }
>
> One obvious possibility is that the 'mn' pointer became invalid before the
> timeout was executed. But I am not certain what happened exactly; the info
> in ddb was inconclusive since the console switching ran into splassert
> failures and I didn't see a good backtrace. But r12 in 'show regs' contained
> the address of ieee80211_mira_probe_timeout_up() and it looked like the
> kernel was in softclock context.
>
> In any case, it looks like cancelling timeouts before scheduling the
> iwm_newstate_task can lead to a race:
>
>  - Timeouts are cancelled and iwm_newstate_task is scheduled
>  - Tx done interrupts feed frames to MiRA which adds a new timeout
>  - iwm_newstate_task runs and switches state without cancelling this timeout
>  
> So cancel timeouts when we are actually switching state in the task.
>
> While here, initialize MiRA timeouts and other rate scaling state earlier,
> when the node is allocated.
>
> ok?

Works fine so far on

  iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi
  iwm0: hw rev 0x230, fw ver 34.0.1, address f8:59:71:xx:xx:xx

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply via email to