Hi Kalle (and Johannes; I'll reply to Johannes response separately too), On Mon, Jun 05, 2017 at 06:54:18PM +0300, Kalle Valo wrote: > Brian Norris <briannor...@chromium.org> writes: > > That's not to say that there aren't such bugs out there. I'd still be > > willing to bet there are. And IMO, it seems wise to just do the same > > teardown/setup as one would do for (e.g.) 'rmmod', to prevent exposing > > *too* many new permutations of "wiphy is available but rest of the > > driver is torn down". > > This feels like a sledge hammer approach causing all sort of problems
Yes, it is a sledge hammer. But I'm working with what we have here. With this approach, it's also easier to tell that things aren't out-of-sync, since I'm never quite sure how much state was held in the firmware (and now won't match what user space thinks). A full removal / re-init makes this clear -- user space should expect *everything* to be reset. I'm open to learning better approaches if possible, but this also might be difficult if I don't get any support from Marvell on this. (They seem quite happy to let sleeping dogs lie.) > for user space and I really like the mac80211 approach more. For > example, if an ath10k firmware crash happens user only sees a few second > pause in data traffic and a warning in kernel log, otherwise everything > happens behind the scenes. Of course there are very likely races > somewhere but at least I haven't seen that many reports related to > firmware restart functionality. Yes, that all sounds nice. But for my sake, can you describe better what's actually going on there (e.g., can you point me at which code does this)? I'm really not familiar with mac80211 (though I was aware of the above general behavior). But to my knowledge, mac80211 drivers keep a lot more state managed in the kernel, so it's a little easier and more natural to get the driver/FW back to "the same state" than it is with a full-MAC driver. > > But if none of this is convincing to you, I can take a stab at a > > different solution. > > I don't have any problem applying this patch but more about being > curious why doing it like this. And hopefully finding a less intrusive > solution in the future. OK, sure. I'll see what I can do, but I don't see an easy path at the moment toward fixing (i.e., completely rewriting) this long-standing driver behavior. [trim] Thanks, Brian