Re: [PATCH 1/3] mwifiex: disable ps_mode explicitly by default instead

2020-11-26 Thread Tsuchiya Yuto
On Fri, 2020-11-20 at 13:04 -0800, Brian Norris wrote:
> On Fri, Oct 30, 2020 at 1:04 AM Tsuchiya Yuto  wrote:
> > On Thu, 2020-10-29 at 11:25 -0700, Brian Norris wrote:
> > > For the record, Chrome OS supports plenty of mwifiex systems with 8897
> > > (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting
> > > those. Your problem sounds to be exclusively a problem with the PCIe
> > > 8897 firmware.
> > 
> > Actually, I already know that some Chromebooks use these mwifiex cards
> > (but not out PCIe-88W8897) because I personally like chromiumos. I'm
> > always wondering what is the difference. If the difference is firmware,
> > our PCIe-88W8897 firmware should really be fixed instead of this stupid
> > series.
> 
> PCIe is a very different beast. (For one, it uses DMA and
> memory-mapped registers, where SDIO has neither.) It was a very
> difficult slog to get PCIe/8997 working reliably for the few
> Chromebooks that shipped it, and lots of that work is in firmware. I
> would not be surprised if the PCIe-related changes Marvell made for
> 8997 never fed back into their PCIe-8897 firmware. Or maybe they only
> ever launched PCIe-8897 for Windows, and the Windows driver included
> workarounds that were never published to their Linux driver. But now
> I'm just speculating.

Thanks. Yeah, this is indeed hard work. Actually, I (and maybe also other
users) am already thankful that there is wifi driver/firmware available
on Linux :) and it'll be greater if we can fix ps_mode-related issues.

> > Yes, I'm sorry that I know this series is just a stupid one but I have to
> > send this anyway because this stability issue has not been fixed for a
> > long time. I should have added this buglink to every commit as well:
> > 
> > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
> > 
> > If the firmware can't be fixed, I'm afraid I have to go this way. It makes
> > no sense to keep enabling power_save for the affected devices if we know
> > it's broken.
> 
> Condolences and sympathy, seriously. You likely have little chance of
> getting the firmware fixed, so without new information (e.g,. other
> workarounds?), this is the probably the right way to go.

Thank you for the pointer!

There are two issues regarding ps_mode:
1) fw crashes with "Firmware wakeup failed"
   (I haven't mentioned in this series, but ps_mode also causes fw crashes)
2) connection instability (like large ping delay or even ping not reaching)

If anyone is ever interested in dmesg log with debug_mask=0x and
device_dump, I posted them to the Bugzilla [1] before.

Regarding the #2, although this is even not a workaround but I found
scanning APs will fix this. So, when I encounter this issue, I keep
scanning APs like "watch -n10 sudo iw dev ${dev_name} scan". So, it
seems that scanning APs will somehow wake wifi up? In other words, wifi
is sleeping when it shouldn't? or wifi somehow failed to wake up when
it should?

Regarding #1, we don't have any ideas yet. There is a guess that memory
leak will occur in the fw every time wifi goes into sleep, but don't know.

We even don't have the exact reproducers for both #1 and #2. What we
know so far is that, enabling ps_mode causes these issues.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=109681#c130

> Brian




Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter

2020-11-26 Thread Tsuchiya Yuto
On Fri, 2020-11-20 at 12:53 -0800, Brian Norris wrote:
> (Sorry if anything's a bit slow here. I don't really have time to
> write out full proposals myself.)

Don’t worry, I (and other owners) am already glad to hear from upstream
devs, including you :)

(and I'm also sorry for the late reply from me. It was difficult to find
spare time these days.)

> On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto  wrote:
> > Let me know if splitting this patch like this works. 1) The first patch
> > is to add this module parameter but don't change the default behavior.
> 
> That *could* be OK with me, although it's already been said that there
> are many people who dislike extra module parameters. I also don't see
> why this needs to be a module parameter at all. If you do #2 right,
> you don't really need this, as there are already several standard ways
> of doing this (e.g., via Kconfig, or via nl80211 on a per-device
> basis).
> 
> > 2) The second patch is to change the parameter value depending on the
> > DMI matching or something so that it doesn't break the existing users.
> 
> Point 2 sounds good, and this is the key point. Note that you can do
> point 2 without making it a module parameter. Just keep a flag in the
> driver-private structures.

I chose to also provide the module parameter because I thought it would
be a little bit complicated for users to re-enable this dump feature if
I chose not to provide this parameter.

If I don't provide the parameter but still want to allow users to
re-enable this dump feature, we can provide a way to change the bit flags
of quirks (via a new debugfs entry or another module parameter). That
said, is there a way to easily change the quirk flags only for this dump
feature?

For example, assume that the following three quirks are enabled by default:

/* quirks */
#define QUIRK_FW_RST_D3COLD BIT(0)
#define QUIRK_NO_DEVICE_DUMPBIT(1)
#define QUIRK_FOO   BIT(2)

.driver_data = (void *)(QUIRK_FW_RST_D3COLD |
QUIRK_NO_DEVICE_DUMP |
QUIRK_FOO),

card->quirks = (uintptr_t)dmi_id->driver_data;

/* and assume that card->quirks can be changed by users by a file named
 * "quirks" under debugfs.
 */

So, the card->quirks will be "7" in decimal by default. Then, if users
want to re-enable the dump feature, as far as I know, users need to know
the default value "7", then need to know that device_dump is controlled
by bit 1. Finally, users can set the new quirk flags "5" in decimal (101
in binary).

echo 5 > /sys/kernel/debug/mwifiex/mlan0/quirks

I'm glad if there is another nice way to control only the device_dump
quirk flag, if we only provide a way to change quirk flags.

But at the same time I also think that if someone dare to want to
re-enable this feature, maybe the person won't feel it's complicated haha.
So, I'll drop the device_dump module parameter and switch to use the quirk
framework, adding a way to change the quirk flags value by users.

That said, we may even drop disabling the dump. Take a look at my comment
I wrote below.

> > But what I want to say here as well is that, if the firmware can be fixed,
> > we don't need a patch like this.
> 
> Sure. That's also where we don't necessarily need more ways to control
> this from user space (e.g., module parameters), but just better
> detection of currently broken systems (in the driver). And if firmware
> ever gets fixed, we can undo the "broken device" detection.

There are two types of firmware crashes we observes:
1) cmd_timedout (other than ps_mode-related)
2) Firmware wakeup failed (ps_mode-related)

The #2 is observed when we enabled ps_mode. The #1 is observed for the
other causes. And hopefully, verdre (in Cc) found a "fix" [1] for the
#1 fw crash. We are trying the fix now.

The pull req (still WIP) basically addresses fw crashing by using
"non-posted" register writes and uninterruptible wait queue for PCI
operations in the kernel driver side.

We still don't have any ideas to "fix" the #2 fw crash, but now we don't
see the #1 crash anymore so far.

I'd like to see where the attempt goes before I start working on this
patch here again.

Thank you, everyone.

[1] https://github.com/linux-surface/kernel/pull/70
[WIP] Properly fix wifi firmware crashes by jonas2515 · Pull Request #70 · 
linux-surface/kernel

> Brian




Re: [PATCH] mwifiex: pcie: skip cancel_work_sync() on reset failure path

2020-11-11 Thread Tsuchiya Yuto
On Tue, 2020-11-10 at 18:51 +, Kalle Valo wrote:
> Tsuchiya Yuto  wrote:
> 
> > If a reset is performed, but even the reset fails for some reasons (e.g.,
> > on Surface devices, the fw reset requires another quirks),
> > cancel_work_sync() hangs in mwifiex_cleanup_pcie().
> > 
> > # firmware went into a bad state
> > [...]
> > [ 1608.281690] mwifiex_pcie :03:00.0: info: shutdown mwifiex...
> > [ 1608.282724] mwifiex_pcie :03:00.0: rx_pending=0, tx_pending=1,   
> > cmd_pending=0
> > [ 1608.292400] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [ 1608.292405] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > # reset performed after firmware went into a bad state
> > [ 1609.394320] mwifiex_pcie :03:00.0: WLAN FW already running! Skip 
> > FW dnld
> > [ 1609.394335] mwifiex_pcie :03:00.0: WLAN FW is active
> > # but even the reset failed
> > [ 1619.499049] mwifiex_pcie :03:00.0: mwifiex_cmd_timeout_func: 
> > Timeout cmd id = 0xfa, act = 0xe000
> > [ 1619.499094] mwifiex_pcie :03:00.0: num_data_h2c_failure = 0
> > [ 1619.499103] mwifiex_pcie :03:00.0: num_cmd_h2c_failure = 0
> > [ 1619.499110] mwifiex_pcie :03:00.0: is_cmd_timedout = 1
> > [ 1619.499117] mwifiex_pcie :03:00.0: num_tx_timeout = 0
> > [ 1619.499124] mwifiex_pcie :03:00.0: last_cmd_index = 0
> > [ 1619.499133] mwifiex_pcie :03:00.0: last_cmd_id: fa 00 07 01 07 
> > 01 07 01 07 01
> > [ 1619.499140] mwifiex_pcie :03:00.0: last_cmd_act: 00 e0 00 00 00 
> > 00 00 00 00 00
> > [ 1619.499147] mwifiex_pcie :03:00.0: last_cmd_resp_index = 3
> > [ 1619.499155] mwifiex_pcie :03:00.0: last_cmd_resp_id: 07 81 07 81 
> > 07 81 07 81 07 81
> > [ 1619.499162] mwifiex_pcie :03:00.0: last_event_index = 2
> > [ 1619.499169] mwifiex_pcie :03:00.0: last_event: 58 00 58 00 58 00 
> > 58 00 58 00
> > [ 1619.499177] mwifiex_pcie :03:00.0: data_sent=0 cmd_sent=1
> > [ 1619.499185] mwifiex_pcie :03:00.0: ps_mode=0 ps_state=0
> > [ 1619.499215] mwifiex_pcie :03:00.0: info: _mwifiex_fw_dpc: 
> > unregister device
> > # mwifiex_pcie_work hang happening
> > [ 1823.233923] INFO: task kworker/3:1:44 blocked for more than 122 
> > seconds.
> > [ 1823.233932]   Tainted: GWC OE 5.10.0-rc1-1-mainline 
> > #1
> > [ 1823.233935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> > disables this message.
> > [ 1823.233940] task:kworker/3:1 state:D stack:0 pid:   44 ppid: 
> > 2 flags:0x4000
> > [ 1823.233960] Workqueue: events mwifiex_pcie_work [mwifiex_pcie]
> > [ 1823.233965] Call Trace:
> > [ 1823.233981]  __schedule+0x292/0x820
> > [ 1823.233990]  schedule+0x45/0xe0
> > [ 1823.233995]  schedule_timeout+0x11c/0x160
> > [ 1823.234003]  wait_for_completion+0x9e/0x100
> > [ 1823.234012]  __flush_work.isra.0+0x156/0x210
> > [ 1823.234018]  ? flush_workqueue_prep_pwqs+0x130/0x130
> > [ 1823.234026]  __cancel_work_timer+0x11e/0x1a0
> > [ 1823.234035]  mwifiex_cleanup_pcie+0x28/0xd0 [mwifiex_pcie]
> > [ 1823.234049]  mwifiex_free_adapter+0x24/0xe0 [mwifiex]
> > [ 1823.234060]  _mwifiex_fw_dpc+0x294/0x560 [mwifiex]
> > [ 1823.234074]  mwifiex_reinit_sw+0x15d/0x300 [mwifiex]
> > [ 1823.234080]  mwifiex_pcie_reset_done+0x50/0x80 [mwifiex_pcie]
> > [ 1823.234087]  pci_try_reset_function+0x5c/0x90
> > [ 1823.234094]  process_one_work+0x1d6/0x3a0
> > [ 1823.234100]  worker_thread+0x4d/0x3d0
> > [ 1823.234107]  ? rescuer_thread+0x410/0x410
> > [ 1823.234112]  kthread+0x142/0x160
> > [ 1823.234117]  ? __kthread_bind_mask+0x60/0x60
> > [ 1823.234124]  ret_from_fork+0x22/0x30
> > [...]
> > 
> > This is a deadlock caused by calling cancel_work_sync() in
> > mwifiex_cleanup_pcie():
> > 
> > - Device resets are done via mwifiex_pcie_card_reset()
> > - which schedules card->work to call mwifiex_pcie_card_reset_work()
> > - which calls pci_try_reset_function().
> > - This leads to mwifiex_pcie_reset_done() be called on the same workqueue,
> >   which in turn calls
> > - mwifiex_reinit_sw() and that calls
> > - _mwifiex_fw_dpc().
> > 
> > The problem is now that _mwifiex_fw_dpc() calls mwifiex_free_adapter()
> > in case firmware initialization fails. That ends up calling
> > mwifiex_cleanup_pcie().
> > 
> > Note that all those calls are still 

Re: [PATCH 1/2] mwifiex: fix mwifiex_shutdown_sw() causing sw reset failure

2020-11-08 Thread Tsuchiya Yuto
On Sun, 2020-11-08 at 13:31 +0200, Kalle Valo wrote:
> Tsuchiya Yuto  writes:
> 
> > When FLR is performed but without fw reset for some reasons (e.g. on
> > Surface devices, fw reset requires another quirk), it fails to reset
> > properly. You can trigger the issue on such devices via debugfs entry
> > for reset:
> > 
> > $ echo 1 | sudo tee /sys/kernel/debug/mwifiex/mlan0/reset
> > 
> > and the resulting dmesg log:
> > 
> > [   45.740508] mwifiex_pcie :03:00.0: Resetting per request
> > [   45.742937] mwifiex_pcie :03:00.0: info: successfully 
> > disconnected from [BSSID]: reason code 3
> > [   45.744666] mwifiex_pcie :03:00.0: info: shutdown mwifiex...
> > [   45.751530] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.751539] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771691] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771695] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   45.771697] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771698] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   45.771699] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771701] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   45.771702] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771703] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   45.771704] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771705] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   45.771707] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
> > [   45.771708] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   53.099343] mwifiex_pcie :03:00.0: info: trying to associate to 
> > '[SSID]' bssid [BSSID]
> > [   53.241870] mwifiex_pcie :03:00.0: info: associated to bssid 
> > [BSSID] successfully
> > [   75.377942] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
> > [   85.385491] mwifiex_pcie :03:00.0: info: successfully 
> > disconnected from [BSSID]: reason code 15
> > [   87.539408] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
> > [   87.539412] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [   99.699917] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
> > [   99.699925] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [  111.859802] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
> > [  111.859808] mwifiex_pcie :03:00.0: deleting the crypto keys
> > [...]
> > 
> > When comparing mwifiex_shutdown_sw() with mwifiex_pcie_remove(), it
> > lacks mwifiex_init_shutdown_fw().
> > 
> > This commit fixes mwifiex_shutdown_sw() by adding the missing
> > mwifiex_init_shutdown_fw().
> > 
> > Fixes: 4c5dae59d2e9 ("mwifiex: add PCIe function level reset support")
> > Signed-off-by: Tsuchiya Yuto 
> 
> Otherwise looks good to me, but what is FLR? I can add the description
> to the commit log if you tell me what it is.
> 

Thanks! It's a PCIe Function Level Reset (FLR). In addition to this,
it may be better to add a vendor name (Microsoft) of the devices where
we observed this issue?
(The other patch I sent also lacks the vendor name [1])

Based on the two improvements, I think the commit log should look like
this (also fixed some grammatical errors):

When a PCIe function level reset (FLR) is performed but without fw reset
for some reasons (e.g., on Microsoft Surface devices, fw reset requires
other quirks), it fails to reset wifi properly. You can trigger the issue
on such devices via debugfs entry for reset:
[...]

I'd appreciate it if you could add the above changes (and/or your changes
if required). I can also resend this series if this is more preferable.

[1] [PATCH] mwifiex: pcie: skip cancel_work_sync() on reset failure path

https://lore.kernel.org/linux-wireless/20201028142346.18355-1-kita...@gmail.com/




Re: [RFC PATCH] mwifiex: pcie: use shutdown_sw()/reinit_sw() on suspend/resume

2020-11-02 Thread Tsuchiya Yuto
On Sat, 2020-10-31 at 00:27 +0900, Tsuchiya Yuto wrote:
> On Wed, 2020-10-28 at 23:27 +0900, Tsuchiya Yuto wrote:
>> On Microsoft Surface devices (PCIe-88W8897), there are issues with S0ix
>> achievement and AP scanning after suspend with the current Host Sleep
>> method.
>>
>> When using the Host Sleep method, it prevents the platform to reach S0ix
>> during suspend. Also, sometimes AP scanning won't work, resulting in
>> non-working wifi after suspend.
>>
>> To fix such issues, perform shutdown_sw()/reinit_sw() instead of Host
>> Sleep on suspend/resume.
>>
>> Signed-off-by: Tsuchiya Yuto 
>> ---
>> As a side effect, this patch disables wakeups (means that Wake-On-WLAN
>> can't be used anymore, if it was working before), and might also reset
>> some internal states.
>>
>> Of course it's the best to rather fix Host Sleep itself. But if it's
>> difficult, I'm afraid we have to go this way.
>>
>> I reused the contents of suspend()/resume() functions as much as possible,
>> and removed only the parts that are incompatible or redundant with
>> shutdown_sw()/reinit_sw().
>>
>> - Removed wait_for_completion() as redundant
>>   mwifiex_shutdown_sw() does this.
>> - Removed flush_workqueue() as incompatible
>>   Causes kernel crashing.
>> - Removed mwifiex_enable_wake()/mwifiex_disable_wake()
>>   as incompatible and redundant because the driver will be shut down
>>   instead of entering Host Sleep.
>>
>> I'm worried about why flush_workqueue() causes kernel crash with this
>> suspend method. Is it OK to just drop it? At least We Microsoft Surface
>> devices users used this method for about one month and haven't observed
>> any issues.
>>
>> Note that suspend() no longer checks if it's already suspended.
>> With the previous Host Sleep method, the check was done by looking at
>> adapter->hs_activated in mwifiex_enable_hs() [sta_ioctl.c], but not
>> MWIFIEX_IS_SUSPENDED. So, what the previous method checked was instead
>> Host Sleep state, not suspend itself.
>>
>> Therefore, there is no need to check the suspend state now.
>> Also removed comment for suspend state check at top of suspend()
>> accordingly.
>
> This patch depends on the following mwifiex_shutdown_sw() fix I sent
> separately.
>
> [PATCH 1/2] mwifiex: fix mwifiex_shutdown_sw() causing sw reset failure
> https://lore.kernel.org/linux-wireless/20201028142110.18144-2-kita...@gmail.com/

The AP scanning issue with Host Sleep is now difficult to reproduce on
v5.10-rc2. It might be already gone but not yet so sure.

Here are the details about AP scanning issue with Host Sleep for the
record (as of Apr 2020):

When using Host Sleep on suspend, after resuming from suspend, it
(sometimes) can't connect to APs because it fails to scan APs. When
I set debug_mask to 0x, I noticed that scanning is being blocked
with this message:

kern  :info  : [99952.621609] mwifiex_pcie :03:00.0: info: received 
scan request on mlan0
kern  :info  : [99952.621613] mwifiex_pcie :03:00.0: cmd: Scan already 
in process..

What is worse, when this issue happened, the subsequent suspend
(sometimes) fails with the following message:

kern  :info  : [101844.423427] mwifiex_pcie :03:00.0: 
hs_activate_wait_q terminated
kern  :info  : [101844.423433] mwifiex_pcie :03:00.0: cmd: failed to 
suspend
kern  :err   : [101844.423446] PM: pci_pm_suspend(): 
mwifiex_pcie_suspend+0x0/0xd0 [mwifiex_pcie] returns -14
kern  :err   : [101844.423453] PM: dpm_run_callback(): 
pci_pm_suspend+0x0/0x160 returns -14
kern  :err   : [101844.423466] PM: Device :03:00.0 failed to suspend 
async: error -14
kern  :debug : [101844.423525] PM: suspend of devices aborted after 
10064.914 msecs
kern  :debug : [101844.423529] PM: start suspend of devices aborted after 
10065.318 msecs
kern  :err   : [101844.423531] PM: Some devices failed to suspend, or early 
wake event detected

The message is from the following code in mwifiex_cfg80211_scan()
[cfg80211.c].

/* Block scan request if scan operation or scan cleanup when interface
 * is disabled is in process
 */
if (priv->scan_request || priv->scan_aborting) {
mwifiex_dbg(priv->adapter, WARN,
"cmd: Scan already in process..\n");
return -EBUSY;
}

Further print debugging showed that scan_request was not true but
scan_aborting was true. And the scan_aborting was set by mwifiex_close()
[main.c].

Regarding the S0ix achievement, I don't have any idea how I can fix it
with the Host Sleep method. So, I sent this patch. Any suggestions for
fixing it with Host Sleep are welcome.

If I understand correctly, the mwifiex card is in fully wor

Re: [RFC PATCH 1/3] mwifiex: pcie: add DMI-based quirk impl for Surface devices

2020-11-02 Thread Tsuchiya Yuto
On Wed, 2020-10-28 at 17:27 +0200, Andy Shevchenko wrote:
> On Wed, Oct 28, 2020 at 11:27:51PM +0900, Tsuchiya Yuto wrote:
>> This commit adds quirk implementation based on DMI matching with DMI
>> table for Microsoft Surface devices that uses mwifiex chip (currently,
>> all the devices that use mwifiex equip PCIe-88W8897 chip).
>>
>> This implementation can be used for quirks later.
>
> I guess you might need to resend this (and possible other PCI pm related)
> patches to linux-pci@ and Bjorn in Cc.
>
> They may advise possible other (better) solutions.

Thanks for the advice! I also feel it's better. And if I resend this to
the linux-pci mailing list, I feel that I should try to use
drivers/pci/quirks.c instead if possible. But if I do so with this quirk
implementation, it might be possible that it'll be too big change.

So, I'll just resend this series (with changes so that it can be used
for powr_save quirk as well like I said in the reply to another series)
to the linux-pci mailing list and Bjorn (and linux-wireless mailing list
as well) and see what they think.

>> Signed-off-by: Tsuchiya Yuto 
>> ---
>>  drivers/net/wireless/marvell/mwifiex/Makefile |   1 +
>>  drivers/net/wireless/marvell/mwifiex/pcie.c   |   4 +
>>  drivers/net/wireless/marvell/mwifiex/pcie.h   |   2 +
>>  .../wireless/marvell/mwifiex/pcie_quirks.c| 114 ++
>>  .../wireless/marvell/mwifiex/pcie_quirks.h|  11 ++
>>  5 files changed, 132 insertions(+)
>>  create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
>>  create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.h
>>
>> diff --git a/drivers/net/wireless/marvell/mwifiex/Makefile 
>> b/drivers/net/wireless/marvell/mwifiex/Makefile
>> index fdfd9bf15ed46..8a1e7c5b9c6e2 100644
>> --- a/drivers/net/wireless/marvell/mwifiex/Makefile
>> +++ b/drivers/net/wireless/marvell/mwifiex/Makefile
>> @@ -49,6 +49,7 @@ mwifiex_sdio-y += sdio.o
>>  obj-$(CONFIG_MWIFIEX_SDIO) += mwifiex_sdio.o
>>  
>>  mwifiex_pcie-y += pcie.o
>> +mwifiex_pcie-y += pcie_quirks.o
>>  obj-$(CONFIG_MWIFIEX_PCIE) += mwifiex_pcie.o
>>  
>>  mwifiex_usb-y += usb.o
>> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
>> b/drivers/net/wireless/marvell/mwifiex/pcie.c
>> index 6a10ff0377a24..362cf10debfa0 100644
>> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
>> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
>> @@ -27,6 +27,7 @@
>>  #include "wmm.h"
>>  #include "11n.h"
>>  #include "pcie.h"
>> +#include "pcie_quirks.h"
>>  
>>  #define PCIE_VERSION"1.0"
>>  #define DRV_NAME"Marvell mwifiex PCIe"
>> @@ -410,6 +411,9 @@ static int mwifiex_pcie_probe(struct pci_dev *pdev,
>>  return ret;
>>  }
>>  
>> +/* check quirks */
>> +mwifiex_initialize_quirks(card);
>> +
>>  if (mwifiex_add_card(card, >fw_done, _ops,
>>   MWIFIEX_PCIE, >dev)) {
>>  pr_err("%s failed\n", __func__);
>> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.h 
>> b/drivers/net/wireless/marvell/mwifiex/pcie.h
>> index 843d57eda8201..09839a3bd1753 100644
>> --- a/drivers/net/wireless/marvell/mwifiex/pcie.h
>> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.h
>> @@ -242,6 +242,8 @@ struct pcie_service_card {
>>  struct mwifiex_msix_context share_irq_ctx;
>>  struct work_struct work;
>>  unsigned long work_flags;
>> +
>> +unsigned long quirks;
>>  };
>>  
>>  static inline int
>> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c 
>> b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
>> new file mode 100644
>> index 0..929aee2b0a60a
>> --- /dev/null
>> +++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
>> @@ -0,0 +1,114 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * File for PCIe quirks.
>> + */
>> +
>> +/* The low-level PCI operations will be performed in this file. Therefore,
>> + * let's use dev_*() instead of mwifiex_dbg() here to avoid troubles (e.g.
>> + * to avoid using mwifiex_adapter struct before init or wifi is powered
>> + * down, or causes NULL ptr deref).
>> + */
>> +
>> +#include 
>> +
>> +#include "pcie_quirks.h"
>> +
>> +/* quirk table based on DMI matching */
>> +static const struct dmi_system_id mwifiex_quirk_table[] = {
>> +{
>> +.ident = "S

Re: [PATCH 2/3] mwifiex: add allow_ps_mode module parameter

2020-11-02 Thread Tsuchiya Yuto
On Fri, 2020-10-30 at 13:02 +0200, Andy Shevchenko wrote:
> On Fri, Oct 30, 2020 at 04:58:33PM +0900, Tsuchiya Yuto wrote:
> > On Wed, 2020-10-28 at 15:04 -0700, Brian Norris wrote:
> 
> ...
> 
> > On the other hand, I agree that I don't want to break the existing users.
> > As you mentioned in the reply to the first patch, I can set the default
> > value of this parameter depending on the chip id (88W8897) or DMI matching.
> 
> Since it's a PCIe device you already have ID table where you may add a
> driver_data with what ever quirks are needed.

Sorry that my comment was misleading. I meant using the quirk framework
(that is based on DMI matching) I sent in another series. This applies
to the other replies from me.

However, thanks to your comment, I remembered that currently, the quirk
framework can be used only within pcie.c file. For example, the quirk
initialization is currently done in pcie.c file. The mwifiex driver is
divided into interface-specific modules (PCIe, SDIO, USB) (e.g.,
mwifiex_pcie module for PCIe interface) + common module (mwifiex module).

So, I need to extend the quirk framework so that it can be used by the
mwifiex module globally.

I'll make a v2 version of this series with using the updated quirk
framework so that it won't change behaviors for existing users.




Re: [RFC PATCH] mwifiex: pcie: use shutdown_sw()/reinit_sw() on suspend/resume

2020-10-30 Thread Tsuchiya Yuto
On Wed, 2020-10-28 at 23:27 +0900, Tsuchiya Yuto wrote:
> On Microsoft Surface devices (PCIe-88W8897), there are issues with S0ix
> achievement and AP scanning after suspend with the current Host Sleep
> method.
>
> When using the Host Sleep method, it prevents the platform to reach S0ix
> during suspend. Also, sometimes AP scanning won't work, resulting in
> non-working wifi after suspend.
>
> To fix such issues, perform shutdown_sw()/reinit_sw() instead of Host
> Sleep on suspend/resume.
>
> Signed-off-by: Tsuchiya Yuto 
> ---
> As a side effect, this patch disables wakeups (means that Wake-On-WLAN
> can't be used anymore, if it was working before), and might also reset
> some internal states.
>
> Of course it's the best to rather fix Host Sleep itself. But if it's
> difficult, I'm afraid we have to go this way.
>
> I reused the contents of suspend()/resume() functions as much as possible,
> and removed only the parts that are incompatible or redundant with
> shutdown_sw()/reinit_sw().
>
> - Removed wait_for_completion() as redundant
>   mwifiex_shutdown_sw() does this.
> - Removed flush_workqueue() as incompatible
>   Causes kernel crashing.
> - Removed mwifiex_enable_wake()/mwifiex_disable_wake()
>   as incompatible and redundant because the driver will be shut down
>   instead of entering Host Sleep.
>
> I'm worried about why flush_workqueue() causes kernel crash with this
> suspend method. Is it OK to just drop it? At least We Microsoft Surface
> devices users used this method for about one month and haven't observed
> any issues.
>
> Note that suspend() no longer checks if it's already suspended.
> With the previous Host Sleep method, the check was done by looking at
> adapter->hs_activated in mwifiex_enable_hs() [sta_ioctl.c], but not
> MWIFIEX_IS_SUSPENDED. So, what the previous method checked was instead
> Host Sleep state, not suspend itself.
>
> Therefore, there is no need to check the suspend state now.
> Also removed comment for suspend state check at top of suspend()
> accordingly.

This patch depends on the following mwifiex_shutdown_sw() fix I sent
separately.

[PATCH 1/2] mwifiex: fix mwifiex_shutdown_sw() causing sw reset failure
https://lore.kernel.org/linux-wireless/20201028142110.18144-2-kita...@gmail.com/

>  drivers/net/wireless/marvell/mwifiex/pcie.c | 29 +++--
>  1 file changed, 10 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
> b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index 6a10ff0377a24..3b5c614def2f5 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -293,8 +293,7 @@ static bool mwifiex_pcie_ok_to_access_hw(struct 
> mwifiex_adapter *adapter)
>   * registered functions must have drivers with suspend and resume
>   * methods. Failing that the kernel simply removes the whole card.
>   *
> - * If already not suspended, this function allocates and sends a host
> - * sleep activate request to the firmware and turns off the traffic.
> + * This function shuts down the adapter.
>   */
>  static int mwifiex_pcie_suspend(struct device *dev)
>  {
> @@ -302,31 +301,21 @@ static int mwifiex_pcie_suspend(struct device *dev)
>   struct pcie_service_card *card = dev_get_drvdata(dev);
>  
>  
> - /* Might still be loading firmware */
> - wait_for_completion(>fw_done);
> -
>   adapter = card->adapter;
>   if (!adapter) {
>   dev_err(dev, "adapter is not valid\n");
>   return 0;
>   }
>  
> - mwifiex_enable_wake(adapter);
> -
> - /* Enable the Host Sleep */
> - if (!mwifiex_enable_hs(adapter)) {
> + /* Shut down SW */
> + if (mwifiex_shutdown_sw(adapter)) {
>   mwifiex_dbg(adapter, ERROR,
>   "cmd: failed to suspend\n");
> - clear_bit(MWIFIEX_IS_HS_ENABLING, >work_flags);
> - mwifiex_disable_wake(adapter);
>   return -EFAULT;
>   }
>  
> - flush_workqueue(adapter->workqueue);
> -
>   /* Indicate device suspended */
>   set_bit(MWIFIEX_IS_SUSPENDED, >work_flags);
> - clear_bit(MWIFIEX_IS_HS_ENABLING, >work_flags);
>  
>   return 0;
>  }
> @@ -336,13 +325,13 @@ static int mwifiex_pcie_suspend(struct device *dev)
>   * registered functions must have drivers with suspend and resume
>   * methods. Failing that the kernel simply removes the whole card.
>   *
> - * If already not resumed, this function turns on the traffic and
> - * sends a host sleep cancel request to the firmware.
> + * If already not resumed, this function reinits the adap

Re: [PATCH] mwifiex: pcie: add enable_device_dump module parameter

2020-10-30 Thread Tsuchiya Yuto
On Wed, 2020-10-28 at 17:12 -0700, Brian Norris wrote:
> On Wed, Oct 28, 2020 at 3:58 PM Tsuchiya Yuto  wrote:
> > 
> > The devicve_dump may take a little bit long time and users may want to
> > disable the dump for daily usage.
> > 
> > This commit adds a new module parameter enable_device_dump and disables
> > the device_dump by default.
> 
> As with one of your other patches, please don't change the defaults
> and hide them under a module parameter. If you're adding a module
> parameter, leave the default behavior alone.

Thanks for the review!

I mentioned about power_save stability on the other patches. But I should
have added this fact into the commit message of this patch that even if
we disable that power_save, the firmware crashes a lot on some specific
devices. Really a lot.

For example, as far as I know Surface Pro 5 needs ASPM L1.2 substate
disabled to avoid the firmware crash. Disabling it is still acceptable.
On the other hand, Surface 3 needs L1 ASPM state disabled. This is not
acceptable because this breaks S0ix. Anyway, handling ASPM should be done
in firmware I think.

So, the context of why I sent this patch is the next. We can't fix the
fw crash itself, so, we decided to just let it crash and reset by itself
(with the other fw reset quirks I sent). In this way, the time it does
device_dump is really annoying if fw crashes so often.

Let me know if splitting this patch like this works. 1) The first patch
is to add this module parameter but don't change the default behavior.
2) The second patch is to change the parameter value depending on the
DMI matching or something so that it doesn't break the existing users.

But what I want to say here as well is that, if the firmware can be fixed,
we don't need a patch like this.

> This also seems like something that might be nicer as a user-space
> knob in generic form (similar to "/sys/class/devcoredump/disabled",
> except on a per-device basis, and fed back to the driver so it doesn't
> waste time generating such dumps), but I suppose I can see why a
> module parameter (so you can just stick your configuration in
> /etc/modprobe.d/) might be easier to deal with in some cases.

Agreed.

> Brian




Re: [PATCH 0/3] mwifiex: disable ps_mode by default for stability

2020-10-30 Thread Tsuchiya Yuto
On Wed, 2020-10-28 at 17:21 +0200, Andy Shevchenko wrote:
> On Wed, Oct 28, 2020 at 11:24:30PM +0900, Tsuchiya Yuto wrote:
> > Hello all,
> > 
> > On Microsoft Surface devices (PCIe-88W8897), we are observing stability
> > issues when ps_mode (IEEE power_save) is enabled, then eventually causes
> > firmware crash. Especially on 5GHz APs, the connection is completely
> > unstable and almost unusable.
> > 
> > I think the most desirable change is to fix the ps_mode itself. But is
> > seems to be hard work [1], I'm afraid we have to go this way.
> > 
> > Therefore, the first patch of this series disables the ps_mode by default
> > instead of enabling it on driver init. I'm not sure if explicitly
> > disabling it is really required or not. I don't have access to the details
> > of this chip. Let me know if it's enough to just remove the code that
> > enables ps_mode.
> > 
> > The Second patch adds a new module parameter named "allow_ps_mode". Since
> > other wifi drivers just disable power_save by default by module parameter
> > like this, I also added this.
> > 
> > The third patch adds a message when ps_mode will be changed. Useful when
> > diagnosing connection issues.
> 
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=109681
> 
> Can you attach this to the actual patch as BugLink: tag?
> 

Thanks! Indeed I should have added this... I wrote it in the replies.
If I send the v2 version of this series, I'll add it to them.




Re: [PATCH 1/3] mwifiex: disable ps_mode explicitly by default instead

2020-10-30 Thread Tsuchiya Yuto
On Thu, 2020-10-29 at 11:25 -0700, Brian Norris wrote:
> On Wed, Oct 28, 2020 at 7:04 PM Tsuchiya Yuto  wrote:
> > 
> > On Microsoft Surface devices (PCIe-88W8897), the ps_mode causes
> > connection unstable, especially with 5GHz APs. Then, it eventually causes
> > fw crash.
> > 
> > This commit disables ps_mode by default instead of enabling it.
> > 
> > Required code is extracted from mwifiex_drv_set_power().
> > 
> > Signed-off-by: Tsuchiya Yuto 
> 
> You should read up on WIPHY_FLAG_PS_ON_BY_DEFAULT and
> CONFIG_CFG80211_DEFAULT_PS, and set/respect those appropriately (hint:
> mwifiex sets WIPHY_FLAG_PS_ON_BY_DEFAULT, and your patch makes this a
> lie). Also, this seems like a quirk that you haven't properly worked
> out -- if you're working on a quirk framework in your other series,
> you should just key into that.

Thanks for the review! I didn't know about the flag, much appreciated.
By setting the flag to false explicitly, indeed userspace doesn't try
to enable power_save now at least for this short amount of time. I wonder
if I can drop the second patch (adding module parameter) now. But I still
want to make sure that power_save won't be enabled by userspace tools by
default.

Regarding quirks, I also don't want to break existing users. So, of course
I can try to use the quirk framework if we really can't fix the firmware.

> For the record, Chrome OS supports plenty of mwifiex systems with 8897
> (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting
> those. Your problem sounds to be exclusively a problem with the PCIe
> 8897 firmware.

Actually, I already know that some Chromebooks use these mwifiex cards
(but not out PCIe-88W8897) because I personally like chromiumos. I'm
always wondering what is the difference. If the difference is firmware,
our PCIe-88W8897 firmware should really be fixed instead of this stupid
series.

Yes, I'm sorry that I know this series is just a stupid one but I have to
send this anyway because this stability issue has not been fixed for a
long time. I should have added this buglink to every commit as well:

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681

If the firmware can't be fixed, I'm afraid I have to go this way. It makes
no sense to keep enabling power_save for the affected devices if we know
it's broken.

> As-is, NAK.
> 
> Brian




Re: [PATCH 2/3] mwifiex: add allow_ps_mode module parameter

2020-10-30 Thread Tsuchiya Yuto
On Thu, 2020-10-29 at 13:04 -0400, Willem de Bruijn wrote:
> On Wed, Oct 28, 2020 at 9:13 PM Brian Norris  wrote:
> > 
> > On Wed, Oct 28, 2020 at 2:56 PM Tsuchiya Yuto  wrote:
> > > 
> > > To make the ps_mode (power_save) control easier, this commit adds a new
> > > module parameter allow_ps_mode and set it false (disallowed) by default.
> 
> This sounds like some form of access control, not something that makes
> power control "easier"? What exactly is the use case.

Thanks for the review!

As I replied to Brian, userspace tools sometimes try to enable power_save
anyway regardless of default driver settings (expecting it's not broken,
but in fact it's broken), the module parameter like this is required.

So, the commit message is misleading. What will be "easier" is letting
userspace tools know power_save should be off, not the procedure of
toggling ps_mode state in the driver.

> Also, module params in networking devices are discouraged.

Even though it should be avoided, some upstream drivers provide a module
parameter like this to let users enable it if needed (since they disable
power_save by default because of stability on some devices) likw this
commit 0172b0292649 ("iwlagn: add power_save module parameter").




Re: [PATCH 2/3] mwifiex: add allow_ps_mode module parameter

2020-10-30 Thread Tsuchiya Yuto
On Wed, 2020-10-28 at 15:04 -0700, Brian Norris wrote:
> On Wed, Oct 28, 2020 at 2:56 PM Tsuchiya Yuto  wrote:
> > 
> > To make the ps_mode (power_save) control easier, this commit adds a new
> > module parameter allow_ps_mode and set it false (disallowed) by default.
> 
> This sounds like a bad idea, as it breaks all the existing users who
> expect this feature to be allowed. Seems like you should flip the
> defaults. Without some better justification, NACK.

Thanks for the review! I wanted to open a discussion widely and wanted
to ask from the upstream developers the direction of how this stability
issue should be resolved.

I added the link to the Bugzilla in the cover-letter (that should have
arrived on the mailing list now), but I should have added this to every
commit as well:

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681

This stability issue exists for a long time. I also submitted there the
required kernel log and device_dump more than three months ago. However,
unfortunately, it's not fixed yet. So, I have to send a series like this.

If we know that the power_save feature is broken (on some devices), I
think it should be fixed in either firmware or driver for the affected
devices. It makes no sense to keep enabling the broken features by
default.

Because userspace tools sometimes try to enable power_save anyway
regardless of default driver settings (expecting it's not broken, but
in fact it's broken), the module parameter like this is required in
addition to the first patch of this series. The commit 8298383c2cd5
("ath9k: Do not support PowerSave by default") also does the same thing
for this purpose.

On the other hand, I agree that I don't want to break the existing users.
As you mentioned in the reply to the first patch, I can set the default
value of this parameter depending on the chip id (88W8897) or DMI matching.

> Also, I can't find the other 2 patches in this alleged series. Maybe
> they're still making it through the mailing lists and archives.

Yes, there seems to be a problem with the mailing list at the time.
All the other patches I sent have arrived by now.

> Brian
> 
> > When this parameter is set to false, changing the power_save mode will
> > be disallowed like the following:
> > 
> > $ sudo iw dev mlan0 set power_save on
> >     command failed: Operation not permitted (-1)
> > 
> > Signed-off-by: Tsuchiya Yuto 




[PATCH 0/3] mwifiex: disable ps_mode by default for stability

2020-10-28 Thread Tsuchiya Yuto
Hello all,

On Microsoft Surface devices (PCIe-88W8897), we are observing stability
issues when ps_mode (IEEE power_save) is enabled, then eventually causes
firmware crash. Especially on 5GHz APs, the connection is completely
unstable and almost unusable.

I think the most desirable change is to fix the ps_mode itself. But is
seems to be hard work [1], I'm afraid we have to go this way.

Therefore, the first patch of this series disables the ps_mode by default
instead of enabling it on driver init. I'm not sure if explicitly
disabling it is really required or not. I don't have access to the details
of this chip. Let me know if it's enough to just remove the code that
enables ps_mode.

The Second patch adds a new module parameter named "allow_ps_mode". Since
other wifi drivers just disable power_save by default by module parameter
like this, I also added this.

The third patch adds a message when ps_mode will be changed. Useful when
diagnosing connection issues.

Thanks,
Tsuchiya Yuto

[1] https://bugzilla.kernel.org/show_bug.cgi?id=109681

Tsuchiya Yuto (3):
  mwifiex: disable ps_mode explicitly by default instead
  mwifiex: add allow_ps_mode module parameter
  mwifiex: print message when changing ps_mode

 .../net/wireless/marvell/mwifiex/cfg80211.c   | 23 +++
 .../net/wireless/marvell/mwifiex/sta_cmd.c| 11 ++---
 2 files changed, 31 insertions(+), 3 deletions(-)

-- 
2.29.1



[PATCH 1/3] mwifiex: disable ps_mode explicitly by default instead

2020-10-28 Thread Tsuchiya Yuto
On Microsoft Surface devices (PCIe-88W8897), the ps_mode causes
connection unstable, especially with 5GHz APs. Then, it eventually causes
fw crash.

This commit disables ps_mode by default instead of enabling it.

Required code is extracted from mwifiex_drv_set_power().

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/sta_cmd.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/sta_cmd.c 
b/drivers/net/wireless/marvell/mwifiex/sta_cmd.c
index d3a968ef21ef9..9b7b52fbc9c45 100644
--- a/drivers/net/wireless/marvell/mwifiex/sta_cmd.c
+++ b/drivers/net/wireless/marvell/mwifiex/sta_cmd.c
@@ -2333,14 +2333,19 @@ int mwifiex_sta_init_cmd(struct mwifiex_private *priv, 
u8 first_sta, bool init)
return -1;
 
if (priv->bss_type != MWIFIEX_BSS_TYPE_UAP) {
-   /* Enable IEEE PS by default */
-   priv->adapter->ps_mode = MWIFIEX_802_11_POWER_MODE_PSP;
+   /* Disable IEEE PS by default */
+   priv->adapter->ps_mode = MWIFIEX_802_11_POWER_MODE_CAM;
ret = mwifiex_send_cmd(priv,
   HostCmd_CMD_802_11_PS_MODE_ENH,
-  EN_AUTO_PS, BITMAP_STA_PS, NULL,
+  DIS_AUTO_PS, BITMAP_STA_PS, NULL,
   true);
if (ret)
return -1;
+   ret = mwifiex_send_cmd(priv,
+  HostCmd_CMD_802_11_PS_MODE_ENH,
+  GET_PS, 0, NULL, false);
+   if (ret)
+   return -1;
}
 
if (drcs) {
-- 
2.29.1



[PATCH 0/2] mwifiex: fixes for shutdown_sw() and reinit_sw()

2020-10-28 Thread Tsuchiya Yuto
Hello all,

This series fixes a software level reset feature. We, Microsoft Surface
devices users observed this issue, where firmware reset requires another
quirk. The other device users might not notice this issue because if
the reset is performed for the firmware level, this issue can't be
observed easily.

While here, update description of shutdown_sw() and reinit_sw() functions
to reflect current state.

Thanks,
Tsuchiya Yuto

Tsuchiya Yuto (2):
  mwifiex: fix mwifiex_shutdown_sw() causing sw reset failure
  mwifiex: update comment for shutdown_sw()/reinit_sw() to reflect
current state

 drivers/net/wireless/marvell/mwifiex/main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.29.1



[PATCH 1/2] mwifiex: fix mwifiex_shutdown_sw() causing sw reset failure

2020-10-28 Thread Tsuchiya Yuto
When FLR is performed but without fw reset for some reasons (e.g. on
Surface devices, fw reset requires another quirk), it fails to reset
properly. You can trigger the issue on such devices via debugfs entry
for reset:

$ echo 1 | sudo tee /sys/kernel/debug/mwifiex/mlan0/reset

and the resulting dmesg log:

[   45.740508] mwifiex_pcie :03:00.0: Resetting per request
[   45.742937] mwifiex_pcie :03:00.0: info: successfully disconnected 
from [BSSID]: reason code 3
[   45.744666] mwifiex_pcie :03:00.0: info: shutdown mwifiex...
[   45.751530] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.751539] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771691] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771695] mwifiex_pcie :03:00.0: deleting the crypto keys
[   45.771697] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771698] mwifiex_pcie :03:00.0: deleting the crypto keys
[   45.771699] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771701] mwifiex_pcie :03:00.0: deleting the crypto keys
[   45.771702] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771703] mwifiex_pcie :03:00.0: deleting the crypto keys
[   45.771704] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771705] mwifiex_pcie :03:00.0: deleting the crypto keys
[   45.771707] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[   45.771708] mwifiex_pcie :03:00.0: deleting the crypto keys
[   53.099343] mwifiex_pcie :03:00.0: info: trying to associate to 
'[SSID]' bssid [BSSID]
[   53.241870] mwifiex_pcie :03:00.0: info: associated to bssid [BSSID] 
successfully
[   75.377942] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
[   85.385491] mwifiex_pcie :03:00.0: info: successfully disconnected 
from [BSSID]: reason code 15
[   87.539408] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
[   87.539412] mwifiex_pcie :03:00.0: deleting the crypto keys
[   99.699917] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
[   99.699925] mwifiex_pcie :03:00.0: deleting the crypto keys
[  111.859802] mwifiex_pcie :03:00.0: cmd_wait_q terminated: -110
[  111.859808] mwifiex_pcie :03:00.0: deleting the crypto keys
[...]

When comparing mwifiex_shutdown_sw() with mwifiex_pcie_remove(), it
lacks mwifiex_init_shutdown_fw().

This commit fixes mwifiex_shutdown_sw() by adding the missing
mwifiex_init_shutdown_fw().

Fixes: 4c5dae59d2e9 ("mwifiex: add PCIe function level reset support")
Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/main.c 
b/drivers/net/wireless/marvell/mwifiex/main.c
index 9ba8a8f64976b..6283df5aaaf8b 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1471,6 +1471,8 @@ int mwifiex_shutdown_sw(struct mwifiex_adapter *adapter)
priv = mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY);
mwifiex_deauthenticate(priv, NULL);
 
+   mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
+
mwifiex_uninit_sw(adapter);
adapter->is_up = false;
 
-- 
2.29.1



[RFC PATCH 3/3] mwifiex: pcie: add reset_wsid quirk for Surface 3

2020-10-28 Thread Tsuchiya Yuto
This commit adds reset_wsid quirk and uses this quirk for Surface 3 on
card reset.

To reset mwifiex on Surface 3, it seems that calling the _DSM method
exists in \_SB.WSID [1] device is required.

On Surface 3, calling the _DSM method removes/re-probes the card by
itself. So, need to place the reset function before performing FLR and
skip performing any other reset-related works.

Note that Surface Pro 3 also has the WSID device [2], but it seems to need
more work. This commit only supports Surface 3 yet.

[1] 
https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_3/dsdt.dsl#L11947-L12011
[2] 
https://github.com/linux-surface/acpidumps/blob/05cba925f3a515f222acb5b3551a032ddde958fe/surface_pro_3/dsdt.dsl#L12164-L12216

Signed-off-by: Tsuchiya Yuto 
---
Current issues:
1) After reset with this quirk, ASPM settings don't get restored.

Below is the "sudo lspci -nnvvv" diff before/after fw reset on Surface 3:

#
# root port of wifi:
# 00:1c.0 PCI bridge [0604]: Intel Corporation Atom/Celeron/Pentium 
Processor x5-E8000/J3xxx/N3xxx Series PCI Express Port #1 [8086:22c8] (rev 20) 
(prog-if 00 [Normal decode])
#
@@ -103,7 +103,7 @@
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ 
TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit 
Latency L0s <512ns, L1 <16us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
-   LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- 
CommClk+
+   LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (downgraded), Width x1 (ok)
TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-

#
# no changes on wifi regarding ASPM
#

As you see, the ASPM of wifi root port is disabled after fw reset (thus
breaks S0ix).

2) Creating a new device driver for the WSID device is better?

then export the reset function, and call it from mwifiex. As the comments
in code states, on S3, calling the _DSM methods immediately
remove/re-probe mwifiex. So, resetting the wifi externally is a better
idea? (Also in this way, hopefully supporting SP3 may become easier.)

 drivers/net/wireless/marvell/mwifiex/pcie.c   | 10 +++
 .../wireless/marvell/mwifiex/pcie_quirks.c| 77 ++-
 .../wireless/marvell/mwifiex/pcie_quirks.h|  5 ++
 3 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
b/drivers/net/wireless/marvell/mwifiex/pcie.c
index c0c4b5a9149ab..b3aacd19cc48f 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -2966,6 +2966,16 @@ static void mwifiex_pcie_card_reset_work(struct 
mwifiex_adapter *adapter)
 {
struct pcie_service_card *card = adapter->card;
 
+   /* On Surface 3, reset_wsid method removes then re-probes card by
+* itself. So, need to place it here and skip performing any other
+* reset-related works.
+*/
+   if (card->quirks & QUIRK_FW_RST_WSID_S3) {
+   mwifiex_pcie_reset_wsid_quirk(card->dev);
+   /* skip performing any other reset-related works */
+   return;
+   }
+
/* We can't afford to wait here; remove() might be waiting on us. If we
 * can't grab the device lock, maybe we'll get another chance later.
 */
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c 
b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
index edc739c542fea..f0a6fa0a7ae5f 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
@@ -9,10 +9,21 @@
  * down, or causes NULL ptr deref).
  */
 
+#include 
 #include 
 
 #include "pcie_quirks.h"
 
+/* For reset_wsid quirk */
+#define ACPI_WSID_PATH "\\_SB.WSID"
+#define WSID_REV   0x0
+#define WSID_FUNC_WIFI_PWR_OFF 0x1
+#define WSID_FUNC_WIFI_PWR_ON  0x2
+/* WSID _DSM UUID: "534ea3bf-fcc2-4e7a-908f-a13978f0c7ef" */
+static const guid_t wsid_dsm_guid =
+   GUID_INIT(0x534ea3bf, 0xfcc2, 0x4e7a,
+ 0x90, 0x8f, 0xa1, 0x39, 0x78, 0xf0, 0xc7, 0xef);
+
 /* quirk table based on DMI matching */
 static const struct dmi_system_id mwifiex_quirk_table[] = {
{
@@ -87,7 +98,7 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface 3"),
},
-   .driver_data = 0,
+   .driver_data = (void *)QUIRK_FW_RST_WSID_S3,
},
{
.ident = "Surface Pro 3",
@@ -113,6 +124,9 @@ void mwifiex_ini

[RFC PATCH] mwifiex: pcie: use shutdown_sw()/reinit_sw() on suspend/resume

2020-10-28 Thread Tsuchiya Yuto
On Microsoft Surface devices (PCIe-88W8897), there are issues with S0ix
achievement and AP scanning after suspend with the current Host Sleep
method.

When using the Host Sleep method, it prevents the platform to reach S0ix
during suspend. Also, sometimes AP scanning won't work, resulting in
non-working wifi after suspend.

To fix such issues, perform shutdown_sw()/reinit_sw() instead of Host
Sleep on suspend/resume.

Signed-off-by: Tsuchiya Yuto 
---
As a side effect, this patch disables wakeups (means that Wake-On-WLAN
can't be used anymore, if it was working before), and might also reset
some internal states.

Of course it's the best to rather fix Host Sleep itself. But if it's
difficult, I'm afraid we have to go this way.

I reused the contents of suspend()/resume() functions as much as possible,
and removed only the parts that are incompatible or redundant with
shutdown_sw()/reinit_sw().

- Removed wait_for_completion() as redundant
  mwifiex_shutdown_sw() does this.
- Removed flush_workqueue() as incompatible
  Causes kernel crashing.
- Removed mwifiex_enable_wake()/mwifiex_disable_wake()
  as incompatible and redundant because the driver will be shut down
  instead of entering Host Sleep.

I'm worried about why flush_workqueue() causes kernel crash with this
suspend method. Is it OK to just drop it? At least We Microsoft Surface
devices users used this method for about one month and haven't observed
any issues.

Note that suspend() no longer checks if it's already suspended.
With the previous Host Sleep method, the check was done by looking at
adapter->hs_activated in mwifiex_enable_hs() [sta_ioctl.c], but not
MWIFIEX_IS_SUSPENDED. So, what the previous method checked was instead
Host Sleep state, not suspend itself.

Therefore, there is no need to check the suspend state now.
Also removed comment for suspend state check at top of suspend()
accordingly.

 drivers/net/wireless/marvell/mwifiex/pcie.c | 29 +++--
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 6a10ff0377a24..3b5c614def2f5 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -293,8 +293,7 @@ static bool mwifiex_pcie_ok_to_access_hw(struct 
mwifiex_adapter *adapter)
  * registered functions must have drivers with suspend and resume
  * methods. Failing that the kernel simply removes the whole card.
  *
- * If already not suspended, this function allocates and sends a host
- * sleep activate request to the firmware and turns off the traffic.
+ * This function shuts down the adapter.
  */
 static int mwifiex_pcie_suspend(struct device *dev)
 {
@@ -302,31 +301,21 @@ static int mwifiex_pcie_suspend(struct device *dev)
struct pcie_service_card *card = dev_get_drvdata(dev);
 
 
-   /* Might still be loading firmware */
-   wait_for_completion(>fw_done);
-
adapter = card->adapter;
if (!adapter) {
dev_err(dev, "adapter is not valid\n");
return 0;
}
 
-   mwifiex_enable_wake(adapter);
-
-   /* Enable the Host Sleep */
-   if (!mwifiex_enable_hs(adapter)) {
+   /* Shut down SW */
+   if (mwifiex_shutdown_sw(adapter)) {
mwifiex_dbg(adapter, ERROR,
"cmd: failed to suspend\n");
-   clear_bit(MWIFIEX_IS_HS_ENABLING, >work_flags);
-   mwifiex_disable_wake(adapter);
return -EFAULT;
}
 
-   flush_workqueue(adapter->workqueue);
-
/* Indicate device suspended */
set_bit(MWIFIEX_IS_SUSPENDED, >work_flags);
-   clear_bit(MWIFIEX_IS_HS_ENABLING, >work_flags);
 
return 0;
 }
@@ -336,13 +325,13 @@ static int mwifiex_pcie_suspend(struct device *dev)
  * registered functions must have drivers with suspend and resume
  * methods. Failing that the kernel simply removes the whole card.
  *
- * If already not resumed, this function turns on the traffic and
- * sends a host sleep cancel request to the firmware.
+ * If already not resumed, this function reinits the adapter.
  */
 static int mwifiex_pcie_resume(struct device *dev)
 {
struct mwifiex_adapter *adapter;
struct pcie_service_card *card = dev_get_drvdata(dev);
+   int ret;
 
 
if (!card->adapter) {
@@ -360,9 +349,11 @@ static int mwifiex_pcie_resume(struct device *dev)
 
clear_bit(MWIFIEX_IS_SUSPENDED, >work_flags);
 
-   mwifiex_cancel_hs(mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_STA),
- MWIFIEX_ASYNC_CMD);
-   mwifiex_disable_wake(adapter);
+   ret = mwifiex_reinit_sw(adapter);
+   if (ret)
+   dev_err(dev, "reinit failed: %d\n", ret);
+   else
+   mwifiex_dbg(adapter, INFO, "%s, successful\n", __func__);
 
return 0;
 }
-- 
2.29.1



[PATCH] mwifiex: pcie: add enable_device_dump module parameter

2020-10-28 Thread Tsuchiya Yuto
The devicve_dump may take a little bit long time and users may want to
disable the dump for daily usage.

This commit adds a new module parameter enable_device_dump and disables
the device_dump by default.

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/pcie.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 6a10ff0377a24..8254e06fb22ce 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -33,6 +33,11 @@
 
 static struct mwifiex_if_ops pcie_ops;
 
+static bool enable_device_dump;
+module_param(enable_device_dump, bool, 0644);
+MODULE_PARM_DESC(enable_device_dump,
+"enable device_dump (default: disabled)");
+
 static const struct mwifiex_pcie_card_reg mwifiex_reg_8766 = {
.cmd_addr_lo = PCIE_SCRATCH_0_REG,
.cmd_addr_hi = PCIE_SCRATCH_1_REG,
@@ -2938,6 +2943,12 @@ static void mwifiex_pcie_fw_dump(struct mwifiex_adapter 
*adapter)
 
 static void mwifiex_pcie_device_dump_work(struct mwifiex_adapter *adapter)
 {
+   if (!enable_device_dump) {
+   mwifiex_dbg(adapter, MSG,
+   "device_dump is disabled by module parameter\n");
+   return;
+   }
+
adapter->devdump_data = vzalloc(MWIFIEX_FW_DUMP_SIZE);
if (!adapter->devdump_data) {
mwifiex_dbg(adapter, ERROR,
-- 
2.29.1



[PATCH 3/3] mwifiex: print message when changing ps_mode

2020-10-28 Thread Tsuchiya Yuto
Users may want to know the ps_mode state change (e.g., diagnosing
connection issues). This commit adds the print when changing ps_mode.

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/cfg80211.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c 
b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index 943bc1e8ceaee..a2eb8df8d3854 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -451,6 +451,13 @@ mwifiex_cfg80211_set_power_mgmt(struct wiphy *wiphy,
return -EPERM;
}
 
+   if (ps_mode)
+   mwifiex_dbg(priv->adapter, MSG,
+   "Enabling ps_mode, disable if unstable.\n");
+   else
+   mwifiex_dbg(priv->adapter, MSG,
+   "Disabling ps_mode.\n");
+
return mwifiex_drv_set_power(priv, _mode);
 }
 
-- 
2.29.1



[PATCH] mwifiex: pcie: skip cancel_work_sync() on reset failure path

2020-10-28 Thread Tsuchiya Yuto
If a reset is performed, but even the reset fails for some reasons (e.g.,
on Surface devices, the fw reset requires another quirks),
cancel_work_sync() hangs in mwifiex_cleanup_pcie().

# firmware went into a bad state
[...]
[ 1608.281690] mwifiex_pcie :03:00.0: info: shutdown mwifiex...
[ 1608.282724] mwifiex_pcie :03:00.0: rx_pending=0, tx_pending=1,   
cmd_pending=0
[ 1608.292400] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
[ 1608.292405] mwifiex_pcie :03:00.0: PREP_CMD: card is removed
# reset performed after firmware went into a bad state
[ 1609.394320] mwifiex_pcie :03:00.0: WLAN FW already running! Skip FW 
dnld
[ 1609.394335] mwifiex_pcie :03:00.0: WLAN FW is active
# but even the reset failed
[ 1619.499049] mwifiex_pcie :03:00.0: mwifiex_cmd_timeout_func: Timeout 
cmd id = 0xfa, act = 0xe000
[ 1619.499094] mwifiex_pcie :03:00.0: num_data_h2c_failure = 0
[ 1619.499103] mwifiex_pcie :03:00.0: num_cmd_h2c_failure = 0
[ 1619.499110] mwifiex_pcie :03:00.0: is_cmd_timedout = 1
[ 1619.499117] mwifiex_pcie :03:00.0: num_tx_timeout = 0
[ 1619.499124] mwifiex_pcie :03:00.0: last_cmd_index = 0
[ 1619.499133] mwifiex_pcie :03:00.0: last_cmd_id: fa 00 07 01 07 01 07 
01 07 01
[ 1619.499140] mwifiex_pcie :03:00.0: last_cmd_act: 00 e0 00 00 00 00 
00 00 00 00
[ 1619.499147] mwifiex_pcie :03:00.0: last_cmd_resp_index = 3
[ 1619.499155] mwifiex_pcie :03:00.0: last_cmd_resp_id: 07 81 07 81 07 
81 07 81 07 81
[ 1619.499162] mwifiex_pcie :03:00.0: last_event_index = 2
[ 1619.499169] mwifiex_pcie :03:00.0: last_event: 58 00 58 00 58 00 58 
00 58 00
[ 1619.499177] mwifiex_pcie :03:00.0: data_sent=0 cmd_sent=1
[ 1619.499185] mwifiex_pcie :03:00.0: ps_mode=0 ps_state=0
[ 1619.499215] mwifiex_pcie :03:00.0: info: _mwifiex_fw_dpc: unregister 
device
# mwifiex_pcie_work hang happening
[ 1823.233923] INFO: task kworker/3:1:44 blocked for more than 122 seconds.
[ 1823.233932]   Tainted: GWC OE 5.10.0-rc1-1-mainline #1
[ 1823.233935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[ 1823.233940] task:kworker/3:1 state:D stack:0 pid:   44 ppid: 
2 flags:0x4000
[ 1823.233960] Workqueue: events mwifiex_pcie_work [mwifiex_pcie]
[ 1823.233965] Call Trace:
[ 1823.233981]  __schedule+0x292/0x820
[ 1823.233990]  schedule+0x45/0xe0
[ 1823.233995]  schedule_timeout+0x11c/0x160
[ 1823.234003]  wait_for_completion+0x9e/0x100
[ 1823.234012]  __flush_work.isra.0+0x156/0x210
[ 1823.234018]  ? flush_workqueue_prep_pwqs+0x130/0x130
[ 1823.234026]  __cancel_work_timer+0x11e/0x1a0
[ 1823.234035]  mwifiex_cleanup_pcie+0x28/0xd0 [mwifiex_pcie]
[ 1823.234049]  mwifiex_free_adapter+0x24/0xe0 [mwifiex]
[ 1823.234060]  _mwifiex_fw_dpc+0x294/0x560 [mwifiex]
[ 1823.234074]  mwifiex_reinit_sw+0x15d/0x300 [mwifiex]
[ 1823.234080]  mwifiex_pcie_reset_done+0x50/0x80 [mwifiex_pcie]
[ 1823.234087]  pci_try_reset_function+0x5c/0x90
[ 1823.234094]  process_one_work+0x1d6/0x3a0
[ 1823.234100]  worker_thread+0x4d/0x3d0
[ 1823.234107]  ? rescuer_thread+0x410/0x410
[ 1823.234112]  kthread+0x142/0x160
[ 1823.234117]  ? __kthread_bind_mask+0x60/0x60
[ 1823.234124]  ret_from_fork+0x22/0x30
[...]

This is a deadlock caused by calling cancel_work_sync() in
mwifiex_cleanup_pcie():

- Device resets are done via mwifiex_pcie_card_reset()
- which schedules card->work to call mwifiex_pcie_card_reset_work()
- which calls pci_try_reset_function().
- This leads to mwifiex_pcie_reset_done() be called on the same workqueue,
  which in turn calls
- mwifiex_reinit_sw() and that calls
- _mwifiex_fw_dpc().

The problem is now that _mwifiex_fw_dpc() calls mwifiex_free_adapter()
in case firmware initialization fails. That ends up calling
mwifiex_cleanup_pcie().

Note that all those calls are still running on the workqueue. So when
mwifiex_cleanup_pcie() now calls cancel_work_sync(), it's really waiting
on itself to complete, causing a deadlock.

This commit fixes the deadlock by skipping cancel_work_sync() on a reset
failure path.

After this commit, when reset fails, the following output is
expected to be shown:

kernel: mwifiex_pcie :03:00.0: info: _mwifiex_fw_dpc: unregister device
kernel: mwifiex: Failed to bring up adapter: -5
kernel: mwifiex_pcie :03:00.0: reinit failed: -5

To reproduce this issue, for example, try putting the root port of wifi
into D3 (replace "00:1d.3" with your setup).

# put into D3 (root port)
sudo setpci -v -s 00:1d.3 CAP_PM+4.b=0b

Cc: Maximilian Luz 
Signed-off-by: Tsuchiya Yuto 
---
I'm not sure which commit is suitable for the fixes tag. So, I didn't
add it.

 drivers/net/wireless/marvell/mwifiex/pcie.c | 18 +-
 drivers/net/wirel

[PATCH 2/2] mwifiex: update comment for shutdown_sw()/reinit_sw() to reflect current state

2020-10-28 Thread Tsuchiya Yuto
The functions mwifiex_shutdown_sw() and mwifiex_reinit_sw() can be used
for more general purposes than the PCIe function level reset. Also, these
are even not PCIe-specific.

So, let's update the comments at the top of each function accordingly.

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/main.c 
b/drivers/net/wireless/marvell/mwifiex/main.c
index 6283df5aaaf8b..ee52fb839ef77 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1455,7 +1455,7 @@ static void mwifiex_uninit_sw(struct mwifiex_adapter 
*adapter)
 }
 
 /*
- * This function gets called during PCIe function level reset.
+ * This function can be used for shutting down the adapter SW.
  */
 int mwifiex_shutdown_sw(struct mwifiex_adapter *adapter)
 {
@@ -1483,7 +1483,7 @@ int mwifiex_shutdown_sw(struct mwifiex_adapter *adapter)
 }
 EXPORT_SYMBOL_GPL(mwifiex_shutdown_sw);
 
-/* This function gets called during PCIe function level reset. Required
+/* This function can be used for reinitting the adapter SW. Required
  * code is extracted from mwifiex_add_card()
  */
 int
-- 
2.29.1



[RFC PATCH 0/3] mwifiex: add fw reset quirks for Microsoft Surface

2020-10-28 Thread Tsuchiya Yuto
Hello all,

This series adds firmware reset quirks for Microsoft Surface devices
(PCIe-88W8897). Surface devices somehow requires quirks to reset the
firmware. Otherwise, current mwifiex driver can reset only software level.
This is not enough to recover from a bad state.

To do so, in the first patch, I added a DMI-based quirk implementation
for Surface devices that use mwifiex chip.

The required quirk is different by generation. Surface gen3 devices
(Surface 3 and Surface Pro 3) require a quirk that calls _DSM method
(the third patch).
Note that Surface Pro 3 is not yet supported because of the difference
between Surface 3. On Surface 3, the wifi card will be immediately
removed/reprobed after the _DSM call. On the other hand, Surface Pro 3
doesn't. Need to remove/reprobe wifi card ourselves. This behavior makes
the support difficult.

Surface gen4+ devices (Surface Pro 4 and later) require a quirk that
puts wifi into D3cold before FLR.

While here, created new files for quirks (mwifiex/pcie_quirks.c and
mwifiex/pcie_quirks.h) because the changes are a little bit too big to
add into pcie.c.

Thanks,
Tsuchiya Yuto

Tsuchiya Yuto (3):
  mwifiex: pcie: add DMI-based quirk impl for Surface devices
  mwifiex: pcie: add reset_d3cold quirk for Surface gen4+ devices
  mwifiex: pcie: add reset_wsid quirk for Surface 3

 drivers/net/wireless/marvell/mwifiex/Makefile |   1 +
 drivers/net/wireless/marvell/mwifiex/pcie.c   |  21 ++
 drivers/net/wireless/marvell/mwifiex/pcie.h   |   2 +
 .../wireless/marvell/mwifiex/pcie_quirks.c| 246 ++
 .../wireless/marvell/mwifiex/pcie_quirks.h|  17 ++
 5 files changed, 287 insertions(+)
 create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
 create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.h

-- 
2.29.1



[RFC PATCH 1/3] mwifiex: pcie: add DMI-based quirk impl for Surface devices

2020-10-28 Thread Tsuchiya Yuto
This commit adds quirk implementation based on DMI matching with DMI
table for Microsoft Surface devices that uses mwifiex chip (currently,
all the devices that use mwifiex equip PCIe-88W8897 chip).

This implementation can be used for quirks later.

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/Makefile |   1 +
 drivers/net/wireless/marvell/mwifiex/pcie.c   |   4 +
 drivers/net/wireless/marvell/mwifiex/pcie.h   |   2 +
 .../wireless/marvell/mwifiex/pcie_quirks.c| 114 ++
 .../wireless/marvell/mwifiex/pcie_quirks.h|  11 ++
 5 files changed, 132 insertions(+)
 create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
 create mode 100644 drivers/net/wireless/marvell/mwifiex/pcie_quirks.h

diff --git a/drivers/net/wireless/marvell/mwifiex/Makefile 
b/drivers/net/wireless/marvell/mwifiex/Makefile
index fdfd9bf15ed46..8a1e7c5b9c6e2 100644
--- a/drivers/net/wireless/marvell/mwifiex/Makefile
+++ b/drivers/net/wireless/marvell/mwifiex/Makefile
@@ -49,6 +49,7 @@ mwifiex_sdio-y += sdio.o
 obj-$(CONFIG_MWIFIEX_SDIO) += mwifiex_sdio.o
 
 mwifiex_pcie-y += pcie.o
+mwifiex_pcie-y += pcie_quirks.o
 obj-$(CONFIG_MWIFIEX_PCIE) += mwifiex_pcie.o
 
 mwifiex_usb-y += usb.o
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 6a10ff0377a24..362cf10debfa0 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -27,6 +27,7 @@
 #include "wmm.h"
 #include "11n.h"
 #include "pcie.h"
+#include "pcie_quirks.h"
 
 #define PCIE_VERSION   "1.0"
 #define DRV_NAME"Marvell mwifiex PCIe"
@@ -410,6 +411,9 @@ static int mwifiex_pcie_probe(struct pci_dev *pdev,
return ret;
}
 
+   /* check quirks */
+   mwifiex_initialize_quirks(card);
+
if (mwifiex_add_card(card, >fw_done, _ops,
 MWIFIEX_PCIE, >dev)) {
pr_err("%s failed\n", __func__);
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.h 
b/drivers/net/wireless/marvell/mwifiex/pcie.h
index 843d57eda8201..09839a3bd1753 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.h
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.h
@@ -242,6 +242,8 @@ struct pcie_service_card {
struct mwifiex_msix_context share_irq_ctx;
struct work_struct work;
unsigned long work_flags;
+
+   unsigned long quirks;
 };
 
 static inline int
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c 
b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
new file mode 100644
index 0..929aee2b0a60a
--- /dev/null
+++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * File for PCIe quirks.
+ */
+
+/* The low-level PCI operations will be performed in this file. Therefore,
+ * let's use dev_*() instead of mwifiex_dbg() here to avoid troubles (e.g.
+ * to avoid using mwifiex_adapter struct before init or wifi is powered
+ * down, or causes NULL ptr deref).
+ */
+
+#include 
+
+#include "pcie_quirks.h"
+
+/* quirk table based on DMI matching */
+static const struct dmi_system_id mwifiex_quirk_table[] = {
+   {
+   .ident = "Surface Pro 4",
+   .matches = {
+   DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
+   DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 4"),
+   },
+   .driver_data = 0,
+   },
+   {
+   .ident = "Surface Pro 5",
+   .matches = {
+   /* match for SKU here due to generic product name 
"Surface Pro" */
+   DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
+   DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1796"),
+   },
+   .driver_data = 0,
+   },
+   {
+   .ident = "Surface Pro 5 (LTE)",
+   .matches = {
+   /* match for SKU here due to generic product name 
"Surface Pro" */
+   DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
+   DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1807"),
+   },
+   .driver_data = 0,
+   },
+   {
+   .ident = "Surface Pro 6",
+   .matches = {
+   DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
+   DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 6"),
+   },
+   .driver_data = 0,
+   },
+   {
+   .ident = "Surface Book 1",
+   .matches = {
+   DMI_EXACT_MA

[PATCH 2/3] mwifiex: add allow_ps_mode module parameter

2020-10-28 Thread Tsuchiya Yuto
To make the ps_mode (power_save) control easier, this commit adds a new
module parameter allow_ps_mode and set it false (disallowed) by default.

When this parameter is set to false, changing the power_save mode will
be disallowed like the following:

$ sudo iw dev mlan0 set power_save on
command failed: Operation not permitted (-1)

Signed-off-by: Tsuchiya Yuto 
---
 drivers/net/wireless/marvell/mwifiex/cfg80211.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c 
b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index a6b9dc6700b14..943bc1e8ceaee 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -25,6 +25,11 @@
 static char *reg_alpha2;
 module_param(reg_alpha2, charp, 0);
 
+static bool allow_ps_mode;
+module_param(allow_ps_mode, bool, 0644);
+MODULE_PARM_DESC(allow_ps_mode,
+"allow WiFi power management to be enabled. (default: 
disallowed)");
+
 static const struct ieee80211_iface_limit mwifiex_ap_sta_limits[] = {
{
.max = MWIFIEX_MAX_BSS_NUM,
@@ -435,6 +440,17 @@ mwifiex_cfg80211_set_power_mgmt(struct wiphy *wiphy,
 
ps_mode = enabled;
 
+   /* Allow ps_mode to be enabled only when allow_ps_mode is true */
+   if (ps_mode && !allow_ps_mode) {
+   mwifiex_dbg(priv->adapter, MSG,
+   "Enabling ps_mode disallowed by modparam\n");
+
+   /* Return -EPERM to inform userspace tools that setting
+* power_save to be enabled is not permitted.
+*/
+   return -EPERM;
+   }
+
return mwifiex_drv_set_power(priv, _mode);
 }
 
-- 
2.29.1



[RFC PATCH 2/3] mwifiex: pcie: add reset_d3cold quirk for Surface gen4+ devices

2020-10-28 Thread Tsuchiya Yuto
To reset mwifiex on Surface gen4+ (Pro 4 or later gen) devices, it
seems that putting the wifi device into D3cold is required according
to errata.inf file on Windows installation (Windows/INF/errata.inf).

This patch adds a function that performs power-cycle (put into D3cold
then D0) and call the function at the end of reset_prepare().

Note: Need to also reset the parent device (bridge) of wifi on SB1;
it might be because the bridge of wifi always reports it's in D3hot.
When I tried to reset only the wifi device (not touching parent), it gave
the following error and the reset failed:

acpi device:4b: Cannot transition to power state D0 for parent in D3hot
mwifiex_pcie :03:00.0: can't change power state from D3cold to D0 
(config space inaccessible)

Signed-off-by: Tsuchiya Yuto 
---
Current issue:
* After reset with this quirk, ASPM settings don't get restored.

Below is the "sudo lspci -nnvvv" diff before/after fw reset on Surface Book 1:

#
# 03:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88W8897 
[AVASTAR] 802.11ac Wireless [11ab:2b38]
#
@@ -574,9 +574,9 @@
Capabilities: [168 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ 
L1_PM_Substates+
  PortCommonModeRestoreTime=70us 
PortTPowerOnTime=10us
-   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
-  T_CommonMode=0us LTR1.2_Threshold=163840ns
-   L1SubCtl2: T_PwrOn=44us
+   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
+  T_CommonMode=0us LTR1.2_Threshold=0ns
+   L1SubCtl2: T_PwrOn=10us
Kernel driver in use: mwifiex_pcie
Kernel modules: mwifiex_pcie

#
# no changes on root port of wifi regarding ASPM
#

As you see, all of the L1 substates are disabled after fw reset. LTR
value is also changed.

 drivers/net/wireless/marvell/mwifiex/pcie.c   |  7 ++
 .../wireless/marvell/mwifiex/pcie_quirks.c| 73 +--
 .../wireless/marvell/mwifiex/pcie_quirks.h|  3 +-
 3 files changed, 74 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c 
b/drivers/net/wireless/marvell/mwifiex/pcie.c
index 362cf10debfa0..c0c4b5a9149ab 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -529,6 +529,13 @@ static void mwifiex_pcie_reset_prepare(struct pci_dev 
*pdev)
mwifiex_shutdown_sw(adapter);
clear_bit(MWIFIEX_IFACE_WORK_DEVICE_DUMP, >work_flags);
clear_bit(MWIFIEX_IFACE_WORK_CARD_RESET, >work_flags);
+
+   /* For Surface gen4+ devices, we need to put wifi into D3cold right
+* before performing FLR
+*/
+   if (card->quirks & QUIRK_FW_RST_D3COLD)
+   mwifiex_pcie_reset_d3cold_quirk(pdev);
+
mwifiex_dbg(adapter, INFO, "%s, successful\n", __func__);
 }
 
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c 
b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
index 929aee2b0a60a..edc739c542fea 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie_quirks.c
@@ -21,7 +21,7 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 4"),
},
-   .driver_data = 0,
+   .driver_data = (void *)QUIRK_FW_RST_D3COLD,
},
{
.ident = "Surface Pro 5",
@@ -30,7 +30,7 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1796"),
},
-   .driver_data = 0,
+   .driver_data = (void *)QUIRK_FW_RST_D3COLD,
},
{
.ident = "Surface Pro 5 (LTE)",
@@ -39,7 +39,7 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "Surface_Pro_1807"),
},
-   .driver_data = 0,
+   .driver_data = (void *)QUIRK_FW_RST_D3COLD,
},
{
.ident = "Surface Pro 6",
@@ -47,7 +47,7 @@ static const struct dmi_system_id mwifiex_quirk_table[] = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Microsoft 
Corporation"),
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Surface Pro 6"),
},
-   .driver_data = 0,
+   .d