Re: [Bug 215129] New: Linux kernel hangs during power down

2021-11-24 Thread Heiner Kallweit
On 25.11.2021 01:46, Jakub Kicinski wrote:
> Adding Kalle and Hainer.
> 
> On Wed, 24 Nov 2021 14:45:05 -0800 Stephen Hemminger wrote:
>> Begin forwarded message:
>>
>> Date: Wed, 24 Nov 2021 21:14:53 +
>> From: bugzilla-dae...@bugzilla.kernel.org
>> To: step...@networkplumber.org
>> Subject: [Bug 215129] New: Linux kernel hangs during power down
>>
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=215129
>>
>> Bug ID: 215129
>>Summary: Linux kernel hangs during power down
>>Product: Networking
>>Version: 2.5
>> Kernel Version: 5.15
>>   Hardware: All
>> OS: Linux
>>   Tree: Mainline
>> Status: NEW
>>   Severity: normal
>>   Priority: P1
>>  Component: Other
>>   Assignee: step...@networkplumber.org
>>   Reporter: martin.sto...@gmail.com
>> Regression: No
>>
>> Created attachment 299703
>>   --> https://bugzilla.kernel.org/attachment.cgi?id=299703&action=edit
>> Kernel log after timeout occured
>>
>> On my system the kernel is waiting for a task during shutdown which doesn't
>> complete.
>>
>> The commit which causes this behavior is:
>> [f32a213765739f2a1db319346799f130a3d08820] ethtool: runtime-resume netdev
>> parent before ethtool ioctl ops
>>
>> This bug causes also that the system gets unresponsive after starting Steam:
>> https://steamcommunity.com/app/221410/discussions/2/3194736442566303600/
>>
> 

I think the reference to ath10k_pci is misleading, Kalle isn't needed here.
The actual issue is a RTNL deadlock in igb_resume(). See log snippet:

Nov 24 18:56:19 MartinsPc kernel:  igb_resume+0xff/0x1e0 [igb 
21bf6a00cb1f20e9b0e8434f7f8748a0504e93f8]
Nov 24 18:56:19 MartinsPc kernel:  pci_pm_runtime_resume+0xa7/0xd0
Nov 24 18:56:19 MartinsPc kernel:  ? pci_pm_freeze_noirq+0x110/0x110
Nov 24 18:56:19 MartinsPc kernel:  __rpm_callback+0x41/0x120
Nov 24 18:56:19 MartinsPc kernel:  ? pci_pm_freeze_noirq+0x110/0x110
Nov 24 18:56:19 MartinsPc kernel:  rpm_callback+0x35/0x70
Nov 24 18:56:19 MartinsPc kernel:  rpm_resume+0x567/0x810
Nov 24 18:56:19 MartinsPc kernel:  __pm_runtime_resume+0x4a/0x80
Nov 24 18:56:19 MartinsPc kernel:  dev_ethtool+0xd4/0x2d80

We have at least two places in net core where runtime_resume() is called
under RTNL. This conflicts with the current structure in few Intel drivers
that have something like the following in their resume path.

rtnl_lock();
if (!err && netif_running(netdev))
err = __igb_open(netdev, true);

if (!err)
netif_device_attach(netdev);
rtnl_unlock();

Other drivers don't do this, so it's the question whether it's actually
needed here to take RTNL. Some discussion was started [0], but it ended
w/o tangible result and since then it has been surprisingly quiet.

[0] https://www.spinics.net/lists/netdev/msg736880.html

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: [Bug 215129] New: Linux kernel hangs during power down

2021-11-24 Thread Jakub Kicinski
Adding Kalle and Hainer.

On Wed, 24 Nov 2021 14:45:05 -0800 Stephen Hemminger wrote:
> Begin forwarded message:
> 
> Date: Wed, 24 Nov 2021 21:14:53 +
> From: bugzilla-dae...@bugzilla.kernel.org
> To: step...@networkplumber.org
> Subject: [Bug 215129] New: Linux kernel hangs during power down
> 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=215129
> 
> Bug ID: 215129
>Summary: Linux kernel hangs during power down
>Product: Networking
>Version: 2.5
> Kernel Version: 5.15
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: step...@networkplumber.org
>   Reporter: martin.sto...@gmail.com
> Regression: No
> 
> Created attachment 299703
>   --> https://bugzilla.kernel.org/attachment.cgi?id=299703&action=edit
> Kernel log after timeout occured
> 
> On my system the kernel is waiting for a task during shutdown which doesn't
> complete.
> 
> The commit which causes this behavior is:
> [f32a213765739f2a1db319346799f130a3d08820] ethtool: runtime-resume netdev
> parent before ethtool ioctl ops
> 
> This bug causes also that the system gets unresponsive after starting Steam:
> https://steamcommunity.com/app/221410/discussions/2/3194736442566303600/
> 


___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


[PATCH v3] ath10k: Fix the MTU size on QCA9377 SDIO

2021-11-24 Thread Fabio Estevam
On an imx6dl-pico-pi board with a QCA9377 SDIO chip, simply trying to
connect via ssh to another machine causes:

[   55.824159] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[   55.832169] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[   55.838529] ath10k_sdio mmc1:0001:1: failed to push frame: -12
[   55.905863] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[   55.913650] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[   55.919887] ath10k_sdio mmc1:0001:1: failed to push frame: -12

, leading to an ssh connection failure.

One user inspected the size of frames on Wireshark and reported
the followig:

"I was able to narrow the issue down to the mtu. If I set the mtu for
the wlan0 device to 1486 instead of 1500, the issue does not happen.

The size of frames that I see on Wireshark is exactly 1500 after
setting it to 1486."

Clearing the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE avoids the problem and
the ssh command works successfully after that.

Introduce a 'credit_size_workaround' field to ath10k_hw_params for
the QCA9377 SDIO, so that the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE
is not set in this case.

Tested with QCA9377 SDIO with firmware WLAN.TF.1.1.1-00061-QCATFSWPZ-1.

Fixes: 2f918ea98606 ("ath10k: enable alt data of TX path for sdio")
Signed-off-by: Fabio Estevam 
---
Changes since v2:
- Set the credit_size_workaround field as true for QCA9377 SDIO.

 drivers/net/wireless/ath/ath10k/core.c | 4 +++-
 drivers/net/wireless/ath/ath10k/hw.h   | 3 +++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c 
b/drivers/net/wireless/ath/ath10k/core.c
index 72a366aa9f60..8a325ae97b0e 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -571,6 +571,7 @@ static const struct ath10k_hw_params 
ath10k_hw_params_list[] = {
.ast_skid_limit = 0x10,
.num_wds_entries = 0x20,
.uart_pin_workaround = true,
+   .credit_size_workaround = true,
.dynamic_sar_support = false,
},
{
@@ -715,6 +716,7 @@ static void ath10k_send_suspend_complete(struct ath10k *ar)
 
 static int ath10k_init_sdio(struct ath10k *ar, enum ath10k_firmware_mode mode)
 {
+   bool mtu_workaround = ar->hw_params.credit_size_workaround;
int ret;
u32 param = 0;
 
@@ -732,7 +734,7 @@ static int ath10k_init_sdio(struct ath10k *ar, enum 
ath10k_firmware_mode mode)
 
param |= HI_ACS_FLAGS_SDIO_REDUCE_TX_COMPL_SET;
 
-   if (mode == ATH10K_FIRMWARE_MODE_NORMAL)
+   if (mode == ATH10K_FIRMWARE_MODE_NORMAL && !mtu_workaround)
param |= HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE;
else
param &= ~HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE;
diff --git a/drivers/net/wireless/ath/ath10k/hw.h 
b/drivers/net/wireless/ath/ath10k/hw.h
index 6b03c7787e36..591ef7416b61 100644
--- a/drivers/net/wireless/ath/ath10k/hw.h
+++ b/drivers/net/wireless/ath/ath10k/hw.h
@@ -618,6 +618,9 @@ struct ath10k_hw_params {
 */
bool uart_pin_workaround;
 
+   /* Workaround for the credit size calculation */
+   bool credit_size_workaround;
+
/* tx stats support over pktlog */
bool tx_stats_over_pktlog;
 
-- 
2.25.1


___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: [PATCH] ath10k: Clean the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE flag

2021-11-24 Thread Fabio Estevam

Hi Kalle and Wen,

On 24/11/2021 05:05, Kalle Valo wrote:


Thanks, I was worried it's something like this. One way to solve this
would be to add a new field to ath10k_hw_params so that the workaround
is done only on QCA9377 SDIO.


Thanks for the feedback, appreciate it.

I have done as suggested in v2.

Thanks a lot,

Fabio Estevam
--
DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-60 Fax: (+49)-8142-66989-80 Email: 
feste...@denx.de


___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


[PATCH v2] ath10k: Fix the MTU size on QCA9377 SDIO

2021-11-24 Thread Fabio Estevam
On an imx6dl-pico-pi board with a QCA9377 SDIO chip, simply trying to
connect via ssh to another machine causes:

[   55.824159] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[   55.832169] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[   55.838529] ath10k_sdio mmc1:0001:1: failed to push frame: -12
[   55.905863] ath10k_sdio mmc1:0001:1: failed to transmit packet, dropping: -12
[   55.913650] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
[   55.919887] ath10k_sdio mmc1:0001:1: failed to push frame: -12

, leading to an ssh connection failure.

One user inspected the size of frames on Wireshark and reported
the followig:

"I was able to narrow the issue down to the mtu. If I set the mtu for
the wlan0 device to 1486 instead of 1500, the issue does not happen.

The size of frames that I see on Wireshark is exactly 1500 after
setting it to 1486."

Clearing the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE avoids the problem and
the ssh command works successfully after that.

Introduce a 'credit_size_workaround' field to ath10k_hw_params for
the QCA9377 SDIO, so that the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE
is not set in this case.

Tested with QCA9377 SDIO with firmware WLAN.TF.1.1.1-00061-QCATFSWPZ-1.

Fixes: 2f918ea98606 ("ath10k: enable alt data of TX path for sdio")
Signed-off-by: Fabio Estevam 
---
Changes since v1:
- Restrict the workaround only for QCA9377 SDIO

 drivers/net/wireless/ath/ath10k/core.c | 3 ++-
 drivers/net/wireless/ath/ath10k/hw.h   | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/core.c 
b/drivers/net/wireless/ath/ath10k/core.c
index 72a366aa9f60..5a936e643d7a 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -715,6 +715,7 @@ static void ath10k_send_suspend_complete(struct ath10k *ar)
 
 static int ath10k_init_sdio(struct ath10k *ar, enum ath10k_firmware_mode mode)
 {
+   bool mtu_workaround = ar->hw_params.credit_size_workaround;
int ret;
u32 param = 0;
 
@@ -732,7 +733,7 @@ static int ath10k_init_sdio(struct ath10k *ar, enum 
ath10k_firmware_mode mode)
 
param |= HI_ACS_FLAGS_SDIO_REDUCE_TX_COMPL_SET;
 
-   if (mode == ATH10K_FIRMWARE_MODE_NORMAL)
+   if (mode == ATH10K_FIRMWARE_MODE_NORMAL && !mtu_workaround)
param |= HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE;
else
param &= ~HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE;
diff --git a/drivers/net/wireless/ath/ath10k/hw.h 
b/drivers/net/wireless/ath/ath10k/hw.h
index 6b03c7787e36..591ef7416b61 100644
--- a/drivers/net/wireless/ath/ath10k/hw.h
+++ b/drivers/net/wireless/ath/ath10k/hw.h
@@ -618,6 +618,9 @@ struct ath10k_hw_params {
 */
bool uart_pin_workaround;
 
+   /* Workaround for the credit size calculation */
+   bool credit_size_workaround;
+
/* tx stats support over pktlog */
bool tx_stats_over_pktlog;
 
-- 
2.25.1


___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k


Re: [PATCH] ath10k: Clean the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE flag

2021-11-24 Thread Kalle Valo
Wen Gong  writes:

> On 11/24/2021 3:46 PM, Kalle Valo wrote:
>> Fabio Estevam  writes:
>>
>>> Hi Kalle,
>>>
>>> On Mon, Nov 15, 2021 at 3:06 PM Fabio Estevam  wrote:
 Hi Kalle,

 On Wed, Sep 15, 2021 at 1:05 PM Fabio Estevam  wrote:
> On an imx6dl-pico-pi board with a QCA9377 SDIO chip, the following
> errors are observed when the board works in STA mode:
>
> Simply running "ssh user@192.168.0.1" causes:
>
> [   55.824159] ath10k_sdio mmc1:0001:1: failed to transmit packet, 
> dropping: -12
> [   55.832169] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
> [   55.838529] ath10k_sdio mmc1:0001:1: failed to push frame: -12
> [   55.905863] ath10k_sdio mmc1:0001:1: failed to transmit packet, 
> dropping: -12
> [   55.913650] ath10k_sdio mmc1:0001:1: failed to submit frame: -12
> [   55.919887] ath10k_sdio mmc1:0001:1: failed to push frame: -12
>
> and it is not possible to connect via ssh to the other machine.
>
> One user inspected the size of frames on Wireshark and reported
> the followig:
>
> "I was able to narrow the issue down to the mtu. If I set the mtu for
> the wlan0 device to 1486 instead of 1500, the issue does not happen.
>
> The size of frames that I see on Wireshark is exactly 1500 after
> setting it to 1486."
>
> Clearing the HI_ACS_FLAGS_ALT_DATA_CREDIT_SIZE avoids the problem and
> the ssh command works successfully after that.
>
> Tested with QCA9377 SDIO with firmware WLAN.TF.1.1.1-00061-QCATFSWPZ-1.
>
> Fixes: 2f918ea98606 ("ath10k: enable alt data of TX path for sdio")
> Signed-off-by: Fabio Estevam 
 A gentle ping on this one.
>>> Any comments, please? Without this fix, we can not log via ssh to other 
>>> machine.
>> I don't have much time for ath10k nowadays, so expect long delays in
>> reviews.
>>
>> I'm worried that this breaks QCA6174 SDIO support. Wen, what do you
>> think of this? Is this because of differences between firmware versions
>> or what?
>
> it is added by below commit, if disable it, will significant effect
> performance.

Thanks, I was worried it's something like this. One way to solve this
would be to add a new field to ath10k_hw_params so that the workaround
is done only on QCA9377 SDIO.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

___
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k