Re: [PATCH 1/2] ath10k: add accounting for the extended peer statistics

2016-12-13 Thread Mohammed Shafi Shajakhan
On Tue, Dec 13, 2016 at 01:41:33PM +0100, Christian Lamparter wrote:
> Hello,
> 
> It looks like google put your mail into the spam-can. 
> I'm sorry for not answering sooner.

[shafi] np, thanks for your reply !

> 
> On Wednesday, December 7, 2016 11:58:24 AM CET Mohammed Shafi Shajakhan wrote:
> > On Mon, Dec 05, 2016 at 10:52:45PM +0100, Christian Lamparter wrote:
> > > The 10.4 firmware adds extended peer information to the
> > > firmware's statistics payload. This additional info is
> > > stored as a separate data field and the elements are
> > > stored in their own "peers_extd" list.
> > > 
> > > These elements can pile up in the same way as the peer
> > > information elements. This is because the
> > > ath10k_wmi_10_4_op_pull_fw_stats() function tries to
> > > pull the same amount (num_peer_stats) for every statistic
> > > data unit.
> > > 
> > > Fixes: 4a49ae94a448faa ("ath10k: fix 10.4 extended peer stats update")
> > > Signed-off-by: Christian Lamparter 
> > > ---
> > >  drivers/net/wireless/ath/ath10k/debug.c | 7 +--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/net/wireless/ath/ath10k/debug.c 
> > > b/drivers/net/wireless/ath/ath10k/debug.c
> > > index 82a4c67f3672..4acd9eb65910 100644
> > > --- a/drivers/net/wireless/ath/ath10k/debug.c
> > > +++ b/drivers/net/wireless/ath/ath10k/debug.c
> > > @@ -399,6 +399,7 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, 
> > > struct sk_buff *skb)
> > >* prevent firmware from DoS-ing the host.
> > >*/
> > >   ath10k_fw_stats_peers_free(>debug.fw_stats.peers);
> > > + 
> > > ath10k_fw_extd_stats_peers_free(>debug.fw_stats.peers_extd);
> > 
> > [shafi] thanks for fixing this !
> > 
> > >   ath10k_warn(ar, "dropping fw peer stats\n");
> > >   goto free;
> > >   }
> > > @@ -409,10 +410,12 @@ void ath10k_debug_fw_stats_process(struct ath10k 
> > > *ar, struct sk_buff *skb)
> > >   goto free;
> > >   }
> > >  
> > > + if (!list_empty())
> > 
> > [shafi] sorry please correct me if i am wrong, for 'extended peer stats' we 
> > are checking
> > for normal 'peer stats' ? Is this the fix intended, i had started a build to
> > check your change and we will keep you posted, does this fix displaying
> > 'rx_duration' in ath10k fw_stats.
> The idea is not to queue any "extended peer stats" when there where no "peer 
> stats" to
> begin with. Because otherwise, the function is still vulnerable to OOM since 
> the 
> extended peers stats will be queued unchecked (not that this is currently a 
> problem).

[shafi] list_splice_tail_init should still check for non-empty 'peers_extd' list
and append if required ? please let me know if i am still missing something

>  
> > > + list_splice_tail_init(_extd,
> > > +   >debug.fw_stats.peers_extd);
> > > +
> > >   list_splice_tail_init(, >debug.fw_stats.peers);
> > >   list_splice_tail_init(, >debug.fw_stats.vdevs);
> > > - list_splice_tail_init(_extd,
> > > -   >debug.fw_stats.peers_extd);
> > >   }
> > >  
> > >   complete(>debug.fw_stats_complete);
> 
> Regards,
> Christian
> 
> 


RE: ATH9 driver issues on ARM64

2016-12-13 Thread Bharat Kumar Gogada
> On Sat, Dec 10, 2016 at 02:40:48PM +, Bharat Kumar Gogada wrote:
> > Hi,
> >
> > After taking some more lecroy traces, we see that after 2nd ASSERT from EP
> on ARM64 we see continuous data movement of 32 dwords or 12 dwords and
> never sign of DEASSERT.
> > Comparatively on working traces (x86) after 2nd assert there are only BAR
> register reads and writes and then DEASSERT, for almost most of the interrupts
> and we haven't seen 12 or 32 dwords data movement on this trace.
> >
> > I did not work on EP wifi/network drivers, any help why EP needs those many
> number of data at scan time ?
> 
> The device doesn't know whether it's in an x86 or an arm64 system.  If it 
> works
> differently, it must be because the PCI core or the driver is programming the
> device differently.
> 
> You should be able to match up Memory transactions from the host in the trace
> with things the driver does.  For example, if you see an Assert_INTx message
> from the device, you should eventually see a Memory Read from the host to get
> the ISR, i.e., some read done in the bowels of ath9k_hw_getisr().
> 
> I don't know how the ath9k device works, but there must be some Memory Read
> or Write done by the driver that tells the device "we've handled this 
> interrupt".
> The device should then send a Deassert_INTx; of course, if the device still
> requires service, e.g., because it has received more packets, it might leave 
> the
> INTx asserted.
> 
> I doubt you'd see exactly the same traces on x86 and arm64 because they aren't
> seeing the same network packets and the driver is executing at different 
> rates.
> But you should at least be able to identify interrupt assertion and the 
> actions of
> the driver's interrupt service routine.


Thanks Bjorn.

As you mentioned we did try to debug in that path. After we start scan after 
2nd ASSERT we see lots of 32 and 12 dword
data, and in function
void ath9k_hw_enable_interrupts(struct ath_hw *ah) 
{
...
..
REG_WRITE(ah, AR_IER, AR_IER_ENABLE);
// EP driver hangs at this 
position after 2nd ASSERT
// The following writes are not 
happening
if (!AR_SREV_9100(ah)) {
REG_WRITE(ah, AR_INTR_ASYNC_ENABLE, async_mask);
REG_WRITE(ah, AR_INTR_ASYNC_MASK, async_mask);

REG_WRITE(ah, AR_INTR_SYNC_ENABLE, sync_default);
REG_WRITE(ah, AR_INTR_SYNC_MASK, sync_default);
}   
ath_dbg(common, INTERRUPT, "AR_IMR 0x%x IER 0x%x\n",
REG_READ(ah, AR_IMR), REG_READ(ah, AR_IER));
}
The above funtion is invoked from tasklet.
I tried several boots every it stops here. The condition (!AR_SREV_9100(ah)) is 
true as per before 1st ASSERT handling.

Regards,
Bharat

> 
> > > Hello there,
> > >
> > > as this is a thread about ath9k and ARM64, i'm not sure if i should
> > > answer here or not, but i have similar "stalls" with ath9k on x86_64
> > > (starting with 4.9rc), stack trace is posted down below where the original
> ARM64 stall traces are.
> > >
> > > Greetings,
> > >
> > > Tobias
> > >
> > >
> > > On 08.12.2016 18:36, Kalle Valo wrote:
> > > > Bharat Kumar Gogada  writes:
> > > >
> > > >>   > [+cc Kalle, ath9k list]
> > > > Thanks, but please also CC linux-wireless. Full thread below for
> > > > the folks there.
> > > >
> > > >>> On Thu, Dec 08, 2016 at 01:49:42PM +, Bharat Kumar Gogada
> wrote:
> > >  Hi,
> > > 
> > >  Did anyone test Atheros ATH9
> > >  driver(drivers/net/wireless/ath/ath9k/)
> > >  on ARM64.  The end point is TP link wifi card with which
> > >  supports only legacy interrupts.
> > > >>> If it works on other arches and the arm64 PCI enumeration works,
> > > >>> my first guess would be an INTx issue, e.g., maybe the driver is
> > > >>> waiting for an interrupt that never arrives.
> > > >> We are not sure for now.
> > >  We are trying to test it on ARM64 with
> > >  (drivers/pci/host/pcie-xilinx-nwl.c) as root port.
> > > 
> > >  EP is getting enumerated and able to link up.
> > > 
> > >  But when we start scan system gets hanged.
> > > >>> When you say the system hangs when you start a scan, I assume
> > > >>> you mean a wifi scan, not the PCI enumeration.  A problem with a
> > > >>> wifi scan might cause a *process* to hang, but it shouldn't hang
> > > >>> the entire system.
> > > >>>
> > > >> Yes wifi scan.
> > >  When we took trace we see that after we start scan assert
> > >  message is sent but there is no de assert from end point.
> > > >>> Are you talking about a trace from a PCIe analyzer?  Do you see
> > > >>> an Assert_INTx PCIe message on the link?
> > > >>>
> > > >> Yes lecroy trace, yes we do see Assert_INTx and Deassert_INTx
> > > >> happening
> > > when we do interface link up.
> > > >> When we have less debug prints in Atheros 

Re: [PATCH v2 0/7] ath9k: EEPROM swapping improvements

2016-12-13 Thread Adrian Chadd
hi,

On 12 December 2016 at 12:05, Martin Blumenstingl
 wrote:

>
> It seems that there are a few devices out there where the whole EEPROM
> is swab16'ed which switches the position of the 1-byte fields
> opCapFlags and eepMisc.
> those still work fine with the new code, however I had a second patch
> in LEDE [0] which results in ath9k_platform_data.endian_check NOT
> being set anymore.
> that endian_check flag was used before to swab16 the whole EEPROM, to
> correct the position of the 1-byte fields again.
> Currently we are fixing this in the firmware hotplug script: [1]
> This is definitely not a blocker for this series though (if we want to
> have a devicetree replacement for "ath9k_platform_data.endian_check"
> then I'd work on that within a separate series, but I somewhat
> consider these EEPROMs as "broken" so fixing them in
> userspace/firmware hotplug script is fine for me)

As a reference - the reference driver has been doign this for a while.
It attempts to detect the endianness by looking at the 0xa55a
signature endian and figuring out which endian the actual contents are
in.

So just FYI yeah, this is a "thing" for reasons I don't quite know.



-adrian

>
>
> Regards,
> Martin
>
>
> [0] 
> https://git.lede-project.org/?p=source.git;a=commitdiff;h=a20616863d32d91163043b6657a63c836bd9c5ba
> [1] 
> https://git.lede-project.org/?p=source.git;a=commitdiff;h=afa37092663d00aa0abf8c61943d9a1b5558b144


Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Adrian Chadd
Hi!

ok, thanks! I've seen some .. annoying rate control related firmware
crashes if you aren't using 11ac / 11n rates (ie you're /really/
legacy, so I wondered if something similar is going on here.

Thanks!


-a


On 13 December 2016 at 22:06, Alexis Green  wrote:
> Hi Adrian,
>
> I have not done much testing of ath10k and ath9k devices in a single
> encrypted mesh recently, but I have a memory of only having this issue
> when communicating between ath10k devices.
>
> Alexis
>
> On Tue, Dec 13, 2016 at 9:53 PM, Adrian Chadd  wrote:
>> Hi!
>>
>> Hm! So is there a firmware bug if there are 11n only capable nodes in
>> an 11s mesh?
>>
>>
>>
>> -adrian


Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Alexis Green
Hi Adrian,

I have not done much testing of ath10k and ath9k devices in a single
encrypted mesh recently, but I have a memory of only having this issue
when communicating between ath10k devices.

Alexis

On Tue, Dec 13, 2016 at 9:53 PM, Adrian Chadd  wrote:
> Hi!
>
> Hm! So is there a firmware bug if there are 11n only capable nodes in
> an 11s mesh?
>
>
>
> -adrian


Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Adrian Chadd
Hi!

Hm! So is there a firmware bug if there are 11n only capable nodes in
an 11s mesh?



-adrian


Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Alexis Green
Thank you for your help Rajkumar,

We've traced the problem down to a peering issue. Looks like there was
a missing compile flag that caused some kind of incongruence. My best
guest is that beacons are generated by firmware and advertise support
for AC mode, whereas wpa_supplicant, when not compiled with
CONFIG_IEEE80211AC=y, sends mesh peering messages and creates peers
without AC support, causing firmware to get confused. After
recompiling supplicant with the correct flag, no more crashes were
observed in casual testing. I submitted a pull request to LEDE to,
hopefully, fix it in upstream.

Best regards,

Alexis

On Tue, Dec 13, 2016 at 3:51 PM, Manoharan, Rajkumar
 wrote:
>> Tested the 10.2.4.70.59-2 firmware and wpa_supplicant running WITHOUT
>> encryption and it still crashes. I suspect this means wpa_supplicant is 
>> setting up
>> the interface incorrectly and/or transmitting a malformed packet that is 
>> causing
>> the driver to crash.
>>
> Ben,
>
> IIRC mesh support was validated in qca988x in VHT mode while ago.  Either it 
> could
> be regression in driver/fw or lede mac80211 package.
>
> 1) Could you please try plain backports in lede w/o applying ath10k patches.
>  I do see 160MHz support in LEDE.
> 2) There are some peer stats dump from your earlier log. Disable peer stats
>  by "peer_stats" debugfs.
> 3) Please confirm the behavior with older firmware revisions.
> 4) use iw to bring up open mesh to rule out wpa_s config
>
> -Rajkumar
>


RE: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Manoharan, Rajkumar
> Tested the 10.2.4.70.59-2 firmware and wpa_supplicant running WITHOUT
> encryption and it still crashes. I suspect this means wpa_supplicant is 
> setting up
> the interface incorrectly and/or transmitting a malformed packet that is 
> causing
> the driver to crash.
> 
Ben,

IIRC mesh support was validated in qca988x in VHT mode while ago.  Either it 
could
be regression in driver/fw or lede mac80211 package.

1) Could you please try plain backports in lede w/o applying ath10k patches.
 I do see 160MHz support in LEDE.
2) There are some peer stats dump from your earlier log. Disable peer stats 
 by "peer_stats" debugfs.
3) Please confirm the behavior with older firmware revisions.
4) use iw to bring up open mesh to rule out wpa_s config

-Rajkumar



Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Benjamin Morgan
Tested the 10.2.4.70.59-2 firmware and wpa_supplicant running WITHOUT 
encryption and it still crashes. I suspect this means wpa_supplicant is 
setting up the interface incorrectly and/or transmitting a malformed 
packet that is causing the driver to crash.


[  162.010206] ath10k_pci :01:00.0: firmware crashed! (uuid 
d30144f6-a8fb-4c0d-bcdf-6ff3b2c37243)
[  162.019322] ath10k_pci :01:00.0: qca988x hw2.0 target 0x4100016c 
chip_id 0x043202ff sub :
[  162.028687] ath10k_pci :01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 1 testmode 1
[  162.041764] ath10k_pci :01:00.0: firmware ver 10.2.4.70.59-2 api 
5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[  162.053908] ath10k_pci :01:00.0: board_file api 1 bmi_id N/A 
crc32 bebc7c08
[  162.061332] ath10k_pci :01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 
cal file max-sta 128 raw 0 hwcrypto 1

[  162.072975] ath10k_pci :01:00.0: firmware register dump:
[  162.078732] ath10k_pci :01:00.0: [00]: 0x4100016C 0x15B3 
0x009A45AF 0x00955B31
[  162.086771] ath10k_pci :01:00.0: [04]: 0x009A45AF 0x00060130 
0x0001 0x4000
[  162.094804] ath10k_pci :01:00.0: [08]: 0x0044110C 0x00442074 
0x00407120 0x004436CC
[  162.102849] ath10k_pci :01:00.0: [12]: 0x0009 0x 
0x009A3550 0x009A355E
[  162.110892] ath10k_pci :01:00.0: [16]: 0x00958080 0x0094085D 
0x 0x
[  162.118935] ath10k_pci :01:00.0: [20]: 0x409A45AF 0x0040AAC4 
0x0040AC60 0x0040AC09
[  162.126978] ath10k_pci :01:00.0: [24]: 0x809A44F2 0x0040AB24 
0x 0xC09A45AF
[  162.135011] ath10k_pci :01:00.0: [28]: 0x809A3A16 0x0040AB84 
0x0044110C 0x00442074
[  162.143056] ath10k_pci :01:00.0: [32]: 0x809A601A 0x0040ABB4 
0x0044110C 0x00407120
[  162.151099] ath10k_pci :01:00.0: [36]: 0x809A2EA4 0x0040ABF4 
0x0040AC10 0x1580
[  162.159142] ath10k_pci :01:00.0: [40]: 0x80990F63 0x0040AD04 
0x009C6458 0x004436CC
[  162.167185] ath10k_pci :01:00.0: [44]: 0x80998520 0x0040AD64 
0x004208FC 0x00439E4C
[  162.175225] ath10k_pci :01:00.0: [48]: 0x8099AEA5 0x0040AD84 
0x004208FC 0x004265C4
[  162.183253] ath10k_pci :01:00.0: [52]: 0x809BFC39 0x0040AEE4 
0x00424FE8 0x0002
[  162.191298] ath10k_pci :01:00.0: [56]: 0x80940F18 0x0040AF14 
0x0004 0x004039D0

[  162.297229] ieee80211 phy0: Hardware restart was requested
[  162.302880] ath10k_pci :01:00.0: wmi disable pktlog

~Benjamin


On 12/13/2016 10:42 AM, Benjamin Morgan wrote:
Just tested the latest 10.2.4.70.59-2 firmware and it still crashes 
with wpa_supplicant encrypted mesh =(


[   85.201440] ath10k_pci :01:00.0: firmware crashed! (uuid 
b7f44483-0488-46af-8dff-db88f4b56327)
[   85.210573] ath10k_pci :01:00.0: qca988x hw2.0 target 
0x4100016c chip_id 0x043202ff sub :
[   85.219940] ath10k_pci :01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 1 testmode 1
[   85.233034] ath10k_pci :01:00.0: firmware ver 10.2.4.70.59-2 
api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[   85.245177] ath10k_pci :01:00.0: board_file api 1 bmi_id N/A 
crc32 bebc7c08
[   85.252592] ath10k_pci :01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 
cal file max-sta 128 raw 0 hwcrypto 1

[   85.264235] ath10k_pci :01:00.0: firmware register dump:
[   85.269992] ath10k_pci :01:00.0: [00]: 0x4100016C 0x15B3 
0x009A45AF 0x00955B31
[   85.278031] ath10k_pci :01:00.0: [04]: 0x009A45AF 0x00060130 
0x0002 0x00439E98
[   85.286078] ath10k_pci :01:00.0: [08]: 0x0044110C 0x00442074 
0x00407120 0x004436CC
[   85.294107] ath10k_pci :01:00.0: [12]: 0x0009 0x 
0x009A3550 0x009A355E
[   85.302152] ath10k_pci :01:00.0: [16]: 0x00958080 0x0094085D 
0x 0x
[   85.310195] ath10k_pci :01:00.0: [20]: 0x409A45AF 0x0040AAC4 
0x0040AC60 0x0040AC09
[   85.318239] ath10k_pci :01:00.0: [24]: 0x809A44F2 0x0040AB24 
0x0040 0xC09A45AF
[   85.326282] ath10k_pci :01:00.0: [28]: 0x809A3A16 0x0040AB84 
0x0044110C 0x00442074
[   85.334314] ath10k_pci :01:00.0: [32]: 0x809A601A 0x0040ABB4 
0x0044110C 0x00407120
[   85.342350] ath10k_pci :01:00.0: [36]: 0x809A2EA4 0x0040ABF4 
0x0040AC14 0x1580
[   85.350393] ath10k_pci :01:00.0: [40]: 0x80990F63 0x0040AD04 
0x009C6458 0x004436CC
[   85.358437] ath10k_pci :01:00.0: [44]: 0x80998520 0x0040AD64 
0x004208FC 0x00439E4C
[   85.366479] ath10k_pci :01:00.0: [48]: 0x8099AEA5 0x0040AD84 
0x004208FC 0x00425AAC
[   85.374512] ath10k_pci :01:00.0: [52]: 0x809BFC39 0x0040AEE4 
0x00424FE8 0x0002
[   85.382548] ath10k_pci :01:00.0: [56]: 0x80940F18 0x0040AF14 
0x0004 0x004039D0

[   85.487067] ieee80211 phy0: Hardware restart was requested
[   85.492701] ath10k_pci :01:00.0: wmi disable pktlog

Any new leads on tracking down this issue?

~Benjamin


On 12/06/2016 01:32 PM, Benjamin Morgan wrote:
1. Yes, this appears to happens every time a unicast packet with 
wpa_supplicant encryption in 

[PATCH V2] ath10k: fix incorrect txpower set by P2P_DEVICE interface

2016-12-13 Thread ryanhsu
From: Ryan Hsu 

Ath10k reports the phy capability that supports P2P_DEVICE interface.

When we use the P2P supported wpa_supplicant to start connection, it'll
create two interfaces, one is wlan0 (vdev_id=0) and one is P2P_DEVICE
p2p-dev-wlan0 which is for p2p control channel (vdev_id=1).

ath10k_pci mac vdev create 0 (add interface) type 2 subtype 0
ath10k_add_interface: vdev_id: 0, txpower: 0, bss_power: 0
...
ath10k_pci mac vdev create 1 (add interface) type 2 subtype 1
ath10k_add_interface: vdev_id: 1, txpower: 0, bss_power: 0

And the txpower in per vif bss_conf will only be set to valid tx power when
the interface is assigned with channel_ctx.

But this P2P_DEVICE interface will never be used for any connection, so
that the uninitialized bss_conf.txpower=0 is assinged to the
arvif->txpower when interface created.

Since the txpower configuration is firmware per physical interface.
So the smallest txpower of all vifs will be the one limit the tx power
of the physical device, that causing the low txpower issue on other
active interfaces.

wlan0: Limiting TX power to 21 (24 - 3) dBm
ath10k_pci mac vdev_id 0 txpower 21
ath10k_mac_txpower_recalc: vdev_id: 1, txpower: 0
ath10k_mac_txpower_recalc: vdev_id: 0, txpower: 21
ath10k_pci mac txpower 0

This issue only happens when we use the wpa_supplicant that supports
P2P or if we use the iw tool to create the control P2P_DEVICE interface.

Signed-off-by: Ryan Hsu 
---
 drivers/net/wireless/ath/ath10k/mac.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index fe42bf5..6e1550b 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -4671,7 +4671,8 @@ static int ath10k_mac_txpower_recalc(struct ath10k *ar)
lockdep_assert_held(>conf_mutex);
 
list_for_each_entry(arvif, >arvifs, list) {
-   WARN_ON(arvif->txpower < 0);
+   if (arvif->txpower <= 0)
+   continue;
 
if (txpower == -1)
txpower = arvif->txpower;
@@ -4679,8 +4680,8 @@ static int ath10k_mac_txpower_recalc(struct ath10k *ar)
txpower = min(txpower, arvif->txpower);
}
 
-   if (WARN_ON(txpower == -1))
-   return -EINVAL;
+   if (txpower == -1)
+   return 0;
 
ret = ath10k_mac_txpower_setup(ar, txpower);
if (ret) {
-- 
1.9.1



Re: [RFC V3 03/11] nl80211: add support for gscan

2016-12-13 Thread Johannes Berg
On Tue, 2016-12-13 at 21:09 +0100, Arend Van Spriel wrote:
> 
> > There's a bit of a weird hard-coded restriction to 16 channels too,
> > that's due to the bucket map?
> 
> Uhm. Is there? I will check, but if you can give me a pointer where
> to look it is appreciated.

Just look for "< 16" or "<= 16" or so in the patch. I do think that's
because the channel map is a u16 though, not sure we'd want to change
that.

johannes


Re: [Patch] NFC: trf7970a:

2016-12-13 Thread Mark Greer
On Thu, Apr 21, 2016 at 05:01:19PM -0700, Mark Greer wrote:
> On Mon, Apr 18, 2016 at 03:48:37PM -0400, Geoff Lansberry wrote:
> 
> Hi Geoff.
> 
> > The current version of the trf7970a driver is missing support for several 
> > features that we needed to operate a custom board. 
> > We feel that these features will be useful to others as well, and we want 
> > to share them.
> > 
> > 1: Support for using a gpio as Slave-Select. Our processor has several 
> > devices on the spi bus, and we ran out of ss lines.  This patch gives 
> > TRF7970A the ability to
> >  drive the ss line of the chip from a gpio that is defined in the device 
> > tree.
> > 
> > 2. When reviewing problems we were having in our implementation with TI 
> > support staff, they recommended that during initialization, address 0x18 
> > should be written to zero.  This patch adds that change
> > 
> > 3. This existing version of the driver assumes that the crystal driving the 
> > trf7970a is 13.56 MHz, because there are several places in the driver code 
> > where the rel
> > ated register is re-written, there is clean way to change to 27.12 MHz.  
> > This patch adds a device tree option for 27 MHz and properly or's in 
> > changes in locations w
> > here the register is changed.
> > 
> > 4. the existing version of the driver assumes that 3.3 volt io is used. The 
> > trf7970a has a special register where you can configure it for 1.8 volt io. 
> >  This patch
> > adds a device tree option to select that this setting should be made.
> > 
> >  [PATCH 1/4] NFC: trf7970a: Add support for gpio as SS
> >  [PATCH 2/4] NFC: trf7970a: add TI recommended write of zero to
> >  [PATCH 3/4] NFC: trf7970a: add device tree option for 27MHz clock
> >  [PATCH 4/4] NFC: trf7970a: Add device tree option of 1.8 Volt IO
> 
> I'm on vacation this week but will be back next week.  I'll take a
> look once I'm back.
> 
> In the meantime, for emails sent to public (techie) list always keep
> the lines less than 80 characters and always bottom-post (i.e., put
> your text *underneath* the text that you are responding to).  Also,
> when you change one or more patches in a series, re-submit the entire
> series with the version number incremented (.e.g., v2, v3, ...) even
> when you change only one of them.  It is a easier for others to know
> what the latest versions are that way.

Hi Geoff.

I know its been a ridiculously long time since I said I would look at
your patches but I have time now.  Do you have an updated version of
your patch series?

Mark
--



Re: [RFC V3 03/11] nl80211: add support for gscan

2016-12-13 Thread Arend Van Spriel


On 13-12-2016 17:19, Johannes Berg wrote:
> On Mon, 2016-12-12 at 11:59 +, Arend van Spriel wrote:
>> This patch adds support for GScan which is a scan offload feature
>> used in Android.
> 
> Found a few places with spaces instead of tabs as indentation, and
> spurious braces around single-statement things, but other than that it
> looks fine from a patch/nl80211 POV.

I added a check in wiphy_register() in this patch which actually is more
appropriate with the "gscan capabilities" patch.

> Haven't really looked into the details of gscan itself now though,
> sorry.
> 
> There's a bit of a weird hard-coded restriction to 16 channels too,
> that's due to the bucket map?

Uhm. Is there? I will check, but if you can give me a pointer where to
look it is appreciated.

Regards,
Arend


Re: [RFC V3 01/11] nl80211: add reporting of gscan capabilities

2016-12-13 Thread Arend Van Spriel


On 13-12-2016 17:15, Johannes Berg wrote:
> 
>> +case 14:
>> +if (!rdev->wiphy.gscan) {
>> +/* done */
>> +state->split_start = 0;
>> +break;
>> +}
>>
> 
> Nit, but I'm not really happy with this - this assumes that case 14 is
> the last case, if anyone ever adds one we break this code but it would
> still work if the device has gscan. Move the gscan stuff into a new
> function and make that return immediately if gscan is NULL or so?

Agree. When coding this piece I was aware that this would need to change
when 'case 15' would be added, which is probably too easy to overlook. I
will change it.

Regards,
Arend


Re: [RFC v2 05/11] ath10k: htc: refactorization

2016-12-13 Thread Michal Kazior
On 13 December 2016 at 19:37, Erik Stromdahl  wrote:
>
>
> On 12/13/2016 06:26 PM, Valo, Kalle wrote:
>> Michal Kazior  writes:
>>
>>> On 13 December 2016 at 14:44, Valo, Kalle  wrote:
 Erik Stromdahl  writes:

> Code refactorization:
>
> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
> to ath10k_htc_control_rx_complete.
>
> This eases the implementation of SDIO/mbox significantly since
> the ep_rx_complete cb is invoked directly from the SDIO/mbox
> hif layer.
>
> Since the ath10k_htc_control_rx_complete already is present
> (only containing a warning message) there is no reason for not
> using it (instead of having a special case for ep 0 in
> ath10k_htc_rx_completion_handler).
>
> Signed-off-by: Erik Stromdahl 

 I tested this on QCA988X PCI board just to see if there are any
 regressions. It crashes immediately during module load, every time, and
 bisected that the crashing starts on this patch:

 [ 1239.715325] ath10k_pci :02:00.0: pci irq msi oper_irq_mode 2 
 irq_mode 0 reset_mode 0
 [ 1239.885125] ath10k_pci :02:00.0: Direct firmware load for 
 ath10k/pre-cal-pci-:02:00.0.bin failed with error -2
 [ 1239.885260] ath10k_pci :02:00.0: Direct firmware load for 
 ath10k/cal-pci-:02:00.0.bin failed with error -2
 [ 1239.885687] ath10k_pci :02:00.0: qca988x hw2.0 target 0x4100016c 
 chip_id 0x043202ff sub :
 [ 1239.885699] ath10k_pci :02:00.0: kconfig debug 1 debugfs 1 tracing 
 1 dfs 1 testmode 1
 [ 1239.885899] ath10k_pci :02:00.0: firmware ver 10.2.4.70.59-2 api 5 
 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
 [ 1239.941836] ath10k_pci :02:00.0: Direct firmware load for 
 ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
 [ 1239.941993] ath10k_pci :02:00.0: board_file api 1 bmi_id N/A crc32 
 bebc7c08
 [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   
 (null)
 [ 1241.136738] IP: [<  (null)>]   (null)
 [ 1241.136759] *pdpt =  *pde = f0002a55f0002a55 [ 
 1241.136781]
 [ 1241.136793] Oops: 0010 [#1] SMP

 What's odd is that when I added some printks on my own and enabled both
 boot and htc debug levels it doesn't crash anymore. After everything
 works normally after that, I can start AP mode and connect to it. Is it
 a race somewhere?
>>>
>>> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
>>> is set in htc_wait_target() [changed patch 4, but still too late].
>>>
>>> ep_rx_complete must be set prior to calling hif_start(). You probably
>>> crash on end of ath10k_htc_rx_completion_handler() when trying to call
>>> ep->ep_ops.ep_rx_complete(ar, skb).
>>
>> Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
>> ath10k_htc_rx_completion_handler().
>>
> It is indeed correct as Michal points out, there is a risk that the
> first HTC control message (typically an HTC ready message) is received
> before the HTC control endpoint is connected.
>
> I have experienced a similar race with my SDIO implementation as well.
> In this case I did solve the issue by enabling HIF target interrupts
> after the HTC control endpoint was connected. I am not sure however if
> this is the most elegant way to solve this problem.
>
> My SDIO target won't send the HTC ready message before this is done.
> The fix essentially consists of moving the ..._irg_enable call from
> hif_start into another hif op.

It makes more sense to move ep_rx_complete setup/assignment before
hif_start(). This assignment should be done very early as there is
nothing to change/override for this endpoint during operation, is
there? It's known what it needs to store very early on.


> I have made a few updates since I submitted the original RFC and created
> a repo on github:
>
> https://github.com/erstrom/linux-ath
>
> I have a bunch of branches that are all based on the tags on the ath master.
>
> As of this moment the latest version is:
>
> ath-201612131156-ath10k-sdio
>
> This branch contains the original RFC patches plus some addons/fixes.
>
> In the above mentioned branch there are a few commits related to this
> race condition. Perhaps you can have a look at them?
>
> The commits are:
> 821672913328cf737c9616786dc28d2e4e8a4a90

I would avoid if(bus==xx) checks.


> dd7fcf0a1f78e68876d14f90c12bd37f3a700ad7
> 7434b7b40875bd08a3a48a437ba50afed7754931
>
> Perhaps this approach can work with PCIe as well?

I think I did contemplate the unmask/start distinction at some point
but I didn't go with it for some reason I can't recall now.


Michał


Re: ath10k firmware crashes in mesh mode on QCA9880

2016-12-13 Thread Benjamin Morgan
Just tested the latest 10.2.4.70.59-2 firmware and it still crashes with 
wpa_supplicant encrypted mesh =(


[   85.201440] ath10k_pci :01:00.0: firmware crashed! (uuid 
b7f44483-0488-46af-8dff-db88f4b56327)
[   85.210573] ath10k_pci :01:00.0: qca988x hw2.0 target 0x4100016c 
chip_id 0x043202ff sub :
[   85.219940] ath10k_pci :01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 1 testmode 1
[   85.233034] ath10k_pci :01:00.0: firmware ver 10.2.4.70.59-2 api 
5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[   85.245177] ath10k_pci :01:00.0: board_file api 1 bmi_id N/A 
crc32 bebc7c08
[   85.252592] ath10k_pci :01:00.0: htt-ver 2.1 wmi-op 5 htt-op 2 
cal file max-sta 128 raw 0 hwcrypto 1

[   85.264235] ath10k_pci :01:00.0: firmware register dump:
[   85.269992] ath10k_pci :01:00.0: [00]: 0x4100016C 0x15B3 
0x009A45AF 0x00955B31
[   85.278031] ath10k_pci :01:00.0: [04]: 0x009A45AF 0x00060130 
0x0002 0x00439E98
[   85.286078] ath10k_pci :01:00.0: [08]: 0x0044110C 0x00442074 
0x00407120 0x004436CC
[   85.294107] ath10k_pci :01:00.0: [12]: 0x0009 0x 
0x009A3550 0x009A355E
[   85.302152] ath10k_pci :01:00.0: [16]: 0x00958080 0x0094085D 
0x 0x
[   85.310195] ath10k_pci :01:00.0: [20]: 0x409A45AF 0x0040AAC4 
0x0040AC60 0x0040AC09
[   85.318239] ath10k_pci :01:00.0: [24]: 0x809A44F2 0x0040AB24 
0x0040 0xC09A45AF
[   85.326282] ath10k_pci :01:00.0: [28]: 0x809A3A16 0x0040AB84 
0x0044110C 0x00442074
[   85.334314] ath10k_pci :01:00.0: [32]: 0x809A601A 0x0040ABB4 
0x0044110C 0x00407120
[   85.342350] ath10k_pci :01:00.0: [36]: 0x809A2EA4 0x0040ABF4 
0x0040AC14 0x1580
[   85.350393] ath10k_pci :01:00.0: [40]: 0x80990F63 0x0040AD04 
0x009C6458 0x004436CC
[   85.358437] ath10k_pci :01:00.0: [44]: 0x80998520 0x0040AD64 
0x004208FC 0x00439E4C
[   85.366479] ath10k_pci :01:00.0: [48]: 0x8099AEA5 0x0040AD84 
0x004208FC 0x00425AAC
[   85.374512] ath10k_pci :01:00.0: [52]: 0x809BFC39 0x0040AEE4 
0x00424FE8 0x0002
[   85.382548] ath10k_pci :01:00.0: [56]: 0x80940F18 0x0040AF14 
0x0004 0x004039D0

[   85.487067] ieee80211 phy0: Hardware restart was requested
[   85.492701] ath10k_pci :01:00.0: wmi disable pktlog

Any new leads on tracking down this issue?

~Benjamin


On 12/06/2016 01:32 PM, Benjamin Morgan wrote:
1. Yes, this appears to happens every time a unicast packet with 
wpa_supplicant encryption in VHT80 mode is received. I haven't seen a 
successful ping-pong pair.

2. We tried with 10.2.4.70.42-2 firmware and still saw crashes.
3. We ran our experiment again with extra debugging turned on.
Node A: 18:A6:F7:23:6E:66 | 10.230.5.41
Node B: 18:A6:F7:26:0F:21 | 10.230.5.42
The ping command we used was run on Node A was 'ping -s 1500 -i 
0.1 10.230.5.42'

Here is the dmesg log from Node B.

[ 5413.478170] ath10k_pci :01:00.0: WMI_UPDATE_STATS_EVENTID
[ 5413.503954] ath10k_pci :01:00.0: scan event bss channel type 4 
reason 3 freq 5825 req_id 40961 scan_id 40960 vdev_id 0 state running (2)
[ 5413.503985] ath10k_pci :01:00.0: chan info err_code 0 freq 5825 
cmd_flags 1 noise_floor -105 rx_clear_count 7692807 cycle_count 312271423
[ 5413.504029] ath10k_pci :01:00.0: scan event completed type 2 
reason 0 freq 5825 req_id 40961 scan_id 40960 vdev_id 0 state running (2)
[ 5413.525868] ath10k_pci :01:00.0: wmi vdev install key idx 1 
cipher 4 len 16
[ 5413.526014] ath10k_pci :01:00.0: wmi vdev id 0x0 set param 31 
value 1

[ 5413.526193] ath10k_pci :01:00.0: mac vdev 0 set keyidx 1
[ 5413.526216] ath10k_pci :01:00.0: wmi vdev id 0x0 set param 31 
value 1
[ 5413.526532] ath10k_pci :01:00.0: mac chanctx add freq 5180 
width 3 ptr 86db29b0
[ 5413.526556] ath10k_pci :01:00.0: mac monitor recalc started? 0 
needed? 0 allowed? 1
[ 5413.526574] ath10k_pci :01:00.0: mac chanctx assign ptr 
86db29b0 vdev_id 0
[ 5413.526592] ath10k_pci :01:00.0: mac vdev 0 start center_freq 
5180 phymode 11ac-vht80
[ 5413.526616] ath10k_pci :01:00.0: wmi vdev start id 0x0 flags: 
0x0, freq 5180, mode 10, ch_flags: 0xA00, max_power: 46

[ 5413.533099] ath10k_pci :01:00.0: WMI_VDEV_START_RESP_EVENTID
[ 5413.533148] ath10k_pci :01:00.0: mac vdev_id 0 txpower 23
[ 5413.533163] ath10k_pci :01:00.0: mac txpower 23
[ 5413.533180] ath10k_pci :01:00.0: wmi pdev set param 3 value 46
[ 5413.533247] ath10k_pci :01:00.0: wmi pdev set param 4 value 46
[ 5413.533295] ath10k_pci :01:00.0: mac chanctx change freq 5180 
width 3 ptr 86db29b0 changed 10
[ 5413.533318] ath10k_pci :01:00.0: mac chanctx change freq 5180 
width 3 ptr 86db29b0 changed 2
[ 5413.57] ath10k_pci :01:00.0: mac monitor recalc started? 0 
needed? 1 allowed? 1
[ 5413.533357] ath10k_pci :01:00.0: WMI vdev create: id 1 type 4 
subtype 0 macaddr 18:a6:f7:26:0f:21

[ 5413.533412] ath10k_pci :01:00.0: mac monitor vdev 1 created
[ 

Re: [RFC v2 05/11] ath10k: htc: refactorization

2016-12-13 Thread Erik Stromdahl


On 12/13/2016 06:26 PM, Valo, Kalle wrote:
> Michal Kazior  writes:
> 
>> On 13 December 2016 at 14:44, Valo, Kalle  wrote:
>>> Erik Stromdahl  writes:
>>>
 Code refactorization:

 Moved the code for ep 0 in ath10k_htc_rx_completion_handler
 to ath10k_htc_control_rx_complete.

 This eases the implementation of SDIO/mbox significantly since
 the ep_rx_complete cb is invoked directly from the SDIO/mbox
 hif layer.

 Since the ath10k_htc_control_rx_complete already is present
 (only containing a warning message) there is no reason for not
 using it (instead of having a special case for ep 0 in
 ath10k_htc_rx_completion_handler).

 Signed-off-by: Erik Stromdahl 
>>>
>>> I tested this on QCA988X PCI board just to see if there are any
>>> regressions. It crashes immediately during module load, every time, and
>>> bisected that the crashing starts on this patch:
>>>
>>> [ 1239.715325] ath10k_pci :02:00.0: pci irq msi oper_irq_mode 2 
>>> irq_mode 0 reset_mode 0
>>> [ 1239.885125] ath10k_pci :02:00.0: Direct firmware load for 
>>> ath10k/pre-cal-pci-:02:00.0.bin failed with error -2
>>> [ 1239.885260] ath10k_pci :02:00.0: Direct firmware load for 
>>> ath10k/cal-pci-:02:00.0.bin failed with error -2
>>> [ 1239.885687] ath10k_pci :02:00.0: qca988x hw2.0 target 0x4100016c 
>>> chip_id 0x043202ff sub :
>>> [ 1239.885699] ath10k_pci :02:00.0: kconfig debug 1 debugfs 1 tracing 1 
>>> dfs 1 testmode 1
>>> [ 1239.885899] ath10k_pci :02:00.0: firmware ver 10.2.4.70.59-2 api 5 
>>> features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
>>> [ 1239.941836] ath10k_pci :02:00.0: Direct firmware load for 
>>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>>> [ 1239.941993] ath10k_pci :02:00.0: board_file api 1 bmi_id N/A crc32 
>>> bebc7c08
>>> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   
>>> (null)
>>> [ 1241.136738] IP: [<  (null)>]   (null)
>>> [ 1241.136759] *pdpt =  *pde = f0002a55f0002a55 [ 
>>> 1241.136781]
>>> [ 1241.136793] Oops: 0010 [#1] SMP
>>>
>>> What's odd is that when I added some printks on my own and enabled both
>>> boot and htc debug levels it doesn't crash anymore. After everything
>>> works normally after that, I can start AP mode and connect to it. Is it
>>> a race somewhere?
>>
>> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
>> is set in htc_wait_target() [changed patch 4, but still too late].
>>
>> ep_rx_complete must be set prior to calling hif_start(). You probably
>> crash on end of ath10k_htc_rx_completion_handler() when trying to call
>> ep->ep_ops.ep_rx_complete(ar, skb).
> 
> Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
> ath10k_htc_rx_completion_handler().
> 
It is indeed correct as Michal points out, there is a risk that the
first HTC control message (typically an HTC ready message) is received
before the HTC control endpoint is connected.

I have experienced a similar race with my SDIO implementation as well.
In this case I did solve the issue by enabling HIF target interrupts
after the HTC control endpoint was connected. I am not sure however if
this is the most elegant way to solve this problem.

My SDIO target won't send the HTC ready message before this is done.
The fix essentially consists of moving the ..._irg_enable call from
hif_start into another hif op.

I have made a few updates since I submitted the original RFC and created
a repo on github:

https://github.com/erstrom/linux-ath

I have a bunch of branches that are all based on the tags on the ath master.

As of this moment the latest version is:

ath-201612131156-ath10k-sdio

This branch contains the original RFC patches plus some addons/fixes.

In the above mentioned branch there are a few commits related to this
race condition. Perhaps you can have a look at them?

The commits are:
821672913328cf737c9616786dc28d2e4e8a4a90
dd7fcf0a1f78e68876d14f90c12bd37f3a700ad7
7434b7b40875bd08a3a48a437ba50afed7754931

Perhaps this approach can work with PCIe as well?

/Erik


Re: [RFC v2 05/11] ath10k: htc: refactorization

2016-12-13 Thread Valo, Kalle
Michal Kazior  writes:

> On 13 December 2016 at 14:44, Valo, Kalle  wrote:
>> Erik Stromdahl  writes:
>>
>>> Code refactorization:
>>>
>>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>>> to ath10k_htc_control_rx_complete.
>>>
>>> This eases the implementation of SDIO/mbox significantly since
>>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>>> hif layer.
>>>
>>> Since the ath10k_htc_control_rx_complete already is present
>>> (only containing a warning message) there is no reason for not
>>> using it (instead of having a special case for ep 0 in
>>> ath10k_htc_rx_completion_handler).
>>>
>>> Signed-off-by: Erik Stromdahl 
>>
>> I tested this on QCA988X PCI board just to see if there are any
>> regressions. It crashes immediately during module load, every time, and
>> bisected that the crashing starts on this patch:
>>
>> [ 1239.715325] ath10k_pci :02:00.0: pci irq msi oper_irq_mode 2 irq_mode 
>> 0 reset_mode 0
>> [ 1239.885125] ath10k_pci :02:00.0: Direct firmware load for 
>> ath10k/pre-cal-pci-:02:00.0.bin failed with error -2
>> [ 1239.885260] ath10k_pci :02:00.0: Direct firmware load for 
>> ath10k/cal-pci-:02:00.0.bin failed with error -2
>> [ 1239.885687] ath10k_pci :02:00.0: qca988x hw2.0 target 0x4100016c 
>> chip_id 0x043202ff sub :
>> [ 1239.885699] ath10k_pci :02:00.0: kconfig debug 1 debugfs 1 tracing 1 
>> dfs 1 testmode 1
>> [ 1239.885899] ath10k_pci :02:00.0: firmware ver 10.2.4.70.59-2 api 5 
>> features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
>> [ 1239.941836] ath10k_pci :02:00.0: Direct firmware load for 
>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>> [ 1239.941993] ath10k_pci :02:00.0: board_file api 1 bmi_id N/A crc32 
>> bebc7c08
>> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   
>> (null)
>> [ 1241.136738] IP: [<  (null)>]   (null)
>> [ 1241.136759] *pdpt =  *pde = f0002a55f0002a55 [ 
>> 1241.136781]
>> [ 1241.136793] Oops: 0010 [#1] SMP
>>
>> What's odd is that when I added some printks on my own and enabled both
>> boot and htc debug levels it doesn't crash anymore. After everything
>> works normally after that, I can start AP mode and connect to it. Is it
>> a race somewhere?
>
> Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
> is set in htc_wait_target() [changed patch 4, but still too late].
>
> ep_rx_complete must be set prior to calling hif_start(). You probably
> crash on end of ath10k_htc_rx_completion_handler() when trying to call
> ep->ep_ops.ep_rx_complete(ar, skb).

Yeah, just checked and ep->ep_ops.ep_rx_complete is NULL at the end of
ath10k_htc_rx_completion_handler().

-- 
Kalle Valo

[PATCH v2] ath9k: do not return early to fix rcu unlocking

2016-12-13 Thread Tobias Klausmann
Starting with commit d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb
where possible") the driver uses rcu_read_lock() && rcu_read_unlock(), yet on
returning early in ath_tx_edma_tasklet() the unlock is missing leading to stalls
and suspicious RCU usage:

 ===
 [ INFO: suspicious RCU usage. ]
 4.9.0-rc8 #11 Not tainted
 ---
 kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!

 other info that might help us debug this:

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 1 lock held by swapper/7/0:
 #0:
  (
 rcu_read_lock
 ){..}
 , at:
 [] ath_tx_edma_tasklet+0x0/0x450 [ath9k]

 stack backtrace:
 CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
 Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
  88025efc3f38 8132b1e5 88017ede4540 0001
  88025efc3f68 810a25f7 88025efcee60 88017edebdd8
  88025eeb5400 0091 88025efc3f88 810c3cd4
 Call Trace:
  
  [] dump_stack+0x68/0x93
  [] lockdep_rcu_suspicious+0xd7/0x110
  [] rcu_eqs_enter_common.constprop.85+0x154/0x200
  [] rcu_irq_exit+0x44/0xa0
  [] irq_exit+0x61/0xd0
  [] do_IRQ+0x65/0x110
  [] common_interrupt+0x89/0x89
  
  [] ? cpuidle_enter_state+0x151/0x200
  [] cpuidle_enter+0x12/0x20
  [] call_cpuidle+0x1e/0x40
  [] cpu_startup_entry+0x146/0x220
  [] start_secondary+0x148/0x170

Signed-off-by: Tobias Klausmann 
Fixes: d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb where possible")
Cc:  # v4.9

---
v2: break instead of unlock (rename patch) [Felix Fietkau],
fix reference to commit [Kalle Valo]
---
 drivers/net/wireless/ath/ath9k/xmit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/xmit.c 
b/drivers/net/wireless/ath/ath9k/xmit.c
index 52bfbb988611..e47286bf378e 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -2787,7 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
fifo_list = >txq_fifo[txq->txq_tailidx];
if (list_empty(fifo_list)) {
ath_txq_unlock(sc, txq);
-   return;
+   break;
}
 
bf = list_first_entry(fifo_list, struct ath_buf, list);
-- 
2.11.0



Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations

2016-12-13 Thread Kalle Valo
Andy Lutomirski  writes:

> On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo  wrote:
>> Andy Lutomirski  writes:
>>
>>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>>> at the stack.
>>>
>>> Fix it by switching from ahash to shash.  The result should be
>>> simpler, faster, and more correct.
>>>
>>> Cc: sta...@vger.kernel.org # 4.9 only
>>> Reported-by: Eric Biggers 
>>> Signed-off-by: Andy Lutomirski 
>>
>> "more correct"? Does this fix a real user visible bug or what? And why
>> just stable 4.9, does this maybe have something to do with
>> CONFIG_VMAP_STACK?
>
> Whoops, I had that text in some other patches but forgot to put it in
> this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
> like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
> debugging is off.

Makes sense now, thanks. I'll add that to the commit log and queue this
to 4.10.

-- 
Kalle Valo


Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations

2016-12-13 Thread Andy Lutomirski
On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo  wrote:
> Andy Lutomirski  writes:
>
>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>> at the stack.
>>
>> Fix it by switching from ahash to shash.  The result should be
>> simpler, faster, and more correct.
>>
>> Cc: sta...@vger.kernel.org # 4.9 only
>> Reported-by: Eric Biggers 
>> Signed-off-by: Andy Lutomirski 
>
> "more correct"? Does this fix a real user visible bug or what? And why
> just stable 4.9, does this maybe have something to do with
> CONFIG_VMAP_STACK?

Whoops, I had that text in some other patches but forgot to put it in
this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
debugging is off.

--Andy


Re: [RFC V3 01/11] nl80211: add reporting of gscan capabilities

2016-12-13 Thread Johannes Berg

> + case 14:
> + if (!rdev->wiphy.gscan) {
> + /* done */
> + state->split_start = 0;
> + break;
> + }
> 

Nit, but I'm not really happy with this - this assumes that case 14 is
the last case, if anyone ever adds one we break this code but it would
still work if the device has gscan. Move the gscan stuff into a new
function and make that return immediately if gscan is NULL or so?

johannes


Re: [RFC V3 03/11] nl80211: add support for gscan

2016-12-13 Thread Johannes Berg
On Mon, 2016-12-12 at 11:59 +, Arend van Spriel wrote:
> This patch adds support for GScan which is a scan offload feature
> used in Android.

Found a few places with spaces instead of tabs as indentation, and
spurious braces around single-statement things, but other than that it
looks fine from a patch/nl80211 POV.

Haven't really looked into the details of gscan itself now though,
sorry.

There's a bit of a weird hard-coded restriction to 16 channels too,
that's due to the bucket map?

johannes


Re: [RFC V3 04/11] nl80211: add driver api for gscan notifications

2016-12-13 Thread Johannes Berg
On Mon, 2016-12-12 at 11:59 +, Arend van Spriel wrote:
> The driver can indicate gscan results are available or gscan
> operation has stopped.

This patch is renumbering the previous patches' nl80211 API, which is
best avoided, even if I do realize it doesn't matter now. :)

Even here it's not clear how things are reported though. Somehow I
thought that gscan was reporting only partial information through the
buckets, or is that not true?

johannes


Re: [PATCH 3/3][RFC] nl80211/mac80211: Accept multiple RSSI thresholds for CQM

2016-12-13 Thread Johannes Berg

> I wasn't clear: nl80211 sets the thresholds so that "high" is higher
> than last known value and "low" is lower than last known value, also
> the distance is at least 2 x hysteresis.  There's no purpose for
> reporting "middle" rssi events because we have to set a new range as
> soon as we receive a high or a low event.  I realize I need to
> document better.

But there can be a delay between reporting and reprogramming, and if
during that time a new event could be reported? I guess it doesn't
matter much if we assume that upon reprogramming the driver will always
report a new event if the current value falls outside the new range
(either high or low)... it just seemed a little bit more consistent to
unconditionally report a new event at the beginning, even if that new
event is "yup - falling into the middle of your range now".

johannes


Re: [PATCH 2/4] cfg80211: Add new NL80211_CMD_SET_BTCOEX_PRIORITY to support BTCOEX

2016-12-13 Thread Johannes Berg

> > >  /**
> > > + * wiphy_btcoex_support_flags
> > > + *   This enum has the driver supported frame types for
> > > BTCOEX.
> > > + * @WIPHY_WLAN_BE_PREFERRED - Supports Best Effort frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_BK_PREFERRED - supports Background frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_VI_PREFERRED - supports Video frame for BTCOEX
> > > + * @WIPHY_WLAN_VO_PREFERRED - supports Voice frame for BTCOEX
> > > + * @WIPHY_WLAN_BEACON_PREFERRED - supports Beacon frame for
> > > BTCOEX
> > > + * @WIPHY_WLAN_MGMT_PREFERRED - supports Management frames for
> > > BTCOEX.
> > > + */
> > 
> > That's not making much sense to me?
> > 
> 
> is it fine to have as WIPHY_BTCOEX_BE_PREFERRED ?

It's not really clear to me what you intend to do this - if it's really
support flags then you really should name those better.

> > > +/**
> > > + * enum wiphy_btcoex_priority - BTCOEX priority level
> > > + *   This enum defines priority level for BTCOEX
> > > + * WIPHY_WLAN_PREFERRED_LOW - low priority frames over BT
> > > traffic
> > > + * WIPHY_WLAN_PREFERRED_HIGH - high priority frames over BT
> > > traffic
> > > + */
> > > +
> > > +enum wiphy_btcoex_priority {
> > > + WIPHY_WLAN_PREFERRED_LOW = false,
> > > + WIPHY_WLAN_PREFERRED_HIGH = true,
> > > +};
> > 
> > That false/true seems just strange.
> > 
> 
> I will just use as a enum without assigning false/true.

What do you even need this enum for though?

> > > +enum nl80211_btcoex_priority {
> > > + __NL80211_WLAN_PREFERRED_INVALID,
> > > + NL80211_WLAN_BE_PREFERRED,
> > > + NL80211_WLAN_BK_PREFERRED,
> > > + NL80211_WLAN_VI_PREFERRED,
> > > + NL80211_WLAN_VO_PREFERRED,
> > > + NL80211_WLAN_BEACON_PREFERRED,
> > > + NL80211_WLAN_MGMT_PREFERRED,
> > > + __NL80211_WLAN_PREFERRED_LAST,
> > > + NL80211_WLAN_PREFERRED_MAX =
> > > + __NL80211_WLAN_PREFERRED_LAST - 1,
> > > +};
> > 
> > Wouldn't a bitmap be easier?
> > 
> since this is to distinguish between different btcoex priorities and
> we 
> are not going to do any manipulations on these parameters.
> It is just used as flag attribute.

But why the (parsing) complexity, when a single bitmap would do?

johannes


Re: [PATCH] RFC: Universal scan proposal

2016-12-13 Thread Johannes Berg

> Supporting requests (or more precisely requests and results)
> differentiated by user-space entity can be tricky. Right now we are
> not checking current caller pid, right? Maybe it is also good idea -
> or maybe we can just make result filtering per user-space caller?

Could be done.

You seem to be very worried about the partial results - I'm not too
worried about that I guess, the connection manager itself will always
be able to wait for the full scan to finish before making a decision,
but it may not even want to (see the separate discussion on per-channel 
"done" notifications etc.)

I'm much more worried about the "bucket reporting" since that doesn't
fit into the current full BSS reporting model at all. What's your
suggestion for this?

johannes


Re: [PATCH] RFC: Universal scan proposal

2016-12-13 Thread Johannes Berg
> > Well eventually we also have to clear for location if we run out of
> > memory, that usually means dumping them out to the host, no?
> 
> Being out of memory and consuming more memory are different
> things, but I agree - maybe we don't need to worry about it.

Well, reaching the limit of what we're willing to spend on it is
equivalent I guess :)

> > I'm not entirely sure about this case - surely noticing "we can do
> > better now" is still better than waiting for being able to make the
> > perfect decision?
> 
> Maybe we can just keep flag saying that currently available results
> were not received by usual full scan.

Elsewhere we were planning per-channel results, and a cookie to filter
them - perhaps we could have a similar thing where you may even have to
request these scan results specifically with a certain cookie you got
from the scanning, or so. Or indicate the cookie there so you can tie
it back to the scan request somehow?

> So, let's summarize:
> Instead of creating new type of generic scan with special types,
> we want to go with additional expansion of scheduled scan options and
> parameters (in order not to "multiply entities"), including ability
> to send new scheduled scan request without stopping previous one.
> 
> Is it Ok?

Sounds fine to me.

johannes


Re: [PATCH v2 2/2] cfg80211: Add support to sched scan to report better BSSs

2016-12-13 Thread Johannes Berg
Ok... this is getting complicated :)

Regarding reusing attributes, we have (for the BSS selection thing) the
attribute NL80211_BSS_SELECT_ATTR_RSSI_ADJUST, which is really quite
similar to your new NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF since
while connected (which BSS_SELECT_ATTR_* assumes) the current BSS is
always part of the considered BSSes, I'd think.

However, I tend to think now that reusing the attribute is perhaps not
the right thing to do - but defining them with the same semantics would
still make sense.

Assuming that the value defined in NL80211_BSS_SELECT_ATTR_RSSI_ADJUST
applies also to the *current* BSS, it's actually quite pointless to
define there the band to adjust - if you want to adjust 2.4 GHz
positively you might as well adjust 5 GHz negatively, and vice versa,
and both ways are supported.

OTOH, the new NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF doesn't
make this quite clear - is the current BSS to be adjusted before
comparing, if it's 5 GHz? If so, the semantics are equivalent. If not,
it doesn't actually make much sense ;-)
So assuming that it is in fact taken into account after the same
adjustment, the two attributes are equivalent, and then perhaps it
would make sense to use struct nl80211_bss_select_rssi_adjust for the
new attribute. If a driver doesn't support arbitrary bands, but just 5
GHz as in your example, it can just flip it around to 2.4 GHz by
switching the sign.

Perhaps we should even consider doing that in cfg80211 and adjusting
the internal API for both that way?

> I am not saying it should be avoided. Just looking at it conceptually
> the scheduled scan request holds so-called matchsets that specify the
> constraints to determine whether a BSS was found that is worth
> notifying the host/user-space about. As such I would expect the
> relative RSSI attribute(s) to be part of the matchset. That way you
> can specify it together with the currently connected SSID in a single
> matchset.

I think this makes a lot of sense.

We already have NL80211_SCHED_SCAN_MATCH_ATTR_RSSI, which asks to be
reporting only networks that have an *absolute* RSSI value above the
value of the attribute - a new attribute to make it relative to the
current network instead would make sense.

That would indeed be equivalent to NL80211_BSS_SELECT_ATTR_RSSI then.

Now, if we consider this, NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI
actually is equivalent to NL80211_BSS_SELECT_ATTR_RSSI (a flag
attribute indicating whether or not RSSI-based selection/matching is
done) and NL80211_ATTR_SCHED_SCAN_RELATIVE_RSSI_5G_PREF is equivalent
to NL80211_BSS_SELECT_ATTR_RSSI_ADJUST, both need to be given with the
flag and affect operation.

However, NL80211_BSS_SELECT_ATTR_BAND_PREF doesn't exist, and reusing
the BSS_SELECT namespace also doesn't make sense.


So, how about we move these into NL80211_SCHED_SCAN_MATCH_ATTR_* as
suggested by Arend, and define them with the same content as  the
corresponding NL80211_BSS_SELECT_ATTR_*?

If they're part of match attributes, we might even remove the feature
flag entirely - those were always defined to be optional, but it very
well be worthwhile for userspace to know if they're supported if it
wants to behave differently depending on whether they're supported or
not, I'll leave that up to you since presumably you know the userspace
implementation that you're planning to create.

johannes


Re: [PATCH v3] cfg80211: NL80211_ATTR_SOCKET_OWNER support for CMD_CONNECT

2016-12-13 Thread Johannes Berg
[snip]

Please fix coding style, particularly indentation.

> +static void cfg80211_disconnect_wk(struct work_struct *work)
> +{
> +   struct cfg80211_registered_device *rdev;
> +   struct wireless_dev *wdev;
> +
> +   wdev = container_of(work, struct wireless_dev, disconnect_wk);
> +   rdev = wiphy_to_rdev(wdev->wiphy);


Those should also be possible as initializers on the same line, I
guess?

It might also be worthwhile moving this function into a better file,
even if then it needs a prototype in core.h (it can't be inlined anyway
since it's called through a function pointer in the work struct)

> +   if (!wdev->netdev)
> +   return;

This obviously cannot happen.

All the code you added to nl80211.c is racy.

johannes


Re: [PATCH v2 1/2] mac80211: Remove invalid flag operations in mesh TSF synchronization

2016-12-13 Thread Johannes Berg
On Thu, 2016-12-08 at 10:15 +0900, Masashi Honma wrote:
> mesh_sync_offset_adjust_tbtt() implements Extensible synchronization
> framework ([1] 13.13.2 Extensible synchronization framework). It
> shall
> not operate the flag "TBTT Adjusting subfield" ([1] 8.4.2.100.8 Mesh
> Capability), since it is used only for MBCA ([1] 13.13.4 Mesh beacon
> collision avoidance, see 13.13.4.4.3 TBTT scanning and adjustment
> procedures for detail). So this patch remove the flag operations.
> 

Both applied; I changed this patch to remove ifmsh->adjusting_tbtt
completely since it was now unused.

johannes


Re: [PATCH v2 2/2] net: rfkill: Add rfkill-any LED trigger

2016-12-13 Thread Johannes Berg
On Thu, 2016-12-08 at 08:30 +0100, Michał Kępień wrote:
> Add a new "global" (i.e. not per-rfkill device) LED trigger, rfkill-
> any,
> which may be useful on laptops with a single "radio LED" and multiple
> radio transmitters.  The trigger is meant to turn a LED on whenever
> there is at least one radio transmitter active and turn it off
> otherwise.
> 

Also applied, but I moved the discussion of the mutex into the recorded
commit log.

johannes


Re: [PATCH v2 1/2] net: rfkill: Cleanup error handling in rfkill_init()

2016-12-13 Thread Johannes Berg
On Thu, 2016-12-08 at 08:30 +0100, Michał Kępień wrote:
> Use a separate label per error condition in rfkill_init() to make it
> a bit cleaner and easier to extend.

applied.

johannes


Re: [PATCH v2] mac80211: Ensure enough headroom when forwarding mesh pkt

2016-12-13 Thread Johannes Berg
On Wed, 2016-12-07 at 09:59 +, Cedric Izoard wrote:
> When a buffer is duplicated during MESH packet forwarding,
> this patch ensures that the new buffer has enough headroom.

Applied.

johannes


Re: [PATCH] ath9k: unlock rcu read when returning early

2016-12-13 Thread Felix Fietkau
On 2016-12-13 14:41, Tobias Klausmann wrote:
> On 13.12.2016 11:41, Felix Fietkau wrote:
>> On 2016-12-12 19:50, Tobias Klausmann wrote:
>>> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c 
>>> b/drivers/net/wireless/ath/ath9k/xmit.c
>>> index 52bfbb988611..857d5ae09a1d 100644
>>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>>> @@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
>>> fifo_list = >txq_fifo[txq->txq_tailidx];
>>> if (list_empty(fifo_list)) {
>>> ath_txq_unlock(sc, txq);
>>> +   rcu_read_unlock();
>> Technically this is fine as well, but I'd prefer a fix where you replace
>> the 'return' with 'break', thus avoiding the duplication of
>> rcu_read_unlock()
> 
> Actually if you want to avoid it, maybe skipping over the rest is better 
> (as originally intended):
> 
> ...
> 
> ath_txq_unlock(sc, txq);
> 
> 
> goto unlock;
> }
> ...
> 
> unlock:
> rcu_read_unlock();
There are already other places that skip to the rcu_read_unlock() part
by using 'break'. I don't see how adding an unnecessary goto makes
things any better.

- Felix


Re: [RFC v2 05/11] ath10k: htc: refactorization

2016-12-13 Thread Michal Kazior
On 13 December 2016 at 14:44, Valo, Kalle  wrote:
> Erik Stromdahl  writes:
>
>> Code refactorization:
>>
>> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
>> to ath10k_htc_control_rx_complete.
>>
>> This eases the implementation of SDIO/mbox significantly since
>> the ep_rx_complete cb is invoked directly from the SDIO/mbox
>> hif layer.
>>
>> Since the ath10k_htc_control_rx_complete already is present
>> (only containing a warning message) there is no reason for not
>> using it (instead of having a special case for ep 0 in
>> ath10k_htc_rx_completion_handler).
>>
>> Signed-off-by: Erik Stromdahl 
>
> I tested this on QCA988X PCI board just to see if there are any
> regressions. It crashes immediately during module load, every time, and
> bisected that the crashing starts on this patch:
>
> [ 1239.715325] ath10k_pci :02:00.0: pci irq msi oper_irq_mode 2 irq_mode 
> 0 reset_mode 0
> [ 1239.885125] ath10k_pci :02:00.0: Direct firmware load for 
> ath10k/pre-cal-pci-:02:00.0.bin failed with error -2
> [ 1239.885260] ath10k_pci :02:00.0: Direct firmware load for 
> ath10k/cal-pci-:02:00.0.bin failed with error -2
> [ 1239.885687] ath10k_pci :02:00.0: qca988x hw2.0 target 0x4100016c 
> chip_id 0x043202ff sub :
> [ 1239.885699] ath10k_pci :02:00.0: kconfig debug 1 debugfs 1 tracing 1 
> dfs 1 testmode 1
> [ 1239.885899] ath10k_pci :02:00.0: firmware ver 10.2.4.70.59-2 api 5 
> features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
> [ 1239.941836] ath10k_pci :02:00.0: Direct firmware load for 
> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
> [ 1239.941993] ath10k_pci :02:00.0: board_file api 1 bmi_id N/A crc32 
> bebc7c08
> [ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> [ 1241.136738] IP: [<  (null)>]   (null)
> [ 1241.136759] *pdpt =  *pde = f0002a55f0002a55 [ 1241.136781]
> [ 1241.136793] Oops: 0010 [#1] SMP
>
> What's odd is that when I added some printks on my own and enabled both
> boot and htc debug levels it doesn't crash anymore. After everything
> works normally after that, I can start AP mode and connect to it. Is it
> a race somewhere?

Yes. htc_wait_target() is called after hif_start(). The ep_rx_complete
is set in htc_wait_target() [changed patch 4, but still too late].

ep_rx_complete must be set prior to calling hif_start(). You probably
crash on end of ath10k_htc_rx_completion_handler() when trying to call
ep->ep_ops.ep_rx_complete(ar, skb).


Michał


Re: [RFC v2 05/11] ath10k: htc: refactorization

2016-12-13 Thread Valo, Kalle
Erik Stromdahl  writes:

> Code refactorization:
>
> Moved the code for ep 0 in ath10k_htc_rx_completion_handler
> to ath10k_htc_control_rx_complete.
>
> This eases the implementation of SDIO/mbox significantly since
> the ep_rx_complete cb is invoked directly from the SDIO/mbox
> hif layer.
>
> Since the ath10k_htc_control_rx_complete already is present
> (only containing a warning message) there is no reason for not
> using it (instead of having a special case for ep 0 in
> ath10k_htc_rx_completion_handler).
>
> Signed-off-by: Erik Stromdahl 

I tested this on QCA988X PCI board just to see if there are any
regressions. It crashes immediately during module load, every time, and
bisected that the crashing starts on this patch:

[ 1239.715325] ath10k_pci :02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 
reset_mode 0
[ 1239.885125] ath10k_pci :02:00.0: Direct firmware load for 
ath10k/pre-cal-pci-:02:00.0.bin failed with error -2
[ 1239.885260] ath10k_pci :02:00.0: Direct firmware load for 
ath10k/cal-pci-:02:00.0.bin failed with error -2
[ 1239.885687] ath10k_pci :02:00.0: qca988x hw2.0 target 0x4100016c chip_id 
0x043202ff sub :
[ 1239.885699] ath10k_pci :02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 
1 testmode 1
[ 1239.885899] ath10k_pci :02:00.0: firmware ver 10.2.4.70.59-2 api 5 
features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 4159f498
[ 1239.941836] ath10k_pci :02:00.0: Direct firmware load for 
ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[ 1239.941993] ath10k_pci :02:00.0: board_file api 1 bmi_id N/A crc32 
bebc7c08
[ 1241.136693] BUG: unable to handle kernel NULL pointer dereference at   (null)
[ 1241.136738] IP: [<  (null)>]   (null)
[ 1241.136759] *pdpt =  *pde = f0002a55f0002a55 [ 1241.136781] 
[ 1241.136793] Oops: 0010 [#1] SMP

What's odd is that when I added some printks on my own and enabled both
boot and htc debug levels it doesn't crash anymore. After everything
works normally after that, I can start AP mode and connect to it. Is it
a race somewhere?

-- 
Kalle Valo

Re: [PATCH] ath9k: unlock rcu read when returning early

2016-12-13 Thread Tobias Klausmann



On 13.12.2016 11:41, Felix Fietkau wrote:

On 2016-12-12 19:50, Tobias Klausmann wrote:

Starting with ath9k: use ieee80211_tx_status_noskb where possible
[d94a461d7a7df68991fb9663531173f60ef89c68] the driver uses rcu_read_lock() &&
rcu_read_unlock() yet on returning early in ath_tx_edma_tasklet() the unlock is
missing leading to stalls and suspicious RCU usage:

  ===
  [ INFO: suspicious RCU usage. ]
  4.9.0-rc8 #11 Not tainted
  ---
  kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!

  other info that might help us debug this:

  RCU used illegally from idle CPU!
  rcu_scheduler_active = 1, debug_locks = 0
  RCU used illegally from extended quiescent state!
  1 lock held by swapper/7/0:
  #0:
   (
  rcu_read_lock
  ){..}
  , at:
  [] ath_tx_edma_tasklet+0x0/0x450 [ath9k]

  stack backtrace:
  CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
  Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
   88025efc3f38 8132b1e5 88017ede4540 0001
   88025efc3f68 810a25f7 88025efcee60 88017edebdd8
   88025eeb5400 0091 88025efc3f88 810c3cd4
  Call Trace:
   
   [] dump_stack+0x68/0x93
   [] lockdep_rcu_suspicious+0xd7/0x110
   [] rcu_eqs_enter_common.constprop.85+0x154/0x200
   [] rcu_irq_exit+0x44/0xa0
   [] irq_exit+0x61/0xd0
   [] do_IRQ+0x65/0x110
   [] common_interrupt+0x89/0x89
   
   [] ? cpuidle_enter_state+0x151/0x200
   [] cpuidle_enter+0x12/0x20
   [] call_cpuidle+0x1e/0x40
   [] cpu_startup_entry+0x146/0x220
   [] start_secondary+0x148/0x170

Signed-off-by: Tobias Klausmann 
---
  drivers/net/wireless/ath/ath9k/xmit.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/ath/ath9k/xmit.c 
b/drivers/net/wireless/ath/ath9k/xmit.c
index 52bfbb988611..857d5ae09a1d 100644
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
fifo_list = >txq_fifo[txq->txq_tailidx];
if (list_empty(fifo_list)) {
ath_txq_unlock(sc, txq);
+   rcu_read_unlock();

Technically this is fine as well, but I'd prefer a fix where you replace
the 'return' with 'break', thus avoiding the duplication of
rcu_read_unlock()


Actually if you want to avoid it, maybe skipping over the rest is better 
(as originally intended):


...

ath_txq_unlock(sc, txq);


goto unlock;
}
...

unlock:
rcu_read_unlock();

Thanks,
Tobias


Thanks,

- Felix





Re: [RFC v2 11/11] ath10k: Added sdio support

2016-12-13 Thread Valo, Kalle
Erik Stromdahl  writes:

> Initial HIF sdio/mailbox implementation.
>
> Signed-off-by: Erik Stromdahl 

While testing this I noticed few new warnings:

drivers/net/wireless/ath/ath10k/sdio.c: In function ath10k_sdio_probe:
drivers/net/wireless/ath/ath10k/sdio.c:1723:6: warning: 'ret' may be used 
uninitialized in this function [-Wuninitialized]
drivers/net/wireless/ath/ath10k/sdio.c:375:5: warning: symbol 
'ath10k_sdio_mbox_rxmsg_pending_handler' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1018:5: warning: symbol 
'ath10k_sdio_hif_tx_sg' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1415:5: warning: symbol 
'ath10k_sdio_hif_exchange_bmi_msg' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1555:5: warning: symbol 
'ath10k_sdio_hif_map_service_to_pipe' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c:1635:6: warning: symbol 
'ath10k_sdio_hif_get_default_pipe' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/htc.c:265: line over 90 characters
drivers/net/wireless/ath/ath10k/htc.c:355: line over 90 characters

-- 
Kalle Valo

Re: [PATCH 1/2] ath10k: add accounting for the extended peer statistics

2016-12-13 Thread Christian Lamparter
Hello,

It looks like google put your mail into the spam-can. 
I'm sorry for not answering sooner.

On Wednesday, December 7, 2016 11:58:24 AM CET Mohammed Shafi Shajakhan wrote:
> On Mon, Dec 05, 2016 at 10:52:45PM +0100, Christian Lamparter wrote:
> > The 10.4 firmware adds extended peer information to the
> > firmware's statistics payload. This additional info is
> > stored as a separate data field and the elements are
> > stored in their own "peers_extd" list.
> > 
> > These elements can pile up in the same way as the peer
> > information elements. This is because the
> > ath10k_wmi_10_4_op_pull_fw_stats() function tries to
> > pull the same amount (num_peer_stats) for every statistic
> > data unit.
> > 
> > Fixes: 4a49ae94a448faa ("ath10k: fix 10.4 extended peer stats update")
> > Signed-off-by: Christian Lamparter 
> > ---
> >  drivers/net/wireless/ath/ath10k/debug.c | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/wireless/ath/ath10k/debug.c 
> > b/drivers/net/wireless/ath/ath10k/debug.c
> > index 82a4c67f3672..4acd9eb65910 100644
> > --- a/drivers/net/wireless/ath/ath10k/debug.c
> > +++ b/drivers/net/wireless/ath/ath10k/debug.c
> > @@ -399,6 +399,7 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, 
> > struct sk_buff *skb)
> >  * prevent firmware from DoS-ing the host.
> >  */
> > ath10k_fw_stats_peers_free(>debug.fw_stats.peers);
> > +   
> > ath10k_fw_extd_stats_peers_free(>debug.fw_stats.peers_extd);
> 
> [shafi] thanks for fixing this !
> 
> > ath10k_warn(ar, "dropping fw peer stats\n");
> > goto free;
> > }
> > @@ -409,10 +410,12 @@ void ath10k_debug_fw_stats_process(struct ath10k *ar, 
> > struct sk_buff *skb)
> > goto free;
> > }
> >  
> > +   if (!list_empty())
> 
> [shafi] sorry please correct me if i am wrong, for 'extended peer stats' we 
> are checking
> for normal 'peer stats' ? Is this the fix intended, i had started a build to
> check your change and we will keep you posted, does this fix displaying
> 'rx_duration' in ath10k fw_stats.
The idea is not to queue any "extended peer stats" when there where no "peer 
stats" to
begin with. Because otherwise, the function is still vulnerable to OOM since 
the 
extended peers stats will be queued unchecked (not that this is currently a 
problem).
 
> > +   list_splice_tail_init(_extd,
> > + >debug.fw_stats.peers_extd);
> > +
> > list_splice_tail_init(, >debug.fw_stats.peers);
> > list_splice_tail_init(, >debug.fw_stats.vdevs);
> > -   list_splice_tail_init(_extd,
> > - >debug.fw_stats.peers_extd);
> > }
> >  
> > complete(>debug.fw_stats_complete);

Regards,
Christian




Re: [PATCH v2 0/7] ath9k: EEPROM swapping improvements

2016-12-13 Thread Valo, Kalle
Martin Blumenstingl  writes:

> Hello Kalle,
>
> On Fri, Nov 25, 2016 at 4:06 PM, Valo, Kalle  wrote:
>> Kalle Valo  writes:
>>
>>> Martin Blumenstingl  writes:
>>>
 There are two types of swapping the EEPROM data in the ath9k driver.
 Before this series one type of swapping could not be used without the
 other.

 The first type of swapping looks at the "magic bytes" at the start of
 the EEPROM data and performs swab16 on the EEPROM contents if needed.
 The second type of swapping is EEPROM format specific and swaps
 specific fields within the EEPROM itself (swab16, swab32 - depends on
 the EEPROM format).

 With this series the second part now looks at the EEPMISC register
 inside the EEPROM, which uses a bit to indicate if the EEPROM data
 is Big Endian (this is also done by the FreeBSD kernel).
 This has a nice advantage: currently there are some out-of-tree hacks
 (in OpenWrt and LEDE) where the EEPROM has a Big Endian header on a
 Big Endian system (= no swab16 is performed) but the EEPROM itself
 indicates that it's data is Little Endian. Until now the out-of-tree
 code simply did a swab16 before passing the data to ath9k, so ath9k
 first did the swab16 - this also enabled the format specific swapping.
 These out-of-tree hacks are still working with the new logic, but it
 is recommended to remove them. This implementation is based on a
 discussion with Arnd Bergmann who raised concerns about the
 robustness and portability of the swapping logic in the original OF
 support patch review, see [0].

 After a second round of patches (= v1 of this series) neither Arnd
 Bergmann nor I were really happy with the complexity of the EEPROM
 swapping logic. Based on a discussion (see [1] and [2]) we decided
 that ath9k should use a defined format (specifying the endianness
 of the data - I went with __le16 and __le32) when accessing the
 EEPROM fields. A benefit of this is that we enable the EEPMISC based
 swapping logic by default, just like the FreeBSD driver, see [3]. On
 the devices which I have tested (see below) ath9k now works without
 having to specify the "endian_check" field in ath9k_platform_data (or
 a similar logic which could provide this via devicetree) as ath9k now
 detects the endianness automatically. Only EEPROMs which are mangled
 by some out-of-tree code still need the endian_check flag (or one can
 simply remove that mangling from the out-of-tree code).

 Testing:
 - tested by myself on AR9287 with Big Endian EEPROM
 - tested by myself on AR9227 with Little Endian EEPROM
 - tested by myself on AR9381 (using the ar9003_eeprom implementation,
   which did not suffer from this whole problem)
 - how do we proceed with testing? maybe we could keep this in a
   feature-branch and add these patches to LEDE once we have an ACK to
   get more people to test this

 This series depends on my other series (v7):
 "add devicetree support to ath9k" - see [4]
>>>
>>> I think this looks pretty good. If there's a bug somewhere it should be
>>> quite easy to fix so I'm not that worried and would be willing to take
>>> these as soon as I have applied the dependency series. IIRC your
>>> devicetree patches will have at least one more review round so that will
>>> take some time still. In the meantime it would be great if LEDE folks
>>> could take a look at these and comment (or test).
>>
>> So are everyone happy with this? I haven't seen any comments. If I don't
>> here anything I'm planning to take these, most likely for 4.11.
>
> the patches have been in LEDE for almost two weeks now and I did not
> see any reports of ath9k breakage (footnote below).
>
> It seems that there are a few devices out there where the whole EEPROM
> is swab16'ed which switches the position of the 1-byte fields
> opCapFlags and eepMisc.
> those still work fine with the new code, however I had a second patch
> in LEDE [0] which results in ath9k_platform_data.endian_check NOT
> being set anymore.
> that endian_check flag was used before to swab16 the whole EEPROM, to
> correct the position of the 1-byte fields again.
> Currently we are fixing this in the firmware hotplug script: [1]
> This is definitely not a blocker for this series though (if we want to
> have a devicetree replacement for "ath9k_platform_data.endian_check"
> then I'd work on that within a separate series, but I somewhat
> consider these EEPROMs as "broken" so fixing them in
> userspace/firmware hotplug script is fine for me)

Sounds good to me, thanks for the thorough followup. I'm planning to
apply these any day.

-- 
Kalle Valo

Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations

2016-12-13 Thread Kalle Valo
Andy Lutomirski  writes:

> Eric Biggers pointed out that the orinoco driver pointed scatterlists
> at the stack.
>
> Fix it by switching from ahash to shash.  The result should be
> simpler, faster, and more correct.
>
> Cc: sta...@vger.kernel.org # 4.9 only
> Reported-by: Eric Biggers 
> Signed-off-by: Andy Lutomirski 

"more correct"? Does this fix a real user visible bug or what? And why
just stable 4.9, does this maybe have something to do with
CONFIG_VMAP_STACK?

I'm just wondering should I push this to 4.10 or -next. This is a driver
for ancient hardware so I'm starting to lean for -next.

-- 
Kalle Valo


Re: [PATCH] ath9k: unlock rcu read when returning early

2016-12-13 Thread Felix Fietkau
On 2016-12-12 19:50, Tobias Klausmann wrote:
> Starting with ath9k: use ieee80211_tx_status_noskb where possible
> [d94a461d7a7df68991fb9663531173f60ef89c68] the driver uses rcu_read_lock() &&
> rcu_read_unlock() yet on returning early in ath_tx_edma_tasklet() the unlock 
> is
> missing leading to stalls and suspicious RCU usage:
> 
>  ===
>  [ INFO: suspicious RCU usage. ]
>  4.9.0-rc8 #11 Not tainted
>  ---
>  kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
> 
>  other info that might help us debug this:
> 
>  RCU used illegally from idle CPU!
>  rcu_scheduler_active = 1, debug_locks = 0
>  RCU used illegally from extended quiescent state!
>  1 lock held by swapper/7/0:
>  #0:
>   (
>  rcu_read_lock
>  ){..}
>  , at:
>  [] ath_tx_edma_tasklet+0x0/0x450 [ath9k]
> 
>  stack backtrace:
>  CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
>  Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
>   88025efc3f38 8132b1e5 88017ede4540 0001
>   88025efc3f68 810a25f7 88025efcee60 88017edebdd8
>   88025eeb5400 0091 88025efc3f88 810c3cd4
>  Call Trace:
>   
>   [] dump_stack+0x68/0x93
>   [] lockdep_rcu_suspicious+0xd7/0x110
>   [] rcu_eqs_enter_common.constprop.85+0x154/0x200
>   [] rcu_irq_exit+0x44/0xa0
>   [] irq_exit+0x61/0xd0
>   [] do_IRQ+0x65/0x110
>   [] common_interrupt+0x89/0x89
>   
>   [] ? cpuidle_enter_state+0x151/0x200
>   [] cpuidle_enter+0x12/0x20
>   [] call_cpuidle+0x1e/0x40
>   [] cpu_startup_entry+0x146/0x220
>   [] start_secondary+0x148/0x170
> 
> Signed-off-by: Tobias Klausmann 
> ---
>  drivers/net/wireless/ath/ath9k/xmit.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/wireless/ath/ath9k/xmit.c 
> b/drivers/net/wireless/ath/ath9k/xmit.c
> index 52bfbb988611..857d5ae09a1d 100644
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -2787,6 +2787,7 @@ void ath_tx_edma_tasklet(struct ath_softc *sc)
>   fifo_list = >txq_fifo[txq->txq_tailidx];
>   if (list_empty(fifo_list)) {
>   ath_txq_unlock(sc, txq);
> + rcu_read_unlock();
Technically this is fine as well, but I'd prefer a fix where you replace
the 'return' with 'break', thus avoiding the duplication of
rcu_read_unlock()

Thanks,

- Felix



Re: [PATCH] ath9k: unlock rcu read when returning early

2016-12-13 Thread Kalle Valo
Tobias Klausmann  writes:

> Starting with ath9k: use ieee80211_tx_status_noskb where possible
> [d94a461d7a7df68991fb9663531173f60ef89c68]

The correct format to reference a commit in the commit log is:

Starting with commit d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb
where possible") the...

> the driver uses rcu_read_lock() && rcu_read_unlock() yet on returning
> early in ath_tx_edma_tasklet() the unlock is missing leading to stalls
> and suspicious RCU usage:
>
>  ===
>  [ INFO: suspicious RCU usage. ]
>  4.9.0-rc8 #11 Not tainted
>  ---
>  kernel/rcu/tree.c:705 Illegal idle entry in RCU read-side critical section.!
>
>  other info that might help us debug this:
>
>  RCU used illegally from idle CPU!
>  rcu_scheduler_active = 1, debug_locks = 0
>  RCU used illegally from extended quiescent state!
>  1 lock held by swapper/7/0:
>  #0:
>   (
>  rcu_read_lock
>  ){..}
>  , at:
>  [] ath_tx_edma_tasklet+0x0/0x450 [ath9k]
>
>  stack backtrace:
>  CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.0-rc8 #11
>  Hardware name: Acer Aspire V3-571G/VA50_HC_CR, BIOS V2.21 12/16/2013
>   88025efc3f38 8132b1e5 88017ede4540 0001
>   88025efc3f68 810a25f7 88025efcee60 88017edebdd8
>   88025eeb5400 0091 88025efc3f88 810c3cd4
>  Call Trace:
>   
>   [] dump_stack+0x68/0x93
>   [] lockdep_rcu_suspicious+0xd7/0x110
>   [] rcu_eqs_enter_common.constprop.85+0x154/0x200
>   [] rcu_irq_exit+0x44/0xa0
>   [] irq_exit+0x61/0xd0
>   [] do_IRQ+0x65/0x110
>   [] common_interrupt+0x89/0x89
>   
>   [] ? cpuidle_enter_state+0x151/0x200
>   [] cpuidle_enter+0x12/0x20
>   [] call_cpuidle+0x1e/0x40
>   [] cpu_startup_entry+0x146/0x220
>   [] start_secondary+0x148/0x170
>
> Signed-off-by: Tobias Klausmann 

A fixes line and cc stable would be good to have:

Fixes: d94a461d7a7d ("ath9k: use ieee80211_tx_status_noskb where possible")
Cc:  # v4.9

I can add those.

I'm planning to push this to 4.10 but would prefer to see an ack from
Felix (the author of d94a461d7a7d) first. I added him to Cc.

-- 
Kalle Valo


[PATCH] mac80211: don't call drv_set_default_unicast_key() for VLANs

2016-12-13 Thread Johannes Berg
From: Johannes Berg 

Since drivers know nothing about AP_VLAN interfaces, trying to
call drv_set_default_unicast_key() just results in a warning
and no call to the driver. Avoid the warning by not calling the
driver for this on AP_VLAN interfaces.

This means that drivers that somehow need this call for AP mode
will fail to work properly in the presence of VLAN interfaces,
but the current drivers don't seem to use it, and mac80211 will
select and indicate the key - so drivers should be OK now.

Reported-by: Jouni Malinen 
Signed-off-by: Johannes Berg 
---
 net/mac80211/key.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index edd6f2945f69..a98fc2b5e0dc 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -265,7 +265,8 @@ static void __ieee80211_set_default_key(struct 
ieee80211_sub_if_data *sdata,
if (uni) {
rcu_assign_pointer(sdata->default_unicast_key, key);
ieee80211_check_fast_xmit_iface(sdata);
-   drv_set_default_unicast_key(sdata->local, sdata, idx);
+   if (sdata->vif.type != NL80211_IFTYPE_AP_VLAN)
+   drv_set_default_unicast_key(sdata->local, sdata, idx);
}
 
if (multi)
-- 
2.9.3



[PATCH] rfkill: simplify rfkill_set_hw_state() slightly

2016-12-13 Thread Johannes Berg
From: Johannes Berg 

Simplify the two conditions gating the schedule_work() into
a single one and get rid of the additional exit point from
the function in doing so.

Signed-off-by: Johannes Berg 
---
 net/rfkill/core.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index 884027f62783..184bb711a06d 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -478,10 +478,7 @@ bool rfkill_set_hw_state(struct rfkill *rfkill, bool 
blocked)
 
rfkill_led_trigger_event(rfkill);
 
-   if (!rfkill->registered)
-   return ret;
-
-   if (prev != blocked)
+   if (rfkill->registered && prev != blocked)
schedule_work(>uevent_work);
 
return ret;
-- 
2.9.3