[PATCH v2] iwlwifi: mvm: allow monitor mode capture in STA mode

2017-09-14 Thread gavinli
From: Gavin Li 

Open up the filter if there is a monitor interface configured; this
allows all packets on the channel to be captured even if the device is
in STA mode and associated to a BSS.

Signed-off-by: Gavin Li 
---
 drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c |  3 +-
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 42 +++
 drivers/net/wireless/intel/iwlwifi/mvm/mvm.h  |  3 ++
 3 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
index fd2fc46e2fe5..ea98fc040aa8 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
@@ -752,7 +752,8 @@ static void iwl_mvm_mac_ctxt_cmd_common(struct iwl_mvm *mvm,
cpu_to_le32(vif->bss_conf.use_short_slot ?
MAC_FLG_SHORT_SLOT : 0);
 
-   cmd->filter_flags = cpu_to_le32(MAC_FILTER_ACCEPT_GRP);
+   cmd->filter_flags = cpu_to_le32(mvm->addl_filter_flags |
+   MAC_FILTER_ACCEPT_GRP);
 
for (i = 0; i < IEEE80211_NUM_ACS; i++) {
u8 txf = iwl_mvm_ac_to_tx_fifo[i];
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
index db5ad222e5ea..7c6d30ea3cf1 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
@@ -1641,6 +1641,13 @@ static u64 iwl_mvm_prepare_multicast(struct ieee80211_hw 
*hw,
return (u64)(unsigned long)cmd;
 }
 
+static void iwl_mvm_addl_ff_iface_iterator(
+   void *_mvm, u8 *mac, struct ieee80211_vif *vif)
+{
+   struct iwl_mvm *mvm = _mvm;
+   iwl_mvm_mac_ctxt_changed(mvm, vif, false, NULL);
+}
+
 static void iwl_mvm_configure_filter(struct ieee80211_hw *hw,
 unsigned int changed_flags,
 unsigned int *total_flags,
@@ -1648,20 +1655,41 @@ static void iwl_mvm_configure_filter(struct 
ieee80211_hw *hw,
 {
struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw);
struct iwl_mcast_filter_cmd *cmd = (void *)(unsigned long)multicast;
+   const u32 supported_filters =
+   FIF_BCN_PRBRESP_PROMISC |
+   FIF_CONTROL |
+   FIF_OTHER_BSS |
+   FIF_PROBE_REQ;
+   u32 addl_filter_flags = 0;
+
+   *total_flags &= supported_filters;
+   if (*total_flags & FIF_BCN_PRBRESP_PROMISC)
+   addl_filter_flags |= MAC_FILTER_IN_BEACON;
+   if (*total_flags & FIF_CONTROL)
+   addl_filter_flags |= MAC_FILTER_IN_CONTROL_AND_MGMT;
+   if (*total_flags & FIF_OTHER_BSS)
+   addl_filter_flags |= MAC_FILTER_IN_CONTROL_AND_MGMT |
+   MAC_FILTER_IN_PROMISC |
+   MAC_FILTER_IN_BEACON;
+   if (*total_flags & FIF_PROBE_REQ)
+   addl_filter_flags |= MAC_FILTER_IN_PROBE_REQUEST;
 
+   /* replace previous configuration */
mutex_lock(>mutex);
 
-   /* replace previous configuration */
+   if (mvm->addl_filter_flags != addl_filter_flags) {
+   mvm->addl_filter_flags = addl_filter_flags;
+   ieee80211_iterate_active_interfaces(
+   mvm->hw, IEEE80211_IFACE_ITER_NORMAL,
+   iwl_mvm_addl_ff_iface_iterator, mvm);
+   }
+
kfree(mvm->mcast_filter_cmd);
mvm->mcast_filter_cmd = cmd;
+   if (cmd)
+   iwl_mvm_recalc_multicast(mvm);
 
-   if (!cmd)
-   goto out;
-
-   iwl_mvm_recalc_multicast(mvm);
-out:
mutex_unlock(>mutex);
-   *total_flags = 0;
 }
 
 static void iwl_mvm_config_iface_filter(struct ieee80211_hw *hw,
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h 
b/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h
index 52f8d7a6a7dc..827c413157c7 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h
@@ -832,6 +832,9 @@ struct iwl_mvm {
/* configured by mac80211 */
u32 rts_threshold;
 
+   /* packets to pass as requested by mac80211 */
+   u32 addl_filter_flags;
+
/* Scan status, cmd (pre-allocated) and auxiliary station */
unsigned int scan_status;
void *scan_cmd;
-- 
2.14.1



rtlwifi/rtl8188ee baseband config explanation request

2017-09-14 Thread Farhan Khan
Hi all,

I am trying to get a high-level understanding of what occurs in
rtl88e_phy_bb_config() in
drivers/net/wireless/realtek/rtlwifi/rtl8188ee/phy.c. I understand the
code up until it runs _rtl88e_phy_bb8188e_config_parafile() .

>From there, I see that phy_config_bb_with_headerfile() will write the
values in RTL8188EEPHY_REG_1TARRAY, runs phy_config_bb_with_pghdr(),
then write the values in RTL8188EEAGCTAB_1TARRAY.

Please provide a high-level explanation of
phy_config_bb_with_headerfile(), and more particularly, of
phy_config_bb_pghdr(). What are their objectives? Also, is there a
reason for this specific order or can it be in any otherwise?

Thanks!

--
Farhan Khan
PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 03:42 PM, Denis Kenzior wrote:

Hi Ben,

On 09/14/2017 05:15 PM, Ben Greear wrote:

On 09/14/2017 02:35 PM, Denis Kenzior wrote:

Hi Ben,


I think it is sane to assume that the IP address _should_ be the same. The 
802.11 spec expects this even.  This is to handle bizarre networks that don't do
this
properly.


Can you point me to the section in the spec about this?



Lets see, 802.11-2012, Section 4.3.4.2:
"The key concept is that the ESS network appears the same to an LLC layer as an 
IBSS network. STAs within an ESS may communicate and mobile STAs may move from
one BSS to another (within the same ESS) transparently to LLC."

Section 4.5.3.2,


Thanks for the pointer.

It looks to me like having same SSID/auth is a requirement to be in the same 
ESS, but
just having the same ESSID/auth does not mean it is definitely in the same ESS.



Sure. But then your PSK has to match, or your credentials need to be accepted.  
I suppose you can have two separate ESSes operating with the same SSID and
security type.  But man, how likely is that?  That is fundamentally broken and 
we should not slow everything down just to take care of a broken case.


What about two APs with ssid LEDE and open-auth, each serving DHCP, each
connected to a switch that connects to a cable modem, etc.




etc.


If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't disappear 
if you decide to switch access points.


And if so, you will in some cases be allowing duplicate IP addresses on
a network?



Life is never perfect ;)


If you are breaking networks while trying to optimize something, then I think 
you
are going about it wrong.

Seems like we would need some way for the DHCP server and/or AP to proactively
notify the station that they can skip DHCP, and default to not skipping.


Not unless you're planning to extend the spec?  802.11 doesn't even mention 
DHCP in any real manner.


So, it could be given out by the DHCP server then.  There are ways to add custom
options to it, right?

User-space can remember the option and use that to decide whether to re-do DHCP
when a station roams to another AP in (the probable) same ESS.  Since this is
a new option, you should not have backwards-compatibility issues.


And now you're breaking layering even more.  DHCP shouldn't care that a given 
client is connecting via ethernet or wifi.  And you're still relying on DHCP.

Think about it, with a normal roam, you're probably 40-70 ms latency. If you 
have to do 802.1x, that's probably 150+ms.  With a fast-transition you're down 
to
maybe 20 ms?  If you need to rely on DHCP, that's 1 second or more.  A user can 
detect latency of 100ms easily.  So if their VoIP call drops for for 1 second or
more, you have failed.


On initial connection to the network, the station does DHCP and gets a 
response, like normal.
But, that response has a special message in it that says "you don't need to re-do 
dhcp if you roam here".
User-space remembers this response for future roams...

After that, then you skip DHCP on roam to this SSID/auth network, so you have 
zero extra network
overhead when roaming.

You could also tell IPv6 to not drop/re-acquite its IP addresses on link-down, 
and then everything
works w/out having to hack on the mac80211 state, I think?



I vaguely recall that FT had some way to verify you were roaming to the same 
dhcp-domain
or not, but honestly, it has been a long time since I read through that...



Do you mean a mobility domain?  This has nothing to do with DHCP...


It seems to be a very convenient way to identify a group of APs that share
common infrastructure, more so that just having the same SSID/auth...

Do you think there would ever be a mobility domain that did not share
a common DHCP server pool?



I would expect fast transitions to work exactly like any other transition 
within an ESS.  Just faster.  And there's nothing stopping one from configuring 
2 FT
capable ESSes with the same MDID, leading to a yet another bizarre case.


Well, if we are being very paranoid, then you have to look for some specific 
and purposeful
identifier, such as the DHCP option I mentioned.  But, I know that at least 
some larger
entities are (or were a year or two ago) treating a mobility domain roam to 
mean that DHCP
is not needed on roam.

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Ben,

On 09/14/2017 05:15 PM, Ben Greear wrote:

On 09/14/2017 02:35 PM, Denis Kenzior wrote:

Hi Ben,

I think it is sane to assume that the IP address _should_ be the 
same. The 802.11 spec expects this even.  This is to handle bizarre 
networks that don't do this

properly.


Can you point me to the section in the spec about this?



Lets see, 802.11-2012, Section 4.3.4.2:
"The key concept is that the ESS network appears the same to an LLC 
layer as an IBSS network. STAs within an ESS may communicate and 
mobile STAs may move from

one BSS to another (within the same ESS) transparently to LLC."

Section 4.5.3.2,


Thanks for the pointer.

It looks to me like having same SSID/auth is a requirement to be in the 
same ESS, but
just having the same ESSID/auth does not mean it is definitely in the 
same ESS.




Sure. But then your PSK has to match, or your credentials need to be 
accepted.  I suppose you can have two separate ESSes operating with the 
same SSID and security type.  But man, how likely is that?  That is 
fundamentally broken and we should not slow everything down just to take 
care of a broken case.



etc.


If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't 
disappear if you decide to switch access points.


And if so, you will in some cases be allowing duplicate IP 
addresses on

a network?



Life is never perfect ;)


If you are breaking networks while trying to optimize something, then 
I think you

are going about it wrong.

Seems like we would need some way for the DHCP server and/or AP to 
proactively

notify the station that they can skip DHCP, and default to not skipping.


Not unless you're planning to extend the spec?  802.11 doesn't even 
mention DHCP in any real manner.


So, it could be given out by the DHCP server then.  There are ways to 
add custom

options to it, right?

User-space can remember the option and use that to decide whether to 
re-do DHCP
when a station roams to another AP in (the probable) same ESS.  Since 
this is

a new option, you should not have backwards-compatibility issues.


And now you're breaking layering even more.  DHCP shouldn't care that a 
given client is connecting via ethernet or wifi.  And you're still 
relying on DHCP.


Think about it, with a normal roam, you're probably 40-70 ms latency. 
If you have to do 802.1x, that's probably 150+ms.  With a 
fast-transition you're down to maybe 20 ms?  If you need to rely on 
DHCP, that's 1 second or more.  A user can detect latency of 100ms 
easily.  So if their VoIP call drops for for 1 second or more, you have 
failed.




I vaguely recall that FT had some way to verify you were roaming to 
the same dhcp-domain
or not, but honestly, it has been a long time since I read through 
that...




Do you mean a mobility domain?  This has nothing to do with DHCP...


It seems to be a very convenient way to identify a group of APs that share
common infrastructure, more so that just having the same SSID/auth...

Do you think there would ever be a mobility domain that did not share
a common DHCP server pool?



I would expect fast transitions to work exactly like any other 
transition within an ESS.  Just faster.  And there's nothing stopping 
one from configuring 2 FT capable ESSes with the same MDID, leading to a 
yet another bizarre case.


Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 02:35 PM, Denis Kenzior wrote:

Hi Ben,


I think it is sane to assume that the IP address _should_ be the same. The 
802.11 spec expects this even.  This is to handle bizarre networks that don't 
do this
properly.


Can you point me to the section in the spec about this?



Lets see, 802.11-2012, Section 4.3.4.2:
"The key concept is that the ESS network appears the same to an LLC layer as an 
IBSS network. STAs within an ESS may communicate and mobile STAs may move from
one BSS to another (within the same ESS) transparently to LLC."

Section 4.5.3.2,


Thanks for the pointer.

It looks to me like having same SSID/auth is a requirement to be in the same 
ESS, but
just having the same ESSID/auth does not mean it is definitely in the same ESS.


etc.


If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't disappear 
if you decide to switch access points.


And if so, you will in some cases be allowing duplicate IP addresses on
a network?



Life is never perfect ;)


If you are breaking networks while trying to optimize something, then I think 
you
are going about it wrong.

Seems like we would need some way for the DHCP server and/or AP to proactively
notify the station that they can skip DHCP, and default to not skipping.


Not unless you're planning to extend the spec?  802.11 doesn't even mention 
DHCP in any real manner.


So, it could be given out by the DHCP server then.  There are ways to add custom
options to it, right?

User-space can remember the option and use that to decide whether to re-do DHCP
when a station roams to another AP in (the probable) same ESS.  Since this is
a new option, you should not have backwards-compatibility issues.


I vaguely recall that FT had some way to verify you were roaming to the same 
dhcp-domain
or not, but honestly, it has been a long time since I read through that...



Do you mean a mobility domain?  This has nothing to do with DHCP...


It seems to be a very convenient way to identify a group of APs that share
common infrastructure, more so that just having the same SSID/auth...

Do you think there would ever be a mobility domain that did not share
a common DHCP server pool?

Thanks,
Ben


--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: [EXT] Re: [PATCH 2/2] mwifiex: print URB submit failure error after threshold attemtps

2017-09-14 Thread Brian Norris
Hi Ganapathi,

On Thu, Sep 14, 2017 at 02:14:24PM +, Ganapathi Bhat wrote:
> > On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote:
> > > Current driver prints dev_alloc_skb failures everytime while
> > > submitting RX URBs. This failure might be frequent in some low
> > > resource platforms. So, wait for a threshold failure count before
> > > start priting the error. This change is a follow up for the 'commit
> > > 7b368e3d15c3
> > > ("mwifiex: resubmit failed to submit RX URBs in main thread")'
> > 
> > []
> > 
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c
> > > b/drivers/net/wireless/marvell/mwifiex/usb.c
> > []
> > > @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct
> > urb_context *ctx, int size)
> > >   if (card->rx_cmd_ep != ctx->ep) {
> > >   ctx->skb = dev_alloc_skb(size);
> > >   if (!ctx->skb) {
> > > - mwifiex_dbg(adapter, ERROR,
> > > - "%s: dev_alloc_skb failed\n", __func__);
> > > + if (++card->rx_urb_failure_count >
> > > + MWIFIEX_RX_URB_FAILURE_THRESHOLD) {
> > > + mwifiex_dbg(adapter, ERROR,
> > > + "%s: dev_alloc_skb failed, failure
> > count = %u\n",
> > > + __func__,
> > > + card->rx_urb_failure_count);
> > > + }
> > >   return -ENOMEM;
> > 
> > Why not use a ratelimit?
> Since this is for receive, the packets are from AP side and we cannot
> lower the rate from AP. On some low performance systems this change
> will be helpful.

I think Joe was referring to things like printk_ratelimited() or
dev_err_ratelimited(). Those automatically ratelimit prints for you,
using a static counter. You'd just need to make a small warpper for
mwifiex_dbg() using __ratelimit().

Those sort of rate limits are significantly different than yours though.
You were looking to avoid printing errors when there are only a few
failures in a row, whereas the existing rate-limiting infrastructure
looks to avoid printing errors if too many happen in a row. Those are
different goals.

Brian


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Ben,

I think it is sane to assume that the IP address _should_ be the same. 
The 802.11 spec expects this even.  This is to handle bizarre networks 
that don't do this

properly.


Can you point me to the section in the spec about this?



Lets see, 802.11-2012, Section 4.3.4.2:
"The key concept is that the ESS network appears the same to an LLC 
layer as an IBSS network. STAs within an ESS may communicate and mobile 
STAs may move from one BSS to another (within the same ESS) 
transparently to LLC."


Section 4.5.3.2,

etc.


If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't 
disappear if you decide to switch access points.



And if so, you will in some cases be allowing duplicate IP addresses on
a network?



Life is never perfect ;)


If you are breaking networks while trying to optimize something, then I 
think you

are going about it wrong.

Seems like we would need some way for the DHCP server and/or AP to 
proactively

notify the station that they can skip DHCP, and default to not skipping.


Not unless you're planning to extend the spec?  802.11 doesn't even 
mention DHCP in any real manner.




I vaguely recall that FT had some way to verify you were roaming to the 
same dhcp-domain

or not, but honestly, it has been a long time since I read through that...



Do you mean a mobility domain?  This has nothing to do with DHCP...

Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 01:35 PM, Denis Kenzior wrote:

Hi Ben,

On 09/14/2017 03:29 PM, Ben Greear wrote:

On 09/14/2017 01:26 PM, Denis Kenzior wrote:

Hi Ben,


How do you re-confirm them?  There are definitely cases where SSID/Security is 
the same but each
AP has its own DHCP server and roaming between them will require getting a new 
DHCP address (on
the same (NAT'd) subnet and with same gateway, likely as not).



Using DHCPREQUEST to verify obtained parameters, or the DHCPv6 equivalent 
Confirm message.  This obviously requires some integration between the dhcp 
daemon and
the supplicant.


Do you want to allow the just-now-roamed station to use its old IP address(es) 
while you are
confirming?


I think it is sane to assume that the IP address _should_ be the same. The 
802.11 spec expects this even.  This is to handle bizarre networks that don't 
do this
properly.


Can you point me to the section in the spec about this?


If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't disappear 
if you decide to switch access points.


And if so, you will in some cases be allowing duplicate IP addresses on
a network?



Life is never perfect ;)


If you are breaking networks while trying to optimize something, then I think 
you
are going about it wrong.

Seems like we would need some way for the DHCP server and/or AP to proactively
notify the station that they can skip DHCP, and default to not skipping.

I vaguely recall that FT had some way to verify you were roaming to the same 
dhcp-domain
or not, but honestly, it has been a long time since I read through that...

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/14/2017 02:04 PM, Srinath Mannam wrote:
> Hi Jens Axboe,
> 
> 
> On Thu, Sep 14, 2017 at 11:14 PM, Jens Axboe  wrote:
>> On 09/14/2017 11:35 AM, Jens Axboe wrote:
>>> On 09/14/2017 11:28 AM, Srinath Mannam wrote:
 Hi Bjorn,

 On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:
>
> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
>> [+cc linux-pci]
>>
>> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
>>> On 09/12/2017 02:04 PM, Johannes Berg wrote:
 On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:

> CC'ing the guilty part and Bjorn. I'm assuming it's the
> pci_is_enabled() check, since the rest of the patch shouldn't have
> functional changes.

 and pci_enable_bridge() already checks if it's already enabled, but
 still enables mastering in that case if it isn't:

 static void pci_enable_bridge(struct pci_dev *dev)
 {
 [...]
 if (pci_is_enabled(dev)) {
 if (!dev->is_busmaster)
 pci_set_master(dev);
 return;
 }

 so I guess due to the new check we end up with mastering disabled, and
 thus the firmware can't load since that's a DMA thing?
>>>
>>> Bjorn/Srinath, any input here? This is a regression that prevents wifi
>>> from working on a pretty standard laptop. It'd suck to have this be in
>>> -rc1. Seems like the trivial fix would be:
>>>
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index b0002daa50f3..ffbe11dbdd61 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
>>> *dev, unsigned long flags)
>>> return 0;   /* already enabled */
>>>
>>> bridge = pci_upstream_bridge(dev);
>>> -   if (bridge && !pci_is_enabled(bridge))
>>> +   if (bridge)
 With this change and keeping "mutex_lock(_bridge_mutex);" in
 pci_enable_bridge functoin will causes a nexted lock.
>>>
>>> Took a look, and looks like you are right. That code looks like a mess,
>>> fwiw.
>>>
>>> I'd strongly suggest that the bad commit is reverted until a proper
>>> solution is found, since the simple one-liner could potentially
>>> introduce a deadlock with your patch applied.
>>
>> BTW, your patch looks pretty bad too, introducing a random mutex
>> deep on code that can be recursive. Why isn't this check in
>> pci_enable_device_flags() enough to prevent double-enable of
>> an upstream bridge?
>>
>> if (atomic_inc_return(>enable_cnt) > 1)
>> return 0;   /* already enabled */
>>
> This check only to verify device enable not for the bus master check.
> But device enable function calls the bridge enable if it has the bridge.
> Bridge enable function enables both device and bus master.
> 
> Here the issue might be because, bridge of endpoint has already set
> device enable without set bus master in some other context. which is
> wrong.
> because all bridges should enable with bridge_enable function only.
> So we see the problem In this context, because "if (bridge &&
> !pci_is_enabled(bridge))" check stops bridge enable which intern stops
> bus master.
> pci_enable_bridge function always makes sure that both device and bus
> master are enabled in any case. If pci_enable_bridge function is not
> called means, that bridge is already has device enable flag set. which
> is not from pci_enable_bridge function.

In any case, your patch introduces a regression on systems. Please get
it reverted now, and then you can come up with a new approach to fix the
double enable of the upstream bridge.

-- 
Jens Axboe



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Ben,

On 09/14/2017 03:29 PM, Ben Greear wrote:

On 09/14/2017 01:26 PM, Denis Kenzior wrote:

Hi Ben,


How do you re-confirm them?  There are definitely cases where 
SSID/Security is the same but each
AP has its own DHCP server and roaming between them will require 
getting a new DHCP address (on

the same (NAT'd) subnet and with same gateway, likely as not).



Using DHCPREQUEST to verify obtained parameters, or the DHCPv6 
equivalent Confirm message.  This obviously requires some integration 
between the dhcp daemon and

the supplicant.


Do you want to allow the just-now-roamed station to use its old IP 
address(es) while you are

confirming?


I think it is sane to assume that the IP address _should_ be the same. 
The 802.11 spec expects this even.  This is to handle bizarre networks 
that don't do this properly.




If not, how is this different from just re-doing DHCP like normal?



You get to use your old IP address.  So e.g. your VoIP call doesn't 
disappear if you decide to switch access points.



And if so, you will in some cases be allowing duplicate IP addresses on
a network?



Life is never perfect ;)

Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 01:26 PM, Denis Kenzior wrote:

Hi Ben,


How do you re-confirm them?  There are definitely cases where SSID/Security is 
the same but each
AP has its own DHCP server and roaming between them will require getting a new 
DHCP address (on
the same (NAT'd) subnet and with same gateway, likely as not).



Using DHCPREQUEST to verify obtained parameters, or the DHCPv6 equivalent 
Confirm message.  This obviously requires some integration between the dhcp 
daemon and
the supplicant.


Do you want to allow the just-now-roamed station to use its old IP address(es) 
while you are
confirming?

If not, how is this different from just re-doing DHCP like normal?

And if so, you will in some cases be allowing duplicate IP addresses on
a network?

Thanks,
Ben



Regards,
-Denis




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Ben,


How do you re-confirm them?  There are definitely cases where 
SSID/Security is the same but each
AP has its own DHCP server and roaming between them will require getting 
a new DHCP address (on

the same (NAT'd) subnet and with same gateway, likely as not).



Using DHCPREQUEST to verify obtained parameters, or the DHCPv6 
equivalent Confirm message.  This obviously requires some integration 
between the dhcp daemon and the supplicant.


Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 01:05 PM, Denis Kenzior wrote:

Hi Ben,


There are different meanings for 'roam'.  Are you just talking about
fast-transition roaming?



I'm using 'roam' as any transition between BSSes in the same SSID/Security 
type.  So regular re-association, pre-authenticated association, fast 
transition.


I would think that the decision to restart DHCP (at least ipv4) should
be in user-space.  I'm less sure about how IPv6 should deal with this.

I have tested roaming using FT and normal-ish wpa_supplicant without
doing DHCP, and it works fine.  Of course, it also works if you
choose to re-do DHCP.



Our thinking on this was to assume that DHCP parameters are still valid after a 
roam, but re-confirm them just in case.  If the confirmation fails, then we can
fall back to requesting a new DHCP lease.


How do you re-confirm them?  There are definitely cases where SSID/Security is 
the same but each
AP has its own DHCP server and roaming between them will require getting a new 
DHCP address (on
the same (NAT'd) subnet and with same gateway, likely as not).

Thanks,
Ben



Regards,
-Denis




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Ben,


There are different meanings for 'roam'.  Are you just talking about
fast-transition roaming?



I'm using 'roam' as any transition between BSSes in the same 
SSID/Security type.  So regular re-association, pre-authenticated 
association, fast transition.



I would think that the decision to restart DHCP (at least ipv4) should
be in user-space.  I'm less sure about how IPv6 should deal with this.

I have tested roaming using FT and normal-ish wpa_supplicant without
doing DHCP, and it works fine.  Of course, it also works if you
choose to re-do DHCP.



Our thinking on this was to assume that DHCP parameters are still valid 
after a roam, but re-confirm them just in case.  If the confirmation 
fails, then we can fall back to requesting a new DHCP lease.


Regards,
-Denis


Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Srinath Mannam
Hi Jens Axboe,


On Thu, Sep 14, 2017 at 11:14 PM, Jens Axboe  wrote:
> On 09/14/2017 11:35 AM, Jens Axboe wrote:
>> On 09/14/2017 11:28 AM, Srinath Mannam wrote:
>>> Hi Bjorn,
>>>
>>> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:

 On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
> [+cc linux-pci]
>
> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
>> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>>
 CC'ing the guilty part and Bjorn. I'm assuming it's the
 pci_is_enabled() check, since the rest of the patch shouldn't have
 functional changes.
>>>
>>> and pci_enable_bridge() already checks if it's already enabled, but
>>> still enables mastering in that case if it isn't:
>>>
>>> static void pci_enable_bridge(struct pci_dev *dev)
>>> {
>>> [...]
>>> if (pci_is_enabled(dev)) {
>>> if (!dev->is_busmaster)
>>> pci_set_master(dev);
>>> return;
>>> }
>>>
>>> so I guess due to the new check we end up with mastering disabled, and
>>> thus the firmware can't load since that's a DMA thing?
>>
>> Bjorn/Srinath, any input here? This is a regression that prevents wifi
>> from working on a pretty standard laptop. It'd suck to have this be in
>> -rc1. Seems like the trivial fix would be:
>>
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0002daa50f3..ffbe11dbdd61 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
>> *dev, unsigned long flags)
>> return 0;   /* already enabled */
>>
>> bridge = pci_upstream_bridge(dev);
>> -   if (bridge && !pci_is_enabled(bridge))
>> +   if (bridge)
>>> With this change and keeping "mutex_lock(_bridge_mutex);" in
>>> pci_enable_bridge functoin will causes a nexted lock.
>>
>> Took a look, and looks like you are right. That code looks like a mess,
>> fwiw.
>>
>> I'd strongly suggest that the bad commit is reverted until a proper
>> solution is found, since the simple one-liner could potentially
>> introduce a deadlock with your patch applied.
>
> BTW, your patch looks pretty bad too, introducing a random mutex
> deep on code that can be recursive. Why isn't this check in
> pci_enable_device_flags() enough to prevent double-enable of
> an upstream bridge?
>
> if (atomic_inc_return(>enable_cnt) > 1)
> return 0;   /* already enabled */
>
This check only to verify device enable not for the bus master check.
But device enable function calls the bridge enable if it has the bridge.
Bridge enable function enables both device and bus master.

Here the issue might be because, bridge of endpoint has already set
device enable without set bus master in some other context. which is
wrong.
because all bridges should enable with bridge_enable function only.
So we see the problem In this context, because "if (bridge &&
!pci_is_enabled(bridge))" check stops bridge enable which intern stops
bus master.
pci_enable_bridge function always makes sure that both device and bus
master are enabled in any case. If pci_enable_bridge function is not
called means, that bridge is already has device enable flag set. which
is not from pci_enable_bridge function.

Regards,
Srinath.

> --
> Jens Axboe
>


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,


Well I guess any time that the carrier does in fact go down, you'd also
reset oper state...

And now that I think about it, that does in fact happen in
ieee80211_set_disassoc(), which we even go through for FT roaming.

Sounds like something we should fix?



If you want roaming to keep oper state UP in all cases, then yes.  Does 
this work on full mac cards as well?


E.g. if I CMD_CONNECT to AP1, then pre-authenticate to AP2 and issue a 
CMD_CONNECT to AP2?


Regards,
-Denis


iwlwifi: ToF usage

2017-09-14 Thread Joel B

Hi,

Starting to play around with the FTM/ToF support in iwlwifi, but 
documentation is (understandably) scarce at this point. Using a pair of 
8260:s I had lying around.


Can someone give me some hints on how to work the debugfs API for a 
successful measurement?


I've set up one card as AP with hostapd (with ftm_responder=1 and 
ftm_initiator=1 in hostapd.conf), the other as a STA.


From the STA, I've played with 'tof_range_request' in debugfs. Writing 
send_range_request=1, which it doesn't choke on or anything, but nothing 
seems to happen. Guess I need to set things up a bit first, but how?


If it's possible to get this to work at all at this stage, some hints on 
how to do it would be great.



Thanks,
Joel


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
On Thu, 2017-09-14 at 21:41 +0200, Johannes Berg wrote:
> 
> And now that I think about it, that does in fact happen in
> ieee80211_set_disassoc(), which we even go through for FT roaming.
> 
> Sounds like something we should fix?

Not simple though - we'd have to remember to do it later if the
auth/assoc fails or so ...

johannes


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
On Thu, 2017-09-14 at 14:37 -0500, Denis Kenzior wrote:

> > > My earlier point is that software roams need to have the exact
> > > same behavior as well.  And my understanding is that when we try
> > > to Fast-Transition (e.g. issue a CMD_ASSOCIATE), operstate is no
> > > longer UP.

> I'm talking about the kernel behavior, not wpa_s.

Oh.

> My guys discovered that the kernel seems to twiddle operstate
> automagically?  E.g. see description of 
> https://git.kernel.org/pub/scm/network/wireless/iwd.git/commit/?id=4d
> 20db05d7c8806749db574e9231bd5d1c476c7f

Well I guess any time that the carrier does in fact go down, you'd also
reset oper state...

And now that I think about it, that does in fact happen in
ieee80211_set_disassoc(), which we even go through for FT roaming.

Sounds like something we should fix?

johannes


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
On Thu, 2017-09-14 at 14:34 -0500, Denis Kenzior wrote:

> > I think you pretty much have to assume that, otherwise there's no
> > point
> > in roaming at all - you want your connections to stay, possibly
> > voice
> > calls to continue, etc.
> 
> I'm all for using this assumption.  I just wonder if real world 
> disagrees? :)

I think you can probably find networks where this doesn't work, but I
don't think we should cater for that. And corporate networks done with
Cisco, Aruba or similar should have it work correctly, and in fact I'd
assume that it's pretty much only broken if you take two consumer APs
and configure them to the same SSID or something ...

johannes



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Ben Greear

On 09/14/2017 12:34 PM, Denis Kenzior wrote:

Hi Johannes,

On 09/14/2017 02:17 PM, Johannes Berg wrote:

On Thu, 2017-09-14 at 13:37 -0500, Denis Kenzior wrote:


The question is whether all APs are actually sane after a
roam.  E.g. can the STA assume that the same IP address, DHCP lease,
etc is still valid?  I heard from various people that this might not
be the case, but we haven't had a chance to verify those claims...


I think you pretty much have to assume that, otherwise there's no point
in roaming at all - you want your connections to stay, possibly voice
calls to continue, etc.


I'm all for using this assumption.  I just wonder if real world disagrees? :)


There are different meanings for 'roam'.  Are you just talking about
fast-transition roaming?

I would think that the decision to restart DHCP (at least ipv4) should
be in user-space.  I'm less sure about how IPv6 should deal with this.

I have tested roaming using FT and normal-ish wpa_supplicant without
doing DHCP, and it works fine.  Of course, it also works if you
choose to re-do DHCP.

Thanks,
Ben




However, I'm not really convinced (any more) that this is actually
correct. If I'm reading the supplicant code correctly, then it sets
IF_OPER_UP only once the connection is *completed*, so it's already
doing what I thought it should be doing and couldn't.



Yes.  For a new connection it does something like:

New Key
Set Key
New Key (group)
Set Station (Authorized) (which fails on some drivers)
Set OperState to UP

Regards,
-Denis




--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com



Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,


My earlier point is that software roams need to have the exact same
behavior as well.  And my understanding is that when we try to
Fast-Transition (e.g. issue a CMD_ASSOCIATE), operstate is no longer
UP.


I'm not sure - I don't know what the state machine in wpa_s goes
through here. Probably easier to test than try to reason about the
code...


I'm talking about the kernel behavior, not wpa_s.  My guys discovered 
that the kernel seems to twiddle operstate automagically?  E.g. see 
description of 
https://git.kernel.org/pub/scm/network/wireless/iwd.git/commit/?id=4d20db05d7c8806749db574e9231bd5d1c476c7f


Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,

On 09/14/2017 02:17 PM, Johannes Berg wrote:

On Thu, 2017-09-14 at 13:37 -0500, Denis Kenzior wrote:


The question is whether all APs are actually sane after a
roam.  E.g. can the STA assume that the same IP address, DHCP lease,
etc is still valid?  I heard from various people that this might not
be the case, but we haven't had a chance to verify those claims...


I think you pretty much have to assume that, otherwise there's no point
in roaming at all - you want your connections to stay, possibly voice
calls to continue, etc.


I'm all for using this assumption.  I just wonder if real world 
disagrees? :)



However, I'm not really convinced (any more) that this is actually
correct. If I'm reading the supplicant code correctly, then it sets
IF_OPER_UP only once the connection is *completed*, so it's already
doing what I thought it should be doing and couldn't.



Yes.  For a new connection it does something like:

New Key
Set Key
New Key (group)
Set Station (Authorized) (which fails on some drivers)
Set OperState to UP

Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
Hi,

> Yep, but I seem to recall there was some vague language that said the
> AP would delete the PMKSA if the client disconnected.

Ok, not sure about that. But even if the AP does, we could try to send
it and it just can't use it :)

> operstates.txt states that for new connections, operstate should be 
> dormant until 802.1x is complete & successful.  So the !eapol-over-
> nl condition would violate that, no?

As I just wrote in my other email, I think I'm totally confused
regarding this, and the supplicant already does it correctly - and you
can ignore the whole "!eapol-over-nl" conditions, and just read it like
what I thought we could only do in the eapol-over-nl case.

No idea how I ended up with the idea that you could only send data
frames when the netdev was IF_OPER_UP - that doesn't seem to have any
basis in reality.

> > > >    - initialize 1X state machines/timeouts
> > > >  - 1X handshake
> > > >  - send PMK to device for 4-way-HS
> > > >  - AUTHORIZED event
> > > >    - [if eapol-over-nl: toggle oper state up]
> > > > 
> 
> Given your explanation above, should this be [if !eapol-over-nl ..?


> So I agree that OPERSTATE_UP should not change on a roam.  I think
> we're both in agreement here.

Great.

> My earlier point is that software roams need to have the exact same 
> behavior as well.  And my understanding is that when we try to 
> Fast-Transition (e.g. issue a CMD_ASSOCIATE), operstate is no longer
> UP.

I'm not sure - I don't know what the state machine in wpa_s goes
through here. Probably easier to test than try to reason about the
code...

> At the very least there's lots of confusion with what is supposed to 
> happen with operstate and when.  So if we can work out & document a 
> consistent behavior, I'm all for it.

:-)

johannes


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
On Thu, 2017-09-14 at 13:37 -0500, Denis Kenzior wrote:

> The question is whether all APs are actually sane after a
> roam.  E.g. can the STA assume that the same IP address, DHCP lease,
> etc is still valid?  I heard from various people that this might not
> be the case, but we haven't had a chance to verify those claims...

I think you pretty much have to assume that, otherwise there's no point
in roaming at all - you want your connections to stay, possibly voice
calls to continue, etc.

> I think it does make sense to tie one into the other.  However, do
> we have a race condition here?  E.g. AUTHORIZED is sent on one
> socket, then OPER_STATE is signaled on rtnl.  Which one do
> applications rely on?

Regular applications wouldn't really look at nl80211.

> > Note that we *can't* do this right now, otherwise we can't transfer
> > the EAPOL frames; but once we do that over nl80211 we'd be able to.

However, I'm not really convinced (any more) that this is actually
correct. If I'm reading the supplicant code correctly, then it sets
IF_OPER_UP only once the connection is *completed*, so it's already
doing what I thought it should be doing and couldn't.

> *wakes up*
> 
> Ah I now seem to remember that I volunteered to look into this before
> my sabbatical :)  I think this was in early June?  I'm certainly
> still interested in doing so.   Let me dust off that portion of my
> brain and come up with a proposal.  Unless you already have a clear
> idea of how things should work?

Not really. I guess a new command/event with the frame, and some flags,
I know that at least we want a "don't encrypt" flag, for example.

johannes


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,

On 09/14/2017 01:36 PM, Johannes Berg wrote:

Hi Denis,

I'd actually only Cc'ed you to see if you were actually working on
EAPOL-over-nl80211 :-)


Curious, isn't sending a PMKID for a non-roaming case not strictly
per spec?  Is this just to avoid running 802.1x?


I don't think so, what makes you? In -2016 9.4.2.25.5 for example, the
spec says:

 The PMKID Count and List fields are used only in the RSNE in the
 (Re)Association Request frame to an AP and in FT authentication
 sequence frames.

Why should it only be valid in roaming? You might turn off and on wifi
and still have a PMKSA.


Yep, but I seem to recall there was some vague language that said the AP 
would delete the PMKSA if the client disconnected.  Sorry, getting back 
on track:





How does this mesh with operstates outlined in
Documentation/networking/operstates.txt?  Are you treating the
'roaming' case as 802.1x 'reauthentication'?


I think we have to, because otherwise DHCP/IPv6 could happen again
after roaming - we really can't set the oper state to dormant during
the roaming.



Okay, I can buy this interpretation...


Assuming this assessment is correct, this opens up another
possibility:
tracking the PORT_AUTHORIZED state with a separate event:

   * new connection case - no PMKSA (usage)
 - CONNECT event
   - [if !eapol-over-nl: toggle oper state up]


Doesn't this go against operstates.txt, Section 4?


Why?


operstates.txt states that for new connections, operstate should be 
dormant until 802.1x is complete & successful.  So the !eapol-over-nl 
condition would violate that, no?





Also, I must be dense, but why is the behavior different between
eapol-over-nl and not?


Because for EAPOL-over-data, we actually have to have IF_OPER_UP to be
able to TX/RX the EAPOL frames.



Okay.  But then wouldn't userspace now have to deal with two different 
behaviors now?



   - initialize 1X state machines/timeouts
 - 1X handshake
 - send PMK to device for 4-way-HS
 - AUTHORIZED event
   - [if eapol-over-nl: toggle oper state up]



Given your explanation above, should this be [if !eapol-over-nl ..?


   * new connection case - PMKSA usage
 - CONNECT event
   - [if !eapol-over-nl: toggle oper state up]
   - initialize 1X state machines/timeouts
 - AUTHORIZED event
   - stop 1X state machine/timeouts
   - [if eapol-over-nl: toggle oper state up]

   * roaming case - no PMKSA (usage)
 - ROAM event
   - [no touching oper state]


My understanding is that oper state goes back to dormant in case we
re-associate.  At least we needed to set it back to UP after a fast
transition?


Like I said above, I really don't think we should/can do this, I think
we'd lose at least IPv6 connectivity in the meantime?



So I agree that OPERSTATE_UP should not change on a roam.  I think we're 
both in agreement here.


My earlier point is that software roams need to have the exact same 
behavior as well.  And my understanding is that when we try to 
Fast-Transition (e.g. issue a CMD_ASSOCIATE), operstate is no longer UP.


At the very least there's lots of confusion with what is supposed to 
happen with operstate and when.  So if we can work out & document a 
consistent behavior, I'm all for it.


Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,

On 09/14/2017 06:44 AM, Johannes Berg wrote:

On Thu, 2017-09-14 at 13:21 +0200, Arend van Spriel wrote:


Yep. Toggling the OPER_STATE seems to go against what roaming is
about.


Agree.


The question is whether all APs are actually sane after a roam.  E.g. 
can the STA assume that the same IP address, DHCP lease, etc is still 
valid?  I heard from various people that this might not be the case, but 
we haven't had a chance to verify those claims...





Come to think of it, is it a good idea to tightly couple
PORT_AUTHORIZED to OPER_STATE. Aren't these separate concepts in
different layers of the network stack.


Well, I think that coupling would make the most sense, since once you
have oper state UP you'll try to get IPv6 etc., no? And before being
authorized there's no point.



I think it does make sense to tie one into the other.  However, do we 
have a race condition here?  E.g. AUTHORIZED is sent on one socket, then 
OPER_STATE is signaled on rtnl.  Which one do applications rely on?



Note that we *can't* do this right now, otherwise we can't transfer the
EAPOL frames; but once we do that over nl80211 we'd be able to.


In earlier discussions the proposal for a separate event was made by
Jithu (colleague). In brcmfmac it would become a bit less
complicated with a separate event so it has my vote as well. So the
AUTHORIZED event will have no attributes, right? So if the event
occurs it is AUTHORIZED.


I think so, yes. I pondered having the attribute in there so you could
explicitly have a "not authorized" event, but do we really need that?
If you get disconnected that's pretty much implied, so ... I don't
think we need it.



(*) is anyone working on that? I'll throw it on my list if not.


["that" being EAPOL-over-nl80211]


The last I saw on this was Denis Kenzior volunteering for it, but
that was about it.


Oh, thanks for the reminder, I'd forgotten entirely...
Denis?


*wakes up*

Ah I now seem to remember that I volunteered to look into this before my 
sabbatical :)  I think this was in early June?  I'm certainly still 
interested in doing so.   Let me dust off that portion of my brain and 
come up with a proposal.  Unless you already have a clear idea of how 
things should work?


Regards,
-Denis


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
Hi Denis,

I'd actually only Cc'ed you to see if you were actually working on
EAPOL-over-nl80211 :-)

> Curious, isn't sending a PMKID for a non-roaming case not strictly
> per spec?  Is this just to avoid running 802.1x?

I don't think so, what makes you? In -2016 9.4.2.25.5 for example, the
spec says:

The PMKID Count and List fields are used only in the RSNE in the
(Re)Association Request frame to an AP and in FT authentication
sequence frames.

Why should it only be valid in roaming? You might turn off and on wifi
and still have a PMKSA.

> How does this mesh with operstates outlined in 
> Documentation/networking/operstates.txt?  Are you treating the
> 'roaming' case as 802.1x 'reauthentication'?

I think we have to, because otherwise DHCP/IPv6 could happen again
after roaming - we really can't set the oper state to dormant during
the roaming.

> > Assuming this assessment is correct, this opens up another
> > possibility:
> > tracking the PORT_AUTHORIZED state with a separate event:
> > 
> >   * new connection case - no PMKSA (usage)
> > - CONNECT event
> >   - [if !eapol-over-nl: toggle oper state up]
> 
> Doesn't this go against operstates.txt, Section 4?

Why?

> Also, I must be dense, but why is the behavior different between 
> eapol-over-nl and not?

Because for EAPOL-over-data, we actually have to have IF_OPER_UP to be
able to TX/RX the EAPOL frames.

> >   - initialize 1X state machines/timeouts
> > - 1X handshake
> > - send PMK to device for 4-way-HS
> > - AUTHORIZED event
> >   - [if eapol-over-nl: toggle oper state up]
> > 
> >   * new connection case - PMKSA usage
> > - CONNECT event
> >   - [if !eapol-over-nl: toggle oper state up]
> >   - initialize 1X state machines/timeouts
> > - AUTHORIZED event
> >   - stop 1X state machine/timeouts
> >   - [if eapol-over-nl: toggle oper state up]
> > 
> >   * roaming case - no PMKSA (usage)
> > - ROAM event
> >   - [no touching oper state]
> 
> My understanding is that oper state goes back to dormant in case we 
> re-associate.  At least we needed to set it back to UP after a fast 
> transition?

Like I said above, I really don't think we should/can do this, I think
we'd lose at least IPv6 connectivity in the meantime?

johannes


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Denis Kenzior

Hi Johannes,

On 09/14/2017 03:39 AM, Johannes Berg wrote:

Hi Arend, Jouni, all,

Avi and I just discussed another use case for the PORT_AUTHORIZED flag,
which is PMKSA caching.

Assume you have 4-way-HS offloaded, then if you send a PMKID in the new
connection (doesn't matter if it's roaming or not), the AP may chose to
use it and start the 4-way-HS, or it may not be able to and start the
802.1X handshake.


Curious, isn't sending a PMKID for a non-roaming case not strictly per 
spec?  Is this just to avoid running 802.1x?




In the supplicant, if you are waiting for the 1X handshake, then in the
case the PMKSA gets used the supplicant would wait forever, and
eventually terminate the connection because the handshake isn't
successful.

Thus, *both* cases need to know about the PORT_AUTHORIZED flag.

I previously didn't apply the patch for the flag in CONNECT
notification, but nobody is using it in ROAM notification so far, so
that we still have a chance of revisiting this.

Throwing a potential EAPOL-over-nl80211 (*) into the mix complicates
the picture further. For example, in that case the OPER_STATE might not
be set to UP until the port is authorized. In the case of roaming, this
might - on first sight - lead to toggling DOWN/UP if a new 1X handshake
needs to be done and the PORT_AUTHORIZED flag doesn't come with the
ROAM event. However, I think this is actually wrong as it would
probably lose IPv6 address assignment etc. - so I think we don't want
to do this even if we do have EAPOL-over-nl80211 and don't need the
oper_state to be up in order to do the 1X handshake. Do you agree with
this assessment?


How does this mesh with operstates outlined in 
Documentation/networking/operstates.txt?  Are you treating the 'roaming' 
case as 802.1x 'reauthentication'?




Assuming this assessment is correct, this opens up another possibility:
tracking the PORT_AUTHORIZED state with a separate event:

  * new connection case - no PMKSA (usage)
- CONNECT event
  - [if !eapol-over-nl: toggle oper state up]


Doesn't this go against operstates.txt, Section 4?

Also, I must be dense, but why is the behavior different between 
eapol-over-nl and not?



  - initialize 1X state machines/timeouts
- 1X handshake
- send PMK to device for 4-way-HS
- AUTHORIZED event
  - [if eapol-over-nl: toggle oper state up]

  * new connection case - PMKSA usage
- CONNECT event
  - [if !eapol-over-nl: toggle oper state up]
  - initialize 1X state machines/timeouts
- AUTHORIZED event
  - stop 1X state machine/timeouts
  - [if eapol-over-nl: toggle oper state up]

  * roaming case - no PMKSA (usage)
- ROAM event
  - [no touching oper state]


My understanding is that oper state goes back to dormant in case we 
re-associate.  At least we needed to set it back to UP after a fast 
transition?



  - initialize 1X state machines/timeouts
- 1X handshake
- AUTHORIZED event
  - [no touching oper state]

  * roaming case - PMKSA usage
- ROAM event
  - [no touching oper state]
  - initialize 1X state machines/timeouts
- AUTHORIZED event
  - stop 1X state machine/timeouts
  - [no touching oper state]


Does this seem reasonable? If possible, I prefer this to just having
the attribute, because with the attribute the roaming case will be
difficult, the natural code flow in the driver would probably make it
end up looking like this:

  * roaming case - no PMKSA (usage)
- ROAM event
  - [no touching
oper state]
  - initialize 1X state machines/timeouts
- 1X
handshake
- CONNECT event w/ PORT_AUTHORIZED
  - [no touching oper
state]

which seems awkward.

Any other thoughts?


A separate authorized event seems fine to me...

Regards,
-Denis


[PATCH] rtlwifi: rtl8192ee: Fix memory leak when loading firmware

2017-09-14 Thread Larry Finger
In routine rtl92ee_set_fw_rsvdpagepkt(), the driver allocates an skb, but
never calls rtl_cmd_send_packet(), which will free the buffer. All other
rtlwifi drivers perform this operation correctly.

This problem has been in the driver since it was included in the kernel.
Fortunately, each firmware load only leaks 4 buffers, which likely
explains why it has not previously been detected.

Cc: Stable  # 3.18+
Signed-off-by: Larry Finger 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192ee/fw.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/fw.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/fw.c
index 7eae27f8e173..f9563ae301ad 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/fw.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/fw.c
@@ -682,7 +682,7 @@ void rtl92ee_set_fw_rsvdpagepkt(struct ieee80211_hw *hw, 
bool b_dl_finished)
struct rtl_priv *rtlpriv = rtl_priv(hw);
struct rtl_mac *mac = rtl_mac(rtl_priv(hw));
struct sk_buff *skb = NULL;
-
+   bool rtstatus;
u32 totalpacketlen;
u8 u1rsvdpageloc[5] = { 0 };
bool b_dlok = false;
@@ -768,7 +768,9 @@ void rtl92ee_set_fw_rsvdpagepkt(struct ieee80211_hw *hw, 
bool b_dl_finished)
skb = dev_alloc_skb(totalpacketlen);
skb_put_data(skb, _page_packet, totalpacketlen);
 
-   b_dlok = true;
+   rtstatus = rtl_cmd_send_packet(hw, skb);
+   if (rtstatus)
+   b_dlok = true;
 
if (b_dlok) {
RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD ,
-- 
2.12.3



Re: RTL8192EE PCIe Wireless Network Adapter crashed with linux-4.13

2017-09-14 Thread Larry Finger

On 09/14/2017 08:30 AM, Zwindl wrote:

Dear developers:
I'm using Arch Linux with testing enabled, the current kernel version and 
details are
`Linux zwindl 4.13.2-1-ARCH #1 SMP PREEMPT Thu Sep 14 02:57:34 UTC 2017 x86_64 
GNU/Linux`.
The wireless card can't work properly from the kernel 4.13. Here's the log(in 
attachment) when NetworkManager trying to connect my wifi which is named as 
'TP', my mac addr hided as xx:xx:xx:xx:xx.

What should I provide to help to debug?
ZWindL.


The BUG-ON arises in __intel_map_single() due to dir (for direction of DMA) 
equal to DMA_NONE (3). When rtl8192ee calls pci_map_single(), it uses 
PCI_DMA_TODEVICE (1). I followed the calling sequence through the entire chain, 
and none of the routines made any changes to 'dir', other that changing the type 
from int to enum dma_data_direction. That would not have changed a 1 to a 3.


I built a 4.13.2 system. The problem does not happen here. At this point, the 
system has been up for about two hours. I did discover a small memory leak 
associated with firmware loading, but that should not have caused the problem. 
Nonetheless, I will be sending a patch to fix that problem.


I will continue testing, although I doubt that the problem will happen here.

How long had your system been up when the problem occurred? Your dmesg fragment 
did not show any times. What kernels have you tried besides 4.13.2?


Larry



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Johannes Berg
On Thu, 2017-09-14 at 23:14 +0530, Srinath Mannam wrote:
> 
> atomic_inc_return(>enable_cnt) in function
> pci_enable_device_flags enables passed pcie device.
> !pci_is_enabled(bridge) check in "if (bridge &&
> !pci_is_enabled(bridge))"  checks for bridge device of previous pcie
> device.
> So it will not stop bus_master of bridge device.

It also doesn't *enable* it though, does it?

johannes


Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/14/2017 11:44 AM, Srinath Mannam wrote:
> Hi Jens Axboe,
> 
> On Thu, Sep 14, 2017 at 11:05 PM, Jens Axboe  wrote:
>> On 09/14/2017 11:28 AM, Srinath Mannam wrote:
>>> Hi Bjorn,
>>>
>>> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:

 On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
> [+cc linux-pci]
>
> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
>> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>>
 CC'ing the guilty part and Bjorn. I'm assuming it's the
 pci_is_enabled() check, since the rest of the patch shouldn't have
 functional changes.
>>>
>>> and pci_enable_bridge() already checks if it's already enabled, but
>>> still enables mastering in that case if it isn't:
>>>
>>> static void pci_enable_bridge(struct pci_dev *dev)
>>> {
>>> [...]
>>> if (pci_is_enabled(dev)) {
>>> if (!dev->is_busmaster)
>>> pci_set_master(dev);
>>> return;
>>> }
>>>
>>> so I guess due to the new check we end up with mastering disabled, and
>>> thus the firmware can't load since that's a DMA thing?
>>
>> Bjorn/Srinath, any input here? This is a regression that prevents wifi
>> from working on a pretty standard laptop. It'd suck to have this be in
>> -rc1. Seems like the trivial fix would be:
>>
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0002daa50f3..ffbe11dbdd61 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
>> *dev, unsigned long flags)
>> return 0;   /* already enabled */
>>
>> bridge = pci_upstream_bridge(dev);
>> -   if (bridge && !pci_is_enabled(bridge))
>> +   if (bridge)
>>> With this change and keeping "mutex_lock(_bridge_mutex);" in
>>> pci_enable_bridge functoin will causes a nexted lock.
>>
>> Took a look, and looks like you are right. That code looks like a mess,
>> fwiw.
>>
>> I'd strongly suggest that the bad commit is reverted until a proper
>> solution is found, since the simple one-liner could potentially
>> introduce a deadlock with your patch applied.
> atomic_inc_return(>enable_cnt) in function
> pci_enable_device_flags enables passed pcie device.
> !pci_is_enabled(bridge) check in "if (bridge &&
> !pci_is_enabled(bridge))"  checks for bridge device of previous pcie
> device.
> So it will not stop bus_master of bridge device.
> In your case how many bridges in between endpoint and host bridge?
> Please provide the details about your use case.

It's a bog standard Lenovo X1 Carbon, gen4.

# lspci
00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 
08)
00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated 
Graphics (rev 07)
00:08.0 System peripheral: Intel Corporation Sky Lake Gaussian Mixture Model
00:13.0 Non-VGA unclassified device: Intel Corporation Device 9d35 (rev 21)
00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI 
Controller (rev 21)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP 
Thermal subsystem (rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI 
(rev 21)
00:1c.0 PCI bridge: Intel Corporation Device 9d10 (rev f1)
00:1c.2 PCI bridge: Intel Corporation Device 9d12 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port 
(rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-LP LPC Controller (rev 21)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection I219-LM (rev 
21)
04:00.0 Network controller: Intel Corporation Wireless 8260 (rev 3a)
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD 
Controller (rev 01)

# lspci -t
-[:00]-+-00.0
   +-02.0
   +-08.0
   +-13.0
   +-14.0
   +-14.2
   +-16.0
   +-1c.0-[02]--
   +-1c.2-[04]00.0
   +-1c.4-[05]00.0
   +-1f.0
   +-1f.2
   +-1f.3
   +-1f.4
   \-1f.6

-- 
Jens Axboe



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/14/2017 11:35 AM, Jens Axboe wrote:
> On 09/14/2017 11:28 AM, Srinath Mannam wrote:
>> Hi Bjorn,
>>
>> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:
>>>
>>> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
 [+cc linux-pci]

 On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>
>>> CC'ing the guilty part and Bjorn. I'm assuming it's the
>>> pci_is_enabled() check, since the rest of the patch shouldn't have
>>> functional changes.
>>
>> and pci_enable_bridge() already checks if it's already enabled, but
>> still enables mastering in that case if it isn't:
>>
>> static void pci_enable_bridge(struct pci_dev *dev)
>> {
>> [...]
>> if (pci_is_enabled(dev)) {
>> if (!dev->is_busmaster)
>> pci_set_master(dev);
>> return;
>> }
>>
>> so I guess due to the new check we end up with mastering disabled, and
>> thus the firmware can't load since that's a DMA thing?
>
> Bjorn/Srinath, any input here? This is a regression that prevents wifi
> from working on a pretty standard laptop. It'd suck to have this be in
> -rc1. Seems like the trivial fix would be:
>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0002daa50f3..ffbe11dbdd61 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
> *dev, unsigned long flags)
> return 0;   /* already enabled */
>
> bridge = pci_upstream_bridge(dev);
> -   if (bridge && !pci_is_enabled(bridge))
> +   if (bridge)
>> With this change and keeping "mutex_lock(_bridge_mutex);" in
>> pci_enable_bridge functoin will causes a nexted lock.
> 
> Took a look, and looks like you are right. That code looks like a mess,
> fwiw.
> 
> I'd strongly suggest that the bad commit is reverted until a proper
> solution is found, since the simple one-liner could potentially
> introduce a deadlock with your patch applied.

BTW, your patch looks pretty bad too, introducing a random mutex
deep on code that can be recursive. Why isn't this check in
pci_enable_device_flags() enough to prevent double-enable of
an upstream bridge?

if (atomic_inc_return(>enable_cnt) > 1)
return 0;   /* already enabled */

-- 
Jens Axboe



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Srinath Mannam
Hi Jens Axboe,

On Thu, Sep 14, 2017 at 11:05 PM, Jens Axboe  wrote:
> On 09/14/2017 11:28 AM, Srinath Mannam wrote:
>> Hi Bjorn,
>>
>> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:
>>>
>>> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
 [+cc linux-pci]

 On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>
>>> CC'ing the guilty part and Bjorn. I'm assuming it's the
>>> pci_is_enabled() check, since the rest of the patch shouldn't have
>>> functional changes.
>>
>> and pci_enable_bridge() already checks if it's already enabled, but
>> still enables mastering in that case if it isn't:
>>
>> static void pci_enable_bridge(struct pci_dev *dev)
>> {
>> [...]
>> if (pci_is_enabled(dev)) {
>> if (!dev->is_busmaster)
>> pci_set_master(dev);
>> return;
>> }
>>
>> so I guess due to the new check we end up with mastering disabled, and
>> thus the firmware can't load since that's a DMA thing?
>
> Bjorn/Srinath, any input here? This is a regression that prevents wifi
> from working on a pretty standard laptop. It'd suck to have this be in
> -rc1. Seems like the trivial fix would be:
>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0002daa50f3..ffbe11dbdd61 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
> *dev, unsigned long flags)
> return 0;   /* already enabled */
>
> bridge = pci_upstream_bridge(dev);
> -   if (bridge && !pci_is_enabled(bridge))
> +   if (bridge)
>> With this change and keeping "mutex_lock(_bridge_mutex);" in
>> pci_enable_bridge functoin will causes a nexted lock.
>
> Took a look, and looks like you are right. That code looks like a mess,
> fwiw.
>
> I'd strongly suggest that the bad commit is reverted until a proper
> solution is found, since the simple one-liner could potentially
> introduce a deadlock with your patch applied.
atomic_inc_return(>enable_cnt) in function
pci_enable_device_flags enables passed pcie device.
!pci_is_enabled(bridge) check in "if (bridge &&
!pci_is_enabled(bridge))"  checks for bridge device of previous pcie
device.
So it will not stop bus_master of bridge device.
In your case how many bridges in between endpoint and host bridge?
Please provide the details about your use case.
Thank you.
>
> --
> Jens Axboe
>
Regards,
Srinath.


Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/14/2017 11:28 AM, Srinath Mannam wrote:
> Hi Bjorn,
> 
> On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:
>>
>> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
>>> [+cc linux-pci]
>>>
>>> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
 On 09/12/2017 02:04 PM, Johannes Berg wrote:
> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>
>> CC'ing the guilty part and Bjorn. I'm assuming it's the
>> pci_is_enabled() check, since the rest of the patch shouldn't have
>> functional changes.
>
> and pci_enable_bridge() already checks if it's already enabled, but
> still enables mastering in that case if it isn't:
>
> static void pci_enable_bridge(struct pci_dev *dev)
> {
> [...]
> if (pci_is_enabled(dev)) {
> if (!dev->is_busmaster)
> pci_set_master(dev);
> return;
> }
>
> so I guess due to the new check we end up with mastering disabled, and
> thus the firmware can't load since that's a DMA thing?

 Bjorn/Srinath, any input here? This is a regression that prevents wifi
 from working on a pretty standard laptop. It'd suck to have this be in
 -rc1. Seems like the trivial fix would be:


 diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
 index b0002daa50f3..ffbe11dbdd61 100644
 --- a/drivers/pci/pci.c
 +++ b/drivers/pci/pci.c
 @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
 *dev, unsigned long flags)
 return 0;   /* already enabled */

 bridge = pci_upstream_bridge(dev);
 -   if (bridge && !pci_is_enabled(bridge))
 +   if (bridge)
> With this change and keeping "mutex_lock(_bridge_mutex);" in
> pci_enable_bridge functoin will causes a nexted lock.

Took a look, and looks like you are right. That code looks like a mess,
fwiw.

I'd strongly suggest that the bad commit is reverted until a proper
solution is found, since the simple one-liner could potentially
introduce a deadlock with your patch applied.

-- 
Jens Axboe



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Srinath Mannam
Hi Bjorn,

On Thu, Sep 14, 2017 at 10:52 PM, Jens Axboe  wrote:
>
> On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
> > [+cc linux-pci]
> >
> > On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
> >> On 09/12/2017 02:04 PM, Johannes Berg wrote:
> >>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
> >>>
>  CC'ing the guilty part and Bjorn. I'm assuming it's the
>  pci_is_enabled() check, since the rest of the patch shouldn't have
>  functional changes.
> >>>
> >>> and pci_enable_bridge() already checks if it's already enabled, but
> >>> still enables mastering in that case if it isn't:
> >>>
> >>> static void pci_enable_bridge(struct pci_dev *dev)
> >>> {
> >>> [...]
> >>> if (pci_is_enabled(dev)) {
> >>> if (!dev->is_busmaster)
> >>> pci_set_master(dev);
> >>> return;
> >>> }
> >>>
> >>> so I guess due to the new check we end up with mastering disabled, and
> >>> thus the firmware can't load since that's a DMA thing?
> >>
> >> Bjorn/Srinath, any input here? This is a regression that prevents wifi
> >> from working on a pretty standard laptop. It'd suck to have this be in
> >> -rc1. Seems like the trivial fix would be:
> >>
> >>
> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >> index b0002daa50f3..ffbe11dbdd61 100644
> >> --- a/drivers/pci/pci.c
> >> +++ b/drivers/pci/pci.c
> >> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
> >> *dev, unsigned long flags)
> >> return 0;   /* already enabled */
> >>
> >> bridge = pci_upstream_bridge(dev);
> >> -   if (bridge && !pci_is_enabled(bridge))
> >> +   if (bridge)
With this change and keeping "mutex_lock(_bridge_mutex);" in
pci_enable_bridge functoin will causes a nexted lock.

> >> pci_enable_bridge(bridge);
> >>
> >> /* only skip sriov related */
> >>
> >>
> >
> > Looks like a reasonable fix.  I assume it works for you?  I don't have
> > a way to test it, so if you can verify that it works and supply a
> > Signed-off-by, I can merge it.
>
> Booting it right now, I'll send out a proper version in a few minutes.
>
> --
> Jens Axboe
>


Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/14/2017 11:11 AM, Bjorn Helgaas wrote:
> [+cc linux-pci]
> 
> On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
>> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>>
 CC'ing the guilty part and Bjorn. I'm assuming it's the
 pci_is_enabled() check, since the rest of the patch shouldn't have
 functional changes.
>>>
>>> and pci_enable_bridge() already checks if it's already enabled, but
>>> still enables mastering in that case if it isn't:
>>>
>>> static void pci_enable_bridge(struct pci_dev *dev)
>>> {
>>> [...]
>>> if (pci_is_enabled(dev)) {
>>> if (!dev->is_busmaster)
>>> pci_set_master(dev);
>>> return;
>>> }
>>>
>>> so I guess due to the new check we end up with mastering disabled, and
>>> thus the firmware can't load since that's a DMA thing?
>>
>> Bjorn/Srinath, any input here? This is a regression that prevents wifi
>> from working on a pretty standard laptop. It'd suck to have this be in
>> -rc1. Seems like the trivial fix would be:
>>
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0002daa50f3..ffbe11dbdd61 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev 
>> *dev, unsigned long flags)
>> return 0;   /* already enabled */
>>
>> bridge = pci_upstream_bridge(dev);
>> -   if (bridge && !pci_is_enabled(bridge))
>> +   if (bridge)
>> pci_enable_bridge(bridge);
>>
>> /* only skip sriov related */
>>
>>
> 
> Looks like a reasonable fix.  I assume it works for you?  I don't have
> a way to test it, so if you can verify that it works and supply a
> Signed-off-by, I can merge it.

Booting it right now, I'll send out a proper version in a few minutes.

-- 
Jens Axboe



Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Bjorn Helgaas
[+cc linux-pci]

On Thu, Sep 14, 2017 at 12:00 PM, Jens Axboe  wrote:
> On 09/12/2017 02:04 PM, Johannes Berg wrote:
>> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
>>
>>> CC'ing the guilty part and Bjorn. I'm assuming it's the
>>> pci_is_enabled() check, since the rest of the patch shouldn't have
>>> functional changes.
>>
>> and pci_enable_bridge() already checks if it's already enabled, but
>> still enables mastering in that case if it isn't:
>>
>> static void pci_enable_bridge(struct pci_dev *dev)
>> {
>> [...]
>> if (pci_is_enabled(dev)) {
>> if (!dev->is_busmaster)
>> pci_set_master(dev);
>> return;
>> }
>>
>> so I guess due to the new check we end up with mastering disabled, and
>> thus the firmware can't load since that's a DMA thing?
>
> Bjorn/Srinath, any input here? This is a regression that prevents wifi
> from working on a pretty standard laptop. It'd suck to have this be in
> -rc1. Seems like the trivial fix would be:
>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0002daa50f3..ffbe11dbdd61 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev *dev, 
> unsigned long flags)
> return 0;   /* already enabled */
>
> bridge = pci_upstream_bridge(dev);
> -   if (bridge && !pci_is_enabled(bridge))
> +   if (bridge)
> pci_enable_bridge(bridge);
>
> /* only skip sriov related */
>
>

Looks like a reasonable fix.  I assume it works for you?  I don't have
a way to test it, so if you can verify that it works and supply a
Signed-off-by, I can merge it.

Bjorn


Re: iwlwifi firmware load broken in current -git

2017-09-14 Thread Jens Axboe
On 09/12/2017 02:04 PM, Johannes Berg wrote:
> On Tue, 2017-09-12 at 13:43 -0600, Jens Axboe wrote:
> 
>> CC'ing the guilty part and Bjorn. I'm assuming it's the
>> pci_is_enabled() check, since the rest of the patch shouldn't have
>> functional changes.
> 
> and pci_enable_bridge() already checks if it's already enabled, but
> still enables mastering in that case if it isn't:
> 
> static void pci_enable_bridge(struct pci_dev *dev)
> {
> [...]
> if (pci_is_enabled(dev)) {
> if (!dev->is_busmaster)
> pci_set_master(dev);
> return;
> }
> 
> so I guess due to the new check we end up with mastering disabled, and
> thus the firmware can't load since that's a DMA thing?

Bjorn/Srinath, any input here? This is a regression that prevents wifi
from working on a pretty standard laptop. It'd suck to have this be in
-rc1. Seems like the trivial fix would be:


diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0002daa50f3..ffbe11dbdd61 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1394,7 +1394,7 @@ static int pci_enable_device_flags(struct pci_dev *dev, 
unsigned long flags)
return 0;   /* already enabled */
 
bridge = pci_upstream_bridge(dev);
-   if (bridge && !pci_is_enabled(bridge))
+   if (bridge)
pci_enable_bridge(bridge);
 
/* only skip sriov related */


-- 
Jens Axboe



Re: Kernel developer to work on wifi injection in mainline

2017-09-14 Thread Steve deRosier
Hi Raphael,

On Thu, Sep 14, 2017 at 7:31 AM, Raphael Hertzog  wrote:
> (Please keep-me in CC and use reply-to all)
>
...
>
> Thus we are looking for a kernel developer that we could work with on a
> contractor basis to:
> - make wifi injection work on the current Kali kernel on various drivers
>   that we want to support
> - upstream as much as possible of changes needed for wifi injection to
>   work
> - ensure continued support of the feature in new Linux releases
>

I do this sort of consulting work. While I kept this reply on the list
due to your instructions, I'll send you information directly from my
consulting address.

> I wonder whether there are non-technical hurdles that would prevent
> merging of wifi injection patches. Are there legal challenges to this for
> instance?
>

I don't think there would be a legal challenge provided the code is
yours to sign-off on and isn't patent encumbered.

Personally, I think the biggest issue is interest of the Linux
community as a whole. In other words, if you can convince maintainers
that the change is:
1. valuable to the Linux community members
2. not harmful to the community as a whole
3. worth maintaining (and in part will be responsible for maintaining)
4. not purely based on closed business reasons
then I think it'll be possible to get it upstream.

I'm not that familiar with the patch(es) in question but from my quick
look I don't see an obvious blocking issue with the concepts here.

- Steve


Kernel developer to work on wifi injection in mainline

2017-09-14 Thread Raphael Hertzog
(Please keep-me in CC and use reply-to all)

Hello everybody,

I'm working for Kali Linux/Offensive Security. Pentesters rely on working
wifi injection to do their work but support of this feature is sub-optimal
in mainline Linux. This is why we carry a patch:
http://git.kali.org/gitweb/?p=packages/linux.git;a=blob;f=debian/patches/features/all/kali-wifi-injection.patch;h=0b872f83f3ca09c967d9c0c520e11cf7d75dfd91;hb=refs/heads/kali/master

This feature also tends to break on various drivers from time to time
as is currently the case with carl9170 and rt2800usb:
https://bugs.kali.org/view.php?id=4220

Thus we are looking for a kernel developer that we could work with on a
contractor basis to:
- make wifi injection work on the current Kali kernel on various drivers
  that we want to support
- upstream as much as possible of changes needed for wifi injection to
  work
- ensure continued support of the feature in new Linux releases

I wonder whether there are non-technical hurdles that would prevent
merging of wifi injection patches. Are there legal challenges to this for
instance?

Cheers,

PS: We track new kernel release out of Debian Testing so we tend to
use all releases for 2 months.
-- 
Raphaël Hertzog ◈ Kali Linux Developer ◈ Debian Developer

Get the Debian Administrator's Handbook:
→ https://debian-handbook.info/get/


RE: [EXT] Re: [PATCH 2/2] mwifiex: print URB submit failure error after threshold attemtps

2017-09-14 Thread Ganapathi Bhat
Hi Joe,

> On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote:
> > Current driver prints dev_alloc_skb failures everytime while
> > submitting RX URBs. This failure might be frequent in some low
> > resource platforms. So, wait for a threshold failure count before
> > start priting the error. This change is a follow up for the 'commit
> > 7b368e3d15c3
> > ("mwifiex: resubmit failed to submit RX URBs in main thread")'
> 
> []
> 
> > diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c
> > b/drivers/net/wireless/marvell/mwifiex/usb.c
> []
> > @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct
> urb_context *ctx, int size)
> > if (card->rx_cmd_ep != ctx->ep) {
> > ctx->skb = dev_alloc_skb(size);
> > if (!ctx->skb) {
> > -   mwifiex_dbg(adapter, ERROR,
> > -   "%s: dev_alloc_skb failed\n", __func__);
> > +   if (++card->rx_urb_failure_count >
> > +   MWIFIEX_RX_URB_FAILURE_THRESHOLD) {
> > +   mwifiex_dbg(adapter, ERROR,
> > +   "%s: dev_alloc_skb failed, failure
> count = %u\n",
> > +   __func__,
> > +   card->rx_urb_failure_count);
> > +   }
> > return -ENOMEM;
> 
> Why not use a ratelimit?
Since this is for receive, the packets are from AP side and we cannot lower the 
rate from AP. On some low performance systems this change will be helpful.

Regards,
Ganapathi



RE: [EXT] Re: [PATCH] mwifiex: check for mfg_mode in add_virtual_intf

2017-09-14 Thread Ganapathi Bhat
Hi Kalle,

> -Original Message-
> From: Kalle Valo [mailto:kv...@codeaurora.org]
> Sent: Tuesday, September 12, 2017 11:33 AM
> To: Brian Norris
> Cc: Ganapathi Bhat; linux-wireless@vger.kernel.org; Cathy Luo; Xinming
> Hu; Zhiyuan Yang; James Cao; Mangesh Malusare
> Subject: [EXT] Re: [PATCH] mwifiex: check for mfg_mode in
> add_virtual_intf
> 
> External Email
> 
> --
> Brian Norris  writes:
> 
> > Could have used a 'v2' in the subject, but hopefully that doesn't
> > bother Kalle too much.
> 
> It does create more work for me when sorting patches so please always
> try include the version in the subject. But no need to resend.
Sure. I will keep this in mind for future submits.
> 
> --
> Kalle Valo

Thanks,
Ganapathi


Re: ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Arend van Spriel

On 14-09-17 10:39, Johannes Berg wrote:

Hi Arend, Jouni, all,

Avi and I just discussed another use case for the PORT_AUTHORIZED flag,
which is PMKSA caching.

Assume you have 4-way-HS offloaded, then if you send a PMKID in the new
connection (doesn't matter if it's roaming or not), the AP may chose to
use it and start the 4-way-HS, or it may not be able to and start the
802.1X handshake.

In the supplicant, if you are waiting for the 1X handshake, then in the
case the PMKSA gets used the supplicant would wait forever, and
eventually terminate the connection because the handshake isn't
successful.

Thus, *both* cases need to know about the PORT_AUTHORIZED flag.

I previously didn't apply the patch for the flag in CONNECT
notification, but nobody is using it in ROAM notification so far, so
that we still have a chance of revisiting this.

Throwing a potential EAPOL-over-nl80211 (*) into the mix complicates
the picture further. For example, in that case the OPER_STATE might not
be set to UP until the port is authorized. In the case of roaming, this
might - on first sight - lead to toggling DOWN/UP if a new 1X handshake
needs to be done and the PORT_AUTHORIZED flag doesn't come with the
ROAM event. However, I think this is actually wrong as it would
probably lose IPv6 address assignment etc. - so I think we don't want
to do this even if we do have EAPOL-over-nl80211 and don't need the
oper_state to be up in order to do the 1X handshake. Do you agree with
this assessment?


Yep. Toggling the OPER_STATE seems to go against what roaming is about. 
Come to think of it, is it a good idea to tightly couple PORT_AUTHORIZED 
to OPER_STATE. Aren't these separate concepts in different layers of the 
network stack.

Assuming this assessment is correct, this opens up another possibility:
tracking the PORT_AUTHORIZED state with a separate event:

  * new connection case - no PMKSA (usage)
- CONNECT event
  - [if !eapol-over-nl: toggle oper state up]
  - initialize 1X state machines/timeouts
- 1X handshake
- send PMK to device for 4-way-HS
- AUTHORIZED event
  - [if eapol-over-nl: toggle oper state up]

  * new connection case - PMKSA usage
- CONNECT event
  - [if !eapol-over-nl: toggle oper state up]
  - initialize 1X state machines/timeouts
- AUTHORIZED event
  - stop 1X state machine/timeouts
  - [if eapol-over-nl: toggle oper state up]

  * roaming case - no PMKSA (usage)
- ROAM event
  - [no touching oper state]
  - initialize 1X state machines/timeouts
- 1X handshake
- AUTHORIZED event
  - [no touching oper state]

  * roaming case - PMKSA usage
- ROAM event
  - [no touching oper state]
  - initialize 1X state machines/timeouts
- AUTHORIZED event
  - stop 1X state machine/timeouts
  - [no touching oper state]


Does this seem reasonable? If possible, I prefer this to just having
the attribute, because with the attribute the roaming case will be
difficult, the natural code flow in the driver would probably make it
end up looking like this:

  * roaming case - no PMKSA (usage)
- ROAM event
  - [no touching
oper state]
  - initialize 1X state machines/timeouts
- 1X
handshake
- CONNECT event w/ PORT_AUTHORIZED
  - [no touching oper
state]

which seems awkward.

Any other thoughts?


In earlier discussions the proposal for a separate event was made by 
Jithu (colleague). In brcmfmac it would become a bit less complicated 
with a separate event so it has my vote as well. So the AUTHORIZED event 
will have no attributes, right? So if the event occurs it is AUTHORIZED.



johannes

(*) is anyone working on that? I'll throw it on my list if not.


The last I saw on this was Denis Kenzior volunteering for it, but that 
was about it.


By the way, I still have wpa_supplicant patches for 4way-hs offloading. 
I need to dust it off a little and obviously did not foresee the change 
above ;-) Let me know if you (Intel) plan to submit patches for it.


Regards,
Arend


Re: rtl8821ae keep alive not set, connection lost

2017-09-14 Thread James Cameron
On Wed, Sep 13, 2017 at 07:39:35PM -0500, Larry Finger wrote:
> On 09/13/2017 04:46 PM, James Cameron wrote:
> >
> >I'll give it some more testing and let you know, but it seems as
> >capable of keeping a connection as 4.13 plus my earlier revert.
> >

Testing went well; removing the call to enable ASPM was as good as
changing the DBI read back to 16-bit width.

> The change I sent earlier should be as good as reverting the change
> to write_byte in your reversion.

Yes, that would be the hope.

But with the 16-bit DBI read, the register REG_DBI_CTRL+0 is being
read as well, in the first read in _rtl8821ae_enable_aspm_back_door,
so perhaps reading that register has an unexpected side-effect.

Is there any documentation for that register?  I see other code writes
to REG_DBI_CTRL+3, in _rtl8821ae_check_pcie_dma_hang

Evidence of read from REG_DBI_CTRL was captured with an instrumented
kernel; git diff http://dev.laptop.org/~quozl/y/1dsQ6B.txt yielding
these dmesg lines;

[6.010255] rtl_pci: _rtl_pci_update_default_setting const_amdpci_aspm=03
[6.010338] rtl_pci: rtl_pci_enable_aspm
[6.034295] ieee80211 phy0: Selected rate control algorithm 'rtl_rc'
[6.034806] rtlwifi: rtlwifi: wireless switch is on
[6.196958] rtl8821ae :02:00.0 wlp2s0: renamed from wlan0
[7.979186] rtl_pci: rtl_pci_disable_aspm
[7.979306] rtl8821ae: _rtl8821ae_check_pcie_dma_hang
[8.295360] rtl8821ae: _rtl8821ae_enable_aspm_back_door
[8.295437] rtl8821ae: _rtl8821ae_dbi_read  070f ->  (@034f)
[8.295449] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c)
[8.295462] rtl8821ae: _rtl8821ae_dbi_read  0719 -> 0200 (@034d)
[8.295474] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718)
[8.295477] rtl_pci: rtl_pci_enable_aspm
[8.469734] rtl_pci: rtl_pci_disable_aspm
[8.469857] rtl8821ae: _rtl8821ae_check_pcie_dma_hang
[8.686955] rtl8821ae: _rtl8821ae_enable_aspm_back_door
[8.687013] rtl8821ae: _rtl8821ae_dbi_read  070f ->  (@034f)
[8.687025] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c)
[8.687038] rtl8821ae: _rtl8821ae_dbi_read  0719 -> 0218 (@034d)
[8.687050] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718)
[8.687053] rtl_pci: rtl_pci_enable_aspm

Observe how the windowed read of DBI register 0x70f causes a read of
16-bits at 0x34f, which includes first 8-bits of 0x350 REG_DBI_CTRL.

By the way, the cold boot value of DBI register 0x719 is 0x00, and
the warm boot value is 0x18, so I'm confident there isn't a
comprehensive register reset.  It means that BIOS has relevance; and
this BIOS is outside my control.  BIOS variation may explain
difficulty reproducing.

> There has been a report (in Russian unfortunately) at
> https://www.linux.org.ru/forum/desktop/12620193 of delays in ARP
> handling.

Thanks.  I've considered and excluded ARP handling delay.  Though ARP
renewal is typical reason for device sleep to end.

With the call to enable ASPM disabled, instead of changing the DBI
read to 16-bit width, what happens is that the device stops accepting
data from the access point, packets are buffered there, and are
transmitted as soon as the device makes the next transmission.

http://dev.laptop.org/~quozl/z/1dsQBf.txt has the ping and IP tcpdump
to confirm this.

I've a monitor mode tcpdump I can send by private mail if required.
In that the burst of packets shows ICMP echo requests were buffered by
the access point.

> According to Google translate is as follows:
> 
> 
> Periodically, Wi-Fi networker rtl8821ae ceases to respond to ARP,
> which causes the Internet to end. Wireshark looks quite interesting:
> ARP replays can be sent by one large packet a few seconds after
> receiving the requests, ie. they seem to be buffered somewhere.

Yes, buffering at access point.

> I need to explore that ENOBUFS return code.

I've seen ENOBUFS up at the application level with ping too, when the
original problem happens with v4.10 plus stable.

> Your case where the device is unresponsive to pings from another NIC
> until the device transmits may also be an ARP problem.
> 
> For completeness, are you using the 2.4 of 5 GHz band? What is the
> make/model your AP? If possible for you to determine, what firmware
> is it running?

2.4 GHz and 5 GHz reproduces the problem.

Open or WPA reproduces the problem.

Netgear WNDR3800 OpenWrt 12.09-beta, r33312.

Several other access points reproduce the problem, including a
customer's TP-Link TL-WR1042ND with unknown firmware version.

No access point as yet does not reproduce the problem.

Hope that helps, thanks for your ideas.

-- 
James Cameron
http://quozl.netrek.org/


[PATCH v1] iwlwifi: mvm: allow monitor mode capture in STA mode

2017-09-14 Thread gavinli
From: Gavin Li 

Open up the filter if there is a monitor interface configured; this
allows all packets on the channel to be captured even if the device is
in STA mode and associated to a BSS.

Signed-off-by: Gavin Li 
---
 drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c |  6 ++
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 15 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
index a2bf530eeae4..38b933dc8545 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c
@@ -654,6 +654,12 @@ static void iwl_mvm_mac_ctxt_cmd_common(struct iwl_mvm 
*mvm,
 
cmd->filter_flags = cpu_to_le32(MAC_FILTER_ACCEPT_GRP);
 
+   if (mvm->hw->conf.flags & IEEE80211_CONF_MONITOR)
+   cmd->filter_flags |= cpu_to_le32(MAC_FILTER_IN_PROMISC |
+MAC_FILTER_IN_CONTROL_AND_MGMT 
|
+MAC_FILTER_IN_BEACON |
+MAC_FILTER_IN_PROBE_REQUEST);
+
for (i = 0; i < IEEE80211_NUM_ACS; i++) {
u8 txf = iwl_mvm_mac_ac_to_tx_fifo(mvm, i);
 
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c 
b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
index 15f2d826bb4b..c529fb8228df 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
@@ -1530,8 +1530,23 @@ static void iwl_mvm_mac_remove_interface(struct 
ieee80211_hw *hw,
mutex_unlock(>mutex);
 }
 
+static void iwl_mvm_monitor_changed_iterator(void *_mvm, u8 *mac, struct 
ieee80211_vif *vif)
+{
+   struct iwl_mvm *mvm = _mvm;
+   iwl_mvm_mac_ctxt_changed(mvm, vif, false, NULL);
+}
+
 static int iwl_mvm_mac_config(struct ieee80211_hw *hw, u32 changed)
 {
+   struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw);
+
+   if (changed & IEEE80211_CONF_CHANGE_MONITOR) {
+   mutex_lock(>mutex);
+   ieee80211_iterate_active_interfaces(hw, 
IEEE80211_IFACE_ITER_NORMAL,
+   iwl_mvm_monitor_changed_iterator, mvm);
+   mutex_unlock(>mutex);
+   }
+
return 0;
 }
 
-- 
2.14.1



ROAM/CONNECT event with PORT_AUTHORIZED

2017-09-14 Thread Johannes Berg
Hi Arend, Jouni, all,

Avi and I just discussed another use case for the PORT_AUTHORIZED flag,
which is PMKSA caching.

Assume you have 4-way-HS offloaded, then if you send a PMKID in the new
connection (doesn't matter if it's roaming or not), the AP may chose to
use it and start the 4-way-HS, or it may not be able to and start the
802.1X handshake.

In the supplicant, if you are waiting for the 1X handshake, then in the
case the PMKSA gets used the supplicant would wait forever, and
eventually terminate the connection because the handshake isn't
successful.

Thus, *both* cases need to know about the PORT_AUTHORIZED flag.

I previously didn't apply the patch for the flag in CONNECT
notification, but nobody is using it in ROAM notification so far, so
that we still have a chance of revisiting this.

Throwing a potential EAPOL-over-nl80211 (*) into the mix complicates
the picture further. For example, in that case the OPER_STATE might not
be set to UP until the port is authorized. In the case of roaming, this
might - on first sight - lead to toggling DOWN/UP if a new 1X handshake
needs to be done and the PORT_AUTHORIZED flag doesn't come with the
ROAM event. However, I think this is actually wrong as it would
probably lose IPv6 address assignment etc. - so I think we don't want
to do this even if we do have EAPOL-over-nl80211 and don't need the
oper_state to be up in order to do the 1X handshake. Do you agree with
this assessment?

Assuming this assessment is correct, this opens up another possibility:
tracking the PORT_AUTHORIZED state with a separate event:

 * new connection case - no PMKSA (usage)
   - CONNECT event
 - [if !eapol-over-nl: toggle oper state up]
 - initialize 1X state machines/timeouts
   - 1X handshake
   - send PMK to device for 4-way-HS
   - AUTHORIZED event
 - [if eapol-over-nl: toggle oper state up]

 * new connection case - PMKSA usage
   - CONNECT event
 - [if !eapol-over-nl: toggle oper state up]
 - initialize 1X state machines/timeouts
   - AUTHORIZED event
 - stop 1X state machine/timeouts
 - [if eapol-over-nl: toggle oper state up]

 * roaming case - no PMKSA (usage)
   - ROAM event
 - [no touching oper state]
 - initialize 1X state machines/timeouts
   - 1X handshake
   - AUTHORIZED event
 - [no touching oper state]

 * roaming case - PMKSA usage
   - ROAM event
 - [no touching oper state]
 - initialize 1X state machines/timeouts
   - AUTHORIZED event
 - stop 1X state machine/timeouts
 - [no touching oper state]


Does this seem reasonable? If possible, I prefer this to just having
the attribute, because with the attribute the roaming case will be
difficult, the natural code flow in the driver would probably make it
end up looking like this:

 * roaming case - no PMKSA (usage)
   - ROAM event
 - [no touching
oper state]
 - initialize 1X state machines/timeouts
   - 1X
handshake
   - CONNECT event w/ PORT_AUTHORIZED
 - [no touching oper
state]

which seems awkward.

Any other thoughts?

johannes

(*) is anyone working on that? I'll throw it on my list if not.