On 9/5/2024 9:46 AM, Baochen Qiang wrote:
> 
> 
> On 9/5/2024 2:03 AM, Jeff Johnson wrote:
>> On 8/16/2024 5:04 AM, James Prestwood wrote:
>>> Hi Baochen,
>>>
>>> On 8/16/24 3:19 AM, Baochen Qiang wrote:
>>>>
>>>> On 7/12/2024 9:11 PM, James Prestwood wrote:
>>>>> Hi,
>>>>>
>>>>> I've seen this error mentioned on random forum posts, but its always 
>>>>> associated with a kernel crash/warning or some very obvious negative 
>>>>> behavior. I've noticed this occasionally and at one location very 
>>>>> frequently during FT roaming, specifically just after CMD_ASSOCIATE is 
>>>>> issued. For our company run networks I'm not seeing any negative behavior 
>>>>> apart from a 3 second delay in sending the re-association frame since the 
>>>>> kernel waits for this timeout. But we have some networks our clients run 
>>>>> on that we do not own (different vendor), and we are seeing association 
>>>>> timeouts after this error occurs and in some cases the AP is sending a 
>>>>> deauthentication with reason code 8 instead of replying with a 
>>>>> reassociation reply and an error status, which is quite odd.
>>>>>
>>>>> We are chasing down this with the vendor of these APs as well, but the 
>>>>> behavior always happens after we see this key removal failure/timeout on 
>>>>> the client side. So it would appear there is potentially a problem on 
>>>>> both the client and AP. My guess is _something_ about the re-association 
>>>>> frame changes when this error is encountered, but I cannot see how that 
>>>>> would be the case. We are working to get PCAPs now, but its through a 3rd 
>>>>> party, so that timing is out of my control.
>>>>>
>>>>>  From the kernel code this error would appear innocuous, the old key is 
>>>>> failing to be removed but it gets immediately replaced by the new key. 
>>>>> And we don't see that addition failing. Am I understanding that logic 
>>>>> correctly? I.e. this logic:
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/mac80211/key.c#n503
>>>>>
>>>>> Below are a few kernel logs of the issue happening, some with the deauth 
>>>>> being sent by the AP, some with just timeouts:
>>>>>
>>>>> --- No deauth frame sent, just association timeouts after the error ---
>>>>>
>>>>> Jul 11 00:05:30 kernel: wlan0: disconnect from AP <previous BSS> for new 
>>>>> assoc to <new BSS>
>>>>> Jul 11 00:05:33 kernel: ath10k_pci 0000:02:00.0: failed to install key 
>>>>> for vdev 0 peer <previous BSS>: -110
>>>>> Jul 11 00:05:33 kernel: wlan0: failed to remove key (0, <previous BSS>) 
>>>>> from hardware (-110)
>>>>> Jul 11 00:05:33 kernel: wlan0: associate with <new BSS> (try 1/3)
>>>>> Jul 11 00:05:33 kernel: wlan0: associate with <new BSS> (try 2/3)
>>>>> Jul 11 00:05:33 kernel: wlan0: associate with <new BSS> (try 3/3)
>>>>> Jul 11 00:05:33 kernel: wlan0: association with <new BSS> timed out
>>>>> Jul 11 00:05:36 kernel: wlan0: authenticate with <new BSS>
>>>>> Jul 11 00:05:36 kernel: wlan0: send auth to <new BSS>a (try 1/3)
>>>>> Jul 11 00:05:36 kernel: wlan0: authenticated
>>>>> Jul 11 00:05:36 kernel: wlan0: associate with <new BSS> (try 1/3)
>>>>> Jul 11 00:05:36 kernel: wlan0: RX AssocResp from <new BSS> (capab=0x1111 
>>>>> status=0 aid=16)
>>>>> Jul 11 00:05:36 kernel: wlan0: associated
>>>>>
>>>>> --- Deauth frame sent amidst the association timeouts ---
>>>>>
>>>>> Jul 11 00:43:18 kernel: wlan0: disconnect from AP <previous BSS> for new 
>>>>> assoc to <new BSS>
>>>>> Jul 11 00:43:21 kernel: ath10k_pci 0000:02:00.0: failed to install key 
>>>>> for vdev 0 peer <previous BSS>: -110
>>>>> Jul 11 00:43:21 kernel: wlan0: failed to remove key (0, <previous BSS>) 
>>>>> from hardware (-110)
>>>>> Jul 11 00:43:21 kernel: wlan0: associate with <new BSS> (try 1/3)
>>>>> Jul 11 00:43:21 kernel: wlan0: deauthenticated from <new BSS> while 
>>>>> associating (Reason: 8=DISASSOC_STA_HAS_LEFT)
>>>>> Jul 11 00:43:24 kernel: wlan0: authenticate with <new BSS>
>>>>> Jul 11 00:43:24 kernel: wlan0: send auth to <new BSS> (try 1/3)
>>>>> Jul 11 00:43:24 kernel: wlan0: authenticated
>>>>> Jul 11 00:43:24 kernel: wlan0: associate with <new BSS> (try 1/3)
>>>>> Jul 11 00:43:24 kernel: wlan0: RX AssocResp from <new BSS> (capab=0x1111 
>>>>> status=0 aid=101)
>>>>> Jul 11 00:43:24 kernel: wlan0: associated
>>>>>
>>>> Hi James, this is QCA6174, right? could you also share firmware version?
>>>
>>> Yep, using:
>>>
>>> qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1dac:0261
>>> firmware ver WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp 
>>> crc32 bf907c7c
>>>
>>> I did try in one instance the latest firmware, 309, and still saw the 
>>> same behavior but 288 is what all our devices are running.
>>>
>>> Thanks,
>>>
>>> James
>>
>> Baochen, are you looking more into this? Would prefer to fix the root cause
>> rather than take "[RFC 0/1] wifi: ath10k: improvement on key removal failure"
> I asked CST team to try to reproduce this issue such that we can get firmware 
> dump for debug further. What I got is that CST team is currently busy at 
> other critical schedules and they are planning to debug this ath10k issue 
> after those schedules get finished.
> 

Jeff, I am notified that CST team can not reproduce this issue.



Reply via email to