Re: rtl8723bu: low signal, fails to associate
On Wed, Aug 29, 2018 at 04:19:01PM -0400, Jes Sorensen wrote: > On 08/23/2018 09:36 PM, James Cameron wrote: > > G'day Carlo, Mylene, and James, > > > > Thanks for your earlier reports about RTL8723bu. Have you any more > > recent experiences you might share? > > > > I'm evaluating a sample laptop which worked fine with Windows 10, but > > not very well with Ubuntu 18.04, and kernel v4.15 or kernel v4.18.4. > > > > The laptop is by Hena, model NT16-PRO-C-E, with a wireless device on > > internal USB (0x0bda:0xb720) which loads rtl8xxxu, identifying as > > RTL8723BU. > > > > http://dev.laptop.org/~quozl/z/1ft0qv.txt (dmesg) > > > > Symptoms are low RSSI on scan, very short range, and often a failure > > to associate over a distance of two metres in a radio quiet location. > > > > Symptoms began after first power off, which suggests that device > > registers programmed by the previous operating system Windows 10 had > > not been reset by reboot into the Ubuntu 18.04 installer. The device > > worked fine in Ubuntu 18.04 before the first power off. > > > > Jes, let me know if there is anything I can do to help. > > It's been a while since I had time to look at the 8723bu support, and > rtl8xxxu doesn't have BT coexist support. I notice that your laptop does > load the bluetooth module for 8723bu which I believe fiddles with the > antenna configuration and is likely to take control of the antennas. I > suspect this is why you see low signal quality on the WiFi side. > > If you blacklist the BT module, does it work better? Thanks. No, it doesn't work any better, or worse. Method: add "blacklist btusb" to /etc/modprobe.d/blacklist.conf, regenerate initramfs, and boot. Also, a difference in symptom between cold and warm boot; * on cold boot after 15 seconds of power off, device connects but has short range, with received power at monitor of -76dBm, http://dev.laptop.org/~quozl/z/1fv7sW.txt (dmesg, cold boot, no btusb) * on warm boot with less than 5 seconds of power off, device does not connect, dmesg "authentication with xx:yy:zz:aa:bb:cc timed out", and probe request, authentication, and association packets are not seen by monitor, http://dev.laptop.org/~quozl/z/1fv8IO.txt (dmesg, warm boot, no btusb) In both cases, scan results are normal, and similar signal level. An active scan for cold boot, and a passive scan for warm boot. Monitor device is an ath9k about 30cm away; a radio quiet environment on a farm. Speculation; device registers are not being reset. Would not have been a problem for RTL8723BU on removable USB. -- James Cameron http://quozl.netrek.org/
rtl8723bu: low signal, fails to associate
G'day Carlo, Mylene, and James, Thanks for your earlier reports about RTL8723bu. Have you any more recent experiences you might share? I'm evaluating a sample laptop which worked fine with Windows 10, but not very well with Ubuntu 18.04, and kernel v4.15 or kernel v4.18.4. The laptop is by Hena, model NT16-PRO-C-E, with a wireless device on internal USB (0x0bda:0xb720) which loads rtl8xxxu, identifying as RTL8723BU. http://dev.laptop.org/~quozl/z/1ft0qv.txt (dmesg) Symptoms are low RSSI on scan, very short range, and often a failure to associate over a distance of two metres in a radio quiet location. Symptoms began after first power off, which suggests that device registers programmed by the previous operating system Windows 10 had not been reset by reboot into the Ubuntu 18.04 installer. The device worked fine in Ubuntu 18.04 before the first power off. Jes, let me know if there is anything I can do to help. -- James Cameron http://quozl.netrek.org/
Re: iwlwifi intermittent beacon capture in monitor mode?
G'day Tyler, I've seen that kind of behaviour when there are multiple APs with the same beacon timing, and one or more APs are not backing off. In my case the beacons were colliding. The times without beacons followed a regular pattern; based on the variance in CPU oscillator clocks of the APs. Cooling or heating an AP changed the pattern. Behaviour also varied across cards; RF sensitivity of a batch of cards follows a statistical normal distribution, with a bit of warping caused by manufacturing test rejects. Have you access to a spectrum analyser? You might check what transmissions are happening at the same time, on or near 2.457 MHz. Can you exclude all other APs, e.g. by placing the devices inside a disconnected microwave oven? Can you monitor the current of the card with a digital storage oscilloscope? Can you watch the beacons with an RF probe and an oscilloscope? Simplest probe is a diode (axial, bandoleer) with leads cut for a multiple of 2.457 MHz held in oscilloscope probes within an inch or so of the card antenna. With both these last two tests, you may see dips corresponding to beacon transmissions. If they stop, you know you have a firmware or software problem. -- James Cameron http://quozl.netrek.org/
Re: [PATCH] rtlwifi: rtl8723be: Fix loss of signal
On Thu, Feb 22, 2018 at 02:28:59PM -0600, Larry Finger wrote: > In commit c713fb071edc ("rtlwifi: rtl8821ae: Fix connection lost problem > correctly") a problem in rtl8821ae that caused loss of signal was fixed. > That same problem has now been reported for rtl8723be. Accordingly, > the ASPM L1 latency has been increased from 0 to 7 to fix the instability. > > Signed-off-by: Larry Finger <larry.fin...@lwfinger.net> > Cc: Stable <sta...@vger.kernel.org> Tested-by: James Cameron <qu...@laptop.org> With both patches applied to v4.15 on OLPC NL3 with rtl8723be. Nice catch, well done! May explain some of our problems with rtl8723be that made me withdraw it from production of laptops. -- James Cameron http://quozl.netrek.org/
Re: [PATCH v2] ath9k: mark RSSI as invalid if frame received during channel setup
On Thu, Feb 15, 2018 at 08:52:53AM +, Jean Pierre TOSONI wrote: > > -Message d'origine- > > De : qu...@laptop.org [mailto:qu...@laptop.org] > > Envoyé : jeudi 15 février 2018 08:21 > > À : Kalle Valo > > Cc : Jean Pierre TOSONI; linux-wireless@vger.kernel.org; ath9k- > > de...@qca.qualcomm.com > > Objet : Re: [PATCH v2] ath9k: mark RSSI as invalid if frame received > > during channel setup > > > > On Thu, Feb 15, 2018 at 07:51:28AM +0200, Kalle Valo wrote: > > > James Cameron <qu...@laptop.org> writes: > > > > > >> On Wed, Feb 14, 2018 at 04:26:42PM +, Jean Pierre TOSONI > > wrote: > > >>> ath9k returns a wrong RSSI value for frames received > > >>> in a 30ms time window after a channel change. The > > >>> correct value is typically 10dB below the returned value. > > >> > > >> How was your correct value determined? > > >> > > 1) test setup: > Connecting the AP through coax and attenuators, then making 500 passive scans > off-channel, then drawing an histogram of the beacon signals found by the > chip. The off-channel period is 108 ms. The probability of being in the 30 ms > window is 28%. The histogram shows 2 spikes, one large with the expected > value, one small at around +10dB above. > > 2) value determination > Adjust the delay (CONFIG_HZ=250) by trial and error. 25ms was not enough to > completely absorb the +10dB spike in the histogram, while 30ms was enough. > > Do you think of a better approach? No, I think your approach is fine. I was curious. Thanks for explaining. > Maybe the guys at Qualcomm know the correct value? Yes, that seems likely. > > >>> This was found with a Atheros AR9300 Rev:3 chip (WLE350NX / > > >>> JWX6083 cards), during offchannel scans. > > >>> > > >>> Mark the signal value as invalid in this case. > > >> > > >> Why not adjust by 10dB? > > I considered that also. But, > 1) during how much time should I do this adjustment? Around 30 ms after > channel switch? Yes. If RSSI is so critical for your application, you'll do what you can to get a real RSSI rather than drop it. > 2) The histogram shows a scattering of the measures in a +/- 3dB range around > the mean value. Perhaps a sampling error by the device. > So I could not decide for sure if it needed -9dB, -10dB or -11dB? > > > >> > > >> Speculating: in a typical card, RSSI is calculated by firmware > > from > > >> readings of ADCs attached to the receiver. Firmware may average > > >> several readings. Firmware may apply other offsets or > > calibrations, > > >> based on frequency and temperature. This sounds like a firmware > > >> problem. > > > > > > ath9k does not have firmware, only ath9k_htc has it. > > > > Heh. s/firmware/silicon implementation/g > > Oh well, if it's silicon problem, then it's a hardware problem, and > I am right to correct it that way, since there is no other way :-) Yes, if it can be reproduced by every ath9k. -- James Cameron http://quozl.netrek.org/
Re: [PATCH v2] ath9k: mark RSSI as invalid if frame received during channel setup
On Thu, Feb 15, 2018 at 07:51:28AM +0200, Kalle Valo wrote: > James Cameron <qu...@laptop.org> writes: > > > On Wed, Feb 14, 2018 at 04:26:42PM +, Jean Pierre TOSONI wrote: > >> ath9k returns a wrong RSSI value for frames received in a 30ms time > >> window after a channel change. The correct value is typically 10dB > >> below the returned value. > > > > How was your correct value determined? > > > >> This was found with a Atheros AR9300 Rev:3 chip (WLE350NX / JWX6083 > >> cards), during offchannel scans. > >> > >> Mark the signal value as invalid in this case. > > > > Why not adjust by 10dB? > > > > Speculating: in a typical card, RSSI is calculated by firmware from > > readings of ADCs attached to the receiver. Firmware may average > > several readings. Firmware may apply other offsets or calibrations, > > based on frequency and temperature. This sounds like a firmware > > problem. > > ath9k does not have firmware, only ath9k_htc has it. Heh. s/firmware/silicon implementation/g -- James Cameron http://quozl.netrek.org/
Re: [PATCH v2] ath9k: mark RSSI as invalid if frame received during channel setup
On Wed, Feb 14, 2018 at 04:26:42PM +, Jean Pierre TOSONI wrote: > ath9k returns a wrong RSSI value for frames received in a 30ms time > window after a channel change. The correct value is typically 10dB > below the returned value. How was your correct value determined? > This was found with a Atheros AR9300 Rev:3 chip (WLE350NX / JWX6083 > cards), during offchannel scans. > > Mark the signal value as invalid in this case. Why not adjust by 10dB? Speculating: in a typical card, RSSI is calculated by firmware from readings of ADCs attached to the receiver. Firmware may average several readings. Firmware may apply other offsets or calibrations, based on frequency and temperature. This sounds like a firmware problem. > Signed-off-by: Jean Pierre TOSONI <jp.tos...@acksys.fr> > [...] -- James Cameron http://quozl.netrek.org/
Re: [PATCH] rtlwifi: rtl8821ae: Fix connection lost problem correctly
On Mon, Feb 05, 2018 at 12:38:11PM -0600, Larry Finger wrote: > There has been a coding error in rtl8821ae since it was first introduced, > namely that an 8-bit register was read using a 16-bit read in > _rtl8821ae_dbi_read(). This error was fixed with commit 40b368af4b75 > ("rtlwifi: Fix alignment issues"); however, this change led to > instability in the connection. To restore stability, this change > was reverted in commit b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection > lost problem"). > > Unfortunately, the unaligned access causes machine checks in ARM > architecture, and we were finally forced to find the actual cause of the > problem on x86 platforms. Following a suggestion from Pkshih > <pks...@realtek.com>, it was found that increasing the ASPM L1 > latency from 0 to 7 fixed the instability. This parameter was varied to > see if a smaller value would work; however, it appears that 7 is the > safest value. A new symbol is defined for this quantity, thus it can be > easily changed if necessary. > > Fixes: b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection lost problem") > Cc: Stable <sta...@vger.kernel.org> # 4.14+ > Fix-suggested-by: Pkshih <pks...@realtek.com> > Signed-off-by: Larry Finger <larry.fin...@lwfinger.net> Tested-by: James Cameron <qu...@laptop.org> # x86_64 OLPC NL3 Thanks Larry & Pkshih, this does work as well as it did before. -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Wed, Jan 31, 2018 at 11:06:12AM -0600, Larry Finger wrote: > On 09/12/2017 05:09 PM, James Cameron wrote: > >Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks > >rtl8821ae keep alive, causing "Connection to AP lost" and deauth, > >but why? > > > >Wireless connection is lost after a few seconds or minutes, on > >every OLPC NL3 laptop with rtl8821ae, with any stable kernel after > >4.10.1, and any kernel with 40b368af4b75. > > > >dmesg contains > > > > wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost > > > >iw event shows > > > > wlp2s0: del station 2c:b0:5d:a6:86:eb > > wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: > > Disassociated due to inactivity > > wlp2s0 (phy #0): disconnected (local request) > > > >Workaround is to bounce the link, then reconnect; > > > > ip link set wlp2s0 down > > ip link set wlp2s0 up > > iw dev wlp2s0 connect qz > > > >A nearby monitor host captures a deauthentication packet sent by > >the device. > > > >Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment > >issues") which changes the width of DBI register read. > > > >On the face of it, 40b368af4b75 looks correct, especially compared > >against same function in rtl8723be. > > > >I've no idea why reverting fixes the problem. I'm hoping someone > >here might speculate and suggest ways to test. > > > >As keep alive is set through this path, my guess is that keep alive > >is not being set in the device. Or perhaps reading 16-bits > >perturbs another register. Is there a way to test? > > > >http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13 > > > >http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and > >revert of 40b368af4b75 > > James, > > I'm afraid we are needing to revisit this problem again. Changing > that 8-bit read to a 16-bit version causes an unaligned memory > reference in AARCH64, thus we will need to re-revert. To prevent > problems on systems such as yours, PK plans to turn off ASPM > capability and backdoor in certain platforms that will be listed in > a quirks table. Please report the output of 'dmidecode -t system' > for you affected system(s). Thanks for letting me know. We made three production runs, and I'm waiting to get a hold of the dmidecode for two of them. This may take some weeks; we have to find stock and ship it, or we have to ask our contract manufacturer (CM) if they have kept data or units. I've dmidecode for one production run. http://dev.laptop.org/~quozl/z/1eh7JF.txt (my unit nl3-e) I've dmidecode for prototypes, but they have clearly been programmed badly. We did not ask our CM for Windows compatibility, so they may have had no step to verify the data. We also went through several iterations to get serial numbers assigned, so the data I have does not have good provenance. http://dev.laptop.org/~quozl/z/1eh7EE.txt (my unit nl3-c) http://dev.laptop.org/~quozl/z/1eh7EV.txt (my unit nl3-d) http://dev.laptop.org/~quozl/z/1eh7He.txt (my unit nl3-a) http://dev.laptop.org/~quozl/z/1eh8DR.txt (my unit nl3-b) > We hope you will be able to test any proposed patches. Yes, can do. I've just tested v4.15. However, I'm concerned about your plan to use quirks; 1. turning off ASPM may decrease run time on battery, which if it is significant, across several thousand laptops will yield generator fuel or solar budget failure; can the power impact be quantified? 2. why not keep ASPM enabled, and use 8-bit when quirked, or on x86_64, or when not AARCH64? 3. why not find the underlying problem; PK is in the same company as the device firmware engineers, so it should be possible for them to find out why 16-bit access causes the device firmware to hang? We drew a blank trying to reach firmware engineers through our CM and module maker; perhaps we were not large or noisy enough. 4. it's not just me; there are others who have reported similar problems, so won't re-reverting affect them? They haven't engaged in the process as thoroughly, and may not be in the quirks table. You also reproduced the problem with different hardware. > Thanks, > > Larry -- James Cameron http://quozl.netrek.org/
Re: [PATCH v3 1/3] mwifiex: refactor device dump code to make it generic for usb interface
- int drv_info_size); > +void mwifiex_drv_info_dump(struct mwifiex_adapter *adapter); > +void mwifiex_prepare_fw_dump_info(struct mwifiex_adapter *adapter); > +void mwifiex_upload_device_dump(struct mwifiex_adapter *adapter); > void *mwifiex_alloc_dma_align_buf(int rx_len, gfp_t flags); > void mwifiex_queue_main_work(struct mwifiex_adapter *adapter); > int mwifiex_get_wakeup_reason(struct mwifiex_private *priv, u16 action, > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c > b/drivers/net/wireless/marvell/mwifiex/pcie.c > index cd31494..f666cb2 100644 > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c > @@ -2769,12 +2769,17 @@ static void mwifiex_pcie_fw_dump(struct > mwifiex_adapter *adapter) > > static void mwifiex_pcie_device_dump_work(struct mwifiex_adapter *adapter) > { > - int drv_info_size; > - void *drv_info; > + adapter->devdump_data = vzalloc(MWIFIEX_FW_DUMP_SIZE); > + if (!adapter->devdump_data) { > + mwifiex_dbg(adapter, ERROR, > + "vzalloc devdump data failure!\n"); > + return; > + } > > - drv_info_size = mwifiex_drv_info_dump(adapter, _info); > + mwifiex_drv_info_dump(adapter); > mwifiex_pcie_fw_dump(adapter); > - mwifiex_upload_device_dump(adapter, drv_info, drv_info_size); > + mwifiex_prepare_fw_dump_info(adapter); > + mwifiex_upload_device_dump(adapter); > } > > static void mwifiex_pcie_card_reset_work(struct mwifiex_adapter *adapter) > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c > b/drivers/net/wireless/marvell/mwifiex/sdio.c > index fd5183c..a828801 100644 > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c > @@ -2505,15 +2505,21 @@ static void mwifiex_sdio_generic_fw_dump(struct > mwifiex_adapter *adapter) > static void mwifiex_sdio_device_dump_work(struct mwifiex_adapter *adapter) > { > struct sdio_mmc_card *card = adapter->card; > - int drv_info_size; > - void *drv_info; > > - drv_info_size = mwifiex_drv_info_dump(adapter, _info); > + adapter->devdump_data = vzalloc(MWIFIEX_FW_DUMP_SIZE); > + if (!adapter->devdump_data) { > + mwifiex_dbg(adapter, ERROR, > + "vzalloc devdump data failure!\n"); > + return; > + } > + > + mwifiex_drv_info_dump(adapter); > if (card->fw_dump_enh) > mwifiex_sdio_generic_fw_dump(adapter); > else > mwifiex_sdio_fw_dump(adapter); > - mwifiex_upload_device_dump(adapter, drv_info, drv_info_size); > + mwifiex_prepare_fw_dump_info(adapter); > + mwifiex_upload_device_dump(adapter); > } > > static void mwifiex_sdio_work(struct work_struct *work) > -- > 1.9.1 > -- James Cameron http://quozl.netrek.org/
Re: rtl8723be on Fedora27
On Tue, Nov 21, 2017 at 09:52:12PM +0100, Rákosi Gergely wrote: > 2017-11-21 21:37 keltezéssel, James Cameron írta: > > On Tue, Nov 21, 2017 at 03:08:16PM +0100, Rákosi Gergely wrote: > >> 2017-11-18 16:52 keltezéssel, Larry Finger írta: > >>> On 11/17/2017 06:22 PM, Rákosi Gergely wrote: > >>>> Hello Larry, > >>>> > >>>> First of all, thanks your help. > >>>> Lets see...here is the kernel version: 4.13.12-300 > >>>> The machine is an Asus ROG 553VE > >>>> > >>>> The firmware which loading in the dmesg is : rtlwifi/rtl8723befw_36.bin > >>>> The output of md5sum is : 1850c1308fbcd95e9f6a7f58ede1e35f > >>> [...] > >>> sudo iw dev wlan0 scan | egrep "SSID|signal" > >>> > >>> Post that output. In addition, copy the dmesg output to some pastebin > >>> and post the link as well. > >>> > >>> Larry > >>> > >> Hello Larry, > >> > >> I hope this email post format is good, and fit to the rules. > >> Here is the output: > >> > >> root@skynet-x2 ~]# iw dev wlp2s0 scan | egrep "SSID|signal" > >> signal: -46.00 dBm > >> SSID: SKYNET-X2 > >> [...] > >> [root@skynet-x2 ~]# > > Scan results seem normal. Was this scan before disconnect? > > Yes, this command output taken while the connection was OK Thanks. > >> And the dmesg output: > >> > >> https://pastebin.com/iqQSu2hD > > Is now v4.13.13. > > Yes, thats the upgraded kernel > > > This is interesting, an H2C command was dropped, but no idea which. > > > > [9.848052] rtl8723be: error H2C cmd because of Fw download fail!!! > > > > Disconnection happened at boot+440 seconds, associate+429 seconds; > > > > [ 439.871033] rtlwifi: AP off, try to reconnect now > > [ 439.871093] wlp2s0: Connection to AP 4c:5e:0c:c7:fa:e3 lost > > > > I cannot tell what causes disconnect. I wonder if the same > > timing of the problem happens always, or if the timing varies. > > The timing always changing, never is the same. And I dont realize > the cause at now... Thanks. I had similar problem with different wireless device. > > Init MAC failed was another 30 seconds later; > > > > [ 468.600670] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready > > [ 469.618926] rtl8723be: Init MAC failed > > > > Looking at _rtl8723be_init_mac, there are two false returns; > > hardware power on fail, and llt write fail. > > > > Sorry, I don't have rtl8723be hardware. > > > > Rákosi, did any older kernel keep connection? > > The oldest kernel is 4.13.9-300.fc27.x86_64 in Fedora 27 > installation, but did the same. If you can give advise how, and > what I do, then I'll try it. You might try running "Live CD" of Fedora 26 or Fedora 25, without installing, to test if unexpected disconnection happens in two older kernels. Not for permanent solution, just for easy testing. I don't have specific advice for Fedora 27, you might ask Fedora community about that, or use RHBZ. A quick search finds this old 2013 page; https://fedoraproject.org/wiki/User:Ignatenkobrain/Kernel/Bisection -- James Cameron http://quozl.netrek.org/
Re: rtl8723be on Fedora27
On Tue, Nov 21, 2017 at 03:08:16PM +0100, Rákosi Gergely wrote: > 2017-11-18 16:52 keltezéssel, Larry Finger írta: > > On 11/17/2017 06:22 PM, Rákosi Gergely wrote: > >> Hello Larry, > >> > >> First of all, thanks your help. > >> Lets see...here is the kernel version: 4.13.12-300 > >> The machine is an Asus ROG 553VE > >> > >> The firmware which loading in the dmesg is : rtlwifi/rtl8723befw_36.bin > >> The output of md5sum is : 1850c1308fbcd95e9f6a7f58ede1e35f > > > > [...] > > sudo iw dev wlan0 scan | egrep "SSID|signal" > > > > Post that output. In addition, copy the dmesg output to some pastebin > > and post the link as well. > > > > Larry > > > Hello Larry, > > I hope this email post format is good, and fit to the rules. > Here is the output: > > root@skynet-x2 ~]# iw dev wlp2s0 scan | egrep "SSID|signal" > signal: -46.00 dBm > SSID: SKYNET-X2 > [...] > [root@skynet-x2 ~]# Scan results seem normal. Was this scan before disconnect? > And the dmesg output: > > https://pastebin.com/iqQSu2hD Is now v4.13.13. This is interesting, an H2C command was dropped, but no idea which. [9.848052] rtl8723be: error H2C cmd because of Fw download fail!!! Disconnection happened at boot+440 seconds, associate+429 seconds; [ 439.871033] rtlwifi: AP off, try to reconnect now [ 439.871093] wlp2s0: Connection to AP 4c:5e:0c:c7:fa:e3 lost I cannot tell what causes disconnect. I wonder if the same timing of the problem happens always, or if the timing varies. Init MAC failed was another 30 seconds later; [ 468.600670] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready [ 469.618926] rtl8723be: Init MAC failed Looking at _rtl8723be_init_mac, there are two false returns; hardware power on fail, and llt write fail. Sorry, I don't have rtl8723be hardware. Rákosi, did any older kernel keep connection? -- James Cameron http://quozl.netrek.org/
Re: [PATCH v2] mwifiex: do not support change AP interface to station mode
On Tue, Nov 21, 2017 at 08:03:35PM +0800, Xinming Hu wrote: > Firmware do not support change interface from micro-ap mode to > station mode, forbidden this operation in driver accordingly. "forbidden" should be "forbid", for correct tense. "in driver" is redundant and can be removed. "accordingly" is also redundant. Perhaps "Firmware do not support change interface from micro-ap mode to station mode, forbid this operation." > Signed-off-by: Cathy Luo <c...@marvell.com> > Signed-off-by: Xinming Hu <h...@marvell.com> > --- > v2: remove unnecessary sta/uap combo check(James Cameron) > > drivers/net/wireless/marvell/mwifiex/cfg80211.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > index 6e0d9a9..4d45df8 100644 > --- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > +++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > @@ -1180,7 +1180,6 @@ static int mwifiex_deinit_priv_params(struct > mwifiex_private *priv) > case NL80211_IFTYPE_AP: > switch (type) { > case NL80211_IFTYPE_ADHOC: Change interface type from micro-ap to adhoc is supported? > - case NL80211_IFTYPE_STATION: > return mwifiex_change_vif_to_sta_adhoc(dev, curr_iftype, > type, params); > break; > -- > 1.9.1 > -- James Cameron http://quozl.netrek.org/
Re: [EXT] Re: [PATCH] mwifiex: do not support change AP interface to station mode
On Tue, Nov 21, 2017 at 12:03:19PM +, Xinming Hu wrote: > Hi James, > > > -Original Message- > > From: qu...@laptop.org [mailto:qu...@laptop.org] > > Sent: 2017年11月21日 16:04 > > To: Xinming Hu <h...@marvell.com> > > Cc: Linux Wireless <linux-wireless@vger.kernel.org>; Kalle Valo > > <kv...@codeaurora.org>; Brian Norris <briannor...@chromium.org>; Dmitry > > Torokhov <d...@google.com>; raja...@google.com; Zhiyuan Yang > > <yan...@marvell.com>; Tim Song <song...@marvell.com>; Cathy Luo > > <c...@marvell.com>; James Cao <j...@marvell.com>; Ganapathi Bhat > > <gb...@marvell.com>; Ellie Reeves <ellierev...@gmail.com> > > Subject: [EXT] Re: [PATCH] mwifiex: do not support change AP interface to > > station mode > > > > External Email > > > > -- > > On Tue, Nov 21, 2017 at 03:24:03PM +0800, Xinming Hu wrote: > > > Firmware do not support change interface from micro-ap mode to station > > > mode, forbidden this operation in driver accordingly. > > > > All firmware or specific versions? > > > > This property result from the initial design consideration in > firmware. Thanks. I maintain a product that uses your MV8787 device with firmware sd8787_uapsta.bin and review mwifiex patches for local backport. > > > > > > > Signed-off-by: Cathy Luo <c...@marvell.com> > > > Signed-off-by: Xinming Hu <h...@marvell.com> > > > --- > > > drivers/net/wireless/marvell/mwifiex/cfg80211.c | 6 ++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > > > b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > > > index 6e0d9a9..a87758f 100644 > > > --- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > > > +++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > > > @@ -1181,6 +1181,12 @@ static int mwifiex_deinit_priv_params(struct > > mwifiex_private *priv) > > > switch (type) { > > > case NL80211_IFTYPE_ADHOC: > > > case NL80211_IFTYPE_STATION: > > > + if (mwifiex_get_priv_by_id(priv->adapter, priv->bss_num, > > > +MWIFIEX_BSS_TYPE_STA)){ > > > > Is this test necessary? > > Hhn, yes, Will remove this check, which comes from a fix for combo sta/uap > case. > Thanks for the suggestion. > > > > > dev->ieee80211_ptr->iftype is always NL80211_IFTYPE_AP at this point. > > > > > + mwifiex_dbg(priv->adapter, INFO, > > > + "Skip change virtual interface\n"); > > > > Is this message easy to understand? Other messages in the same function > > seem easier; e.g. "%s: changing to %d not supported\n" > > OK. > > > > > > + return 0; > > > > Should this be -EOPNOTSUPP rather than 0? > > Yes. > > > > > > + } > > > return mwifiex_change_vif_to_sta_adhoc(dev, curr_iftype, > > > type, params); > > > break; > > > -- > > > 1.9.1 > > > > > > > -- > > James Cameron > > http://quozl.netrek.org/ -- James Cameron http://quozl.netrek.org/
Re: [PATCH] mwifiex: do not support change AP interface to station mode
On Tue, Nov 21, 2017 at 03:24:03PM +0800, Xinming Hu wrote: > Firmware do not support change interface from micro-ap mode to > station mode, forbidden this operation in driver accordingly. All firmware or specific versions? > > Signed-off-by: Cathy Luo <c...@marvell.com> > Signed-off-by: Xinming Hu <h...@marvell.com> > --- > drivers/net/wireless/marvell/mwifiex/cfg80211.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > index 6e0d9a9..a87758f 100644 > --- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c > +++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c > @@ -1181,6 +1181,12 @@ static int mwifiex_deinit_priv_params(struct > mwifiex_private *priv) > switch (type) { > case NL80211_IFTYPE_ADHOC: > case NL80211_IFTYPE_STATION: > + if (mwifiex_get_priv_by_id(priv->adapter, priv->bss_num, > +MWIFIEX_BSS_TYPE_STA)){ Is this test necessary? dev->ieee80211_ptr->iftype is always NL80211_IFTYPE_AP at this point. > + mwifiex_dbg(priv->adapter, INFO, > + "Skip change virtual interface\n"); Is this message easy to understand? Other messages in the same function seem easier; e.g. "%s: changing to %d not supported\n" > + return 0; Should this be -EOPNOTSUPP rather than 0? > + } > return mwifiex_change_vif_to_sta_adhoc(dev, curr_iftype, > type, params); > break; > -- > 1.9.1 > -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae dbi read question
On Sun, Nov 05, 2017 at 04:15:36PM -0500, Nik Nyby wrote: > I also want to note that adding rtl8821ae.aspm=0 to my grub kernel > boot command doesn't fix my problem. (I'm building this driver into > the kernel, not as a module). My connection dropping problem is > fixed only if I comment out the aspm init code in the driver, per > this patch: > https://patchwork.kernel.org/patch/9951511/ Interesting; that patch was for testing, and did more than just aspm=0. Check for a BIOS update from your vendor. Device registers can be configured and persist despite warm reboot, which implies there is no register reset in driver start or device reset, which further implies that the BIOS can easily affect the outcome. Check also for different behaviour after cold reboot; that is a power down for 30 seconds then power up. Do you yet have a known working kernel to bisect against? -- James Cameron http://quozl.netrek.org/
Re: RTL usb adapter question
On Fri, Oct 27, 2017 at 10:23:54PM -0500, David Ashley wrote: > On 10/26/17, James Cameron <qu...@laptop.org> wrote: > > Interesting, thanks. It should be a QFN 46 pin chip; you may have > > counted 15 instead of 14 pins on the long edge. Send me a photograph > > of the inside, off-list? > > I uploaded a couple of pictures here: > http://www.linuxmotors.com/RTL8188CUS/ > > You're right, I miscounted, it has 46 pins. Thanks. The BZ5JA might be 5V to 3.3V switching voltage regulator, with inductor above it. Datasheet shows there is internal non-volatile memory, powered from pin 27, which has a trace to an external filter capacitor. The large zero ohm resistor bottom right is interesting; size chosen for accessibility; probably for fault isolation or qualification. In summary, the board design is consistent with the datasheet, and confirms non-volatile memory that will contain configuration data and probably firmware. I agree with Larry; try the firmware file. -- James Cameron http://quozl.netrek.org/
Re: rtlwifi oops
On Sat, Oct 28, 2017 at 12:02:30AM +0300, nirinA raseliarison wrote: > On 10/27/2017 07:57 AM, James Cameron wrote: > >On Fri, Oct 27, 2017 at 04:08:48AM +0300, nirinA raseliarison wrote: > >>hi all, > >>i applied the patch against 4.13.8. i still got some trouble, dmesg > >>is below. > > > >As this new event does not have "disabled by hub (EMI?)", it is a > >different problem to your 19th October post, so I don't think the > >patch is relevant. > > > >>after i plugged the device, it seems to be detected and all modules > >>loaded, but when i tried to connect to an access point, by using > >>wicd, it halted after a while. at this point, all usb ports are > >>broken, there was no more log in dmesg, > > > >If the other USB ports are not responding, then your problem is > >probably wider than the wireless device, and the wireless device > >is acting as the "canary in the mine"; failing first because it is > >the most active. > > > >Can you test to exclude possibility of damaged USB host controller or > >hub? > > yes, dmesg below with an usb audio adapter and a usb mouse plugged at > boot time. then the rtl8192cu plugged, and i'm using it to retrieve > and send this mail. Thanks. Your dmesg shows the mouse is discovered, then disconnects, then reconnects. I can't tell if your mouse normally does this. Can you also test for the wireless problem without the USB mouse, or with a different mouse? Your dmesg also shows "cannot get freq" for USB audio device endpoints, but I'm not sure what this means. > my first guess was also about a damaged device or usb port > as those random crashes are recent. > note that the device i'm using here is not the same as the one > that triggered the previous errors. > > >>lsusb still showed the device even after being unplugged. it got > >>even worse as reboot failed. > > > >Yes, once a USB host controller is failed, organised reboot can be > >difficult. lsusb not updated confirms host controller not responding. > > > >>i cannot really trace the error as right now all thing works fine. > > > >Your dmesg looks like you removed and reinserted the wireless device > >several times. Did you do that, or did the system do it without any > >physical action? > > no, the device was always connected. i've only removed it long after > i noticed something went wrong and just before i tried reboot. Okay, thanks. I'm worried that unexpected disconnect suggests a USB host or hub problem. > >A full dmesg from boot may be interesting, at least to better > >understand the USB host controller. > > > > here it is. > thanks, > [...] -- James Cameron http://quozl.netrek.org/
Re: rtlwifi oops
On Fri, Oct 27, 2017 at 04:08:48AM +0300, nirinA raseliarison wrote: > hi all, > i applied the patch against 4.13.8. i still got some trouble, dmesg > is below. As this new event does not have "disabled by hub (EMI?)", it is a different problem to your 19th October post, so I don't think the patch is relevant. > after i plugged the device, it seems to be detected and all modules > loaded, but when i tried to connect to an access point, by using > wicd, it halted after a while. at this point, all usb ports are > broken, there was no more log in dmesg, If the other USB ports are not responding, then your problem is probably wider than the wireless device, and the wireless device is acting as the "canary in the mine"; failing first because it is the most active. Can you test to exclude possibility of damaged USB host controller or hub? > lsusb still showed the device even after being unplugged. it got > even worse as reboot failed. Yes, once a USB host controller is failed, organised reboot can be difficult. lsusb not updated confirms host controller not responding. > i cannot really trace the error as right now all thing works fine. Your dmesg looks like you removed and reinserted the wireless device several times. Did you do that, or did the system do it without any physical action? A full dmesg from boot may be interesting, at least to better understand the USB host controller. -- James Cameron http://quozl.netrek.org/
Re: RTL usb adapter question
Interesting, thanks. It should be a QFN 46 pin chip; you may have counted 15 instead of 14 pins on the long edge. Send me a photograph of the inside, off-list? There's a brief datasheet that I found, but no sign of firmware or registers documentation, as usual; http://www.cnping.com/wp-content/uploads/2015/09/RTL8188CUS_DataSheet_1.01.pdf I've no direct experience with the rtl8188cus chip. I can't prove it, but my experience with other vendors suggests a small non-volatile storage built into the chip for device configuration and firmware. Device configuration often includes USB vendor:product. I've read that Edimax uses rtl8188cus in a device programmed with vendor:product 7392:7811, and the kernel handles this in rtl8xxxu/rtl8xxxu_core.c rtlwifi/rtl8192cu/sw.c rtl8188cus has several configurable pins, so device configuration or firmware would have been programmed to match the circuit layout. As your kernel isn't providing firmware, yet the device works to an extent, there is probably firmware already on the device. I don't know of a way to ask the device for a firmware version, or a firmware dump. You might sacrifice a sample to see if loading rtl8192cu firmware changes behaviour at all. You might also work with your device vendor to improve clarity. ;-) On Thu, Oct 26, 2017 at 08:28:00PM -0500, David Ashley wrote: > I opened up the dongle, it has these things inside (aside from 2 coils > and various resistors and capacitors) > 1) > 48 pin chip (9 pins, 15 pins, 9 pins, 15 pins) > REALTEK > RTL8188CUS > F6J23P2 > GF27 TAIWAN > > 6 pin chip (3 pins,3 pins) > BZ5JA > > 40.0 mhz crystal oscillator > > I was thinking maybe some serial eeprom would be included, but there wasn't > one. > > -Dave > > > On 10/26/17, James Cameron <qu...@laptop.org> wrote: > > Base on your evidence, I'd say the device is different to others and > > has firmware included. > > > > On Thu, Oct 26, 2017 at 04:45:54PM -0500, David Ashley wrote: > >> OK I'm completely baffled. > >> > >> I have explicitly removed all rtlwifi/ firmware files from the root > >> filesystem and yet the usb dongle still works, even after a power > >> cycle. How can it possibly be getting its firmware file > >> > >> Here are the relevant kernel messages. There is no file > >> rtl8192cufw.bin anywhere for the kernel to find... > >> root@30046:~# ls -l /lib/firmware/rtlwifi/ > >> total 0 > >> > >> I have ensured there is no *OTHER* route to the internet such that the > >> driver (or udev) can magically get the firmware file from the > >> internet... > >> > >> Here's other info that may be useful... > >> > >> root@30046:~# zcat /proc/config.gz | grep FIRM > >> CONFIG_PREVENT_FIRMWARE_BUILD=y > >> CONFIG_FIRMWARE_IN_KERNEL=y > >> CONFIG_EXTRA_FIRMWARE="am335x-pm-firmware.elf > >> am335x-bone-scale-data.bin am335x-evm-scale-data.bin > >> am43x-evm-scale-data.bin" > >> CONFIG_EXTRA_FIRMWARE_DIR="firmware" > >> # CONFIG_LIBERTAS_THINFIRM is not set > >> # CONFIG_FIRMWARE_MEMMAP is not set > >> # CONFIG_TEST_FIRMWARE is not set > >> root@30046:~# cat /proc/version > >> Linux version 4.1.19-bone20 (dash@DaveDesktop) (gcc version 5.4.0 > >> 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) ) #2 Tue Oct 3 > >> 17:25:35 CDT 2017 > >> root@30046:~# lsusb > >> Bus 001 Device 002: ID 7392:7811 Edimax Technology Co., Ltd EW-7811Un > >> 802.11n Wireless Adapter [Realtek RTL8188CUS] > >> > >> ... ifconfig > >> wlan0 Link encap:Ethernet HWaddr 74:da:38:61:f1:2c > >> inet addr:192.168.10.31 Bcast:192.168.10.255 > >> Mask:255.255.255.0 > >> inet6 addr: fe80::76da:38ff:fe61:f12c/64 Scope:Link > >> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > >> RX packets:509 errors:0 dropped:0 overruns:0 frame:0 > >> TX packets:146 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:60812 (59.3 KiB) TX bytes:16365 (15.9 KiB) > >> > >> > >> > >> > >> [9.663796] rtl8192cu: Chip version 0x10 > >> [9.745394] cfg80211: Calling CRDA to update world regulatory domain > >> [9.844311] random: nonblocking pool is initialized > >> [9.877851] rtl8192cu: MAC address: 74:da:38:61:f1:2c > >> [9.877883] rtl8192cu: Board Type 0 > >> [9.877989] rtl_usb: rx_max_size 15360, rx_urb_num 8, in_ep 1 > >> [9.878098] rtl8192cu: Loading fi
Re: RTL usb adapter question
filesystem. > >> > >> Basically I'm trying to understand the theory. We have a product that > >> is making use of the device > >> > >> Bus 001 Device 007: ID 7392:7811 Edimax Technology Co., Ltd > >> EW-7811Un802.11n Wireless Adapter [Realtek RTL8188CUS] > >> > >> It has not been especially reliable. I've never provided firmware > >> files for the device in the root filesystem. I've started to pay > >> attention to the kernel error messages. Now the kernel drivers seem to > >> be loading the rtlwifi/rtl8192cufw_TMSC.bin file and I'm trying to > >> understand if this is actually working, if it makes any difference in > >> reliability... > >> > >> It's like I can't figure out how the usb dongle even worked without > >> its firmware file... > >> > >> My working theory is that the usb dongle comes from the factory with a > >> hardcoded firmware file (rtlwifi/rtl8192cufw.bin) but it is buggy or > >> inferior. And the performance and reliability can be improved if the > >> driver successfully manages to load the rtl8192cufw_TMSC.bin file. I > >> don't know if the firmware load persists across a power cycle (my > >> assumption is it doesn't). > > > > There is NO firmware coded by the factory in the device. It only has enough > > > > intelligence to load the real firmware. The exact file that it loads is > > determined by the model. If you provide the appropriate section of the > > output of > > dmesg where the above firmware messages occur, and a file listing of > > /lib/firmware/rtlwifi/, I can tell you what firmware is being loaded. > > > > No, firmware will not persist across a power failure. > > > > The driver has never been particularly reliable, and the USB group at > > Realtek > > seems not to care. You might try their other driver, but you will be on your > > > > own, as I will not support that particular piece of . > > > > Please reply to all on any followups. > > > > Larry > > > > -- James Cameron http://quozl.netrek.org/
Re: iwlwifi crash with hostapd
On Wed, Oct 25, 2017 at 09:08:17AM +0200, Mario Theodoridis wrote: > On 24/10/17 23:01, James Cameron wrote: > >Summary: WARN_ON(iwl_mvm_is_dqa_supported(mvm)) in > >iwl_mvm_rx_tx_cmd_single with v4.13, but code is since changed. > > > >On Tue, Oct 24, 2017 at 09:56:31PM +0200, Mario Theodoridis wrote: > >>Sorry for skipping the list one the last one. > > > >Sorry, that was my fault. It was a private message you replied to. > > > >>On 19.10.2017 22:59, James Cameron wrote: > >>[...] > > > >You didn't say virtualbox was essential for reproducing the problem, > >so I'm continuing to exclude it from thought. If it is essential for > >reproducing, then you might contact them. > > > >Please do make sure you can exclude virtualbox as a cause. > > Let me clarify the virtualbox thing. The machine in question is a VM host. > It hosts several machines, one of which is my mail server, and another > (openbsd) which acts as a gateway to the internet for all machines. > If i run this machine without virtualbox, then my entire network topology is > off-line. While one could argue, that this is bad design, the alternative > would be to use openbsd as a virtual host, but i haven't seen many tutorials > on that. I also would like to run just one machine 24/7 to keep a tap on the > electricity consumption. > > This machine also bridges several interfaces and acts as a hotspot for my > wlan. > > So i don't know whether virtualbox is responsible, but not running > virtualbox is simply not an option. Thanks. I don't have a machine with the same wireless device, so I can't hope to reproduce the problem or test fixes. I do have a slightly later wireless device which uses the same driver, but I'm not confident it would reproduce the problem, because (a) I've not seen the same stack traces, (b) the WARN_ON relates to device response coded in firmware, and my wireless device may use different firmware, and (c) it isn't clear to me what you did to enable the problem. You do have a machine, and you might do tests without virtualbox, but as you say, this is not an option for you. > >>This one pretty quickly loads my syslog with new error stacks. I > >>haven't tested actual behavior yet, but the logs don't look so hot. > > > >Do connections frequently keep dying as before? > > > >>I ran another wireless-info (attached) and appended some of the > >>syslog stuff to it. > > > >Thanks, you identified a line of code and cause; a WARN_ON in > >iwl_mvm_rx_tx_cmd_single; > > > > case TX_STATUS_FAIL_DEST_PS: > > /* In DQA, the FW should have stopped the queue and not > > * return this status > > */ > > WARN_ON(iwl_mvm_is_dqa_supported(mvm)); > > info->flags |= IEEE80211_TX_STAT_TX_FILTERED; > > break; > > > >But it is only a warning. If connections aren't dying, it may not be > >important to you. > > > >Please check you are using the most recent linux-firmware? > > I'm running > ii linux-firmware 1.169 all > from artful. > No difference to the xenial version. Good, thanks. > >>>Several methods, though by far the most common seems to be personal > >>>experience with offsets. > >>> > >>>When you don't have that personal experience, the methods are; > >>> > >>>1. using GDB against the .o file, > >>> > >>>2. using binutils objdump to disassemble .o file or vmlinuz, > >>> > >>>3. using GCC to generate assembly listings, > >>> > >>>See https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks right down > >>>the end of page for the GDB method. > >> > >>I have gotten around to that part, yet, as i was busy with the > >>above, but it seems later versions have issues, too. > > > >However, you're still testing old source code. > > > >Several changes made since are worth testing, please either > >cherry-pick the patches or test a 4.14 rc kernel, and without > >involving dkms or virtualbox. > > Then i'd have to patch those files so they build for 4.14 first. > I've seen patches, but still need to figure out how to get them > applied in the build process. It may be more efficient to wait for your dkms packagers to catch up so that the v4.14-rc6 or v4.14 kernel will work with your package configuration. > -- > Mit freundlichen Grüßen/Best Regards > > Mario Theodoridis -- James Cameron http://quozl.netrek.org/
Re: iwlwifi crash with hostapd
Summary: WARN_ON(iwl_mvm_is_dqa_supported(mvm)) in iwl_mvm_rx_tx_cmd_single with v4.13, but code is since changed. On Tue, Oct 24, 2017 at 09:56:31PM +0200, Mario Theodoridis wrote: > Sorry for skipping the list one the last one. Sorry, that was my fault. It was a private message you replied to. > On 19.10.2017 22:59, James Cameron wrote: > >On Thu, Oct 19, 2017 at 08:56:46AM +0200, Mario Theodoridis wrote: > >>On 18/10/17 23:33, James Cameron wrote: > >> > >> For your interest, kernel v4.4.93 in stable series just released has > >> changes in relevant files. > >> > >> https://lwn.net/Articles/736770/ > >> > >>Thanks James, > >> > >>after looking into bisection last night, i found that just before > >>i wanted to test out the 4.4.0-82 kernel, i found 3 stack traces > >>in my syslog. :( > >> > >>I guess, i'm dealing with race conditions now. But it seems the 79 > >>kernel still crashes wifi a lot less than later ones. > >> > >>How do i get line numbers into these traces? > > As the 4.4.0-79 kernel was sometimes crapping out, too, i decided to > try to test the latest kernel instead of bisecting after all. This > took a while because virtualbox was being a bitch. virtualbox-5.0 > doesn't bode well with virtualbox-dkms-51, so i ended up rebuilding > virtualbox-5.1 to prevent dependency hell. The vb-dkms package > doesn't do 4.14, so i ended up going with the 4.13 kernel that comes > with artful. You didn't say virtualbox was essential for reproducing the problem, so I'm continuing to exclude it from thought. If it is essential for reproducing, then you might contact them. Please do make sure you can exclude virtualbox as a cause. > This one pretty quickly loads my syslog with new error stacks. I > haven't tested actual behavior yet, but the logs don't look so hot. Do connections frequently keep dying as before? > I ran another wireless-info (attached) and appended some of the > syslog stuff to it. Thanks, you identified a line of code and cause; a WARN_ON in iwl_mvm_rx_tx_cmd_single; case TX_STATUS_FAIL_DEST_PS: /* In DQA, the FW should have stopped the queue and not * return this status */ WARN_ON(iwl_mvm_is_dqa_supported(mvm)); info->flags |= IEEE80211_TX_STAT_TX_FILTERED; break; But it is only a warning. If connections aren't dying, it may not be important to you. Please check you are using the most recent linux-firmware? > >Several methods, though by far the most common seems to be personal > >experience with offsets. > > > >When you don't have that personal experience, the methods are; > > > >1. using GDB against the .o file, > > > >2. using binutils objdump to disassemble .o file or vmlinuz, > > > >3. using GCC to generate assembly listings, > > > >See https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks right down > >the end of page for the GDB method. > > I have gotten around to that part, yet, as i was busy with the > above, but it seems later versions have issues, too. However, you're still testing old source code. Several changes made since are worth testing, please either cherry-pick the patches or test a 4.14 rc kernel, and without involving dkms or virtualbox. Or, if new firmware fixes the problem, go with that instead. > -- > Mit freundlichen Grüßen/Best regards > > Mario Theodoridis > > ## wireless info START ## > [...] -- James Cameron http://quozl.netrek.org/
Re: [PATCH] bcma: Use bcma_debug and not pr_cont in MIPS driver
On Wed, Oct 18, 2017 at 10:12:18PM -0700, Joe Perches wrote: > Commit 66cc04424960 ("bcma: use bcma_debug and pr_cont in MIPS driver") > converted a printk(KERN_DEBUG to bcma_debug. > > bcma_debug is guarded by a #define DEBUG via pr_debug. > > This means that the bcma_debug will generally not be emitted > but any pr_cont following the bcma_debug will be emitted. > > Correct this by removing the uses of pr_cont by using a temporary. > > Signed-off-by: Joe Perches <j...@perches.com> > --- > drivers/bcma/driver_mips.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/drivers/bcma/driver_mips.c b/drivers/bcma/driver_mips.c > index 5904ef1aa624..a929956150eb 100644 > --- a/drivers/bcma/driver_mips.c > +++ b/drivers/bcma/driver_mips.c > @@ -184,11 +184,14 @@ static void bcma_core_mips_print_irq(struct bcma_device > *dev, unsigned int irq) > { > int i; > static const char *irq_name[] = {"2(S)", "3", "4", "5", "6", "D", "I"}; > +char interrupts[20]; > +char *ints = interrupts; Tabs were changed to spaces. > > - bcma_debug(dev->bus, "core 0x%04x, irq :", dev->id.id); > - for (i = 0; i <= 6; i++) > - pr_cont(" %s%s", irq_name[i], i == irq ? "*" : " "); > - pr_cont("\n"); > +for (i = 0; i < ARRAY_SIZE(irq_name); i++) > +ints += sprintf(ints, " %s%c", > + irq_name[i], i == irq ? '*' : ' '); But not on this line. > + > +bcma_debug(dev->bus, "core 0x%04x, irq:%s\n", dev->id.id, > interrupts); > } > > static void bcma_core_mips_dump_irq(struct bcma_bus *bus) > -- > 2.10.0.rc2.1.g053435c > -- James Cameron http://quozl.netrek.org/
Re: rtlwifi oops
00 > [ 239.701327] RIP: rtl_deinit_core+0x2e/0x90 [rtlwifi] RSP: > c99a3b40 > [ 239.702028] CR2: > [ 239.705370] ---[ end trace 6ec9029c0d9c0e13 ]--- > [ 239.706311] udevd[528]: worker [1174] failed while handling > '/devices/pci:00/:00:1d.0/usb2/2-1/2-1.3/2-1.3:1.0' > -- James Cameron http://quozl.netrek.org/
Re: iwlwifi crash with hostapd
On Tue, Oct 17, 2017 at 09:35:39PM +0200, Mario Theodoridis wrote: > On 16.10.2017 05:37, James Cameron wrote: > >On Sun, Oct 15, 2017 at 06:21:36PM +0200, Mario Theodoridis wrote: > >>Thanks for the pointers, James. > >> > >>On 12.10.2017 23:24, James Cameron wrote: > >>>There's a good chance this problem has been fixed already. You > >>>are using a v4.4 kernel with many patches applied by Ubuntu. Here, we > >>>are more concerned with the latest kernels, and v4.4 is quite old. > >>> > >>>Please test some of the later kernels, see > >>>https://wiki.ubuntu.com/Kernel/MainlineBuilds > >>> > >>>In particular, test v4.13 or v4.14-rc4. > >> > >>I'm having a hard time with that, because the virtualbox-dkms build fails > >>with the 4.13 kernel, and virtualbox unfortunately is essential. > > > >Is virtualbox essential for reproducing the problem, or essential for > >your general use? > > It is essential for general use, like Internet connectivity. Okay, good, that means we can ignore virtualbox, and leave that to you. Please test v4.13 or v4.14-rc5, ignoring virtualbox for the time being. > >If the former, then that's interesting. > > > >If the latter, then you might instead test the v4.13 or v14-rc4 > >kernels for only the problem, and then revert to an older kernel after > >testing. > > > >Either way, to use virtualbox-dkms with a later kernel you may be able > >to upgrade just the virtualbox packages from a later Ubuntu release. > > > >See https://packages.ubuntu.com/virtualbox-dkms and > >https://packages.ubuntu.com/virtualbox for the later versions available. > > > >Purpose of the test can be to help isolate the cause, not only to > >solve your problem. > > Thanks for the info. > > > > >[...] > >You might also try with later firmware package. > >See https://packages.ubuntu.com/linux-firmware > > > >You might also test with booting installation media in live-mode, > >ignoring the internal disk. > > Ok, that was completely off the radar. Updating linux-firmware may run different firmware on the wireless card, and the change in behaviour may fix the problem. A gamble. A test with later installation media is useful, because you can verify problems with different kernels and wireless firmware without change to configuration. You might try Ubuntu 17.10 Artful ISO. > I ended up going the other way. I still had a 4.4.0-79-generic kernel and > booted that. It does not have this problem. > After checking out > git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial > i tried to find the culprit but was not able to trace the back trace to a > potential null pointer or some such. I got stuck at > iwl_mvm_send_cmd_pdu_status not finding a reference to iwl_mvm_disable_txq > from there. > > I did got the following diff though > > git diff Ubuntu-4.4.0-79.100 Ubuntu-4.4.0-93.116 -- > drivers/net/wireless/iwlwifi/ drivers/net/wireless/mac80211_hwsim.c > > wifi.patch > > I don't know whether this came from upstream or was ubuntu sourced. Upstream. You found your problem was introduced in an Ubuntu kernel, in the update from -79 to -93. This contained Ubuntu backports of two stable kernel patches, which are also upstream patches; 8fbcfeb8a9cc ("mac80211_hwsim: Replace bogus hrtimer clockid") from v4.4.69 50ea05efaf3b ("mac80211: pass block ack session timeout to to driver") from v4.4.77 git log Ubuntu-4.4.0-79.100..Ubuntu-4.4.0-93.116 -- \ drivers/net/wireless/iwlwifi/ drivers/net/wireless/mac80211_hwsim.c git remote add stable \ git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git git fetch stable git log v4.4.68..v4.4.92 -- \ drivers/net/wireless/iwlwifi/ drivers/net/wireless/mac80211_hwsim.c > This fixed the issue for now, but now i'm stuck on that kernel :( Yes. Here in upstream, we would run the latest kernel v4.13 and work to fix that. Trouble you had with virtualbox packages would be eventually solvable, but aren't really a problem with the kernel itself. So your next step may be to report an Ubuntu bug, and say that -79 worked fine, and -93 did not. > While i'm perfectly comfortable with user land C, i have no kernel > experience (clue stick links definitely welcome). You might verify the above patches caused the problem by doing a bisection between -79 and -93. https://wiki.ubuntu.com/Kernel/KernelBisection Or by reverting only those patches. Then report to Ubuntu which patch caused the problem. > [...] Hope that helps. -- James Cameron http://quozl.netrek.org/
Re: iwlwifi crash with hostapd
On Sun, Oct 15, 2017 at 06:21:36PM +0200, Mario Theodoridis wrote: > Thanks for the pointers, James. > > On 12.10.2017 23:24, James Cameron wrote: > >There's a good chance this problem has been fixed already. You > >are using a v4.4 kernel with many patches applied by Ubuntu. Here, we > >are more concerned with the latest kernels, and v4.4 is quite old. > > > >Please test some of the later kernels, see > >https://wiki.ubuntu.com/Kernel/MainlineBuilds > > > >In particular, test v4.13 or v4.14-rc4. > > I'm having a hard time with that, because the virtualbox-dkms build fails > with the 4.13 kernel, and virtualbox unfortunately is essential. Is virtualbox essential for reproducing the problem, or essential for your general use? If the former, then that's interesting. If the latter, then you might instead test the v4.13 or v14-rc4 kernels for only the problem, and then revert to an older kernel after testing. Either way, to use virtualbox-dkms with a later kernel you may be able to upgrade just the virtualbox packages from a later Ubuntu release. See https://packages.ubuntu.com/virtualbox-dkms and https://packages.ubuntu.com/virtualbox for the later versions available. Purpose of the test can be to help isolate the cause, not only to solve your problem. > >If the problem still happens, capture the same information and send it > >again as a reply. > > > >If the problem doesn't happen, then you can either continue to use the > >new kernel, or find when the problem was fixed; a long but rewarding > >process. > > > >Should the problem have been fixed for v4.10, you might also switch to > >using the Ubuntu package linux-generic-hwe-16.04. > >https://wiki.ubuntu.com/Kernel/RollingLTSEnablementStack#hwe-16.04 > > The 4.10 kernel readily produced this one > > [ cut here ] > WARNING: CPU: 4 PID: 1617 at > /build/linux-hwe-IJy1zi/linux-hwe-4.10.0/drivers/net/wireless/intel/iwlwifi/mvm/tx.c:510 > iwl_mvm_tx_skb_non_sta+0x39a/0x440 [iwlmvm] > Modules linked in: bnep ccm pci_stub vboxpci(OE) vboxnetadp(OE) > vboxnetflt(OE) vboxdrv(OE) nf_log_ipv4 nf_log_common xt_LOG xt_tcpudp > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter > ip_tables x_tables snd_hda_codec_hdmi arc4 iwlmvm mac80211 > snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal > intel_powerclamp iwlwifi coretemp snd_hda_intel snd_hda_codec kvm_intel > snd_hda_core snd_hwdep kvm input_leds irqbypass crct10dif_pclmul snd_pcm > bridge crc32_pclmul joydev stp llc ghash_clmulni_intel snd_seq_midi pcbc > snd_seq_midi_event snd_rawmidi aesni_intel snd_seq aes_x86_64 crypto_simd > snd_seq_device glue_helper cfg80211 cryptd snd_timer intel_cstate snd > intel_rapl_perf soundcore shpchp mei_me hci_uart mei btbcm btqca btintel > bluetooth intel_lpss_acpi > acpi_als mac_hid intel_lpss kfifo_buf tpm_infineon industrialio acpi_pad > parport_pc ppdev lp parport autofs4 i915 e1000e i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops e100 hid_generic ptp i2c_hid > ahci mii drm pps_core pinctrl_sunrisepoint libahci usbhid e1000 hid wmi > video pinctrl_intel fjes > CPU: 4 PID: 1617 Comm: hostapd Tainted: G OE 4.10.0-37-generic > #41~16.04.1-Ubuntu > Hardware name: Gigabyte Technology Co., Ltd. Z170M-D3H/Z170M-D3H-CF, BIOS > F20 11/17/2016 > Call Trace: > dump_stack+0x63/0x90 > __warn+0xcb/0xf0 > warn_slowpath_null+0x1d/0x20 > iwl_mvm_tx_skb_non_sta+0x39a/0x440 [iwlmvm] > iwl_mvm_mac_tx+0x11e/0x1d0 [iwlmvm] > ieee80211_tx_frags+0x14b/0x220 [mac80211] > __ieee80211_tx+0x81/0x180 [mac80211] > ieee80211_tx+0x10f/0x150 [mac80211] > ieee80211_xmit+0x9b/0xf0 [mac80211] > __ieee80211_tx_skb_tid_band+0x5c/0x70 [mac80211] > ieee80211_mgmt_tx+0x42c/0x4a0 [mac80211] > cfg80211_mlme_mgmt_tx+0xdc/0x310 [cfg80211] > nl80211_tx_mgmt+0x212/0x360 [cfg80211] > genl_family_rcv_msg+0x1db/0x3b0 > ? skb_queue_tail+0x43/0x50 > genl_rcv_msg+0x59/0xa0 > ? genl_notify+0x80/0x80 > netlink_rcv_skb+0xa4/0xc0 > genl_rcv+0x28/0x40 > netlink_unicast+0x18c/0x240 > netlink_sendmsg+0x2fb/0x3a0 > ? aa_sock_msg_perm+0x61/0x150 > sock_sendmsg+0x38/0x50 > ___sys_sendmsg+0x2c2/0x2d0 > ? sock_sendmsg+0x38/0x50 > ? SYSC_sendto+0x101/0x190 > ? __check_object_size+0x108/0x1e3 > ? _copy_to_user+0x55/0x60 > __sys_sendmsg+0x54/0x90 > SyS_sendmsg+0x12/0x20 > entry_SYSCALL_64_fastpath+0x1e/0xad > RIP: 0033:0x7fcc38cfe450 > RSP: 002b:7fffdefc9b18 EFLAGS: 0246 ORIG_RAX: 002e > RAX: ffda RBX: 563e91285590 RCX: 7fcc38cfe450 > RDX: RSI: 7fffdefc9ba0 RDI: 0005 > RBP: R08: 000
Re: iwlwifi crash with hostapd
On Thu, Oct 12, 2017 at 10:26:33PM +0200, Mario Theodoridis wrote: > Hello everyone, > > i'm running Kubuntu 16.04 as a Virtualbox VM host, and a wireless AP > with an Intel Wireless 7260. > > My WLAN connections frequently keep dying, so that i need to > disconnect and reconnect in order to use them again. > My syslog is full of these: > > Oct 12 21:48:55 zippy kernel: [3546600.957321] [ cut here > ] > Oct 12 21:48:55 zippy kernel: [3546600.957352] WARNING: CPU: 2 PID: 1571 at > /build/linux-YyUNAI/linux-4.4.0/drivers/net/wireless/iwlwifi/mvm/utils.c:740 > iwl_mvm_disable_txq+0x2a6/0x2c0 [iwlmvm]() > [...] > I'm not sure if this is the right forum to post this. > If it isn't, a pointer to the right place would be appreciated. This is a right place. Another right place is Ubuntu bug reporting. > Please include me in the reply as i'm not on the list. > Let me know, what additional details i need to provide, as i'm > interested in getting this to work. There's a good chance this problem has been fixed already. You are using a v4.4 kernel with many patches applied by Ubuntu. Here, we are more concerned with the latest kernels, and v4.4 is quite old. Please test some of the later kernels, see https://wiki.ubuntu.com/Kernel/MainlineBuilds In particular, test v4.13 or v4.14-rc4. If the problem still happens, capture the same information and send it again as a reply. If the problem doesn't happen, then you can either continue to use the new kernel, or find when the problem was fixed; a long but rewarding process. Should the problem have been fixed for v4.10, you might also switch to using the Ubuntu package linux-generic-hwe-16.04. https://wiki.ubuntu.com/Kernel/RollingLTSEnablementStack#hwe-16.04 Hope that helps. > Thanks. > > Regards > > Mario [...] -- James Cameron http://quozl.netrek.org/
Re: Contributing to Linux-wireless drivers.
On Tue, Oct 10, 2017 at 05:14:02PM +0530, Himanshu Jha wrote: > Hello everyone, > > Apologies for that forwarded email which I hurriedly sent without > editing here! > > I am an undergraduate student in ECE(3rd year) and wish to contribute to > linux-wireless > drivers. I am familiar with the kernel development process and have many > patches accepted in the past 2 months with variety of tools used such as > coccinelle, Kasan, smatch, sparse and checkpatch. > > My past contributions can be found here: > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=grep=Himanshu+Jha > > Also, James Cameron suggested me to *not self promot* and other useful > stuff. But I'm not self promoting and the purpose is to avoid the > initial steps that you generally recommend to a newbie like reading the > conding guideline, submitting patches, learn Git etc. Last time I'll try that privately. Now I'm publically outed for it. I keep making this mistake. For completeness, what I had said was; > > Self promotion is not often acceptable. For background on > > culture, see http://www.catb.org/esr/faqs/hacker-howto.html and Himanshu said they wanted to avoid being told the initial steps again, to which I replied; > > Good point. However even as a grey beard, I can still get told > > these things; it reflects more on them than me. > > > > An alternate method would be to say what you have done without > > using any words that measure or evaluate what you have done. However, I am curious to know if there will be a GSoC engagement by Linux Foundation in the linux-wireless scope. It would be fun to watch and learn. -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Thu, Sep 21, 2017 at 09:40:14AM -0500, Larry Finger wrote: > On 09/21/2017 03:07 AM, James Cameron wrote: > >My test kernel "-qb" was write_readback = false in sw.c, with 8-bit > >read of REG_DBI_RDATA, and has been stable for four hours. I'll > >focus on some more testing of this one. It is a surprise. > > > >http://dev.laptop.org/~quozl/z/1dutXk.txt (dmesg) > > > >Observe how REG_DBI_FLAG+0 is briefly seen as 1, which doesn't > >happen with write_readback = true. > > Again, thanks for your efforts. > > At this point, my system has been up over 17 hours without a single > drop. As a result, I will leave the reversion of commit 40b368af4b75 > in place. It seems safer than turning off write_readback. After we > get more testing, that could still be an option. Thanks for the reversion commit, I'll point others to it. My apologies for sloppy work, the test kernel features got swapped! "-qb" above was with write_readback off, and 16-bit read of REG_DBI_RDATA, not 8-bit. Verified with objdump. It has run for 24 hours without a drop. So at conclusion; - the 16-bit read is good with or without write_readback. - the 8-bit read is bad with or without write_readback, and tends to lose connection much quicker without write_readback. Been a pleasure working with you. Back to lurk mode. -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Thu, Sep 21, 2017 at 09:22:28AM +1000, James Cameron wrote: > On Wed, Sep 20, 2017 at 04:48:23PM -0500, Larry Finger wrote: > > On 09/20/2017 04:36 AM, James Cameron wrote: > > >When the problem occurs, register 0x350 bit 25 is set, for which a > > >comment in _rtl8821ae_check_pcie_dma_hang says means there is an RX > > >hang. > > > > > >So perhaps driver should call _rtl8821ae_check_pcie_dma_hang > > >and _rtl8821ae_reset_pcie_interface_dma. > > > > > >Any ideas where to do this? > > > > Thanks for the extended debugging. > > > > I was able to repeat your findings. With the 8-bit read of > > REG_DBI_RDATA, I got poor connection stability. Reverting that part > > made it stable again. For that reason, I pushed the partial > > reversion of commit 40b368af4b75 ("rtlwifi: Fix alignment issues"). > > That's great you were able to reproduce, thanks! > [...] > I'm still pondering a few more theories; > > - change write_readback, it is true now, and the while()/udelay in > _rtl8821ae_dbi_read seems a waste, it never executes, My test kernel "-qb" was write_readback = false in sw.c, with 8-bit read of REG_DBI_RDATA, and has been stable for four hours. I'll focus on some more testing of this one. It is a surprise. http://dev.laptop.org/~quozl/z/1dutXk.txt (dmesg) Observe how REG_DBI_FLAG+0 is briefly seen as 1, which doesn't happen with write_readback = true. > - clearing REG_DBI_CTRL write enable bits at the end of > _rtl8821ae_dbi_write, My test kernel "-qc" had reset of REG_DBI_ADDR as last step in both _rtl8821ae_dbi_read and _rtl8821ae_dbi_write, and was very unstable, not able to connect. http://dev.laptop.org/~quozl/y/1dutbX.txt (git diff v4.13) http://dev.laptop.org/~quozl/z/1dutuM.txt (dmesg) My test kernel "-qd" had reset of REG_DBI_ADDR as last step in only _rtl8821ae_dbi_write, and had poor connection stability. http://dev.laptop.org/~quozl/y/1dutr3.txt (git diff v4.13) http://dev.laptop.org/~quozl/z/1duuDc.txt (dmesg connection lost) Based on the above two kernels, clearing REG_DBI_ADDR after a read is a bad idea, and suggests there is some underlying asynchronicity about the DBI access. Almost as if some other condition should signal completion rather than zero in REG_DBI_FLAG+0. > - switching to 32-bit access as used by rtl8192de. My test kernel "-qe" changed RED_DBI_RDATA read to 32-bit, then used a union hack to pull out the desired byte, and had poor connection stability. http://dev.laptop.org/~quozl/y/1duvIC.txt (git diff v4.13) http://dev.laptop.org/~quozl/z/1duwI1.txt (dmesg connection lost) -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Wed, Sep 20, 2017 at 04:48:23PM -0500, Larry Finger wrote: > On 09/20/2017 04:36 AM, James Cameron wrote: > >When the problem occurs, register 0x350 bit 25 is set, for which a > >comment in _rtl8821ae_check_pcie_dma_hang says means there is an RX > >hang. > > > >So perhaps driver should call _rtl8821ae_check_pcie_dma_hang > >and _rtl8821ae_reset_pcie_interface_dma. > > > >Any ideas where to do this? > > Thanks for the extended debugging. > > I was able to repeat your findings. With the 8-bit read of > REG_DBI_RDATA, I got poor connection stability. Reverting that part > made it stable again. For that reason, I pushed the partial > reversion of commit 40b368af4b75 ("rtlwifi: Fix alignment issues"). That's great you were able to reproduce, thanks! > Where did you detect that bit 25 of register 0x350 was set? In _rtl8821ae_check_pcie_dma_hang on link up. REG_DBI_FLAG (0x350 bits 16-31) is observed as; - 0x on entry to function after warm boot, - 0x0400 on exit from function; debug bit 23 is set by the function, - 0x0400 on entry to function on link up when the problem has not happened, - 0x0600 on entry to function on link up when the problem has happened. But I don't know if 0x0600 is useful to detect earlier, or if it is only a symptom of link down while device active. Either way, if it truly does signal an RX hang or firmware RX queue full, it's useful. My "-q9" and "-qa" test kernels dump REG_DBI_CTRL and REG_DBI_FLAG. "-q9" is with 8-bit read of REG_DBI_RDATA. "-qa" is with 16-bit read of REG_DBI_DATA. My "-qa" test kernel; http://dev.laptop.org/~quozl/y/1dunwN.txt (git diff v4.13) http://dev.laptop.org/~quozl/z/1dubX7.txt (dmesg) REG_DBI_CTRL+3 used by _rtl8821ae_check_pcie_dma_hang is effectively REG_DBI_FLAG+1 (0x353). REG_DBI_CTRL is REG_DBI_ADDR; a duplicate register definition. I'm still pondering a few more theories; - change write_readback, it is true now, and the while()/udelay in _rtl8821ae_dbi_read seems a waste, it never executes, - clearing REG_DBI_CTRL write enable bits at the end of _rtl8821ae_dbi_write, - switching to 32-bit access as used by rtl8192de. And a giggle from reviewing the code, _rtl8821ae_wowlan_initialize_adapter says "Patch Pcie Rx DMA hang after S3/S4 several times. The root cause has not be found." ... I've learned that root causes that aren't found tend to cause further problems later. ;-) Given this, my gut feel is firmware or silicon problem; RX DMA ceases, the driver does not detect it, the connection is lost. -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Tue, Sep 19, 2017 at 07:42:04PM +1000, James Cameron wrote: > On Thu, Sep 14, 2017 at 07:27:39PM +1000, James Cameron wrote: > > On Wed, Sep 13, 2017 at 07:39:35PM -0500, Larry Finger wrote: > > > On 09/13/2017 04:46 PM, James Cameron wrote: > > > > > > > >I'll give it some more testing and let you know, but it seems as > > > >capable of keeping a connection as 4.13 plus my earlier revert. > > > > > > > > Testing went well; removing the call to enable ASPM was as good as > > changing the DBI read back to 16-bit width. > > > > > The change I sent earlier should be as good as reverting the change > > > to write_byte in your reversion. > > > > Yes, that would be the hope. > > > > But with the 16-bit DBI read, the register REG_DBI_CTRL+0 is being > > read as well, in the first read in _rtl8821ae_enable_aspm_back_door, > > so perhaps reading that register has an unexpected side-effect. > > > > I've ruled that out after testing for several days different kernels > based on v4.13; > > - add an rtl_read_byte of REG_DBI_CTRL+0 in rtl8821ae_hw_init just > after the call to enable_aspm; does not solve problem, > > - add an rtl_read_byte of REG_DBI_CTRL+0 at the start of > _rtl8821ae_check_pcie_dma_hang; does not solve problem, When the problem occurs, register 0x350 bit 25 is set, for which a comment in _rtl8821ae_check_pcie_dma_hang says means there is an RX hang. So perhaps driver should call _rtl8821ae_check_pcie_dma_hang and _rtl8821ae_reset_pcie_interface_dma. Any ideas where to do this? > [...] -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Thu, Sep 14, 2017 at 07:27:39PM +1000, James Cameron wrote: > On Wed, Sep 13, 2017 at 07:39:35PM -0500, Larry Finger wrote: > > On 09/13/2017 04:46 PM, James Cameron wrote: > > > > > >I'll give it some more testing and let you know, but it seems as > > >capable of keeping a connection as 4.13 plus my earlier revert. > > > > > Testing went well; removing the call to enable ASPM was as good as > changing the DBI read back to 16-bit width. > > > The change I sent earlier should be as good as reverting the change > > to write_byte in your reversion. > > Yes, that would be the hope. > > But with the 16-bit DBI read, the register REG_DBI_CTRL+0 is being > read as well, in the first read in _rtl8821ae_enable_aspm_back_door, > so perhaps reading that register has an unexpected side-effect. > I've ruled that out after testing for several days different kernels based on v4.13; - add an rtl_read_byte of REG_DBI_CTRL+0 in rtl8821ae_hw_init just after the call to enable_aspm; does not solve problem, - add an rtl_read_byte of REG_DBI_CTRL+0 at the start of _rtl8821ae_check_pcie_dma_hang; does not solve problem, Only way to solve the problem at the moment is either; - reverting 40b368af4b75 ("rtlwifi: Fix alignment issues"), which means using rtl_read_word in _rtl8821ae_dbi_read, or - removing the two lines that enable ASPM, as you asked me to try. > Is there any documentation for that register? I see other code writes > to REG_DBI_CTRL+3, in _rtl8821ae_check_pcie_dma_hang I'll repeat and expand on this. Is there any documentation for this register, or the other REG_DBI_* registers? I see that DBI windowed access in rtl8192de is different and yet very similar. In rtl8821ae, rtl8723be, and rtl8192de the method seems straightforward; there are bits for address, bits for write enable by byte, and flag bits for starting the transfer and completing. > Evidence of read from REG_DBI_CTRL was captured with an instrumented > kernel; git diff http://dev.laptop.org/~quozl/y/1dsQ6B.txt yielding > these dmesg lines; > > [6.010255] rtl_pci: _rtl_pci_update_default_setting const_amdpci_aspm=03 > [6.010338] rtl_pci: rtl_pci_enable_aspm > [6.034295] ieee80211 phy0: Selected rate control algorithm 'rtl_rc' > [6.034806] rtlwifi: rtlwifi: wireless switch is on > [6.196958] rtl8821ae :02:00.0 wlp2s0: renamed from wlan0 > [7.979186] rtl_pci: rtl_pci_disable_aspm > [7.979306] rtl8821ae: _rtl8821ae_check_pcie_dma_hang > [8.295360] rtl8821ae: _rtl8821ae_enable_aspm_back_door > [8.295437] rtl8821ae: _rtl8821ae_dbi_read 070f -> (@034f) > [8.295449] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c) > [8.295462] rtl8821ae: _rtl8821ae_dbi_read 0719 -> 0200 (@034d) > [8.295474] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718) > [8.295477] rtl_pci: rtl_pci_enable_aspm > [8.469734] rtl_pci: rtl_pci_disable_aspm > [8.469857] rtl8821ae: _rtl8821ae_check_pcie_dma_hang > [8.686955] rtl8821ae: _rtl8821ae_enable_aspm_back_door > [8.687013] rtl8821ae: _rtl8821ae_dbi_read 070f -> (@034f) > [8.687025] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c) > [8.687038] rtl8821ae: _rtl8821ae_dbi_read 0719 -> 0218 (@034d) > [8.687050] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718) > [8.687053] rtl_pci: rtl_pci_enable_aspm > > Observe how the windowed read of DBI register 0x70f causes a read of > 16-bits at 0x34f, which includes first 8-bits of 0x350 REG_DBI_CTRL. > > By the way, the cold boot value of DBI register 0x719 is 0x00, and > the warm boot value is 0x18, so I'm confident there isn't a > comprehensive register reset. It means that BIOS has relevance; and > this BIOS is outside my control. BIOS variation may explain > difficulty reproducing. Is there a register for device reset that I can try? It would help to exclude BIOS. > > > There has been a report (in Russian unfortunately) at > > https://www.linux.org.ru/forum/desktop/12620193 of delays in ARP > > handling. > > Thanks. I've considered and excluded ARP handling delay. Though ARP > renewal is typical reason for device sleep to end. > > With the call to enable ASPM disabled, instead of changing the DBI > read to 16-bit width, what happens is that the device stops accepting > data from the access point, packets are buffered there, and are > transmitted as soon as the device makes the next transmission. > > http://dev.laptop.org/~quozl/z/1dsQBf.txt has the ping and IP tcpdump > to confirm this. > > I've a monitor mode tcpdump I can send by private mail if required. > In that the burst of packets shows ICMP echo requests were buffered by > the access point. > &
Re: [TDLS PATCH V2 1/5] mac80211: Enable TDLS peer buffer STA feature
On Tue, Sep 19, 2017 at 10:51:04AM +0800, yint...@qti.qualcomm.com wrote: > From: Yingying Tang <yint...@qti.qualcomm.com> > > Enable TDLS peer buffer STA feature. > Set extended capability bit to enable buffer STA when driver > support it. > > Signed-off-by: Yingying Tang <yint...@qti.qualcomm.com> > --- > include/net/cfg80211.h |3 +++ > net/mac80211/tdls.c|5 - > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h > index f12fa52..edefc25 100644 > --- a/include/net/cfg80211.h > +++ b/include/net/cfg80211.h > @@ -3249,6 +3249,8 @@ struct cfg80211_ops { > * beaconing mode (AP, IBSS, Mesh, ...). > * @WIPHY_FLAG_HAS_STATIC_WEP: The device supports static WEP key > installation > * before connection. > + * @WIPHY_FLAG_SUPPORT_TDLS_BUFFER_ST: Device support buffer STA when TDLS is "_ST:" should be "_STA:" > + * established. > */ > enum wiphy_flags { > /* use hole at 0 */ > @@ -3275,6 +3277,7 @@ enum wiphy_flags { > WIPHY_FLAG_SUPPORTS_5_10_MHZ= BIT(22), > WIPHY_FLAG_HAS_CHANNEL_SWITCH = BIT(23), > WIPHY_FLAG_HAS_STATIC_WEP = BIT(24), > + WIPHY_FLAG_SUPPORT_TDLS_BUFFER_STA = BIT(25), > }; > > /** > [...] -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Wed, Sep 13, 2017 at 07:39:35PM -0500, Larry Finger wrote: > On 09/13/2017 04:46 PM, James Cameron wrote: > > > >I'll give it some more testing and let you know, but it seems as > >capable of keeping a connection as 4.13 plus my earlier revert. > > Testing went well; removing the call to enable ASPM was as good as changing the DBI read back to 16-bit width. > The change I sent earlier should be as good as reverting the change > to write_byte in your reversion. Yes, that would be the hope. But with the 16-bit DBI read, the register REG_DBI_CTRL+0 is being read as well, in the first read in _rtl8821ae_enable_aspm_back_door, so perhaps reading that register has an unexpected side-effect. Is there any documentation for that register? I see other code writes to REG_DBI_CTRL+3, in _rtl8821ae_check_pcie_dma_hang Evidence of read from REG_DBI_CTRL was captured with an instrumented kernel; git diff http://dev.laptop.org/~quozl/y/1dsQ6B.txt yielding these dmesg lines; [6.010255] rtl_pci: _rtl_pci_update_default_setting const_amdpci_aspm=03 [6.010338] rtl_pci: rtl_pci_enable_aspm [6.034295] ieee80211 phy0: Selected rate control algorithm 'rtl_rc' [6.034806] rtlwifi: rtlwifi: wireless switch is on [6.196958] rtl8821ae :02:00.0 wlp2s0: renamed from wlan0 [7.979186] rtl_pci: rtl_pci_disable_aspm [7.979306] rtl8821ae: _rtl8821ae_check_pcie_dma_hang [8.295360] rtl8821ae: _rtl8821ae_enable_aspm_back_door [8.295437] rtl8821ae: _rtl8821ae_dbi_read 070f -> (@034f) [8.295449] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c) [8.295462] rtl8821ae: _rtl8821ae_dbi_read 0719 -> 0200 (@034d) [8.295474] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718) [8.295477] rtl_pci: rtl_pci_enable_aspm [8.469734] rtl_pci: rtl_pci_disable_aspm [8.469857] rtl8821ae: _rtl8821ae_check_pcie_dma_hang [8.686955] rtl8821ae: _rtl8821ae_enable_aspm_back_door [8.687013] rtl8821ae: _rtl8821ae_dbi_read 070f -> (@034f) [8.687025] rtl8821ae: _rtl8821ae_dbi_write 070f <- ff (@870c) [8.687038] rtl8821ae: _rtl8821ae_dbi_read 0719 -> 0218 (@034d) [8.687050] rtl8821ae: _rtl8821ae_dbi_write 0719 <- 18 (@2718) [8.687053] rtl_pci: rtl_pci_enable_aspm Observe how the windowed read of DBI register 0x70f causes a read of 16-bits at 0x34f, which includes first 8-bits of 0x350 REG_DBI_CTRL. By the way, the cold boot value of DBI register 0x719 is 0x00, and the warm boot value is 0x18, so I'm confident there isn't a comprehensive register reset. It means that BIOS has relevance; and this BIOS is outside my control. BIOS variation may explain difficulty reproducing. > There has been a report (in Russian unfortunately) at > https://www.linux.org.ru/forum/desktop/12620193 of delays in ARP > handling. Thanks. I've considered and excluded ARP handling delay. Though ARP renewal is typical reason for device sleep to end. With the call to enable ASPM disabled, instead of changing the DBI read to 16-bit width, what happens is that the device stops accepting data from the access point, packets are buffered there, and are transmitted as soon as the device makes the next transmission. http://dev.laptop.org/~quozl/z/1dsQBf.txt has the ping and IP tcpdump to confirm this. I've a monitor mode tcpdump I can send by private mail if required. In that the burst of packets shows ICMP echo requests were buffered by the access point. > According to Google translate is as follows: > > > Periodically, Wi-Fi networker rtl8821ae ceases to respond to ARP, > which causes the Internet to end. Wireshark looks quite interesting: > ARP replays can be sent by one large packet a few seconds after > receiving the requests, ie. they seem to be buffered somewhere. Yes, buffering at access point. > I need to explore that ENOBUFS return code. I've seen ENOBUFS up at the application level with ping too, when the original problem happens with v4.10 plus stable. > Your case where the device is unresponsive to pings from another NIC > until the device transmits may also be an ARP problem. > > For completeness, are you using the 2.4 of 5 GHz band? What is the > make/model your AP? If possible for you to determine, what firmware > is it running? 2.4 GHz and 5 GHz reproduces the problem. Open or WPA reproduces the problem. Netgear WNDR3800 OpenWrt 12.09-beta, r33312. Several other access points reproduce the problem, including a customer's TP-Link TL-WR1042ND with unknown firmware version. No access point as yet does not reproduce the problem. Hope that helps, thanks for your ideas. -- James Cameron http://quozl.netrek.org/
Re: rtl8821ae keep alive not set, connection lost
On Wed, Sep 13, 2017 at 10:01:37AM -0500, Larry Finger wrote: > Thank you very much for making the effort to bisect this problem. I > know that several people have reported the problem, which we cannot > duplicate; however, most of them just say it drops the connection > and do nothing more. In fact, we are lucky to have them even report > which kernel version they are running! Yes, in the reported bugs that style is common; almost animistic, very mystical, and based on heuristics rather than analysis. ;-) > As we do not see the problem, we will be relying on you to help > diagnose the issue. Merely changing the read from 8 to 16 bits > should not cause any change. Agreed. > As _rtl8821ae_dbi_read() is only called from > _rtl8821ae_enable_aspm_back_door(), we want to test turning off > ASPM. The following patch will accomplish this. Unfortunately, the > patch is white-space damaged, thus you will need to apply it > manually. Please try it to see if it helps your connection > loss. Note that ASPM settings are preserved through a module > unload/reload sequence. Thus you will need to reboot after > rebuilding the driver. Went back to 4.13, added your test patch, and built kernel. http://dev.laptop.org/~quozl/z/1dsFOW.txt is dmesg. New symptom occurs; after 23 seconds since last transmission, the device becomes unresponsive to ping from another host, but begins to respond if the device transmits. Flurry of responses then it settles down to regular ping. 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=39 ttl=64 time=1.71 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=40 ttl=64 time=1.93 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=41 ttl=64 time=1.71 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=42 ttl=64 time=1.66 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=43 ttl=64 time=1.70 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=44 ttl=64 time=1.69 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=45 ttl=64 time=37.7 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=46 ttl=64 time=383 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=47 ttl=64 time=11464 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=48 ttl=64 time=10465 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=49 ttl=64 time=9465 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=50 ttl=64 time=8466 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=51 ttl=64 time=7466 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=52 ttl=64 time=6466 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=53 ttl=64 time=5466 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=54 ttl=64 time=4467 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=55 ttl=64 time=3467 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=56 ttl=64 time=2468 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=57 ttl=64 time=1468 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=58 ttl=64 time=469 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=59 ttl=64 time=1.79 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=60 ttl=64 time=1.75 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=61 ttl=64 time=1.72 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=62 ttl=64 time=1.68 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=63 ttl=64 time=1.68 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=64 ttl=64 time=1.95 ms 64 bytes from nl3-e.lan (10.0.0.94): icmp_seq=65 ttl=64 time=1.68 ms I'll give it some more testing and let you know, but it seems as capable of keeping a connection as 4.13 plus my earlier revert. -- James Cameron http://quozl.netrek.org/
rtl8821ae keep alive not set, connection lost
Summary: 40b368af4b75 ("rtlwifi: Fix alignment issues") breaks rtl8821ae keep alive, causing "Connection to AP lost" and deauth, but why? Wireless connection is lost after a few seconds or minutes, on every OLPC NL3 laptop with rtl8821ae, with any stable kernel after 4.10.1, and any kernel with 40b368af4b75. dmesg contains wlp2s0: Connection to AP 2c:b0:5d:a6:86:eb lost iw event shows wlp2s0: del station 2c:b0:5d:a6:86:eb wlp2s0 (phy #0): deauth 74:c6:3b:09:b5:0d -> 2c:b0:5d:a6:86:eb reason 4: Disassociated due to inactivity wlp2s0 (phy #0): disconnected (local request) Workaround is to bounce the link, then reconnect; ip link set wlp2s0 down ip link set wlp2s0 up iw dev wlp2s0 connect qz A nearby monitor host captures a deauthentication packet sent by the device. Bisection showed cause is 40b368af4b75 ("rtlwifi: Fix alignment issues") which changes the width of DBI register read. On the face of it, 40b368af4b75 looks correct, especially compared against same function in rtl8723be. I've no idea why reverting fixes the problem. I'm hoping someone here might speculate and suggest ways to test. As keep alive is set through this path, my guess is that keep alive is not being set in the device. Or perhaps reading 16-bits perturbs another register. Is there a way to test? http://dev.laptop.org/~quozl/z/1drtGD.txt dmesg of 4.13 http://dev.laptop.org/~quozl/z/1drt7c.txt dmesg with 4.13 and revert of 40b368af4b75 -- James Cameron http://quozl.netrek.org/
Re: [PATCH] libertas: Fix lbs_prb_rsp_limit_set()
On Fri, Jun 23, 2017 at 06:17:38PM +0300, Dan Carpenter wrote: > The kstrtoul() test was reversed so this always returned -ENOTSUPP. > > Fixes: 27d7f47756f4 ("net: wireless: replace strict_strtoul() with > kstrtoul()") > Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com> Reviewed-by: James Cameron <qu...@laptop.org> -- James Cameron http://quozl.netrek.org/
Re: [ldv-project] [net] libertas: potential race condition
On Tue, Jun 14, 2016 at 05:16:11PM +0400, Pavel Andrianov wrote: > 08.06.2016 02:51, James Cameron пишет: > >On Tue, Jun 07, 2016 at 09:39:55AM -0500, Dan Williams wrote: > >>On Tue, 2016-06-07 at 13:30 +0400, Pavel Andrianov wrote: > >>>Hi! > >>> > >>>There is a potential race condition in > >>>drivers/net/wireless/libertas/libertas.ko. > >>>In the function lbs_hard_start_xmit(..), line 159, a socket buffer > >>>is > >>>written to priv->current_skb with a spin_lock protection. > >>>In the function lbs_mac_event_disconnected(..), lines 50-51, the > >>>field > >>>current_skb is cleaned. There is no protection used. The > >>>corresponding > >>>handlers are activated at the same time in lbs_start_card(..) and > >>>then > >>>may be executed simultaneously. Note, there are two structures > >>>lbs_netdev_ops and mesh_netdev_ops, which have the target handler > >>>lbs_hard_start_xmit. > >>>Is it a real race or I have missed something? > >>Yeah, it looks like it should be grabbing priv->driver_lock before > >>clearing priv->currenttxskb in lbs_mac_event_disconnected(). Care to > >>submit a patch after testing? Do you have any of that hardware? > >I've hardware, with serial console. > > > >Can test any patch, on USB (8388) or SDIO (8686). > > > Hi! > > I've prepare the patch for this issue. Could you test it? > > Thank you. Tested on OLPC XO-1 (usb8388) and XO-1.5 (sd8686) with v4.7-rc3. Confirmed that lbs_mac_event_disconnected is being called on the station when hostapd on access point is given SIGHUP. Longer duration test was; - SSH to station and run "top -d 0.2", - send SIGHUP every six seconds, for 300 cycles, You may add my; Tested-by: James Cameron <qu...@laptop.org> -- James Cameron http://quozl.netrek.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ldv-project] [net] libertas: potential race condition
On Tue, Jun 07, 2016 at 09:39:55AM -0500, Dan Williams wrote: > On Tue, 2016-06-07 at 13:30 +0400, Pavel Andrianov wrote: > > Hi! > > > > There is a potential race condition in > > drivers/net/wireless/libertas/libertas.ko. > > In the function lbs_hard_start_xmit(..), line 159, a socket buffer > > is > > written to priv->current_skb with a spin_lock protection. > > In the function lbs_mac_event_disconnected(..), lines 50-51, the > > field > > current_skb is cleaned. There is no protection used. The > > corresponding > > handlers are activated at the same time in lbs_start_card(..) and > > then > > may be executed simultaneously. Note, there are two structures > > lbs_netdev_ops and mesh_netdev_ops, which have the target handler > > lbs_hard_start_xmit. > > Is it a real race or I have missed something? > > Yeah, it looks like it should be grabbing priv->driver_lock before > clearing priv->currenttxskb in lbs_mac_event_disconnected(). Care to > submit a patch after testing? Do you have any of that hardware? I've hardware, with serial console. Can test any patch, on USB (8388) or SDIO (8686). -- James Cameron http://quozl.netrek.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mwifiex: add __GFP_REPEAT to skb allocation call
On Tue, Mar 29, 2016 at 12:47:20PM +0800, Wei-Ning Huang wrote: > "single skb allocation failure" happens when system is under heavy > memory pressure. Add __GFP_REPEAT to skb allocation call so kernel > attempts to reclaim pages and retry the allocation. Oh, that's interesting, we're back to this symptom again. Nice to see this fix. Heavy memory pressure on 3.5 caused dev_alloc_skb failure in this driver. Tracked at OLPC as #12694. -- James Cameron http://quozl.netrek.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] iw: Memory leak in error condition
On Thu, Oct 01, 2015 at 01:01:18AM +0200, Ola Olsson wrote: > >From 5239e8e9aa79a131b716398efbf7a1203decbd9b Mon Sep 17 00:00:00 2001 > From: Ola Olsson <ola.ols...@sonymobile.com> > Date: Thu, 1 Oct 2015 00:43:06 +0200 > Subject: [PATCH] iw: Memory leak in error condition Signed-off-by: Ola > Olsson <ola.ols...@sonymobile.com> > > --- > scan.c |2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/scan.c b/scan.c > index e959c1b..f248981 100644 > --- a/scan.c > +++ b/scan.c > @@ -446,6 +446,8 @@ static int handle_scan(struct nl80211_state *state, > if (ies || meshid) { > tmpies = (unsigned char *) malloc(ies_len + meshid_len); > if (!tmpies) > + free(ies); > + free(meshid); > goto nla_put_failure; Braces? { } > if (ies) { > memcpy(tmpies, ies, ies_len); > -- > 1.7.9.5 > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: question about potential integer truncation in mwifiex_set_wapi_ie and mwifiex_set_wps_ie
On Tue, Sep 29, 2015 at 05:21:28PM +0200, PaX Team wrote: > hi all, > > in drivers/net/wireless/mwifiex/sta_ioctl.c the following functions > > mwifiex_set_wpa_ie_helper > mwifiex_set_wapi_ie > mwifiex_set_wps_ie > > can truncate the incoming ie_len argument from u16 to u8 when it gets > stored in mwifiex_private.wpa_ie_len, mwifiex_private.wapi_ie_len and > mwifiex_private.wps_ie_len, respectively. based on some light code > reading it seems a length value of 256 is valid (IEEE_MAX_IE_SIZE and > MWIFIEX_MAX_VSIE_LEN seem to limit it) and thus would get truncated > to 0 when stored in those u8 fields. the question is whether this is > intentional or a bug somewhere. i agree, while there is a test to ensure ie_len is not greater than 256, there is a possibility that it will be exactly 256, which means 256 bytes will be given to memcpy but mwifiex_private.{wpa,wapi,wps}_ie_len will be zero. i suggest changing the lengths to u16. not tested. diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h index fe12560..b66e9a7 100644 --- a/drivers/net/wireless/mwifiex/main.h +++ b/drivers/net/wireless/mwifiex/main.h @@ -512,14 +512,14 @@ struct mwifiex_private { struct mwifiex_wep_key wep_key[NUM_WEP_KEYS]; u16 wep_key_curr_index; u8 wpa_ie[256]; - u8 wpa_ie_len; + u16 wpa_ie_len; u8 wpa_is_gtk_set; struct host_cmd_ds_802_11_key_material aes_key; struct host_cmd_ds_802_11_key_material_v2 aes_key_v2; u8 wapi_ie[256]; - u8 wapi_ie_len; + u16 wapi_ie_len; u8 *wps_ie; - u8 wps_ie_len; + u16 wps_ie_len; u8 wmm_required; u8 wmm_enabled; u8 wmm_qosinfo; -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to set a scan on a given frequency
On Mon, Aug 31, 2015 at 03:53:18PM -0500, Shengrong Yin wrote: > Hello, > > I was using iw to scan a given frequency. > For example, > iw wlan0 scan freq 2412 | grep freq: > However, the result was scanned ssids with different frequencies > across 2.4 GHz band, which is > freq: 2462 > freq: 2462 > freq: 2437 > freq: 2412 > ... > Why this happened? Shouldn't it return only the ssid with 2412? No. A radio receiver in a wireless device can receive beacons on adjacent frequencies to the frequency it is tuned for. The signal strength will be lower, but not low enough to prevent receive. If you want to restrict results to the frequency you are interested in, then filter the data after you have received it from the kernel. But the data returned to you isn't the frequency of the received radio burst, but is the frequency value in the beacon packet. Usually this is the same, but faulty devices, deceptive devices, or high speed movement could make it different. You should specify a frequency in your scan request if you can, because it shortens the time taken by the scan. If you do not specify a frequency, then the scan must be repeated for every channel. There is a time cost for switching, and a time spent listening on each channel. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rtlwifi: rtl8723be: disable FW control power save
On Mon, Aug 10, 2015 at 06:47:05PM +0800, Kai Heng Feng wrote: Do you use Ubuntu Trusty (14.04)? Yes, that's what I'm using. The rtl8723be firmware is not up-to-date in Trusty's linux-firmware. You can grab the newer one from the upstream linux-firmware. Thanks, but both seem to work fine (apart from the issue in this thread), and I don't have a list of what has changed. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rtlwifi: rtl8723be: disable FW control power save
On Mon, Aug 10, 2015 at 11:38:09AM +0800, AceLan Kao wrote: I tried using ips=0 today and found it's not working. I ping 8.8.8.8 and got below message within one hour. ping: sendmsg: No buffer space available I use both ips=0 and fwlps=0 with Ubuntu kernel and with 4.1, and the connection remains stable. fwlps=0 alone was not enough. ips=0 alone was not enough. Hope that helps. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] Add new mac80211 driver mwlwifi.
On Tue, Jun 30, 2015 at 09:18:44AM -0500, Dan Williams wrote: On Tue, 2015-06-30 at 10:21 +0200, Johannes Berg wrote: On Tue, 2015-06-30 at 01:49 +, David Lin wrote: +++ b/drivers/net/wireless/mwlwifi/Kconfig @@ -0,0 +1,17 @@ +config MWLWIFI + tristate Marvell Wireless WiFi driver (mwlwifi) + depends on PCI MAC80211 MWIFIEX_PCIE=n I still think you need to get rid of this so we can build-test this driver properly. The OLPC 8388 is another device that has two drivers, libertas and libertas_tf. Also 8686. I don't think there's any protection between then, you get whatever gets loaded first by the kernel. In that case, I think the answer was either (a) only put the driver you want onto the system, or Yes, for end-user. (b) manually manage from userspace. Yes, for developer testing. Given that this Marvell hardware is likely intended for more customized use-cases (AP, embedded, etc?) perhaps this would be an acceptable option for now... I tend to agree with Johannes here; the builder of the kernel can certainly adjust CONFIG_MWLWIFI and CONFIG_MWIFIEX to fit their scenario, including leaving both enabled. Dan + select FW_LOADER + select OF This looks OK, though I get a very strange dependency loop warning from Kconfig here. Looks like the driver now builds almost cleanly with sparse/smatch on 64-bit. Two warnings remain, both are bugs: writew(0x00, (void __iomem *)priv-pcmd_buf[1]); cannot be right. This memory isn't __iomem, it's dma_alloc_coherent, so a simple write should be done. in mwl_rx_ring_init: rx_hndl-psk_buff = dev_alloc_skb(desc-rx_buf_size); if (skb_linearize(rx_hndl-psk_buff)) { *crash*. You also later check rx_hndl-psk_buff, but well after it already crashed. Also, this code sequence is utterly bogus. Please try to understand why and then remove it. You should also use paged RX since you're allocating *very large* buffe rs. We found that even alloc_pages(1) will fail eventually, you're doing an order-2 allocation here for every RX skb. At least used paged RX to get it down to order-1. johannes -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug ?] 10ec:b723 Realtek RTL8723BE wireless card drops connection
On Sun, Jun 14, 2015 at 06:13:56PM -0500, Larry Finger wrote: On 06/14/2015 06:04 PM, James Cameron wrote: On Sun, Jun 14, 2015 at 10:10:30AM -0500, Larry Finger wrote: To address your problem, power saving does not work correctly on this device. That is why there are numerous posts on the web telling people to use ips=0. It seems that Ubuntu people never look at anything but the Ubuntu literature; however, I'm sure that I posted this suggestion there as well. The Realtek group is currently rewriting the entire dynamic management code for all their drivers. When complete, this should improve performance and should help the power-save condition. No, I do not know when the new code will be ready, or how much improvement it will make. Thanks for summary. OLPC is also seeing the issue. Power saving mode impacts battery run time; one of our design goals. ips=0 seems to solve with 3.19, but not fully with 4.1-rc7; still some periods of packet loss. I offer to test any rtl8723be changes. Please do a bisection between 4.1-rc7 and 3.19. Thanks. But I was too hasty in reporting a good result. Now no difference across those kernel versions; still some periods of packet loss with ips=0. Workaround is to use both ips=0 and fwlps=0. We'll ship with that unless we hear of a fix. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug ?] 10ec:b723 Realtek RTL8723BE wireless card drops connection
On Sun, Jun 14, 2015 at 10:10:30AM -0500, Larry Finger wrote: To address your problem, power saving does not work correctly on this device. That is why there are numerous posts on the web telling people to use ips=0. It seems that Ubuntu people never look at anything but the Ubuntu literature; however, I'm sure that I posted this suggestion there as well. The Realtek group is currently rewriting the entire dynamic management code for all their drivers. When complete, this should improve performance and should help the power-save condition. No, I do not know when the new code will be ready, or how much improvement it will make. Thanks for summary. OLPC is also seeing the issue. Power saving mode impacts battery run time; one of our design goals. ips=0 seems to solve with 3.19, but not fully with 4.1-rc7; still some periods of packet loss. I offer to test any rtl8723be changes. (I'm also looking into IBSS, because Sugar desktop relies on ad-hoc. No beacons on creating an IBSS through NetworkManager, but beacons are fine with iw dev wlan0 ibss join x 2437. But I'm not yet ready to report problem; still some debugging to do.) -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] mwifiex: Driver - Firmware Glitches
On Thu, Apr 16, 2015 at 11:10:02AM +0200, Florian Achleitner wrote: Is the necessity of frequent hardware resets a commonly known issue with this chip/firmware? Anybody else experiencing these? Yes, but how frequent? Currently, we see three different scenarios. One of them is currently not answered by reset. Refer to the upcoming patch. (1) mwifiex_cmd_timeout_func: Timeout cmd .. Ok, after reset. See this a lot during heavy testing. (2) Firmware wakeup failed.. Ok, after reset. Never see this. (3) DNLD_CMD: host to card failed. No reset triggered. See patch. Very rarely see this. However, our experience may not be comparable; we are using 8787 with a 3.5 kernel, because we haven't the resources to use a later kernel or get backports working. Also, we use WOL (wake on lan) heavily; frequent automatic suspends, with a GPIO wakeup in addition to the SDHCI. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mwifiex: increase number of probes for specific SSID scans
On Wed, Apr 15, 2015 at 02:01:44AM -0700, Amitkumar Karwar wrote: Hi James, On Tue, Apr 14, 2015 at 07:49:16AM -0700, Amitkumar Karwar wrote: It's been observed that device sometimes fails to find AP configured in hidden SSID in busy environment. We will increase number of probes for specific SSID scans for getting better results. I don't like this. It worries me. What is the underlying cause? If it is something other than collision, why? Idea was to have better chance of finding an AP configured with hidden SSID when environment is busy by sending multiple probe requests. Yes, I understand the intention, but I don't understand why busy environment should cause missed probe response from hidden SSID AP. Speculating ... Have you tested this? Are you sure the probe request is being sent when the channel is clear? Are collisions detected? Is recovery from collision correct? Are you sure it isn't caused by scan results being too large in busy environment? Is scan for specific SSID given priority in scan results, by firmware? I ask because I'm curious; perhaps there is something else happening to cause scan failure. I have reports of scan failure with mwifiex, with 8686 and 8787, but I've not been able to prove the cause of the problem, because of high complexity of testing. Customer usually unwilling to go into depth. In scenario of tens to a hundred laptops scanning for specific SSID for ad-hoc in the Sugar desktop environment, this patch may decrease free air time considerably. You are right. Free air time will be decreased. We have discarded this approach considering its consequences. Should the number of probes be a choice of user space? Do you see any potential use case for multiple probe requests? No use case that doesn't risk interference. I've used it in diagnosis, and in Open Firmware driver. I think, we should stick to current implementation of sending 1 probe request. That's fine. Regards, Amitkumar -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mwifiex: increase number of probes for specific SSID scans
On Tue, Apr 14, 2015 at 07:49:16AM -0700, Amitkumar Karwar wrote: It's been observed that device sometimes fails to find AP configured in hidden SSID in busy environment. We will increase number of probes for specific SSID scans for getting better results. I don't like this. It worries me. What is the underlying cause? If it is something other than collision, why? In scenario of tens to a hundred laptops scanning for specific SSID for ad-hoc in the Sugar desktop environment, this patch may decrease free air time considerably. Should the number of probes be a choice of user space? -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] mwifiex: recover from skb allocation failures during RX
On 24/03/2015, at 1:20 AM, Avinash Patil wrote: From: Zhaoyang Liu li...@marvell.com This patch adds recovery mechanism for SDIO RX during SKB allocation failures. For allocation failures during multiport aggregation, we skip and drop RX packets. For single port read case, we will use preallocated card-mpa_rx.buf to complete cmd53 read. Thanks. Dropping RX data packets is considered safe, as the peer will retry; but does your patch drop events or command responses? Last year, I tried something similar, and I found that the driver would be confused if command responses were dropped. -- James Cameron -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ARP dropped during WPA handshake
On Fri, Mar 13, 2015 at 11:29:34AM -0500, Dan Williams wrote: On Fri, 2015-03-13 at 16:53 +0100, voncken wrote: Below, a tcpdump capture from sta. 17:43:12.964096 EAPOL key (3) v2, len 95 17:43:12.998439 EAPOL key (3) v1, len 117 17:43:13.062409 ARP, Request who-has 10.32.61.100 tell 10.32.0.1, length 28 17:43:13.079989 EAPOL key (3) v2, len 151 17:43:13.082764 EAPOL key (3) v1, len 95 17:43:14.062381 ARP, Request who-has 10.32.61.100 tell 10.32.0.1, length 28 17:43:14.127101 ARP, Reply 10.32.61.100 is-at b8:88:e3:45:1d:c6 (oui Unknown), length 46 17:43:14.127123 IP 10.69.1.201.41690 10.32.61.100.5001: UDP, length 1470 17:43:14.127136 IP 10.69.1.201.41690 10.32.61.100.5001: UDP, length 1470 You can see the ARP request during the WPA Handshake. During the initial WPA handshake the connection is not fully set up, and so no general traffic can (nor should) pass between the STA and AP. That includes ARP and any L2/L3+ protocols, except for EAP and wifi management packets. The interface itself must be IFF_UP before it can pass traffic, including the WPA handshake traffic. IFF_UP only means that the interface can be configured at the L2 level and the hardware is active, it does *not* mean the interface can pass traffic. Whatever is causing the ARPs shouldn't be doing that yet, and should be fixed to use the interface's operstate or IFF_LOWER_UP instead of IFF_UP. Only when the supplicant changes the interface's operstate to IF_OPER_UP is the interface *actually* ready to pass traffic. IFF_UP is not sufficient. Thanks for your reply. It seems wpa_supplicant set the operstate to IF_OPER_DORMANT when he received the ASSOCIATED Event from the driver (through netlink). And set the operstate to IF_OPER_UP in case of wpa handshake success. Is it normal the local ip stack send arp when netdev it is on IF_OPER_DORMANT state? I'm not sure the kernel stack cares much as long as the device is up. It is requesting the ARP because some application is attempting to communicate with that IP address. That application should probably be waiting until the interface is actually ready to communicate, which means IF_OPER_UP. But if this is the first WPA handshake with the AP during the initial connection, the wifi device shouldn't even have an IP address yet, so nothing should be doing ARP on the interface yet. I thought that ARP was a means to get an IP address before an interface had an IP address, so the interface spends some time without an IP address yet generating ARP. Perhaps whatever is assigning the IP address to the interface is doing it too early, before the interface is IF_OPER_UP? Dan Any suggestion will be appreciate. Cedric. Thanks for your help. Cedric Voncken -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 9/9] mwifiex: delay skb allocation for RX until cmd53 over
On Fri, Mar 13, 2015 at 05:37:59PM +0530, Avinash Patil wrote: From: Zhaoyang Liu li...@marvell.com This patch moves SKB allocation for RX packets from current place i.e. after reading MP regs to place where we already have read data from SDIO bus ie after cmd53. mp_rx_aggr_setup has been modified accordingly to set skb_arr to NULL. Signed-off-by: Zhaoyang Liu li...@marvell.com Signed-off-by: Shengzhen Li s...@marvell.com Reviewed-by: Amitkumar Karwar akar...@marvell.com Reviewed-by: Cathy Luo c...@marvell.com Reviewed-by: Avinash Patil pat...@marvell.com --- drivers/net/wireless/mwifiex/sdio.c | 59 ++--- drivers/net/wireless/mwifiex/sdio.h | 8 ++--- 2 files changed, 33 insertions(+), 34 deletions(-) diff --git a/drivers/net/wireless/mwifiex/sdio.c b/drivers/net/wireless/mwifiex/sdio.c index fdeeb67..330e9d0 100644 --- a/drivers/net/wireless/mwifiex/sdio.c +++ b/drivers/net/wireless/mwifiex/sdio.c [snip] @@ -1538,24 +1550,11 @@ static int mwifiex_process_int_status(struct mwifiex_adapter *adapter) rx_len); return -1; } - rx_len = (u16) (rx_blocks * MWIFIEX_SDIO_BLOCK_SIZE); - skb = mwifiex_alloc_dma_align_buf(rx_len, - GFP_KERNEL | - GFP_DMA); - - if (!skb) { - dev_err(adapter-dev, %s: failed to alloc skb, - __func__); - return -1; - } I like it. Because I continue to have problems with dev_alloc_skb failing, and the return -1; that you are removing doesn't seem to leave the card and driver in a useful state. Your patch is hopefully an improvement. Have you done any testing of response after skb allocation failure before and after your patch? -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/3] mwifiex: add support for SD8801
On Fri, Jan 23, 2015 at 05:09:17PM +0530, Avinash Patil wrote: From: Yogesh Ashok Powar yoge...@marvell.com SD8801 is Marvell's 1x1 802.11bgn offering. This patch adds Device IDs for SD8801 and also defines card structure which has definition for register offsets, buffer sizes etc. Signed-off-by: Yogesh Ashok Powar yoge...@marvell.com Signed-off-by: Avinash Patil pat...@marvell.com Signed-off-by: Nishant Sarmukadam nisha...@marvell.com Signed-off-by: Cathy Luo c...@marvell.com Signed-off-by: Frank Huang fra...@marvell.com Reviewed-by: James Cameron qu...@laptop.org (Not tested, still on 8787 with 3.5). -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] mwifiex: handle command response in aggregation
Firmware does occasionally pass a command response to the host on the data port. Ensure it is processed. http://dev.laptop.org/ticket/12749 --- Seen on device firmwares: 14.66.9.p96 14.66.35.p52 Others not tested. drivers/net/wireless/mwifiex/sdio.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mwifiex/sdio.c b/drivers/net/wireless/mwifiex/sdio.c index 933dae1..8fe6147 100644 --- a/drivers/net/wireless/mwifiex/sdio.c +++ b/drivers/net/wireless/mwifiex/sdio.c @@ -1240,8 +1240,7 @@ static int mwifiex_sdio_card_to_host_mp_aggr(struct mwifiex_adapter *adapter, /* copy pkt to deaggr buf */ skb_deaggr = card-mpa_rx.skb_arr[pind]; - if ((pkt_type == MWIFIEX_TYPE_DATA) (pkt_len = -card-mpa_rx.len_arr[pind])) { + if (pkt_len = card-mpa_rx.len_arr[pind]) { memcpy(skb_deaggr-data, curr_ptr, pkt_len); @@ -1251,7 +1250,7 @@ static int mwifiex_sdio_card_to_host_mp_aggr(struct mwifiex_adapter *adapter, mwifiex_decode_rx_packet(adapter, skb_deaggr, pkt_type); } else { - dev_err(adapter-dev, wrong aggr pkt: + dev_err(adapter-dev, bad aggr pkt: type=%d len=%d max_len=%d\n, pkt_type, pkt_len, card-mpa_rx.len_arr[pind]); -- 1.9.1 -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] mwifiex: simplify ad hoc join capability info
While preparing an ad-hoc start command, the capability info bitmap is needlessly set from the command, and then the ESS bit cleared. Change to set the bitmap directly without reference to the command. Signed-off-by: James Cameron qu...@laptop.org --- drivers/net/wireless/mwifiex/join.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/net/wireless/mwifiex/join.c b/drivers/net/wireless/mwifiex/join.c index 8d6c259..411a6c2 100644 --- a/drivers/net/wireless/mwifiex/join.c +++ b/drivers/net/wireless/mwifiex/join.c @@ -880,9 +880,7 @@ mwifiex_cmd_802_11_ad_hoc_start(struct mwifiex_private *priv, /* Set Capability info */ bss_desc-cap_info_bitmap |= WLAN_CAPABILITY_IBSS; - tmp_cap = le16_to_cpu(adhoc_start-cap_info_bitmap); - tmp_cap = ~WLAN_CAPABILITY_ESS; - tmp_cap |= WLAN_CAPABILITY_IBSS; + tmp_cap = WLAN_CAPABILITY_IBSS; /* Set up privacy in bss_desc */ if (priv-sec_info.encryption_mode) { -- 1.9.1 -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch v2 3/5] mwifiex: fix out of memory issue observed for USB chipsets
On Wed, Nov 05, 2014 at 05:04:29PM +0530, Amitkumar Karwar wrote: On some platforms, system goes out of memory during heavy Rx traffic with our USB chipsets. In case of SDIO/PCIe, after receiving 50 packets in Rx queue we stop processing interrupts till packets pending fall below low threshold i.e 20. We don't have similar logic for USB, so if host platform is slow, we would hit a case where firmware keeps on pushing packets at high speed than driver/kernel can process. We will stop submitting URBs for Rx data when pending packet count reaches high threshold and restart them when enough packets are consumed to solve the problem. Other drivers and user activity can deplete memory. How does this patch solve the problem when dev_alloc_skb fails? I'm worried the underlying issue remains; handling out of memory. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed when conected to 5GHz
On Tue, Oct 14, 2014 at 10:25:01AM +0200, Belisko Marek wrote: Hi Amitkumar, On Tue, Oct 14, 2014 at 9:08 AM, Amitkumar Karwar akar...@marvell.com wrote: Hi Marek, I tried both (slightly modified as we're in 3.9 kernel) but issue is still reproducible. My patch against 3.9 sources: Thanks a lot for the tests. One thing which is not yet still clear to me why kernel console is completely unresponsive when receiving packets in high rates. When use iperf (on client) with -b40m it is OK but when increase to -b100m then console is completely unresponsive until iperf finish. Does the system recover when -b100M iperf is finished? Can we run iperf with -b40M later? Do you see dev_alloc_skb failed messages in dmesg when console is unresponsive? When we get dev_alloc_skb failed then interface is dead (cannot ping ...) so no recovery is possible only system reboot. This symptom was familiar to me, but on sdio.c, which is very different code. I've had a brief look at usb.c and offer the following comments: - a list of six data endpoint urb is allocated in mwifiex_usb_rx_init, because MWIFIEX_RX_DATA_URB is 6, - when data endpoint urb is submitted, a new skb is allocated, in mwifiex_usb_submit_rx_urb, and this is the only source of dev_alloc_skb failed message, - in normal situation, when data endpoint urb is complete, skb is either freed or handed up to mwifiex_usb_recv, and the urb is resubmitted, which causes a new skb to be allocated. - if dev_alloc_skb failed message appears, one data endpoint urb has been lost and is not re-used, - if six dev_alloc_skb failed messages appear, the interface should be dead for data receive only. Amitkumar mentioned this on 9th October; corresponding URB won't get submitted. I think this should be fixed; dev_alloc_skb should be harmless failure, please retry. I don't see why interface is dead with only one dev_alloc_skb failed message. I don't see dev_alloc_skb failed when console is unresponsive. Any other ideas what to change to check? Thanks. Could you please share dmesg log with dynamic debug enabled (using attached script) captured when the problem occurs? I tried to capture logs but when enable DYNAMIC_DEBUG I cannot reproduce issue (running test 30 minutes without allocation failure). Yes, I've seen similar; turn on debugging, and timing critical bug goes away. Serial console? If so, try turning it off, and logging to dmesg buffer only. -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed when conected to 5GHz
On Wed, Sep 17, 2014 at 03:52:52AM -0700, Amitkumar Karwar wrote: Hi BR, Dear Amitkumar Karwar, some additional info. On Thu, Sep 11, 2014 at 5:09 PM, Amitkumar Karwar akar...@marvell.com wrote: Hi BR, I'm using 3.9 mainline mwifiex driver for wireless usb card. Doing some throughput testing (with iperf) in 5GHz I got following failures: [ 221.521799] usb 1-1: mwifiex_usb_submit_rx_urb: dev_alloc_skb failed This is skb allocation failure returned by kernel. 4k buffer is always allocated for Rx packets. This issue doesn't seem to be specific to 5Ghz. Yes you're right. I can reproduce issue also with 2.4GHz (doing iperf testing as mentioned in other email) by pinging device with card. I checked which which size fails to allocate and it's 4096 bytes. I was looking to changes in never kernel releases but I cannot find anything obvious. When connected to 2.4GHz I cannot reproduce issue though. I'm using FW version mwifiex 1.0 (14.68.29.p26). Could you please provide the platform details? How often the problem occurs during throughput testing? Are there any specific steps? One more observation is that when problem occurred complete system is unresponsive (console is almost completely dead). Thanks for the more information. Skb alloc failure should be gracefully handled. We will look into this issue. If you get time, I'd also appreciate a look into the issue on sdio.c during data receive. When dev_alloc_skb fails the interrupt handler does not rewind the driver state in preparation for a retry. This is not graceful. http://dev.laptop.org/ticket/12694 has details, and an adequate solution we are using in 3.5 to rewind the driver state: http://dev.laptop.org/git/olpc-kernel/commit/?h=arm-3.5id=59fcaf10cce5bbdc370ec1c262b12aeb66ed1dca We're using 8787. I can workaround issue by decreasing iperf bandwidth to ~40m. I think in this situation we're running out of memory by exhaustive skb allocations. Actually 6 4K size buffers are being allocated for Rx and Tx data during traffic. Probably your platform runs out of memory after these allocations. Could you please try changing this number(MWIFIEX_TX_DATA_URB/MWIFIEX_RX_DATA_URB macros) to 3? Regards, Amitkumar Karwar N?r??yb?X??ǧv?^?){.n?+{??*ޕ?,?{ay?ʇڙ?,j??f???h???z??w??? ???j:+v???w?j?mzZ+?ݢj??! -- James Cameron http://quozl.linux.org.au/ -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html