Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
Do you think that the appropriate patches could be copied to the appropriate people please? On Thu, Jan 11, 2018 at 04:46:24PM -0800, Dan Williams wrote: > Changes since v1 [1]: > * fixup the ifence definition to use alternative_2 per recent AMD > changes in tip/x86/pti (Tom) > > * drop 'nospec_ptr' (Linus, Mark) > > * rename 'nospec_array_ptr' to 'array_ptr' (Alexei) > > * rename 'nospec_barrier' to 'ifence' (Peter, Ingo) > > * clean up occasions of 'variable assignment in if()' (Sergei, Stephen) > > * make 'array_ptr' use a mask instead of an architectural ifence by > default (Linus, Alexei) > > * provide a command line and compile-time opt-in to the ifence > mechanism, if an architecture provides 'ifence_array_ptr'. > > * provide an optimized mask generation helper, 'array_ptr_mask', for > x86 (Linus) > > * move 'get_user' hardening from '__range_not_ok' to '__uaccess_begin' > (Linus) > > * drop "Thermal/int340x: prevent bounds-check..." since userspace does > not have arbitrary control over the 'trip' index (Srinivas) > > * update the changelog of "net: mpls: prevent bounds-check..." and keep > it in the series to continue the debate about Spectre hygiene patches. > (Eric). > > * record a reviewed-by from Laurent on "[media] uvcvideo: prevent > bounds-check..." > > * update the cover letter > > [1]: https://lwn.net/Articles/743376/ > > --- > > Quoting Mark's original RFC: > > "Recently, Google Project Zero discovered several classes of attack > against speculative execution. One of these, known as variant-1, allows > explicit bounds checks to be bypassed under speculation, providing an > arbitrary read gadget. Further details can be found on the GPZ blog [2] > and the Documentation patch in this series." > > This series incorporates Mark Rutland's latest ARM changes and adds > the x86 specific implementation of 'ifence_array_ptr'. That ifence > based approach is provided as an opt-in fallback, but the default > mitigation, '__array_ptr', uses a 'mask' approach that removes > conditional branches instructions, and otherwise aims to redirect > speculation to use a NULL pointer rather than a user controlled value. > > The mask is generated by the following from Alexei, and Linus: > > mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1); > > ...and Linus provided an optimized mask generation helper for x86: > > asm ("cmpq %1,%2; sbbq %0,%0;" > :"=r" (mask) > :"r"(sz),"r" (idx) > :"cc"); > > The 'array_ptr' mechanism can be switched between 'mask' and 'ifence' > via the spectre_v1={mask,ifence} command line option, and the > compile-time default is set by selecting either CONFIG_SPECTRE1_MASK or > CONFIG_SPECTRE1_IFENCE. > > The 'array_ptr' infrastructure is the primary focus this patch set. The > individual patches that perform 'array_ptr' conversions are a point in > time (i.e. earlier kernel, early analysis tooling, x86 only etc...) > start at finding some of these gadgets. > > Another consideration for reviewing these patches is the 'hygiene' > argument. When a patch refers to hygiene it is concerned with stopping > speculation on an unconstrained or insufficiently constrained pointer > value under userspace control. That by itself is not sufficient for > attack (per current understanding) [3], but it is a necessary > pre-condition. So 'hygiene' refers to cleaning up those suspect > pointers regardless of whether they are usable as a gadget. > > These patches are also be available via the 'nospec-v2' git branch > here: > > git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v2 > > Note that the BPF fix for Spectre variant1 is merged in the bpf.git > tree [4], and is not included in this branch. > > [2]: > https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html > [3]: https://spectreattack.com/spectre.pdf > [4]: > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=b2157399cc98 > > --- > > Dan Williams (16): > x86: implement ifence() > x86: implement ifence_array_ptr() and array_ptr_mask() > asm-generic/barrier: mask speculative execution flows > x86: introduce __uaccess_begin_nospec and ASM_IFENCE > x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths > ipv6: prevent bounds-check bypass via speculative execution > ipv4: prevent bounds-check bypass via speculative execution > vfs, fdtable: prevent bounds-check bypass via speculative execution > userns: prevent bounds-check bypass via speculative execution > udf: prevent bounds-check bypass via speculative execution > [media] uvcvideo: prevent bounds-check bypass via speculative execution > carl9170: prevent bounds-check bypass via speculative execution > p54: prevent bounds-check bypass via speculative execution > qla2xxx: prevent bounds-check bypass via speculative execution >
Re: AP mode with Broadcom 4330
Arend, Did this bug ever get fixed, or are 4330's still ending up advertising BCRM_TEST_SSID when they're put into AP mode with mainline kernels? It would be good to know whether I can drop my patch for this from my kernel tree and still have working AP mode. Thanks. On Mon, Jul 31, 2017 at 10:28:50PM +0200, Arend van Spriel wrote: > On 31-07-17 14:59, Russell King - ARM Linux wrote: > > On Fri, Jul 28, 2017 at 09:50:21PM +0200, Arend van Spriel wrote: > >> I was going to agree with you, but having second thoughts. There are > >> actually two use-cases that need to be handled properly. The regular AP > >> case and the MBSS case. In case of MBSS the initial AP interface will > >> have mbss set to false and subsequent AP interfaces will have mbss set > >> to true, but in firmware this has to be configured inverted. That is > >> what the code above is doing. However, this indeed breaks the regular AP > >> case for firmwares that abuse that setting for testing purposes (no idea > >> why that is in a released firmware). > > > > Maybe detect the BCRM_TEST_SSID string in the firmware file (as it's > > broken up amongst other data, it's not trivial) and disable mbss for > > such firmware? Alternatively, maybe blacklist mbss for some firmware > > versions? > > Well. It seem 43362 chip also had this and we disabled mbss for that > chipset. So we may do that for 4330 as well. > > > Do the firmware versions that include this "abuse" actually have > > functional mbss? > > Digging through our internal bug database I found a remark that > BRCM_TEST_SSID showing up means mbss is not functional. > > > There's also the obvious question: which firmware is recommended for > > the 4330? > > We tend to rely on what is released to AOSP as our team does not have > the bandwidth to go through the release process. I checked and it is > still the same so matches what is in linux-firmware. > > Regards, > Arend -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: [EXTERNAL] wl1837: poor performance?
On Fri, Oct 13, 2017 at 07:43:37AM +, Reizer, Eyal wrote: > Sorry for top posting. > Signal level on wlan0 seems low (-79 dbm) are you sure there are antennas > connected? Thanks for the reply. In development environments, it's common to have the AP and station nearby, which will give good signal. Out of the development environment, the AP and station may be separated by several 10s of ft and objects in the way. That's the case here, so about -80dBm is to be expected. As I mentioned, there's two stations each with different chipsets, both with the same signal level being reported, both at a similar distance from the AP, but one performs way better than the other. I've found the reason for this (see below.) I should also mention that this is installed remotely, and I'm debugging it over the 'net, involving this very Wi-Fi connection. The radio environment is not excessively populated - both systems are within a tower with >2ft thick stone and lime mortar walls which radio has difficulty penetrating, which is itself in a park with very few other buildings around. I'm using the 2.4G channel 8, I've seen some other APs on channel 1 when outside but almost never inside the tower. nmcli (used to?) occasionally reports that the wl1837 can see another AP ("CARWIFI") on channel 1, but that's rare - I suspect it requires someone to park their car in direct line of literal sight through a tower window from the machine. I suspect NM noticed that before I configured the BSSID of the associated AP as I haven't recently noticed NM reporting any other networks. Could also be that non-wifi enabled cars are parked in those spaces! > In addition what type of module is used here? It's a WL1837 integrated onto SolidRun's iMX6 microsom. I'm not sure what other information in terms of "module" you're asking for. > Di you use the configure_device.sh scripts as documented in the > "wlconf" manual? No, because quite honestly the wilink tooling is something of a mess in terms of trying to find the right tooling. This is what I ended up doing: 1. cloning git://github.com/TI-OpenLink/18xx-ti-utils 2. taking the wl18xx driver default configuration (from debugfs). 3. taking the ti wilink driver conf.h files. 4. massaging the conf.h files into a form that wlconf can read 5. generating a new struct.bin 6. changing: wl18xx.phy.number_of_assembled_ant2_4 = 0x01 wl18xx.phy.number_of_assembled_ant5 = 0x00 since this variant of the microsom I have only populates one antenna. (The layout allows for the design to be extended to a second antenna.) I've since found the _right_ repository for the tooling (git://git.ti.com/wilink8-wlan/18xx-ti-utils.git), and compared the resulting files, and confirmed that my new struct.bin is identical to the one in the right repository. There are some differences between the two files (- = mine, + = configure_device.sh generated): -core.sg.params = 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x00aa, 0x0032, 0x, 0x, 0x, 0x00c8, 0x, 0x, 0x, 0x0001, 0x, 0x003c, 0x, 0x04b0, 0x, 0x0001, 0x0003, 0x0006, 0x, 0x, 0x0002, 0x, 0x, 0x0003, 0x, 0x0002, 0x001e, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x +core.sg.params = 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x000f, 0x001b, 0x0011, 0x00aa, 0x0032, 0x0064, 0x0320, 0x00c8, 0x00c8, 0x, 0x, 0x, 0x0001, 0x, 0x003c, 0x1388, 0x04b0, 0x03e8, 0x0001, 0x0003, 0x0006, 0x000a, 0x000a, 0x0002, 0x0005, 0x001e, 0x0003, 0x000a, 0x0002, 0x, 0x0019, 0x0019, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x -core.conn.dynamic_ps_timeout = 0x05dc +core.conn.dynamic_ps_timeout = 0x0096 -core.sched_scan.num_short_intervals = 0x0e +core.sched_scan.num_short_intervals = 0x0d -core.fwlog.mem_blocks = 0x00 +core.fwlog.mem_blocks = 0x02 -wl18xx.ap_sleep.max_stations_thresh = 0x00 -wl18xx.ap_sleep.idle_conn_thresh = 0x00 +wl18xx.ap_sleep.max_stations_thresh = 0x04 +wl18xx.ap_sleep.idle_conn_thresh = 0x08 The core.sg.param
wl1837: poor performance?
On Thu, Oct 12, 2017 at 10:59:25AM +0100, Russell King - ARM Linux wrote: > It looks like ti wilink is unmaintained, so I've added some people who > have touched the driver recently. > > Running wl1837 on a Hummingboard2 (iMX6 Dual core) I've seen one instance > of the warning below. Luckily, the recovery worked and connectivity was > maintained. I also have a question about seemingly poor performance of this driver, so starting a new thread. When running a ping on two clients (and only two clients) connected to a single AP: AP: Broadcom brcmfmac bcm4330 Linux machine. With a ping running on each client machine, the AP reports: # iw wlan0 station dump Station a4:xx:xx:xx:xx:xx (on wlan0) <-- rtl8192eu inactive time: 0 ms rx packets: 6846 tx packets: 2108 tx failed: 0 tx bitrate: 13.0 MBit/s rx bitrate: 19.5 MBit/s authorized: yes authenticated: yes WMM/WME:yes TDLS peer: no Station 00:0f:00:xx:xx:xx (on wlan0) <-- wl1837 inactive time: 0 ms rx packets: 589233 tx packets: 360969 tx failed: 737 tx bitrate: 13.0 MBit/s rx bitrate: 19.5 MBit/s authorized: yes authenticated: yes WMM/WME:yes TDLS peer: no Client 1: wl1837 using mainline driver: # iw wlan0 link Connected to xx:xx:xx:xx:xx:xx (on wlan0) SSID: freq: 2447 RX: 223158 bytes (2307 packets) TX: 1502828 bytes (1955 packets) signal: -78 dBm tx bitrate: 19.5 MBit/s MCS 2 bss flags: short-preamble short-slot-time dtim period:2 beacon int: 100 Ping statistics: 64 bytes from 192.168.250.1: icmp_seq=85 ttl=64 time=10.6 ms 64 bytes from 192.168.250.1: icmp_seq=86 ttl=64 time=10.5 ms 64 bytes from 192.168.250.1: icmp_seq=87 ttl=64 time=10.9 ms 64 bytes from 192.168.250.1: icmp_seq=88 ttl=64 time=10.4 ms 64 bytes from 192.168.250.1: icmp_seq=89 ttl=64 time=12.2 ms 64 bytes from 192.168.250.1: icmp_seq=90 ttl=64 time=12.6 ms 90 packets transmitted, 88 received, 2% packet loss, time 89223ms rtt min/avg/max/mdev = 3.359/58.323/2275.188/273.244 ms, pipe 3 Client 2: rtl8192eu (using vendor driver): # iw wlan0 link Connected to xx:xx:xx:xx:xx:xx (on wlan0) SSID: freq: 2447 signal: -78 dBm tx bitrate: 65.0 MBit/s Ping statistics: 64 bytes from 192.168.250.1: icmp_seq=85 ttl=64 time=6.55 ms 64 bytes from 192.168.250.1: icmp_seq=86 ttl=64 time=3.34 ms 64 bytes from 192.168.250.1: icmp_seq=87 ttl=64 time=3.96 ms 64 bytes from 192.168.250.1: icmp_seq=88 ttl=64 time=3.47 ms 64 bytes from 192.168.250.1: icmp_seq=89 ttl=64 time=3.48 ms 64 bytes from 192.168.250.1: icmp_seq=90 ttl=64 time=3.35 ms 90 packets transmitted, 90 received, 0% packet loss, time 89124ms rtt min/avg/max/mdev = 3.191/4.902/26.607/3.978 ms The difference is quite marked: the rtl8192eu seems to perform much better than the wl1837 even though the wl1837 is located closer to the AP than the rtl8192eu. Most of the ping times via the wl1837 look to be around 10ms, vs 3ms for the rtl8192eu, as can be seen above. Individually, the ping times are very similar to the above. All wifi interfaces have power save disabled, either via the driver module options where possible, or via iw wlan0 set power_save 0 So, why does wl1837 appear to be performing not as well than the rtl8192eu? Are out of tree vendor drivers in fact better than kernel-merged drivers? :) The machines are synchronised via NTP across the wifi link as best they can manage with the irregularity in the network (the rtl8192eu syncs /way/ better than the wl1837 - rtl8192eu is always sync'd to within 250us, the wl1837 keeps reporting milliseconds offset in the NTP loop stats): >From the AP: 23:54:18.026441 IP 192.168.250.4 > 192.168.250.1: ICMP echo request, id 1112, seq 110, length 64 23:54:18.026604 IP 192.168.250.1 > 192.168.250.4: ICMP echo reply, id 1112, seq 110, length 64 23:54:19.036396 IP 192.168.250.4 > 192.168.250.1: ICMP echo request, id 1112, seq 111, length 64 23:54:19.036560 IP 192.168.250.1 > 192.168.250.4: ICMP echo reply, id 1112, seq 111, length 64 23:54:20.036874 IP 192.168.250.4 > 192.168.250.1: ICMP echo request, id 1112, seq 112, length 64 23:54:20.037025 IP 192.168.250.1 > 192.168.250.4: ICMP echo reply, id 1112, seq 112, length 64 >From the wl1837 client: 23:54:18.028504 IP 192.168.250.4 > 192.168.250.1: ICMP echo request, id 1112, seq 110, length 64 23:54:18.032074 IP 192.168.250.1 > 192.168.250.4: ICMP echo reply, id 1112, seq 110, length 64 23:54:19.030464 IP 192.168.250.4 > 192.168.250.1: ICMP echo request, id 1112, seq 111, length 64 23:54:19.043282 IP 192.168.250.1 > 192.168.250.4: ICMP echo reply, id 1112, seq 111, length 64 23:54:20.031082 IP 192.168.250.4 > 192.1
wl1837: ERROR SW watchdog interrupt received! starting recovery
It looks like ti wilink is unmaintained, so I've added some people who have touched the driver recently. Running wl1837 on a Hummingboard2 (iMX6 Dual core) I've seen one instance of the warning below. Luckily, the recovery worked and connectivity was maintained. ... wlcore: Association completed. After 19532s from boot, I saw: wlcore: ERROR SW watchdog interrupt received! starting recovery. [ cut here ] WARNING: CPU: 0 PID: 244 at drivers/net/wireless/ti/wlcore/main.c:796 wl12xx_queue_recovery_work+0x68/0x70 [wlcore] Modules linked in: nfsd wl18xx wlcore mac80211 cfg80211 caam_jr imx_media_ic(C) imx_media_vdic(C) snd_soc_imx_sgtl5000 snd_soc_fsl_asoc_card imx_media_csi(C) imx_media_capture(C) snd_soc_imx_audmux wlcore_sdio snd_soc_sgtl5000 mux_mmio video_mux mux_core ci_hdrc_imx ci_hdrc caam udc_core usbmisc_imx imx_sdma imx2_wdt coda v4l2_mem2mem videobuf2_v4l2 rc_cec imx_vdoa videobuf2_dma_contig videobuf2_core videobuf2_vmalloc videobuf2_memops imx_thermal snd_soc_fsl_ssi imx_pcm_dma imx_media(C) dw_hdmi_ahb_audio dw_hdmi_cec imx_media_common(C) v4l2_fwnode etnaviv CPU: 0 PID: 244 Comm: irq/243-wl18xx Tainted: G C 4.14.0-rc1+ #2209 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [] (dump_backtrace) from [] (show_stack+0x18/0x1c) r6:6013 r5: r4: r3: [] (show_stack) from [] (dump_stack+0xa4/0xdc) [] (dump_stack) from [] (__warn+0xdc/0x108) r6:bf376a48 r5: r4: r3:c0a41530 [] (__warn) from [] (warn_slowpath_null+0x28/0x30) r10:ee309950 r8:ee30973c r7: r6:ee309788 r5:ee309704 r4:ee3096e0 [] (warn_slowpath_null) from [] (wl12xx_queue_recovery_work+0x68/0x70 [wlcore]) [] (wl12xx_queue_recovery_work [wlcore]) from [] (wlcore_irq+0x15c/0x174 [wlcore]) r4:ee3096e0 r3:0001 [] (wlcore_irq [wlcore]) from [] (irq_thread_fn+0x24/0x3c) r10:c00a46ec r8:ee349b00 r7:ef2ffc00 r6:ef2ffc00 r5: r4:ee349b00 [] (irq_thread_fn) from [] (irq_thread+0x128/0x1ec) r6:0001 r5: r4:ee349b24 r3:0004 [] (irq_thread) from [] (kthread+0x150/0x198)[19532.504033] r10:c00a4784 r9:ef111d10 r8:ee349b00 r7:ee2b9680 r6:ee349c00 r5: r4:ee2b9600 [] (kthread) from [] (ret_from_fork+0x14/0x3c) r10: r9: r8: r7: r6: r5:c005cc08 r4:ee349c00 r3:ed9a8000 ---[ end trace b35f1ada6f716c27 ]--- wlcore: Hardware recovery in progress. FW ver: Rev 8.9.0.0.75 wlcore: pc: 0x116424, hint_sts: 0x count: 1 wlcore: down ieee80211 phy0: Hardware restart was requested wlcore: PHY firmware version: Rev 8.2.0.0.240 wlcore: firmware booted (Rev 8.9.0.0.75) wlcore: Association completed. The interrupt, according to /proc/interrupts, shows: CPU0 CPU1 243: 32387 0 gpio-mxc 4 Level wl18xx although that's from about a day or so after boot. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: AP mode with Broadcom 4330
On Tue, Aug 15, 2017 at 10:58:42AM +0100, Russell King - ARM Linux wrote: > Sorry for the confusion - the problem with iwlwifi turned out to be a > lack of /dev/random entropy, causing hostapd to forcefully deauth the > client - that was hidden when running hostapd from systemd and is only > visible if you run hostapd manually. > > (I have other problems there - manually starting hostapd works every > time, but when started using systemctl start hostapd, systemctl status > hostapd always reports that it's started but exited and it definitely > isn't running... I'm just hitting one problem after another here with > wireless, I'm quite sure this tech hates me!) > > However, I'm still having problems with the Realtek not getting further > than _allegedly_ sending the auth frames - I'm not convinced that the > driver is actually sending anything yet. I can't see anything suggesting > it is from iwlwifi in monitor mode, and enabling all the debug for the > rtl8xxxu driver doesn't give the slightest hint that this driver is > doing anything remotely useful to transmit these frames. For instance: > > [41742.979480] usb 1-1: rtl8xxxu_read32(0440) = 0x000f, len 4 > [41742.985826] usb 1-1: rtl8xxxu_write32(0440) = 0x000f > [41742.994470] usb 1-1: rtl8xxxu_write8(0480) = 0x04 > [41742.999430] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) > [41743.206163] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 2/3) > [41743.414148] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 3/3) > [41743.622138] wlan0: authentication with 6c:ad:f8:1d:4c:d9 timed out > [41743.629100] usb 1-1: rtl8xxxu_write8(0618) = 0x00 > [41743.636490] usb 1-1: rtl8xxxu_write8(0619) = 0x00 > [41743.641573] usb 1-1: rtl8xxxu_write8(061a) = 0x00 > > How can it send auth packets without writing to any registers (this is > with all debug options set in /sys/module/rtl8xxxu/parameters/debug !) > > So, I don't think the 4330 has a problem, I think it's all down to > the Realtek driver being buggy. And things get even weirder - if I reboot, rtl8xxxu fails to associate: [ 38.377649] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 [ 38.410252] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) [ 38.616678] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 2/3) [ 38.824660] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 3/3) [ 39.032645] wlan0: authentication with 6c:ad:f8:1d:4c:d9 timed out if I then rmmod the rtl8xxxu module and reinsert the exact same module: [ 49.914724] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 [ 49.935765] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) [ 49.943182] wlan0: authenticated [ 49.956116] wlan0: associate with 6c:ad:f8:1d:4c:d9 (try 1/3) [ 49.971504] wlan0: RX AssocResp from 6c:ad:f8:1d:4c:d9 (capab=0x411 status=0 aid=2) [ 49.985466] usb 1-1: rtl8xxxu_bss_info_changed: HT supported [ 49.997918] wlan0: associated This looks like a rtl8xxxu driver / core mac80211 bug, so I'll take it to the rtl8xxx folk now. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: AP mode with Broadcom 4330
On Tue, Aug 15, 2017 at 11:42:12AM +0200, Arend van Spriel wrote: > On 15-08-17 10:22, Russell King - ARM Linux wrote: > >On Mon, Aug 14, 2017 at 11:25:03PM -0700, ros...@gmail.com wrote: > >>If using rtlwifi, you could try using rtl8xxxu and see if you get > >>similar results. ¯\_(ツ)_/¯ > > > >I'm using rtl8xxxu. I don't think rtlwifi supports the 8192eu - it > >certainly does not list the device id: > >Bus 001 Device 002: ID 0bda:818b Realtek Semiconductor Corp. > > > >As an extra data point, trying to associate to the 4330 with an Intel > >client gives: > > > >[8821752.691490] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 > >[8821752.693448] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) > >[8821752.696230] wlan0: authenticated > >[8821752.697493] wlan0: associate with 6c:ad:f8:1d:4c:d9 (try 1/3) > >[8821752.700816] wlan0: RX AssocResp from 6c:ad:f8:1d:4c:d9 (capab=0x411 > >status=0 aid=1) > >[8821752.704407] wlan0: associated > >[8821755.814844] wlan0: deauthenticated from 6c:ad:f8:1d:4c:d9 (Reason: > >2=PREV_AUTH_NOT_VALID) > > > >which gets slightly further but ultimately still fails. > > Hi Russell, > > On vacation this week, but it is raining over here so have some moments to > kill. Here a couple of things to try: > > 1) try without encryption. > 2) does hostapd log show anything interesting. > 3) can you use the Intel client to make a sniff. > > I can check what could trigger the firmware after 3 sec. to deauth. Just not > sure if I will get to that this week. Could be failing EAPOL handshake. Sorry for the confusion - the problem with iwlwifi turned out to be a lack of /dev/random entropy, causing hostapd to forcefully deauth the client - that was hidden when running hostapd from systemd and is only visible if you run hostapd manually. (I have other problems there - manually starting hostapd works every time, but when started using systemctl start hostapd, systemctl status hostapd always reports that it's started but exited and it definitely isn't running... I'm just hitting one problem after another here with wireless, I'm quite sure this tech hates me!) However, I'm still having problems with the Realtek not getting further than _allegedly_ sending the auth frames - I'm not convinced that the driver is actually sending anything yet. I can't see anything suggesting it is from iwlwifi in monitor mode, and enabling all the debug for the rtl8xxxu driver doesn't give the slightest hint that this driver is doing anything remotely useful to transmit these frames. For instance: [41742.979480] usb 1-1: rtl8xxxu_read32(0440) = 0x000f, len 4 [41742.985826] usb 1-1: rtl8xxxu_write32(0440) = 0x000f [41742.994470] usb 1-1: rtl8xxxu_write8(0480) = 0x04 [41742.999430] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) [41743.206163] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 2/3) [41743.414148] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 3/3) [41743.622138] wlan0: authentication with 6c:ad:f8:1d:4c:d9 timed out [41743.629100] usb 1-1: rtl8xxxu_write8(0618) = 0x00 [41743.636490] usb 1-1: rtl8xxxu_write8(0619) = 0x00 [41743.641573] usb 1-1: rtl8xxxu_write8(061a) = 0x00 How can it send auth packets without writing to any registers (this is with all debug options set in /sys/module/rtl8xxxu/parameters/debug !) So, I don't think the 4330 has a problem, I think it's all down to the Realtek driver being buggy. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: AP mode with Broadcom 4330
On Mon, Aug 14, 2017 at 11:25:03PM -0700, ros...@gmail.com wrote: > If using rtlwifi, you could try using rtl8xxxu and see if you get > similar results. ¯\_(ツ)_/¯ I'm using rtl8xxxu. I don't think rtlwifi supports the 8192eu - it certainly does not list the device id: Bus 001 Device 002: ID 0bda:818b Realtek Semiconductor Corp. As an extra data point, trying to associate to the 4330 with an Intel client gives: [8821752.691490] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 [8821752.693448] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) [8821752.696230] wlan0: authenticated [8821752.697493] wlan0: associate with 6c:ad:f8:1d:4c:d9 (try 1/3) [8821752.700816] wlan0: RX AssocResp from 6c:ad:f8:1d:4c:d9 (capab=0x411 status=0 aid=1) [8821752.704407] wlan0: associated [8821755.814844] wlan0: deauthenticated from 6c:ad:f8:1d:4c:d9 (Reason: 2=PREV_AUTH_NOT_VALID) which gets slightly further but ultimately still fails. > On Tue, 2017-08-15 at 00:30 +0100, Russell King - ARM Linux wrote: > > On Fri, Jul 28, 2017 at 09:50:21PM +0200, Arend van Spriel wrote: > > > On 28-07-17 19:49, Russell King - ARM Linux wrote: > > > > Replacing that "(!mbss)" with "mbss" results in AP mode working > > > > on the > > > > 4330. However, I suspect: > > > > > > > > if (brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MBSS)) > > > > brcmf_fil_iovar_int_set(ifp, "mbss", mbss); > > > > > > > > actually makes much more sense. > > > > > > > > Given that this is direct firmware interaction, I can't say which > > > > is > > > > correct - all I can say is that mainline kernels are currently > > > > broken. > > > > > > Indeed. I have to come up with a proper fix for both scenarios. > > > Thanks > > > for the report. > > > > I'm now on 4.13-rc4, and running with my patch. I've configured AP > > mode: > > > > Interface wlan0 > > ifindex 3 > > wdev 0x1 > > addr 6c:ad:f8:1d:4c:d9 > > ssid Time > > type AP > > wiphy 0 > > channel 5 (2432 MHz), width: 20 MHz, center1: 2432 MHz > > > > using NetworkManager with a WPA2 key. However, a Realtek RTL8192EU > > client is trying to connect to it, but is unable: > > > > [ 1750.497436] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 > > [ 1750.530929] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) > > [ 1750.738795] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 2/3) > > [ 1750.946780] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 3/3) > > [ 1751.154764] wlan0: authentication with 6c:ad:f8:1d:4c:d9 timed out > > > > The antennas are definitely within range (finger to thumb distance) > > and the Realtek can definitely see the BCM4330 in AP mode. I don't > > see the interrupts for the SDIO interface increment while the client > > tries to connect. > > > > The WPA2 password is definitely the same on both ends. > > > > Any ideas how to debug this? > > > > Thanks. > > -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: AP mode with Broadcom 4330
On Fri, Jul 28, 2017 at 09:50:21PM +0200, Arend van Spriel wrote: > On 28-07-17 19:49, Russell King - ARM Linux wrote: > > Replacing that "(!mbss)" with "mbss" results in AP mode working on the > > 4330. However, I suspect: > > > > if (brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MBSS)) > > brcmf_fil_iovar_int_set(ifp, "mbss", mbss); > > > > actually makes much more sense. > > > > Given that this is direct firmware interaction, I can't say which is > > correct - all I can say is that mainline kernels are currently broken. > > Indeed. I have to come up with a proper fix for both scenarios. Thanks > for the report. I'm now on 4.13-rc4, and running with my patch. I've configured AP mode: Interface wlan0 ifindex 3 wdev 0x1 addr 6c:ad:f8:1d:4c:d9 ssid Time type AP wiphy 0 channel 5 (2432 MHz), width: 20 MHz, center1: 2432 MHz using NetworkManager with a WPA2 key. However, a Realtek RTL8192EU client is trying to connect to it, but is unable: [ 1750.497436] wlan0: authenticate with 6c:ad:f8:1d:4c:d9 [ 1750.530929] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 1/3) [ 1750.738795] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 2/3) [ 1750.946780] wlan0: send auth to 6c:ad:f8:1d:4c:d9 (try 3/3) [ 1751.154764] wlan0: authentication with 6c:ad:f8:1d:4c:d9 timed out The antennas are definitely within range (finger to thumb distance) and the Realtek can definitely see the BCM4330 in AP mode. I don't see the interrupts for the SDIO interface increment while the client tries to connect. The WPA2 password is definitely the same on both ends. Any ideas how to debug this? Thanks. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
Re: AP mode with Broadcom 4330
On Fri, Jul 28, 2017 at 09:50:21PM +0200, Arend van Spriel wrote: > I was going to agree with you, but having second thoughts. There are > actually two use-cases that need to be handled properly. The regular AP > case and the MBSS case. In case of MBSS the initial AP interface will > have mbss set to false and subsequent AP interfaces will have mbss set > to true, but in firmware this has to be configured inverted. That is > what the code above is doing. However, this indeed breaks the regular AP > case for firmwares that abuse that setting for testing purposes (no idea > why that is in a released firmware). Maybe detect the BCRM_TEST_SSID string in the firmware file (as it's broken up amongst other data, it's not trivial) and disable mbss for such firmware? Alternatively, maybe blacklist mbss for some firmware versions? Do the firmware versions that include this "abuse" actually have functional mbss? There's also the obvious question: which firmware is recommended for the 4330? -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: AP mode with Broadcom 4330
On Fri, Jul 28, 2017 at 03:15:03PM +0100, Russell King - ARM Linux wrote: > Hi, > > I've been struggling yesterday and today trying to configure AP mode > with the Broadcom 4330 on a SolidRun Hummingboard2, using the 2013 > firmware: > > Firmware version = wl0: Jan 23 2013 17:47:32 version 5.90.195.114 FWID > 01-f9e7e464 > > People tell me that this works with SR's 3.14 kernel, but I'd prefer > to use mainline (4.13-rc2). Whenever I try to configure AP mode via > Network Manager or hostapd (on Debian Jessie), the SSID I ask for and > the MAC address does not appear on other wifi clients. wlan0's > MAC is 6c:ad:f8:1d:4c:d9. I've just found the cause of this. What it comes down to is this commit: commit a44aa4001a86d46f936ca449e5d6c268446bfae2 Author: Hante Meuleman <meule...@broadcom.com> Date: Wed Dec 3 21:05:33 2014 +0100 brcmfmac: add multiple BSS support. This patch adds support for multiple BSS interfaces (AP). In total three AP configurations can be created. In order to use multiple BSS firmware needs to support it. Reviewed-by: Arend Van Spriel <ar...@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <piete...@broadcom.com> Signed-off-by: Hante Meuleman <meule...@broadcom.com> Signed-off-by: Arend van Spriel <ar...@broadcom.com> Signed-off-by: John W. Linville <linvi...@tuxdriver.com> which adds this hunk to brcmf_cfg80211_start_ap() if (dev_role == NL80211_IFTYPE_AP) { + if ((brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MBSS)) && (!mbss)) + brcmf_fil_iovar_int_set(ifp, "mbss", 1); + What this is saying is: "if the device supports MBSS, and MBSS was not requested (from ifp->vif->mbss), then *ENABLE* MBSS." That's clearly nonsense. Replacing that "(!mbss)" with "mbss" results in AP mode working on the 4330. However, I suspect: if (brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MBSS)) brcmf_fil_iovar_int_set(ifp, "mbss", mbss); actually makes much more sense. Given that this is direct firmware interaction, I can't say which is correct - all I can say is that mainline kernels are currently broken. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
AP mode with Broadcom 4330
Hi, I've been struggling yesterday and today trying to configure AP mode with the Broadcom 4330 on a SolidRun Hummingboard2, using the 2013 firmware: Firmware version = wl0: Jan 23 2013 17:47:32 version 5.90.195.114 FWID 01-f9e7e464 People tell me that this works with SR's 3.14 kernel, but I'd prefer to use mainline (4.13-rc2). Whenever I try to configure AP mode via Network Manager or hostapd (on Debian Jessie), the SSID I ask for and the MAC address does not appear on other wifi clients. wlan0's MAC is 6c:ad:f8:1d:4c:d9. However, I have recently noticed that this pops up on clients when AP mode is enabled: BSS 00:10:18:f1:f2:f3(on wlan0) TSF: 80810271 usec (0d, 00:01:20) freq: 2412 beacon interval: 10 TUs capability: ESS (0x0001) signal: -15.00 dBm last seen: 3203 ms ago SSID: BRCM_TEST_SSID Supported rates: 1.0* 2.0* 5.5* 11.0* DS Parameter set: channel 1 IBSS ATIM window: 0 TUsBSS 52:0d:10:41:e9:99(on wlan0) TSF: 21849896478 usec (0d, 06:04:09) freq: 2462 beacon interval: 100 TUs capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431) signal: -80.00 dBm last seen: 3020 ms ago Information elements from Probe Response frame: SSID: Virgin Media Supported rates: 1.0* 2.0* 5.5* 11.0* 6.0 9.0 12.0 18.0 DS Parameter set: channel 11 Country: GB Environment: Indoor/Outdoor Channels [1 - 13] @ 20 dBm ERP: Extended supported rates: 24.0 36.0 48.0 54.0 HT capabilities: Capabilities: 0x1ad RX LDPC HT20 SM Power Save disabled RX HT20 SGI TX STBC RX STBC 1-stream Max AMSDU length: 3839 bytes No DSSS/CCK HT40 Maximum RX AMPDU length 65535 bytes (exponent: 0x003) Minimum RX AMPDU time spacing: 8 usec (0x06) HT TX/RX MCS rate indexes supported: 0-15 HT operation: * primary channel: 11 * secondary channel offset: no secondary * STA channel width: 20 MHz * RIFS: 1 * HT protection: no * non-GF present: 1 * OBSS non-GF present: 0 * dual beacon: 0 * dual CTS protection: 0 * STBC beacon: 0 * L-SIG TXOP Prot: 0 * PCO active: 0 * PCO phase: 0 Overlapping BSS scan params: * passive dwell: 20 TUs * active dwell: 10 TUs * channel width trigger scan interval: 300 s * scan passive total per channel: 200 TUs * scan active total per channel: 20 TUs * BSS width channel transition delay factor: 5 * OBSS Scan Activity Threshold: 0.25 % Extended capabilities: HT Information Exchange Supported, TFS, WNM-Sleep Mode, TIM Broadcast, BSS Transition, 6 WMM: * Parameter version 1 * u-APSD * BE: CW 15-1023, AIFSN 3 * BK: CW 15-1023, AIFSN 7 * VI: CW 7-15, AIFSN 2, TXOP 3008 usec * VO: CW 3-7, AIFSN 2, TXOP 1504 usec Vendor specific: OUI 00:03:7f, data: 01 01 00 00 ff 7f RSN: * Version: 1 * Group cipher: CCMP * Pairwise ciphers: CCMP * Authentication suites: IEEE 802.1X * Capabilities: 1-PTKSA-RC 1-GTKSA-RC (0x) This is when using this hostapd configuration file: interface=wlan0 driver=nl80211 ssid=Time channel=1 hw_mode=g wpa=2 wpa_passphrase=FooBarBazBat wpa_pairwise=CCMP TKIP Enabling tracing via /sys/kernel/debug/tracing/events/cfg80211/rdev_start_ap/enable gives: hostapd-2213 [000] 15637.517729: rdev_start_ap: phy0, netdev:wlan0(3), AP settings - ssid: Time, band: 0, control freq: 2412, width: 0, cf1: 2412, cf2: 0, beacon interval: 100, dtim period: 2, hidden ssid: 0, wpa versions: 2, privacy: true, auth type: 8, inactivity timeout: 0 So the right SSID is being requested. Enabling debug (4096+6) in the brcmfmac driver gives: brcmfmac: brcmf_sdio_bus_txctl Enter brcmfmac: brcmf_sdio_dpc Enter brcmfmac: brcmf_sdio_isr Enter brcmfmac: brcmf_sdio_dpc Enter brcmfmac: brcmf_sdio_dpc Dongle reports CHIPACTIVE brcmfmac: brcmf_sdio_tx_ctrlframe Enter brcmfmac: brcmf_sdio_bus_rxctl Enter brcmfmac: brcmf_sdio_isr Enter brcmfmac: brcmf_sdio_dpc Enter brcmfmac: brcmf_sdio_readframes Enter brcmfmac: brcmf_sdio_read_control Enter brcmfmac: brcmf_fil_iovar_data_get ifidx=0, name=chanspec, len=4 brcmutil: data : 01 2b 00 00 .+.. brcmfmac: brcmf_cfg80211_get_tx_power Enter
[PATCH 4.10-rc3 00/13] net: dsa: remove unnecessary phy.h include
Including phy.h and phy_fixed.h into net/dsa.h causes phy*.h to be an unnecessary dependency for quite a large amount of the kernel. There's very little which actually requires definitions from phy.h in net/dsa.h - the include itself only wants the declaration of a couple of structures and IFNAMSIZ. Add linux/if.h for IFNAMSIZ, declarations for the structures, phy.h to mv88e6xxx.h as it needs it for phy_interface_t, and remove both phy.h and phy_fixed.h from net/dsa.h. This patch reduces from around 800 files rebuilt to around 40 - even with ccache, the time difference is noticable. In order to make this change, several drivers need to be updated to include necessary headers that they were picking up through this include. This has resulted in a much larger patch series. I'm assuming the 0-day builder has had 24 hours with this series, and hasn't reported any further issues with it - the last issue was two weeks ago (before I became ill) which I fixed over the last weekend. I'm hoping this doesn't conflict with what's already in net-next... arch/mips/cavium-octeon/octeon-platform.c | 4 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 1 + drivers/net/ethernet/broadcom/bgmac.c | 2 ++ drivers/net/ethernet/cadence/macb.h | 2 ++ drivers/net/ethernet/cavium/liquidio/lio_main.c | 1 + drivers/net/ethernet/cavium/liquidio/lio_vf_main.c| 1 + drivers/net/ethernet/cavium/liquidio/octeon_console.c | 1 + drivers/net/ethernet/freescale/fman/fman_memac.c | 1 + drivers/net/ethernet/marvell/mvneta.c | 1 + drivers/net/ethernet/qualcomm/emac/emac-sgmii.c | 1 + drivers/net/usb/lan78xx.c | 1 + drivers/net/wireless/ath/ath5k/ahb.c | 2 +- drivers/target/iscsi/iscsi_target_login.c | 1 + include/net/dsa.h | 6 -- net/core/netprio_cgroup.c | 1 + net/sunrpc/xprtrdma/svc_rdma_backchannel.c| 1 + 16 files changed, 20 insertions(+), 7 deletions(-) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: ath9k ARMv7 OOPS in v4.8.6, v4.2.8
On Wed, Nov 23, 2016 at 08:59:17PM +, Jason Cooper wrote: > As requested on irc: Thanks. > 7f0: ea02b 800> 7f4: e7970102ldr r0, [r7, r2, lsl #2] > 7f8: ebfebl 0 > 7fc: e0844000add r4, r4, r0 > 800: e300a000movwsl, #0 > 804: e28b2001add r2, fp, #1 > 808: e340a000movtsl, #0 > 80c: e3a01004mov r1, #4 > 810: e1aamov r0, sl > 814: ebfebl 0 <_find_next_bit_le> > 818: e5953000ldr r3, [r5] > 81c: e153cmp r0, r3 > 820: e1a0b000mov fp, r0 > 824: e2802008add r2, r0, #8 > 828: baf1blt 7f4 Okay, so i was 0, so running UP probably isn't going to help. r7 is also spec_priv->rfs_chan_spec_scan. So, I think the question is... how is this NULL - and has it always been NULL... -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: ath9k ARMv7 OOPS in v4.8.6, v4.2.8
On Wed, Nov 23, 2016 at 07:15:39PM +, Jason Cooper wrote: > --- oops from v4.8.6 #2 -- > [42059.303625] Unable to handle kernel NULL pointer dereference at virtual > address 0020 > [42059.311799] pgd = c0004000 > [42059.314522] [0020] *pgd= > [42059.318162] Internal error: Oops: 17 [#1] SMP ARM > [42059.322889] Modules linked in: ath9k ath9k_common ath9k_hw ath > [42059.328809] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.6 #37 > [42059.334755] Hardware name: Marvell Armada 370/XP (Device Tree) > [42059.340613] task: c0b091c0 task.stack: c0b0 > [42059.345176] PC is at ath_cmn_process_fft+0xa0/0x578 [ath9k_common] > [42059.351388] LR is at ath_cmn_process_fft+0xc4/0x578 [ath9k_common] > [42059.357598] pc : []lr : []psr: 8153 > [42059.357598] sp : c0b01cd0 ip : fp : > [42059.369127] r10: c0b034d4 r9 : 0069 r8 : 006c > [42059.374374] r7 : r6 : dcfbd340 r5 : c0b03da0 r4 : > [42059.380930] r3 : 0001 r2 : 0008 r1 : 0004 r0 : Well, the good news is that it's reproducable. It looks like it could be this: static int ath_cmn_is_fft_buf_full(struct ath_spec_scan_priv *spec_priv) { for_each_online_cpu(i) ret += relay_buf_full(rc->buf[i]); where i = 8 (r2) and rc->buf is r7. That's just a guess though, as there's precious little to go on with the Code: line - modern GCCs don't give us much with the Code: line anymore to figure out what's going on without the exact object files. e5933000ldr r3, [r3] e1d330b4ldrhr3, [r3, #4] e58d3030str r3, [sp, #48] ; 0x30 ea02b 1ce7970102ldr r0, [r7, r2, lsl #2] What makes me wonder though is that if i=8, that means you must have a system with 9 online CPUs, which is probably unlikely - or maybe that's the problem, for_each_online_cpu() is going wrong... If it's not that line of code, I don't see what else it would be based on the output of my compiler - there's only one case in my disassembly that corresponds with the single code line that we have to go on, and it's this: a44: e5983020ldr r3, [r8, #32] a48: e793010aldr r0, [r3, sl, lsl #2] <=== a4c: ebfebl 0 a50: e0844000add r4, r4, r0 a54: e59f9434ldr r9, [pc, #1076] a58: e28a2001add r2, sl, #1 a5c: e3a01004mov r1, #4 a60: e1a9mov r0, r9 a64: ebfebl 0 <_find_next_bit_le> a68: e5953000ldr r3, [r5] a6c: e153cmp r0, r3 a70: e1a0a000mov sl, r0 a74: baf2blt a44 I'm debating now about whether we need to dump more of the code in the oops - both before and after the faulting instruction... -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.
Re: [PATCH] hostap: avoid uninitialized variable use in hfa384x_get_rid
On Wed, Jan 27, 2016 at 02:45:26PM +0100, Arnd Bergmann wrote: > To ensure we get consistent error handling here, this changes the code > to only set rlen if we actually read data correctly, which also takes > care of the warning. It may be a good idea to do the job better. Looking at the code: struct hfa384x_rid_hdr rec; spin_lock_bh(>baplock); res = hfa384x_setup_bap(dev, BAP0, rid, 0); if (!res) res = hfa384x_from_bap(dev, BAP0, , sizeof(rec)); The only thing which initialises any of "rec" is that function call. The following lines are: if (le16_to_cpu(rec.len) == 0) { /* RID not available */ res = -ENODATA; } rlen = (le16_to_cpu(rec.len) - 1) * 2; So, why give the compiler a hard time as you're doing, why make the code harder to read. What's wrong with: spin_lock_bh(>baplock); res = hfa384x_setup_bap(dev, BAP0, rid, 0); if (res) goto unlock; res = hfa384x_from_bap(dev, BAP0, , sizeof(rec)); if (res) goto unlock; if (le16_to_cpu(rec.len) == 0) { /* RID not available */ res = -ENODATA; goto unlock; } rlen = (le16_to_cpu(rec.len) - 1) * 2; if (exact_len && rlen != len) { printk(KERN_DEBUG "%s: hfa384x_get_rid - RID len mismatch: rid=0x%04x, len=%d (expected %d)\n", dev->name, rid, rlen, len); res = -ENODATA; goto unlock; } res = hfa384x_from_bap(dev, BAP0, buf, len); unlock: spin_unlock_bh(>baplock); ? -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 000/182] Rid struct gpio_chip from container_of() usage
On Wed, Dec 09, 2015 at 02:08:35PM +0100, Linus Walleij wrote: > Because we want to have a proper userspace ABI for GPIO chips, > which involves using a character device that the user opens > and closes. While the character device is open, the underlying > kernel objects must not go away. Okay, so you stop the gpio_chip struct from going away. What about the code which is called via gpio_chip - say, if userspace keep shte chardev open, and someone rmmod's the driver providing the GPIO driver. I'm not sure that splitting up objects in this way really solves anything at all. Yes, it divorses the driver's private data from the subsystem data, but is that really an advantage? Network drivers have a similar issue, and the way this problem is solved there is that alloc_netdev() is always used to allocate the subsystem data structure and any driver private data structure as one allocation, and the lifetime of both objects remains under the control of the subsystem. The allocated memory is only freed when the last user goes away, and net has protection to prevent an unregistered driver from being called (via locks on every path into the layer.) Things get a little more complex with gpio, because there's the issue that some methods are spinlocked while others can take semaphores, but it should be possible to come up with a solution to that - maybe an atomic_t which is incremented whenever we're in some operation provided it's >= 0 (otherwise it fails), and decremented when the operation completes. We can then control in the unregistration path further GPIO accesses, and also prevent new accesses occuring by setting the atomic_t to -1. This shouldn't require any additional locking in any path. It does mean that the unregistration path needs careful thought to ensure that when we set it to -1, we wait for it to be dropped by the appropriate amount. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: using DMA-API on ARM
On Tue, Dec 09, 2014 at 11:19:40AM +0100, Arend van Spriel wrote: The issue did not trigger overnight so it seems setting bit 22 Shared Attribute _Override_ Enable solves the issue over here. Now the question is how to move forward with this. As I understood from Catalin this patch was not included as it was not considered responsibility of the linux kernel. It is preferable for firmware to configure the L2 cache appropriately, which includes things like the prefetch offsets as well as feature bits like bit 22. I think what I'll do is queue up a patch which adds a warning if bit 22 is not set, suggesting that firmware is updated to set this bit. -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: using DMA-API on ARM
On Mon, Dec 08, 2014 at 04:50:43PM +, Catalin Marinas wrote: On Mon, Dec 08, 2014 at 04:38:57PM +, Arnd Bergmann wrote: On Monday 08 December 2014 17:22:44 Arend van Spriel wrote: The log: first the ring allocation info is printed. Starting at 16.124847, ring 2, 3 and 4 are rings used for device to host. In this log the failure is on a read of ring 3. Ring 3 is 1024 entries of each 16 bytes. The next thing printed is the kernel page tables. Then some OpenWRT info and the logging of part of the connection setup. Then at 1780.130752 the logging of the failure starts. The sequence number is modulo 253 with ring size of 1024 matches an old entry (read 40, expected 52). Then the different pointers are printed followed by the kernel page table. The code does then a cache invalidate on the dma_handle and the next read the sequence number is correct. How do you invalidate the cache? A dma_handle is of type dma_addr_t and we don't define an operation for that, nor does it make sense on an allocation from dma_alloc_coherent(). What happens if you take out the invalidate? dma_sync_single_for_cpu(, DMA_FROM_DEVICE) which ends up invalidating the cache (or that is our suspicion). I'm not sure about that: static void arm_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { unsigned int offset = handle (PAGE_SIZE - 1); struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); __dma_page_dev_to_cpu(page, offset, size, dir); } Assuming a noncoherent linear (no IOMMU, no swiotlb, no dmabounce) mapping, dma_to_pfn will return the correct pfn here, but pfn_to_page will return a page pointer into the kernel linear mapping, Or a highmem page, both should be handled by dma_cache_maint_page(). A valid point, but one which is irrelevant to this thread, because we're talking about a platform with only 128MB, and a PAGE_OFFSET of 2GB (hence no highmem): Memory: 125936K/131072K available (2682K kernel code, 103K rwdata, 744K rodata, 164K init, 188K bss, 5136K reserved) Can we stay on-point to getting this problem solved, rather than drifting off topic please? -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: using DMA-API on ARM
On Fri, Dec 05, 2014 at 10:22:22AM +0100, Arend van Spriel wrote: For our brcm80211 development we are working on getting brcmfmac driver up and running on a Broadcom ARM-based platform. The wireless device is a PCIe device, which is hooked up to the system behind a PCIe host bridge, and we transfer information between host and device using a descriptor ring buffer allocated using dma_alloc_coherent(). We mostly tested on x86 and seen no issue. However, on this ARM platform (single-core A9) we detect occasionally that the descriptor content is invalid. When this occurs we do a dma_sync_single_for_cpu() and this is retried a number of times if the problem persists. Actually, found out that someone made a mistake by using virt_to_dma(va) to get the dma_handle parameter. So probably we only provided a delay in the retry loop. After fixing that a single call to dma_sync_single_for_cpu() is sufficient. The DMA-API-HOWTO clearly states that: the hardware should guarantee that the device and the CPU can access the data in parallel and will see updates made by each other without any explicit software flushing. So it seems incorrect that we would need to do a dma_sync for this memory. That we do need it seems like this memory can end up in cache(?), or whatever happens, in some rare condition. Is there anyway to investigate this situation either through DMA-API or some low-level ARM specific functions. It's been a long while since I looked at the code, and the code for dma_alloc_coherent() has completely changed since then with the addition of CMA. I'm afraid that anything I would say about it would not be accurate without research into the possible paths through that code - it's no longer just a simple allocator. What you say is correct however: the memory should not have any cache lines associated with it, if it does, there's a bug somewhere. Also, the memory will be weakly ordered, which means that writes to such memory can be reordered. If ordering matters, barriers should be used. rmb() and wmb() can be used for this. (Added Marek for comment on dma_alloc_coherent(), Will for comment on barrier stuff.) -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: using DMA-API on ARM
On Fri, Dec 05, 2014 at 10:52:02AM +0100, Arnd Bergmann wrote: I'm still puzzled why you'd need a single dma_sync_single_for_cpu() after dma_alloc_coherent though, you should not need any. Is it possible that the driver accidentally uses __raw_readl() instead of readl() in some places and you are just lacking an appropriate barrier? Digging into the driver, it looks like individual DMA buffers are allocated (via brcmf_pcie_init_dmabuffer_for_device) and registered into a commonring layer. Whenever the buffer is written to, space is first allocated via a call to brcmf_commonring_reserve_for_write() or brcmf_commonring_reserve_for_write_multiple(), data written to the buffer, followed by a call to brcmf_commonring_write_complete(). brcmf_commonring_write_complete() calls two methods at that point: cr_write_wptr() and cr_ring_bell(), which will be brcmf_pcie_ring_mb_write_wptr() and brcmf_pcie_ring_mb_ring_bell(). The first calls brcmf_pcie_write_tcm16(), which uses iowrite16(), which contains the appropriate barrier. The bell ringing functions also use ioread*/iowrite*(). So, on the write side, it looks fine from the barrier perspective. On the read side, brcmf_commonring_get_read_ptr() is used before a read access to the ring - which calls the cr_update_wptr() method, which in turn uses an ioread16() call. After the CPU has read data from the ring, brcmf_commonring_read_complete() is used, which uses iowrite16(). So, I don't see a barrier problem on the read side. However, I did trip over this: static void * brcmf_pcie_init_dmabuffer_for_device(struct brcmf_pciedev_info *devinfo, u32 size, u32 tcm_dma_phys_addr, dma_addr_t *dma_handle) { void *ring; long long address; ring = dma_alloc_coherent(devinfo-pdev-dev, size, dma_handle, GFP_KERNEL); if (!ring) return NULL; address = (long long)(long)*dma_handle; Casting to (long) will truncate the DMA handle to 32-bits on a 32-bit architecture, even if it supports 64-bit DMA addresses. There's a couple of other places where this same truncation occurs: address = (long long)(long)devinfo-shared.scratch_dmahandle; and address = (long long)(long)devinfo-shared.ringupd_dmahandle; In any case, wouldn't using a u64 type for address be better - isn't long long 128-bit on 64-bit architectures? -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: using DMA-API on ARM
I've been doing more digging into the current DMA code, and I'm dismayed to see that there's new bugs in it... commit 513510ddba9650fc7da456eefeb0ead7632324f6 Author: Laura Abbott lau...@codeaurora.org Date: Thu Oct 9 15:26:40 2014 -0700 common: dma-mapping: introduce common remapping functions This uses map_vm_area() to achieve the remapping of pages allocated inside dma_alloc_coherent(). dma_alloc_coherent() is documented in a rather round-about way in Documentation/DMA-API.txt: | Part Ia - Using large DMA-coherent buffers | -- | | void * | dma_alloc_coherent(struct device *dev, size_t size, | dma_addr_t *dma_handle, gfp_t flag) | | void | dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, |dma_addr_t dma_handle) | | Free a region of consistent memory you previously allocated. dev, | size and dma_handle must all be the same as those passed into | dma_alloc_coherent(). cpu_addr must be the virtual address returned by | the dma_alloc_coherent(). | | Note that unlike their sibling allocation calls, these routines | may only be called with IRQs enabled. Note that very last paragraph. What this says is that it is explicitly permitted to call dma_alloc_coherent() with IRQs disabled. Now, the question is: is it safe to call map_vm_area() with IRQs disabled? Well, map_vm_area() calls pud_alloc(), pmd_alloc(), and pte_alloc_kernel(). These functions all call into the kernel memory allocator *without* GFP_ATOMIC - in other words, these allocations are permitted to sleep. Except, IRQs are off, so it's a bug to call these functions from dma_alloc_coherent(). Now, if we look at the previous code, it used ioremap_page_range(). This has the same problem: it needs to allocate page tables, and it can only do it via functions which may sleep. If we go back even further, we find that the use of ioremap_page_range() in dma_alloc_coherent() was introduced by: commit e9da6e9905e639b0f842a244bc770b48ad0523e9 Author: Marek Szyprowski m.szyprow...@samsung.com Date: Mon Jul 30 09:11:33 2012 +0200 ARM: dma-mapping: remove custom consistent dma region which is the commit which removed my pre-allocated page tables for the DMA re-mapping region - code which I explicitly had to specifically avoid this issue. Obviously, this isn't a big problem, because people haven't reported that they've hit any of the might_sleep() checks in the memory allocators, which I think is our only saving grace - but it's still wrong to the specified calling conditions of the DMA API. If the problem which you (Broadcom) are suffering from is down to the issue I suspect (that being having mappings with different cache attributes) then I'm not sure that there's anything we can realistically do about that. There's a number of issues which make it hard to see a way forward. One example is that if we allocate memory, we need to be able to change (or remove) the cacheable mappings associated with that memory. We'd need to touch the L1 page table, either to change the attributes of the section mapping, or to convert the section mapping to a L2 page table pointer. We need to change the attributes in a break-flush-make sequence to avoid TLB conflicts. However, those mappings may be shared between other CPUs in a SMP system. So, we would need to flush the TLBs on other CPUs before we could proceed to create replacement mappings. That means something like stop_machine() or sending (and waiting for completion) of an IPI to the other CPUs. That is totally impractical due to dma_alloc_coherent() being allowed to be called with IRQs off. I'll continue to think about it, but I don't see many possibilities to satisfy dma_alloc_coherent()'s documented requirements other than by pre-allocating a chunk of memory at boot time to be served out as DMA-able memory for these horrid cases. I don't see much point in keeping the map_vm_area() approach on ARM even if we did fallback - if we're re-establishing mappings for the surrounding pages in lowmem, we might as well insert appropriately attributed mappings for the DMA memory as well. On the face of it, it would be better to allocate one section at a time, but my unfortunate experience is that 3.x kernels are a /lot/ more trigger happy with the OOM killer, and the chances of being able to allocate 1MB of memory at a go after the system has been running for a while is near-on impossible. So I don't think that's a reality. Even if we did break up section mappings in this way, it would also mean that over time, we'd end up with much of lowmem mapped using 4K page table entries, which would place significant pressure on the MMU TLBs. So, we might just be far better off pre-allocating enough DMA coherent memory at boot time and be done with it. Those who want it dynamic can use CMA instead. -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to
Re: using DMA-API on ARM
On Fri, Dec 05, 2014 at 05:38:39PM +, Catalin Marinas wrote: On Fri, Dec 05, 2014 at 11:11:14AM +, Russell King - ARM Linux wrote: In any case, wouldn't using a u64 type for address be better - isn't long long 128-bit on 64-bit architectures? No, it's still 64-bit. There is no 128-bit integer in the C standard. Actually, that's a fallicy. The C99 standard (like previous versions) does not define exactly the number of bits in each type. It defines ranks of type, and says that lower ranks are a subrange of integers with higher ranks (for the same signed-ness.) See section 6.2.5. So, it merely states that: range(char) = range(short) = range(int) = range(long) = range(long long) So, an implementation could have: char: 8 short: 16 int: 16 long: 32 long long: 64 char: 8 short: 16 int: 32 long: 32 long long: 64 char: 8 short: 16 int: 32 long: 64 long long: 64 char: 8 short: 16 int: 64 long: 64 long long: 64 or even: char: 8 short: 16 int: 32 long: 64 long long: 128 and that would still be compliant with C99, since it continues to meet the criteria about the required data types specified in the standard. -- FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html