Re: RTL8723BE performance regression
On Wed, 2018-05-09 at 13:33 -0700, João Paulo Rechi Vita wrote: > On Tue, May 8, 2018 at 1:37 AM, Pkshih wrote: > > On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote: > >> On Tue, May 1, 2018 at 10:58 PM, Pkshih wrote: > >> > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote: > >> >> > >> >> > -Original Message- > >> >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] > >> >> > Sent: Wednesday, May 02, 2018 6:41 AM > >> >> > To: Larry Finger > >> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; > >> >> > Chaoming_Li; Kalle > Valo; > >> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo > >> >> > Rechi Vita; linux@endl > ess > >> m.c > >> >> om > >> >> > Subject: Re: RTL8723BE performance regression > >> >> > > >> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger > >> >> > wrote: > >> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: > >> >> > >> > >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger > >> >> > >> > >> >> > >> wrote: > >> >> > >> > >> >> > >> (...) > >> >> > >> > >> >> > >>> As the antenna selection code changes affected your first > >> >> > >>> bisection, do > >> >> > >>> you > >> >> > >>> have one of those HP laptops with only one antenna and the > >> >> > >>> incorrect > >> >> > >>> coding > >> >> > >>> in the FUSE? > >> >> > >> > >> >> > >> > >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- > >> >> > >> this > >> >> > >> was needed to achieve a good performance in the past, before this > >> >> > >> regression. I've also opened the laptop chassis and confirmed the > >> >> > >> antenna cable is plugged to the connector labeled with "1" on the > >> >> > >> card. > >> >> > >> > >> >> > >>> If so, please make sure that you still have the same signal > >> >> > >>> strength for good and bad cases. I have tried to keep the driver > >> >> > >>> and the > >> >> > >>> btcoex code in sync, but there may be some combinations of antenna > >> >> > >>> configuration and FUSE contents that cause the code to fail. > >> >> > >>> > >> >> > >> > >> >> > >> What is the recommended way to monitor the signal strength? > >> >> > > > >> >> > > > >> >> > > The btcoex code is developed for multiple platforms by a different > >> >> > > group > >> >> > > than the Linux driver. I think they made a change that caused > >> >> > > ant_sel to > >> >> > > switch from 1 to 2. At least numerous comments at > >> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that > >> >> > > change. > >> >> > > > >> >> > > Mhy recommended method is to verify the wifi device name with "iw > >> >> > > dev". Then > >> >> > > using that device > >> >> > > > >> >> > > sudo iw dev scan | egrep "SSID|signal" > >> >> > > > >> >> > > >> >> > I have confirmed that the performance regression is indeed tied to > >> >> > signal strength: on the good cases signal was between -16 and -8 dBm, > >> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've > >> >> > also switched to testing bandwidth in controlled LAN environment using > >> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only > >> >> > machine connected to the 2.4 GHz radio and the machine running the > >> >> > iperf3 server connected via ethernet. > >> >> > > >> >> > >> >> W
Re: RTL8723BE performance regression
On Tue, May 8, 2018 at 1:37 AM, Pkshih wrote: > On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote: >> On Tue, May 1, 2018 at 10:58 PM, Pkshih wrote: >> > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote: >> >> >> >> > -Original Message- >> >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] >> >> > Sent: Wednesday, May 02, 2018 6:41 AM >> >> > To: Larry Finger >> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; >> >> > Chaoming_Li; Kalle Valo; >> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo >> >> > Rechi Vita; linux@endless >> m.c >> >> om >> >> > Subject: Re: RTL8723BE performance regression >> >> > >> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger >> >> > wrote: >> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: >> >> > >> >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger >> >> > >> >> >> > >> wrote: >> >> > >> >> >> > >> (...) >> >> > >> >> >> > >>> As the antenna selection code changes affected your first >> >> > >>> bisection, do >> >> > >>> you >> >> > >>> have one of those HP laptops with only one antenna and the incorrect >> >> > >>> coding >> >> > >>> in the FUSE? >> >> > >> >> >> > >> >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this >> >> > >> was needed to achieve a good performance in the past, before this >> >> > >> regression. I've also opened the laptop chassis and confirmed the >> >> > >> antenna cable is plugged to the connector labeled with "1" on the >> >> > >> card. >> >> > >> >> >> > >>> If so, please make sure that you still have the same signal >> >> > >>> strength for good and bad cases. I have tried to keep the driver >> >> > >>> and the >> >> > >>> btcoex code in sync, but there may be some combinations of antenna >> >> > >>> configuration and FUSE contents that cause the code to fail. >> >> > >>> >> >> > >> >> >> > >> What is the recommended way to monitor the signal strength? >> >> > > >> >> > > >> >> > > The btcoex code is developed for multiple platforms by a different >> >> > > group >> >> > > than the Linux driver. I think they made a change that caused ant_sel >> >> > > to >> >> > > switch from 1 to 2. At least numerous comments at >> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that >> >> > > change. >> >> > > >> >> > > Mhy recommended method is to verify the wifi device name with "iw >> >> > > dev". Then >> >> > > using that device >> >> > > >> >> > > sudo iw dev scan | egrep "SSID|signal" >> >> > > >> >> > >> >> > I have confirmed that the performance regression is indeed tied to >> >> > signal strength: on the good cases signal was between -16 and -8 dBm, >> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've >> >> > also switched to testing bandwidth in controlled LAN environment using >> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only >> >> > machine connected to the 2.4 GHz radio and the machine running the >> >> > iperf3 server connected via ethernet. >> >> > >> >> >> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: >> >> cleanup >> >> 8723be ant_sel definition"). You can use the above commit and do the same >> >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your >> >> results. >> >> Since performance is tied to signal strength, you can only share signal >> >> strength. >> >> >> > >> > Please pay attention to cold reboot once ant_sel is
Re: RTL8723BE performance regression
On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote: > On Tue, May 1, 2018 at 10:58 PM, Pkshih wrote: > > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote: > >> > >> > -Original Message- > >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] > >> > Sent: Wednesday, May 02, 2018 6:41 AM > >> > To: Larry Finger > >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; > >> > Chaoming_Li; Kalle Valo; > >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo > >> > Rechi Vita; linux@endless > m.c > >> om > >> > Subject: Re: RTL8723BE performance regression > >> > > >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger > >> > wrote: > >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: > >> > >> > >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger > >> > >> > >> > >> wrote: > >> > >> > >> > >> (...) > >> > >> > >> > >>> As the antenna selection code changes affected your first bisection, > >> > >>> do > >> > >>> you > >> > >>> have one of those HP laptops with only one antenna and the incorrect > >> > >>> coding > >> > >>> in the FUSE? > >> > >> > >> > >> > >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this > >> > >> was needed to achieve a good performance in the past, before this > >> > >> regression. I've also opened the laptop chassis and confirmed the > >> > >> antenna cable is plugged to the connector labeled with "1" on the > >> > >> card. > >> > >> > >> > >>> If so, please make sure that you still have the same signal > >> > >>> strength for good and bad cases. I have tried to keep the driver and > >> > >>> the > >> > >>> btcoex code in sync, but there may be some combinations of antenna > >> > >>> configuration and FUSE contents that cause the code to fail. > >> > >>> > >> > >> > >> > >> What is the recommended way to monitor the signal strength? > >> > > > >> > > > >> > > The btcoex code is developed for multiple platforms by a different > >> > > group > >> > > than the Linux driver. I think they made a change that caused ant_sel > >> > > to > >> > > switch from 1 to 2. At least numerous comments at > >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that > >> > > change. > >> > > > >> > > Mhy recommended method is to verify the wifi device name with "iw > >> > > dev". Then > >> > > using that device > >> > > > >> > > sudo iw dev scan | egrep "SSID|signal" > >> > > > >> > > >> > I have confirmed that the performance regression is indeed tied to > >> > signal strength: on the good cases signal was between -16 and -8 dBm, > >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've > >> > also switched to testing bandwidth in controlled LAN environment using > >> > iperf3, as suggested by Steve deRosier, with the DUT being the only > >> > machine connected to the 2.4 GHz radio and the machine running the > >> > iperf3 server connected via ethernet. > >> > > >> > >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: > >> cleanup > >> 8723be ant_sel definition"). You can use the above commit and do the same > >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your > >> results. > >> Since performance is tied to signal strength, you can only share signal > >> strength. > >> > > > > Please pay attention to cold reboot once ant_sel is changed. > > > > I've tested the commit mentioned above and it fixes the problem on top > of v4.16 (in addition to the latest wireless-drivers-next also been > fixed as it already contains such commit). On v4.15, we also need the > following commits before "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel > definition" to have a good performan
Re: RTL8723BE performance regression
On Tue, May 1, 2018 at 10:58 PM, Pkshih wrote: > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote: >> >> > -Original Message- >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] >> > Sent: Wednesday, May 02, 2018 6:41 AM >> > To: Larry Finger >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; >> > Chaoming_Li; Kalle Valo; >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi >> > Vita; linux@endlessm.c >> om >> > Subject: Re: RTL8723BE performance regression >> > >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger >> > wrote: >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: >> > >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger >> > >> wrote: >> > >> >> > >> (...) >> > >> >> > >>> As the antenna selection code changes affected your first bisection, do >> > >>> you >> > >>> have one of those HP laptops with only one antenna and the incorrect >> > >>> coding >> > >>> in the FUSE? >> > >> >> > >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this >> > >> was needed to achieve a good performance in the past, before this >> > >> regression. I've also opened the laptop chassis and confirmed the >> > >> antenna cable is plugged to the connector labeled with "1" on the >> > >> card. >> > >> >> > >>> If so, please make sure that you still have the same signal >> > >>> strength for good and bad cases. I have tried to keep the driver and >> > >>> the >> > >>> btcoex code in sync, but there may be some combinations of antenna >> > >>> configuration and FUSE contents that cause the code to fail. >> > >>> >> > >> >> > >> What is the recommended way to monitor the signal strength? >> > > >> > > >> > > The btcoex code is developed for multiple platforms by a different group >> > > than the Linux driver. I think they made a change that caused ant_sel to >> > > switch from 1 to 2. At least numerous comments at >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change. >> > > >> > > Mhy recommended method is to verify the wifi device name with "iw dev". >> > > Then >> > > using that device >> > > >> > > sudo iw dev scan | egrep "SSID|signal" >> > > >> > >> > I have confirmed that the performance regression is indeed tied to >> > signal strength: on the good cases signal was between -16 and -8 dBm, >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've >> > also switched to testing bandwidth in controlled LAN environment using >> > iperf3, as suggested by Steve deRosier, with the DUT being the only >> > machine connected to the 2.4 GHz radio and the machine running the >> > iperf3 server connected via ethernet. >> > >> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup >> 8723be ant_sel definition"). You can use the above commit and do the same >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your >> results. >> Since performance is tied to signal strength, you can only share signal >> strength. >> > > Please pay attention to cold reboot once ant_sel is changed. > I've tested the commit mentioned above and it fixes the problem on top of v4.16 (in addition to the latest wireless-drivers-next also been fixed as it already contains such commit). On v4.15, we also need the following commits before "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel definition" to have a good performance again: 874e837d67d0 rtlwifi: fill FW version and subversion a44709bba70f rtlwifi: btcoex: Add power_on_setting routine 40d9dd4f1c5d rtlwifi: btcoex: Remove global variables from btcoex Surprisingly, it seems forcing ant_sel=1 is not needed anymore on these machines, as the shown by the numbers bellow (ant_sel=0 means that actually no parameter was passed to the module). I have powered off the machine and done a cold boot for every test. It seems something have changed in the antenna auto-selection code since v4.11, the latest point where I could confirm we definitely need to force ant_sel=1. I've been trying to understand what causes this difference, but haven't made progress on that so far, so any suggestions are appreciated (we are trying to decide if we can confidently drop the downstream DMI quirks for these specific machines). w-d-n ant_sel=0: -14.00 dBm, 69.5 Mbps -> good w-d-n ant_sel=1: -10.00 dBm, 41.1 Mbps -> good w-d-n ant_sel=2: -44.00 dBm, 607 kbps -> bad v4.16 ant_sel=0: -12.00 dBm, 63.0 Mbps -> good v4.16 ant_sel=1: - 8.00 dBm, 69.0 Mbps -> good v4.16 ant_sel=2: -50.00 dBm, 224 kbps -> bad v4.15 ant_sel=0: - 8.00 dBm, 33.0 Mbps -> good v4.15 ant_sel=1: -10.00 dBm, 38.1 Mbps -> good v4.15 ant_sel=2: -48.00 dBm, 206 kbps -> bad -- João Paulo Rechi Vita http://about.me/jprvita
Re: RTL8723BE performance regression
On Wed, 2018-05-02 at 05:44 +, Pkshih wrote: > > > -Original Message- > > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] > > Sent: Wednesday, May 02, 2018 6:41 AM > > To: Larry Finger > > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; > > Chaoming_Li; Kalle Valo; > > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi > > Vita; linux@endlessm.c > om > > Subject: Re: RTL8723BE performance regression > > > > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger > > wrote: > > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: > > >> > > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger > > >> wrote: > > >> > > >> (...) > > >> > > >>> As the antenna selection code changes affected your first bisection, do > > >>> you > > >>> have one of those HP laptops with only one antenna and the incorrect > > >>> coding > > >>> in the FUSE? > > >> > > >> > > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this > > >> was needed to achieve a good performance in the past, before this > > >> regression. I've also opened the laptop chassis and confirmed the > > >> antenna cable is plugged to the connector labeled with "1" on the > > >> card. > > >> > > >>> If so, please make sure that you still have the same signal > > >>> strength for good and bad cases. I have tried to keep the driver and the > > >>> btcoex code in sync, but there may be some combinations of antenna > > >>> configuration and FUSE contents that cause the code to fail. > > >>> > > >> > > >> What is the recommended way to monitor the signal strength? > > > > > > > > > The btcoex code is developed for multiple platforms by a different group > > > than the Linux driver. I think they made a change that caused ant_sel to > > > switch from 1 to 2. At least numerous comments at > > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change. > > > > > > Mhy recommended method is to verify the wifi device name with "iw dev". > > > Then > > > using that device > > > > > > sudo iw dev scan | egrep "SSID|signal" > > > > > > > I have confirmed that the performance regression is indeed tied to > > signal strength: on the good cases signal was between -16 and -8 dBm, > > whereas in bad cases signal was always between -50 to - 40 dBm. I've > > also switched to testing bandwidth in controlled LAN environment using > > iperf3, as suggested by Steve deRosier, with the DUT being the only > > machine connected to the 2.4 GHz radio and the machine running the > > iperf3 server connected via ethernet. > > > > We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup > 8723be ant_sel definition"). You can use the above commit and do the same > experiments (with ant_sel=0, 1 and 2) in your side, and then share your > results. > Since performance is tied to signal strength, you can only share signal > strength. > Please pay attention to cold reboot once ant_sel is changed.
RE: RTL8723BE performance regression
> -Original Message- > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com] > Sent: Wednesday, May 02, 2018 6:41 AM > To: Larry Finger > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; > Chaoming_Li; Kalle Valo; > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi > Vita; li...@endlessm.com > Subject: Re: RTL8723BE performance regression > > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger > wrote: > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: > >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger > >> wrote: > >> > >> (...) > >> > >>> As the antenna selection code changes affected your first bisection, do > >>> you > >>> have one of those HP laptops with only one antenna and the incorrect > >>> coding > >>> in the FUSE? > >> > >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this > >> was needed to achieve a good performance in the past, before this > >> regression. I've also opened the laptop chassis and confirmed the > >> antenna cable is plugged to the connector labeled with "1" on the > >> card. > >> > >>> If so, please make sure that you still have the same signal > >>> strength for good and bad cases. I have tried to keep the driver and the > >>> btcoex code in sync, but there may be some combinations of antenna > >>> configuration and FUSE contents that cause the code to fail. > >>> > >> > >> What is the recommended way to monitor the signal strength? > > > > > > The btcoex code is developed for multiple platforms by a different group > > than the Linux driver. I think they made a change that caused ant_sel to > > switch from 1 to 2. At least numerous comments at > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change. > > > > Mhy recommended method is to verify the wifi device name with "iw dev". Then > > using that device > > > > sudo iw dev scan | egrep "SSID|signal" > > > > I have confirmed that the performance regression is indeed tied to > signal strength: on the good cases signal was between -16 and -8 dBm, > whereas in bad cases signal was always between -50 to - 40 dBm. I've > also switched to testing bandwidth in controlled LAN environment using > iperf3, as suggested by Steve deRosier, with the DUT being the only > machine connected to the 2.4 GHz radio and the machine running the > iperf3 server connected via ethernet. > We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup 8723be ant_sel definition"). You can use the above commit and do the same experiments (with ant_sel=0, 1 and 2) in your side, and then share your results. Since performance is tied to signal strength, you can only share signal strength. Regards PK
Re: RTL8723BE performance regression
On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger wrote: > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: >> >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger >> wrote: >> >> (...) >> >>> As the antenna selection code changes affected your first bisection, do >>> you >>> have one of those HP laptops with only one antenna and the incorrect >>> coding >>> in the FUSE? >> >> >> Yes, that is why I've been passing ant_sel=1 during my tests -- this >> was needed to achieve a good performance in the past, before this >> regression. I've also opened the laptop chassis and confirmed the >> antenna cable is plugged to the connector labeled with "1" on the >> card. >> >>> If so, please make sure that you still have the same signal >>> strength for good and bad cases. I have tried to keep the driver and the >>> btcoex code in sync, but there may be some combinations of antenna >>> configuration and FUSE contents that cause the code to fail. >>> >> >> What is the recommended way to monitor the signal strength? > > > The btcoex code is developed for multiple platforms by a different group > than the Linux driver. I think they made a change that caused ant_sel to > switch from 1 to 2. At least numerous comments at > github.com/lwfinger/rtlwifi_new claimed they needed to make that change. > > Mhy recommended method is to verify the wifi device name with "iw dev". Then > using that device > > sudo iw dev scan | egrep "SSID|signal" > I have confirmed that the performance regression is indeed tied to signal strength: on the good cases signal was between -16 and -8 dBm, whereas in bad cases signal was always between -50 to - 40 dBm. I've also switched to testing bandwidth in controlled LAN environment using iperf3, as suggested by Steve deRosier, with the DUT being the only machine connected to the 2.4 GHz radio and the machine running the iperf3 server connected via ethernet. Using those two tests (iperf3 and signal strength) I've dug deeper into the culprit I had found previously, commit 7937f02d1953, reverting it partially and testing the resulting driver, to isolate which change was causing the problem. Besides "hooking up external functions for newer ICs", as described by the commit message, that commit also added code to decided whether ex_btc8723b1ant_*() or ex_btc8723b2ant_*() functions should be used in halbtcoutsrc.c, depending on the value of board_info.btdm_ant_num, whereas before that commit ex_btc8723b2ant_*() were always used. Reverting to always using ex_btc8723b2ant_*() functions fixes the regression on v4.15. I've also tried to bisect between v4.15..v4.16 to find what else was causing problems there, as the changes mentioned above on top of v4.16 did not solve the problem. The bisect pointed to "874e837d67d0 rtlwifi: fill FW version and subversion", only but reverting it plus the changes mentioned above also didn't yield good results. That's when I decided to get a bit creative: starting on v4.16 I first applied the changes to have ex_btc8723b2ant_*() always being used, as mentioned above, then reverted every commit between v4.15..v4.16 affecting drivers/net/wireless/realtek/rtlwifi/, and verified the resulting kernel had a good performance. Then I started trimming down the history and testing along the way, to reduce to the minimum set of changes that had to be reverted in order to restore the good performance. In addition to the ex_btc8723b2ant_*() changes and reverting "874e837d67d0 rtlwifi: fill FW version and subversion", I've also had to remove the following lines from drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c, which were introduced by "40d9dd4f1c5d rtlwifi: btcoex: Remove global variables from btcoex", in order to restore the upload performance and signal strength. /* set default antenna position to main port */ btcoexist->board_info.btdm_ant_pos = BTC_ANTENNA_AT_MAIN_PORT; These are the results I've got on v4.16 (similarly on wireless-drivers-next-for-davem-2018-03-29 or v4.15): $ sudo iw dev wlp2s0 scan | grep -B3 JJ | grep signal signal: -42.00 dBm $ iperf3 -c 192.168.1.254 Connecting to host 192.168.1.254, port 5201 [ 4] local 192.168.1.253 port 39678 connected to 192.168.1.254 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 735 KBytes 6.02 Mbits/sec1 1.41 KBytes [ 4] 1.00-2.00 sec 274 KBytes 2.25 Mbits/sec1 1.41 KBytes [ 4] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec0 1.41 KBytes [ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec0 1.41 KBytes [ 4] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec1 28.3 KBytes [ 4] 5.00-6.00 sec 423 KBytes 3.47 Mbits/sec3 41.0 KBytes [ 4] 6.00-7.00 sec 840 KBytes 6.88 Mbits/sec0 58.0 KBytes [ 4] 7.00-8.00 sec 830 KBytes 6.79 Mbits/sec1 1.41 KBytes [ 4] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec0 1.41
Re: RTL8723BE performance regression
On Tue, Apr 3, 2018 at 6:51 PM, João Paulo Rechi Vita wrote: > > This are the results (testing with speedtest.net) I got at some key points: > > VersionCommitPingDownUp > > v4.11a351e9b1225.445.99 > v4.11a351e9b131 17.025.89 > > v4.13569dbb8174 14.080.00 > v4.13569dbb8261 8.41 0.00 > > v4.15+revert d8a5b801923.861.41 > v4.15+revert d8a5b80189 18.691.39 > I recommend doing throughput testing in a closed system using iperf. speedtest.net is potentially useful for testing your ISP's bandwidth at some particular point in time, but little else as it exposes you to too many variables. I wouldn't take those numbers to mean much and the inconclusive results you're getting could be explained by external network loading and having little to do with your bisect effort. I can get that spread in numbers from speedtest.net without making any changes other than the time of day I do the test. Here's how to do it. Install iperf2 (you could use iperf3, personal choice) on two machines, one being your device under test (DUT). Setup a network configuration that looks similar to this: server <==hardwire==> AP <--wireless link--> DUT Be sure your hardwire is more bandwidth than your wireless link is capable of, or set it up where the server is the AP. What you're looking for here is environmental consistency, not maximum throughput numbers. On the computer hardwired to the network, start the server, we'll assume it has an ip of 192.168.33.18: iperf -s On your DUT: iperf -c 192.168.33.18 That's the most basic setup, check the man page for more options. You will get best results if you can exclude other computers from your test network and other wireless devices from your airspace. - Steve -- Steve deRosier Cal-Sierra Consulting LLC https://www.cal-sierra.com/
Re: RTL8723BE performance regression
On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote: On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger wrote: (...) As the antenna selection code changes affected your first bisection, do you have one of those HP laptops with only one antenna and the incorrect coding in the FUSE? Yes, that is why I've been passing ant_sel=1 during my tests -- this was needed to achieve a good performance in the past, before this regression. I've also opened the laptop chassis and confirmed the antenna cable is plugged to the connector labeled with "1" on the card. If so, please make sure that you still have the same signal strength for good and bad cases. I have tried to keep the driver and the btcoex code in sync, but there may be some combinations of antenna configuration and FUSE contents that cause the code to fail. What is the recommended way to monitor the signal strength? The btcoex code is developed for multiple platforms by a different group than the Linux driver. I think they made a change that caused ant_sel to switch from 1 to 2. At least numerous comments at github.com/lwfinger/rtlwifi_new claimed they needed to make that change. Mhy recommended method is to verify the wifi device name with "iw dev". Then using that device sudo iw dev scan | egrep "SSID|signal" Larry
Re: RTL8723BE performance regression
On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger wrote: (...) > As the antenna selection code changes affected your first bisection, do you > have one of those HP laptops with only one antenna and the incorrect coding > in the FUSE? Yes, that is why I've been passing ant_sel=1 during my tests -- this was needed to achieve a good performance in the past, before this regression. I've also opened the laptop chassis and confirmed the antenna cable is plugged to the connector labeled with "1" on the card. > If so, please make sure that you still have the same signal > strength for good and bad cases. I have tried to keep the driver and the > btcoex code in sync, but there may be some combinations of antenna > configuration and FUSE contents that cause the code to fail. > What is the recommended way to monitor the signal strength? Thanks for such a quick reply, -- João Paulo Rechi Vita http://about.me/jprvita
Re: RTL8723BE performance regression
On 04/03/2018 08:51 PM, João Paulo Rechi Vita wrote: Hello, I've been trying to track a performance regression on the RTL8723BE WiFi adapter, which mainly affects the upload bandwidth (although we can see a decreased download performance as well, the effect on upload is more drastic). This was first reported by users after upgrading from our 4.11-based kernel to our 4.13-based kernel, but also confirmed to affect our development branch (4.15-based kernel) and wireless-drivers-next at the wireless-drivers-next-for-davem-2018-03-29 tag. This is happening on an HP laptop that needs rtl8723be.ant_sel=1 (and all the following tests have been made with that param). My first bisect attempt pointed me to the following commit: bcd37f4a0831 rtlwifi: btcoex: 23b 2ant: let bt transmit when hw initialisation done Which I later found to be already fixed by a33fcba6ec01 rtlwifi: btcoexist: Fix breakage of ant_sel for rtl8723be. That fix is already included in v4.15 though (and our dev branch as well), so I did a second bisect, now cherry-picking a33fcba6ec01 at every step, and it pointed me to the following commit: 7937f02d1953 rtlwifi: btcoex: hook external functions for newer chips Reverting that commit on top of our development branch fixes the problem, but on top of v4.15 I get mixed results: a few times getting a good upload performance (~5-6Mbps) but most of the time just getting ~1-1.5Mpbs (which is still better than the 0.0 then test failure I've gotten on most bad points of the bisect). Bisecting the downstream patches we carry on top of v4.15 (we base our kernel on Ubuntu's, so there are quite a few downstream changes) did not bring any clarity, as at all bisect points (plus reverting 7937f02d1953) the performance was good, so probably there was some other difference in the resulting kernels from my initial revert of that patch on top of v4.15 and each step during the bisect. I've experimented a bit with fwlps=0, but it did not bring any conclusive results either. I'll try to look at other things that may have changed (configuration perhaps?), but I don't have a clear plan yet. Have you seen anything similar, or have any other ideas or suggestions to track this problem? Even without crystal clear results, it looks like 7937f02d1953 is having a negative impact on the RTL8723BE performance, so perhaps it is worth reverting it and reworking it a later point? This are the results (testing with speedtest.net) I got at some key points: VersionCommitPingDownUp v4.11a351e9b1225.445.99 v4.11a351e9b131 17.025.89 v4.13569dbb8174 14.080.00 v4.13569dbb8261 8.41 0.00 v4.15+revert d8a5b801923.861.41 v4.15+revert d8a5b80189 18.691.39 As the antenna selection code changes affected your first bisection, do you have one of those HP laptops with only one antenna and the incorrect coding in the FUSE? If so, please make sure that you still have the same signal strength for good and bad cases. I have tried to keep the driver and the btcoex code in sync, but there may be some combinations of antenna configuration and FUSE contents that cause the code to fail. Larry