Re: RTL8723BE performance regression

2018-05-13 Thread Pkshih
On Wed, 2018-05-09 at 13:33 -0700, João Paulo Rechi Vita wrote:
> On Tue, May 8, 2018 at 1:37 AM, Pkshih <pks...@realtek.com> wrote:
> > On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote:
> >> On Tue, May 1, 2018 at 10:58 PM, Pkshih <pks...@realtek.com> wrote:
> >> > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote:
> >> >>
> >> >> > -Original Message-
> >> >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
> >> >> > Sent: Wednesday, May 02, 2018 6:41 AM
> >> >> > To: Larry Finger
> >> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
> >> >> > Chaoming_Li; Kalle
> Valo;
> >> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo 
> >> >> > Rechi Vita; linux@endl
> ess
> >> m.c
> >> >> om
> >> >> > Subject: Re: RTL8723BE performance regression
> >> >> >
> >> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger 
> >> >> > <larry.fin...@lwfinger.net> wrote:
> >> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
> >> >> > >>
> >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger 
> >> >> > >> <larry.fin...@lwfinger.net>
> >> >> > >> wrote:
> >> >> > >>
> >> >> > >> (...)
> >> >> > >>
> >> >> > >>> As the antenna selection code changes affected your first 
> >> >> > >>> bisection, do
> >> >> > >>> you
> >> >> > >>> have one of those HP laptops with only one antenna and the 
> >> >> > >>> incorrect
> >> >> > >>> coding
> >> >> > >>> in the FUSE?
> >> >> > >>
> >> >> > >>
> >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- 
> >> >> > >> this
> >> >> > >> was needed to achieve a good performance in the past, before this
> >> >> > >> regression. I've also opened the laptop chassis and confirmed the
> >> >> > >> antenna cable is plugged to the connector labeled with "1" on the
> >> >> > >> card.
> >> >> > >>
> >> >> > >>> If so, please make sure that you still have the same signal
> >> >> > >>> strength for good and bad cases. I have tried to keep the driver 
> >> >> > >>> and the
> >> >> > >>> btcoex code in sync, but there may be some combinations of antenna
> >> >> > >>> configuration and FUSE contents that cause the code to fail.
> >> >> > >>>
> >> >> > >>
> >> >> > >> What is the recommended way to monitor the signal strength?
> >> >> > >
> >> >> > >
> >> >> > > The btcoex code is developed for multiple platforms by a different 
> >> >> > > group
> >> >> > > than the Linux driver. I think they made a change that caused 
> >> >> > > ant_sel to
> >> >> > > switch from 1 to 2. At least numerous comments at
> >> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that 
> >> >> > > change.
> >> >> > >
> >> >> > > Mhy recommended method is to verify the wifi device name with "iw 
> >> >> > > dev". Then
> >> >> > > using that device
> >> >> > >
> >> >> > > sudo iw dev  scan | egrep "SSID|signal"
> >> >> > >
> >> >> >
> >> >> > I have confirmed that the performance regression is indeed tied to
> >> >> > signal strength: on the good cases signal was between -16 and -8 dBm,
> >> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
> >> >> > also switched to testing bandwidth in controlled LAN environment using
> >> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only
> >> >> > machine connected to the 2.4 GHz radio and the machine running the
> >> >> > i

Re: RTL8723BE performance regression

2018-05-09 Thread João Paulo Rechi Vita
On Tue, May 8, 2018 at 1:37 AM, Pkshih <pks...@realtek.com> wrote:
> On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote:
>> On Tue, May 1, 2018 at 10:58 PM, Pkshih <pks...@realtek.com> wrote:
>> > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote:
>> >>
>> >> > -Original Message-
>> >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
>> >> > Sent: Wednesday, May 02, 2018 6:41 AM
>> >> > To: Larry Finger
>> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
>> >> > Chaoming_Li; Kalle Valo;
>> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo 
>> >> > Rechi Vita; linux@endless
>> m.c
>> >> om
>> >> > Subject: Re: RTL8723BE performance regression
>> >> >
>> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger 
>> >> > <larry.fin...@lwfinger.net> wrote:
>> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
>> >> > >>
>> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger 
>> >> > >> <larry.fin...@lwfinger.net>
>> >> > >> wrote:
>> >> > >>
>> >> > >> (...)
>> >> > >>
>> >> > >>> As the antenna selection code changes affected your first 
>> >> > >>> bisection, do
>> >> > >>> you
>> >> > >>> have one of those HP laptops with only one antenna and the incorrect
>> >> > >>> coding
>> >> > >>> in the FUSE?
>> >> > >>
>> >> > >>
>> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
>> >> > >> was needed to achieve a good performance in the past, before this
>> >> > >> regression. I've also opened the laptop chassis and confirmed the
>> >> > >> antenna cable is plugged to the connector labeled with "1" on the
>> >> > >> card.
>> >> > >>
>> >> > >>> If so, please make sure that you still have the same signal
>> >> > >>> strength for good and bad cases. I have tried to keep the driver 
>> >> > >>> and the
>> >> > >>> btcoex code in sync, but there may be some combinations of antenna
>> >> > >>> configuration and FUSE contents that cause the code to fail.
>> >> > >>>
>> >> > >>
>> >> > >> What is the recommended way to monitor the signal strength?
>> >> > >
>> >> > >
>> >> > > The btcoex code is developed for multiple platforms by a different 
>> >> > > group
>> >> > > than the Linux driver. I think they made a change that caused ant_sel 
>> >> > > to
>> >> > > switch from 1 to 2. At least numerous comments at
>> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that 
>> >> > > change.
>> >> > >
>> >> > > Mhy recommended method is to verify the wifi device name with "iw 
>> >> > > dev". Then
>> >> > > using that device
>> >> > >
>> >> > > sudo iw dev  scan | egrep "SSID|signal"
>> >> > >
>> >> >
>> >> > I have confirmed that the performance regression is indeed tied to
>> >> > signal strength: on the good cases signal was between -16 and -8 dBm,
>> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
>> >> > also switched to testing bandwidth in controlled LAN environment using
>> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only
>> >> > machine connected to the 2.4 GHz radio and the machine running the
>> >> > iperf3 server connected via ethernet.
>> >> >
>> >>
>> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: 
>> >> cleanup
>> >> 8723be ant_sel definition"). You can use the above commit and do the same
>> >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your 
>> >> results.
>> >> Since performance is tied to signal strength, you can only share signal 
>> >> streng

Re: RTL8723BE performance regression

2018-05-08 Thread Pkshih
On Mon, 2018-05-07 at 14:49 -0700, João Paulo Rechi Vita wrote:
> On Tue, May 1, 2018 at 10:58 PM, Pkshih <pks...@realtek.com> wrote:
> > On Wed, 2018-05-02 at 05:44 +, Pkshih wrote:
> >>
> >> > -Original Message-
> >> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
> >> > Sent: Wednesday, May 02, 2018 6:41 AM
> >> > To: Larry Finger
> >> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
> >> > Chaoming_Li; Kalle Valo;
> >> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo 
> >> > Rechi Vita; linux@endless
> m.c
> >> om
> >> > Subject: Re: RTL8723BE performance regression
> >> >
> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <larry.fin...@lwfinger.net> 
> >> > wrote:
> >> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
> >> > >>
> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger 
> >> > >> <larry.fin...@lwfinger.net>
> >> > >> wrote:
> >> > >>
> >> > >> (...)
> >> > >>
> >> > >>> As the antenna selection code changes affected your first bisection, 
> >> > >>> do
> >> > >>> you
> >> > >>> have one of those HP laptops with only one antenna and the incorrect
> >> > >>> coding
> >> > >>> in the FUSE?
> >> > >>
> >> > >>
> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
> >> > >> was needed to achieve a good performance in the past, before this
> >> > >> regression. I've also opened the laptop chassis and confirmed the
> >> > >> antenna cable is plugged to the connector labeled with "1" on the
> >> > >> card.
> >> > >>
> >> > >>> If so, please make sure that you still have the same signal
> >> > >>> strength for good and bad cases. I have tried to keep the driver and 
> >> > >>> the
> >> > >>> btcoex code in sync, but there may be some combinations of antenna
> >> > >>> configuration and FUSE contents that cause the code to fail.
> >> > >>>
> >> > >>
> >> > >> What is the recommended way to monitor the signal strength?
> >> > >
> >> > >
> >> > > The btcoex code is developed for multiple platforms by a different 
> >> > > group
> >> > > than the Linux driver. I think they made a change that caused ant_sel 
> >> > > to
> >> > > switch from 1 to 2. At least numerous comments at
> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that 
> >> > > change.
> >> > >
> >> > > Mhy recommended method is to verify the wifi device name with "iw 
> >> > > dev". Then
> >> > > using that device
> >> > >
> >> > > sudo iw dev  scan | egrep "SSID|signal"
> >> > >
> >> >
> >> > I have confirmed that the performance regression is indeed tied to
> >> > signal strength: on the good cases signal was between -16 and -8 dBm,
> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
> >> > also switched to testing bandwidth in controlled LAN environment using
> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only
> >> > machine connected to the 2.4 GHz radio and the machine running the
> >> > iperf3 server connected via ethernet.
> >> >
> >>
> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: 
> >> cleanup
> >> 8723be ant_sel definition"). You can use the above commit and do the same
> >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your 
> >> results.
> >> Since performance is tied to signal strength, you can only share signal 
> >> strength.
> >>
> >
> > Please pay attention to cold reboot once ant_sel is changed.
> >
> 
> I've tested the commit mentioned above and it fixes the problem on top
> of v4.16 (in addition to the latest wireless-drivers-next also been
> fixed as it already contains such commit). On v4.15, we also need the
> following commits before "af8a41cccf8f rtlwifi: c

Re: RTL8723BE performance regression

2018-05-07 Thread João Paulo Rechi Vita
On Tue, May 1, 2018 at 10:58 PM, Pkshih <pks...@realtek.com> wrote:
> On Wed, 2018-05-02 at 05:44 +, Pkshih wrote:
>>
>> > -Original Message-
>> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
>> > Sent: Wednesday, May 02, 2018 6:41 AM
>> > To: Larry Finger
>> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
>> > Chaoming_Li; Kalle Valo;
>> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi 
>> > Vita; linux@endlessm.c
>> om
>> > Subject: Re: RTL8723BE performance regression
>> >
>> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <larry.fin...@lwfinger.net> 
>> > wrote:
>> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
>> > >>
>> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger <larry.fin...@lwfinger.net>
>> > >> wrote:
>> > >>
>> > >> (...)
>> > >>
>> > >>> As the antenna selection code changes affected your first bisection, do
>> > >>> you
>> > >>> have one of those HP laptops with only one antenna and the incorrect
>> > >>> coding
>> > >>> in the FUSE?
>> > >>
>> > >>
>> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
>> > >> was needed to achieve a good performance in the past, before this
>> > >> regression. I've also opened the laptop chassis and confirmed the
>> > >> antenna cable is plugged to the connector labeled with "1" on the
>> > >> card.
>> > >>
>> > >>> If so, please make sure that you still have the same signal
>> > >>> strength for good and bad cases. I have tried to keep the driver and 
>> > >>> the
>> > >>> btcoex code in sync, but there may be some combinations of antenna
>> > >>> configuration and FUSE contents that cause the code to fail.
>> > >>>
>> > >>
>> > >> What is the recommended way to monitor the signal strength?
>> > >
>> > >
>> > > The btcoex code is developed for multiple platforms by a different group
>> > > than the Linux driver. I think they made a change that caused ant_sel to
>> > > switch from 1 to 2. At least numerous comments at
>> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
>> > >
>> > > Mhy recommended method is to verify the wifi device name with "iw dev". 
>> > > Then
>> > > using that device
>> > >
>> > > sudo iw dev  scan | egrep "SSID|signal"
>> > >
>> >
>> > I have confirmed that the performance regression is indeed tied to
>> > signal strength: on the good cases signal was between -16 and -8 dBm,
>> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
>> > also switched to testing bandwidth in controlled LAN environment using
>> > iperf3, as suggested by Steve deRosier, with the DUT being the only
>> > machine connected to the 2.4 GHz radio and the machine running the
>> > iperf3 server connected via ethernet.
>> >
>>
>> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup
>> 8723be ant_sel definition"). You can use the above commit and do the same
>> experiments (with ant_sel=0, 1 and 2) in your side, and then share your 
>> results.
>> Since performance is tied to signal strength, you can only share signal 
>> strength.
>>
>
> Please pay attention to cold reboot once ant_sel is changed.
>

I've tested the commit mentioned above and it fixes the problem on top
of v4.16 (in addition to the latest wireless-drivers-next also been
fixed as it already contains such commit). On v4.15, we also need the
following commits before "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel
definition" to have a good performance again:

  874e837d67d0 rtlwifi: fill FW version and subversion
  a44709bba70f rtlwifi: btcoex: Add power_on_setting routine
  40d9dd4f1c5d rtlwifi: btcoex: Remove global variables from btcoex

Surprisingly, it seems forcing ant_sel=1 is not needed anymore on
these machines, as the shown by the numbers bellow (ant_sel=0 means
that actually no parameter was passed to the module). I have powered
off the machine and done a cold boot for every test. It seems
something have changed in the antenna auto-selection code since v4.11,
the latest point where I could confirm we definitely need to force
ant_sel=1. I've been trying to understand what causes this difference,
but haven't made progress on that so far, so any suggestions are
appreciated (we are trying to decide if we can confidently drop the
downstream DMI quirks for these specific machines).

  w-d-n ant_sel=0: -14.00 dBm,  69.5 Mbps -> good
  w-d-n ant_sel=1: -10.00 dBm,  41.1 Mbps -> good
  w-d-n ant_sel=2: -44.00 dBm,   607 kbps -> bad

  v4.16 ant_sel=0: -12.00 dBm,  63.0 Mbps -> good
  v4.16 ant_sel=1: - 8.00 dBm,  69.0 Mbps -> good
  v4.16 ant_sel=2: -50.00 dBm,   224 kbps -> bad

  v4.15 ant_sel=0: - 8.00 dBm,  33.0 Mbps -> good
  v4.15 ant_sel=1: -10.00 dBm,  38.1 Mbps -> good
  v4.15 ant_sel=2: -48.00 dBm,   206 kbps -> bad

--
João Paulo Rechi Vita
http://about.me/jprvita


Re: RTL8723BE performance regression

2018-05-01 Thread Pkshih
On Wed, 2018-05-02 at 05:44 +, Pkshih wrote:
> 
> > -Original Message-
> > From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
> > Sent: Wednesday, May 02, 2018 6:41 AM
> > To: Larry Finger
> > Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
> > Chaoming_Li; Kalle Valo;
> > linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi 
> > Vita; linux@endlessm.c
> om
> > Subject: Re: RTL8723BE performance regression
> > 
> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <larry.fin...@lwfinger.net> 
> > wrote:
> > > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
> > >>
> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger <larry.fin...@lwfinger.net>
> > >> wrote:
> > >>
> > >> (...)
> > >>
> > >>> As the antenna selection code changes affected your first bisection, do
> > >>> you
> > >>> have one of those HP laptops with only one antenna and the incorrect
> > >>> coding
> > >>> in the FUSE?
> > >>
> > >>
> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
> > >> was needed to achieve a good performance in the past, before this
> > >> regression. I've also opened the laptop chassis and confirmed the
> > >> antenna cable is plugged to the connector labeled with "1" on the
> > >> card.
> > >>
> > >>> If so, please make sure that you still have the same signal
> > >>> strength for good and bad cases. I have tried to keep the driver and the
> > >>> btcoex code in sync, but there may be some combinations of antenna
> > >>> configuration and FUSE contents that cause the code to fail.
> > >>>
> > >>
> > >> What is the recommended way to monitor the signal strength?
> > >
> > >
> > > The btcoex code is developed for multiple platforms by a different group
> > > than the Linux driver. I think they made a change that caused ant_sel to
> > > switch from 1 to 2. At least numerous comments at
> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
> > >
> > > Mhy recommended method is to verify the wifi device name with "iw dev". 
> > > Then
> > > using that device
> > >
> > > sudo iw dev  scan | egrep "SSID|signal"
> > >
> > 
> > I have confirmed that the performance regression is indeed tied to
> > signal strength: on the good cases signal was between -16 and -8 dBm,
> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
> > also switched to testing bandwidth in controlled LAN environment using
> > iperf3, as suggested by Steve deRosier, with the DUT being the only
> > machine connected to the 2.4 GHz radio and the machine running the
> > iperf3 server connected via ethernet.
> > 
> 
> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup 
> 8723be ant_sel definition"). You can use the above commit and do the same 
> experiments (with ant_sel=0, 1 and 2) in your side, and then share your 
> results.
> Since performance is tied to signal strength, you can only share signal 
> strength.
> 

Please pay attention to cold reboot once ant_sel is changed.



RE: RTL8723BE performance regression

2018-05-01 Thread Pkshih


> -Original Message-
> From: João Paulo Rechi Vita [mailto:jprv...@gmail.com]
> Sent: Wednesday, May 02, 2018 6:41 AM
> To: Larry Finger
> Cc: Steve deRosier; 莊彥宣; Pkshih; Birming Chiu; Shaofu; Steven Ting; 
> Chaoming_Li; Kalle Valo;
> linux-wireless; Network Development; LKML; Daniel Drake; João Paulo Rechi 
> Vita; li...@endlessm.com
> Subject: Re: RTL8723BE performance regression
> 
> On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <larry.fin...@lwfinger.net> 
> wrote:
> > On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
> >>
> >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger <larry.fin...@lwfinger.net>
> >> wrote:
> >>
> >> (...)
> >>
> >>> As the antenna selection code changes affected your first bisection, do
> >>> you
> >>> have one of those HP laptops with only one antenna and the incorrect
> >>> coding
> >>> in the FUSE?
> >>
> >>
> >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
> >> was needed to achieve a good performance in the past, before this
> >> regression. I've also opened the laptop chassis and confirmed the
> >> antenna cable is plugged to the connector labeled with "1" on the
> >> card.
> >>
> >>> If so, please make sure that you still have the same signal
> >>> strength for good and bad cases. I have tried to keep the driver and the
> >>> btcoex code in sync, but there may be some combinations of antenna
> >>> configuration and FUSE contents that cause the code to fail.
> >>>
> >>
> >> What is the recommended way to monitor the signal strength?
> >
> >
> > The btcoex code is developed for multiple platforms by a different group
> > than the Linux driver. I think they made a change that caused ant_sel to
> > switch from 1 to 2. At least numerous comments at
> > github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
> >
> > Mhy recommended method is to verify the wifi device name with "iw dev". Then
> > using that device
> >
> > sudo iw dev  scan | egrep "SSID|signal"
> >
> 
> I have confirmed that the performance regression is indeed tied to
> signal strength: on the good cases signal was between -16 and -8 dBm,
> whereas in bad cases signal was always between -50 to - 40 dBm. I've
> also switched to testing bandwidth in controlled LAN environment using
> iperf3, as suggested by Steve deRosier, with the DUT being the only
> machine connected to the 2.4 GHz radio and the machine running the
> iperf3 server connected via ethernet.
> 

We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup 
8723be ant_sel definition"). You can use the above commit and do the same 
experiments (with ant_sel=0, 1 and 2) in your side, and then share your results.
Since performance is tied to signal strength, you can only share signal 
strength.

Regards
PK



Re: RTL8723BE performance regression

2018-05-01 Thread João Paulo Rechi Vita
On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger  wrote:
> On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:
>>
>> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger 
>> wrote:
>>
>> (...)
>>
>>> As the antenna selection code changes affected your first bisection, do
>>> you
>>> have one of those HP laptops with only one antenna and the incorrect
>>> coding
>>> in the FUSE?
>>
>>
>> Yes, that is why I've been passing ant_sel=1 during my tests -- this
>> was needed to achieve a good performance in the past, before this
>> regression. I've also opened the laptop chassis and confirmed the
>> antenna cable is plugged to the connector labeled with "1" on the
>> card.
>>
>>> If so, please make sure that you still have the same signal
>>> strength for good and bad cases. I have tried to keep the driver and the
>>> btcoex code in sync, but there may be some combinations of antenna
>>> configuration and FUSE contents that cause the code to fail.
>>>
>>
>> What is the recommended way to monitor the signal strength?
>
>
> The btcoex code is developed for multiple platforms by a different group
> than the Linux driver. I think they made a change that caused ant_sel to
> switch from 1 to 2. At least numerous comments at
> github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
>
> Mhy recommended method is to verify the wifi device name with "iw dev". Then
> using that device
>
> sudo iw dev  scan | egrep "SSID|signal"
>

I have confirmed that the performance regression is indeed tied to
signal strength: on the good cases signal was between -16 and -8 dBm,
whereas in bad cases signal was always between -50 to - 40 dBm. I've
also switched to testing bandwidth in controlled LAN environment using
iperf3, as suggested by Steve deRosier, with the DUT being the only
machine connected to the 2.4 GHz radio and the machine running the
iperf3 server connected via ethernet.

Using those two tests (iperf3 and signal strength) I've dug deeper
into the culprit I had found previously, commit 7937f02d1953,
reverting it partially and testing the resulting driver, to isolate
which change was causing the problem. Besides "hooking up external
functions for newer ICs", as described by the commit message, that
commit also added code to decided whether ex_btc8723b1ant_*() or
ex_btc8723b2ant_*() functions should be used in halbtcoutsrc.c,
depending on the value of board_info.btdm_ant_num, whereas before that
commit ex_btc8723b2ant_*() were always used. Reverting to always using
ex_btc8723b2ant_*() functions fixes the regression on v4.15.

I've also tried to bisect between v4.15..v4.16 to find what else was
causing problems there, as the changes mentioned above on top of v4.16
did not solve the problem. The bisect pointed to "874e837d67d0
rtlwifi: fill FW version and subversion", only but reverting it plus
the changes mentioned above also didn't yield good results. That's
when I decided to get a bit creative: starting on v4.16 I first
applied the changes to have ex_btc8723b2ant_*() always being used, as
mentioned above, then reverted every commit between v4.15..v4.16
affecting drivers/net/wireless/realtek/rtlwifi/, and verified the
resulting kernel had a good performance. Then I started trimming down
the history and testing along the way, to reduce to the minimum set of
changes that had to be reverted in order to restore the good
performance. In addition to the ex_btc8723b2ant_*() changes and
reverting "874e837d67d0 rtlwifi: fill FW version and subversion", I've
also had to remove the following lines from
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c, which
were introduced by "40d9dd4f1c5d rtlwifi: btcoex: Remove global
variables from btcoex", in order to restore the upload performance and
signal strength.

/* set default antenna position to main  port */
btcoexist->board_info.btdm_ant_pos = BTC_ANTENNA_AT_MAIN_PORT;

These are the results I've got on v4.16 (similarly on
wireless-drivers-next-for-davem-2018-03-29 or v4.15):

 $ sudo iw dev wlp2s0 scan | grep -B3 JJ | grep signal
signal: -42.00 dBm
 $ iperf3 -c 192.168.1.254
 Connecting to host 192.168.1.254, port 5201
 [  4] local 192.168.1.253 port 39678 connected to 192.168.1.254 port 5201
 [ ID] Interval   Transfer Bandwidth   Retr  Cwnd
 [  4]   0.00-1.00   sec   735 KBytes  6.02 Mbits/sec1   1.41 KBytes
 [  4]   1.00-2.00   sec   274 KBytes  2.25 Mbits/sec1   1.41 KBytes
 [  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec0   1.41 KBytes
 [  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec0   1.41 KBytes
 [  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec1   28.3 KBytes
 [  4]   5.00-6.00   sec   423 KBytes  3.47 Mbits/sec3   41.0 KBytes
 [  4]   6.00-7.00   sec   840 KBytes  6.88 Mbits/sec0   58.0 KBytes
 [  4]   7.00-8.00   sec   830 KBytes  6.79 Mbits/sec1   1.41 KBytes
 [  4]   

Re: RTL8723BE performance regression

2018-04-04 Thread Steve deRosier
On Tue, Apr 3, 2018 at 6:51 PM, João Paulo Rechi Vita  wrote:
>
> This are the results (testing with speedtest.net) I got at some key points:
>
> VersionCommitPingDownUp
>
> v4.11a351e9b1225.445.99
> v4.11a351e9b131  17.025.89
>
> v4.13569dbb8174  14.080.00
> v4.13569dbb8261  8.41  0.00
>
> v4.15+revert d8a5b801923.861.41
> v4.15+revert d8a5b80189  18.691.39
>

I recommend doing throughput testing in a closed system using iperf.
speedtest.net is potentially useful for testing your ISP's bandwidth
at some particular point in time, but little else as it exposes you to
too many variables. I wouldn't take those numbers to mean much and the
inconclusive results you're getting could be explained by external
network loading and having little to do with your bisect effort. I can
get that spread in numbers from speedtest.net without making any
changes other than the time of day I do the test.

Here's how to do it. Install iperf2 (you could use iperf3, personal
choice) on two machines, one being your device under test (DUT). Setup
a network configuration that looks similar to this:

server <==hardwire==> AP <--wireless link--> DUT

Be sure your hardwire is more bandwidth than your wireless link is
capable of, or set it up where the server is the AP. What you're
looking for here is environmental consistency, not maximum throughput
numbers.

On the computer hardwired to the network, start the server, we'll
assume it has an ip of 192.168.33.18:

iperf -s

On your DUT:

iperf -c 192.168.33.18

That's the most basic setup, check the man page for more options.

You will get best results if you can exclude other computers from your
test network and other wireless devices from your airspace.

- Steve

--
Steve deRosier
Cal-Sierra Consulting LLC
https://www.cal-sierra.com/


Re: RTL8723BE performance regression

2018-04-03 Thread Larry Finger

On 04/03/2018 09:37 PM, João Paulo Rechi Vita wrote:

On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger  wrote:

(...)


As the antenna selection code changes affected your first bisection, do you
have one of those HP laptops with only one antenna and the incorrect coding
in the FUSE?


Yes, that is why I've been passing ant_sel=1 during my tests -- this
was needed to achieve a good performance in the past, before this
regression. I've also opened the laptop chassis and confirmed the
antenna cable is plugged to the connector labeled with "1" on the
card.


If so, please make sure that you still have the same signal
strength for good and bad cases. I have tried to keep the driver and the
btcoex code in sync, but there may be some combinations of antenna
configuration and FUSE contents that cause the code to fail.



What is the recommended way to monitor the signal strength?


The btcoex code is developed for multiple platforms by a different group than 
the Linux driver. I think they made a change that caused ant_sel to switch from 
1 to 2. At least numerous comments at github.com/lwfinger/rtlwifi_new claimed 
they needed to make that change.


Mhy recommended method is to verify the wifi device name with "iw dev". Then 
using that device


sudo iw dev  scan | egrep "SSID|signal"

Larry





Re: RTL8723BE performance regression

2018-04-03 Thread João Paulo Rechi Vita
On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger  wrote:

(...)

> As the antenna selection code changes affected your first bisection, do you
> have one of those HP laptops with only one antenna and the incorrect coding
> in the FUSE?

Yes, that is why I've been passing ant_sel=1 during my tests -- this
was needed to achieve a good performance in the past, before this
regression. I've also opened the laptop chassis and confirmed the
antenna cable is plugged to the connector labeled with "1" on the
card.

> If so, please make sure that you still have the same signal
> strength for good and bad cases. I have tried to keep the driver and the
> btcoex code in sync, but there may be some combinations of antenna
> configuration and FUSE contents that cause the code to fail.
>

What is the recommended way to monitor the signal strength?

Thanks for such a quick reply,

--
João Paulo Rechi Vita
http://about.me/jprvita


Re: RTL8723BE performance regression

2018-04-03 Thread Larry Finger

On 04/03/2018 08:51 PM, João Paulo Rechi Vita wrote:

Hello,

I've been trying to track a performance regression on the RTL8723BE
WiFi adapter, which mainly affects the upload bandwidth (although we
can see a decreased download performance as well, the effect on upload
is more drastic). This was first reported by users after upgrading
from our 4.11-based kernel to our 4.13-based kernel, but also
confirmed to affect our development branch (4.15-based kernel) and
wireless-drivers-next at the
wireless-drivers-next-for-davem-2018-03-29 tag. This is happening on
an HP laptop that needs rtl8723be.ant_sel=1 (and all the following
tests have been made with that param).

My first bisect attempt pointed me to the following commit:

bcd37f4a0831 rtlwifi: btcoex: 23b 2ant: let bt transmit when hw
initialisation done

Which I later found to be already fixed by

a33fcba6ec01 rtlwifi: btcoexist: Fix breakage of ant_sel for rtl8723be.

That fix is already included in v4.15 though (and our dev branch as
well), so I did a second bisect, now cherry-picking a33fcba6ec01 at
every step, and it pointed me to the following commit:

7937f02d1953 rtlwifi: btcoex: hook external functions for newer chips

Reverting that commit on top of our development branch fixes the
problem, but on top of v4.15 I get mixed results: a few times getting
a good upload performance (~5-6Mbps) but most of the time just getting
~1-1.5Mpbs (which is still better than the 0.0 then test failure I've
gotten on most bad points of the bisect).

Bisecting the downstream patches we carry on top of v4.15 (we base our
kernel on Ubuntu's, so there are quite a few downstream changes) did
not bring any clarity, as at all bisect points (plus reverting
7937f02d1953) the performance was good, so probably there was some
other difference in the resulting kernels from my initial revert of
that patch on top of v4.15 and each step during the bisect. I've
experimented a bit with fwlps=0, but it did not bring any conclusive
results either. I'll try to look at other things that may have changed
(configuration perhaps?), but I don't have a clear plan yet.

Have you seen anything similar, or have any other ideas or suggestions
to track this problem? Even without crystal clear results, it looks
like 7937f02d1953 is having a negative impact on the RTL8723BE
performance, so perhaps it is worth reverting it and reworking it a
later point?

This are the results (testing with speedtest.net) I got at some key points:

VersionCommitPingDownUp

v4.11a351e9b1225.445.99
v4.11a351e9b131  17.025.89

v4.13569dbb8174  14.080.00
v4.13569dbb8261  8.41  0.00

v4.15+revert d8a5b801923.861.41
v4.15+revert d8a5b80189  18.691.39



As the antenna selection code changes affected your first bisection, do you have 
one of those HP laptops with only one antenna and the incorrect coding in the 
FUSE? If so, please make sure that you still have the same signal strength for 
good and bad cases. I have tried to keep the driver and the btcoex code in sync, 
but there may be some combinations of antenna configuration and FUSE contents 
that cause the code to fail.


Larry