Re: [Bloat] BBR implementations, knobs to turn?
> > The CPE side has met willingness to investigate these issues from early > on, but it seems that buffer handling is much harder on CPE chipsets > than on base station chipsets. In particular on 5G. We have had some > very good results on 4G, but they do not translate to 5G. > My own experience with various CPE work (working with ODMs to get hardware built) is that those building CPE are stuck with what the silicon vendors will support, much like where we currently are with home routers. The AP firmware and drivers have bloated buffers and there's very little that can be done to change that. The large OEMs (the very large, well-known, retail brands) have the volume to put pressure on the silicon vendors to fix this. But there's not much incentive for a silicon vendor to address this issue for a smaller customer. ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
Re: [Bloat] BBR implementations, knobs to turn?
Fra: Toke Høiland-Jørgensen > > I really appreciate that you are reaching out to the bufferbloat community > > for this real-life 5G mobile testing. Lets all help out Erik. > > Yes! FYI, I've been communicating off-list with Erik for quite some > time, he's doing great work but fighting the usual up-hill battle to get > others to recognise the issues; so +1, let's give him all the help we > can :) Thanks! I'll need all the patting on the back which can be provided. :) > > From your graphs, it does look like you are measuring latency > > under-load, e.g. while the curl download/upload is running. This is > > great as this is the first rule of bufferbloat measuring :-) (and Luca > > hinted to this) I've been using Flent since day one of the Fixed Wireless project. In fact it was an part of the CPE RFQ process that all participating vendors must deliver test results from Flent as a part of the technical response. Not so much that we should save ourselves the work of testing. But to force the vendors to see how their equipment handle latency. Trying to establish Flent as a standard part of their test suit. Toke and Dave has been of great help both in helping me in interpreting Flent results, and moral support as it can be an up hill battle to achieve awareness of the bufferbloat issues. > > The Huawei policer/shaper sounds scary. And 1000 packets deep queue > > also sound like a recipe for bufferbloat. I would of-cause like to > > re-write the Huawei policer/shaper with the knowledge and techniques we > > know from our bufferbloat work in the Linux Kernel. (If only I knew > > someone that coded on 5G solutions that could implement this on their > > hardware solution, and provide a better product Cc. Carlo) I have tried on several occasions to get the vendors to subscribe to this mailing list. And individuals here have been willing to do consulting work for vendors in Telenors RFQ, but as far as I know they have not been contacted. > Your other points about bloated queues etc, are spot on. Ideally, we > could get operators to fix their gear, but working around the issues > like Erik is doing can work in the meantime. And it's great to see that > it seems like Telenor is starting to roll this out; as far as I can tell > that has taken quite a bit of advocacy from Erik's side to get there! :) As you say Toke, I have had some success. In particular with Huawei on the base station side I believe we will have the largest impact. The CPE side has met willingness to investigate these issues from early on, but it seems that buffer handling is much harder on CPE chipsets than on base station chipsets. In particular on 5G. We have had some very good results on 4G, but they do not translate to 5G. -Erik ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
Re: [Bloat] BBR implementations, knobs to turn?
On Tue, Nov 17, 2020 at 10:08 AM Jesper Dangaard Brouer wrote: > > On Tue, 17 Nov 2020 10:05:24 + wrote: > > > Thank you for the response Neal > > Yes. And it is impressive how many highly qualified people are on the > bufferbloat list. > > > old_hw # uname -r > > 5.3.0-64-generic > > (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit > > GPON access. Used as proof of concept from the lab at work) > > > > > > new_hw # uname -r > > 4.18.0-193.19.1.el8_2.x86_64 > > (Centos 8.2 on xenon rack server, discrete 10Gbit network card, > > 40Gbit server farm link (low utilization on link), intended as fully > > supported and run service. Not possible to have newer kernel and > > still get service agreement in my organization) > > Let me help out here. The CentOS/RHEL8 kernels have a huge amount of > backports. I've attached a patch/diff of net/ipv4/tcp_bbr.c changes > missing in RHEL8. > > It looks like these patches are missing in CentOS/RHEL8: > [1] https://git.kernel.org/torvalds/c/78dc70ebaa38aa3 > [2] https://git.kernel.org/torvalds/c/a87c83d5ee25cf7 > > Could missing patch [1] result in the issue Erik is seeing? > (It explicitly mentions improvements for WiFi...) Thanks, Erik, for the detailed information. This is super-useful. And thanks, Jesper, for the patch analysis. Yes, I agree that missing patch [1] is likely the cause of the lower BBR throughout in the "new_hw" case. Since the "new_hw" is running an older kernel that's missing this important patch, it would be expected to have lower throughput in a workload like this. It's unfortunate that it's not possible to have a newer kernel on the newer hardware; it does seem in this case that this would probably do the trick. best, neal ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
Re: [Bloat] BBR implementations, knobs to turn?
Jesper Dangaard Brouer writes: > Hi Erik, > > I really appreciate that you are reaching out to the bufferbloat community > for this real-life 5G mobile testing. Lets all help out Erik. Yes! FYI, I've been communicating off-list with Erik for quite some time, he's doing great work but fighting the usual up-hill battle to get others to recognise the issues; so +1, let's give him all the help we can :) > From your graphs, it does look like you are measuring latency > under-load, e.g. while the curl download/upload is running. This is > great as this is the first rule of bufferbloat measuring :-) (and Luca > hinted to this) > > The Huawei policer/shaper sounds scary. And 1000 packets deep queue > also sound like a recipe for bufferbloat. I would of-cause like to > re-write the Huawei policer/shaper with the knowledge and techniques we > know from our bufferbloat work in the Linux Kernel. (If only I knew > someone that coded on 5G solutions that could implement this on their > hardware solution, and provide a better product Cc. Carlo) > > Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on > wireless networks? (Hint: Airtime fairness) > > Solving bufferbloat in wireless networks require more than applying > fq_codel on the bottleneck queue, it requires Airtime fairness. Doing > scheduling based Clients use of Radio-time and transmit-opportunities > (TXOP), instead of shaping based on bytes. (This is why it can (if you > are very careful) make sense to "holding back packets a bit" to > generate a packet aggregate that only consumes one TXOP). > > The culprit is that each Client/MobilePhone will be sending at > different rates, and scheduling based on bytes, will cause a Client with > a low rate to consume a too large part of the shared radio airtime. > That basically sums up Toke's PhD ;-) Much as I of course appreciate the call-out, airtime fairness itself is not actually much of an issue with mobile networks (LTE/5G/etc)... :) The reason being that they use TDMA scheduling enforced by the base station; so there's a central controller that enforces airtime usage built into the protocol, which ensures fairness (unless the operator explicitly configures it to be unfair for policy reasons). So the new insight in my PhD is not so much "airtime fairness is good for wireless links" as it is "we can achieve airtime fairness in CDMA/CS-scheduled networks like WiFi". Your other points about bloated queues etc, are spot on. Ideally, we could get operators to fix their gear, but working around the issues like Erik is doing can work in the meantime. And it's great to see that it seems like Telenor is starting to roll this out; as far as I can tell that has taken quite a bit of advocacy from Erik's side to get there! :) -Toke ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
Re: [Bloat] BBR implementations, knobs to turn?
Hi Erik, I really appreciate that you are reaching out to the bufferbloat community for this real-life 5G mobile testing. Lets all help out Erik. From your graphs, it does look like you are measuring latency under-load, e.g. while the curl download/upload is running. This is great as this is the first rule of bufferbloat measuring :-) (and Luca hinted to this) The Huawei policer/shaper sounds scary. And 1000 packets deep queue also sound like a recipe for bufferbloat. I would of-cause like to re-write the Huawei policer/shaper with the knowledge and techniques we know from our bufferbloat work in the Linux Kernel. (If only I knew someone that coded on 5G solutions that could implement this on their hardware solution, and provide a better product Cc. Carlo) Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on wireless networks? (Hint: Airtime fairness) Solving bufferbloat in wireless networks require more than applying fq_codel on the bottleneck queue, it requires Airtime fairness. Doing scheduling based Clients use of Radio-time and transmit-opportunities (TXOP), instead of shaping based on bytes. (This is why it can (if you are very careful) make sense to "holding back packets a bit" to generate a packet aggregate that only consumes one TXOP). The culprit is that each Client/MobilePhone will be sending at different rates, and scheduling based on bytes, will cause a Client with a low rate to consume a too large part of the shared radio airtime. That basically sums up Toke's PhD ;-) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer Cc. Marek due to his twitter post[1] and link to 5G-BBR blogpost[2]: [1] https://twitter.com/majek04/status/1329708548297732097 [2] https://blog.acolyer.org/2020/10/05/understanding-operational-5g/ On Thu, 19 Nov 2020 14:35:27 + wrote: > Hello Luca > > > The current PGW is a policer. What the next version will be, I'm not sure. > > > However on parts of the Huawei RAN the policing rate is set to be a > shaper speed on the eNodeB (radio antenna). 1000 packets deep. And > it not only shapes down to 30Mb, but tries to aggregate packets to > keep a level speed whenever using the radio interface. Meaning > holding back packets a bit to try and get to 30Mbit when sending in > bulk in case of less than 30Mbit user traffic. 30Mbit being an > example subscription speed. > > > We are rolling out a fix to turn of that Huawei shaper, but it is not > done nation wide yet. > > The test device is in a lab area, using close to, but not entirely > the same as production 5G setup from Ericsson. Here there should not > be any shapers involved in the downstream path here. There is > however a bloated buffer on the upstream path which we are working on > correcting. > > > The curl graphs are "time to complete a curl download of x file > size", using a apache webserver running bbr. > > > -Erik > > > > Fra: Luca Muscariello > Sendt: 19. november 2020 14:32 > Til: Taraldsen Erik > Kopi: Jesper Dangaard Brouer; priyar...@google.com; bloat; Luca Muscariello > Emne: Re: [Bloat] BBR implementations, knobs to turn? > > Hi Erick, > > one question about the PGW: is it a policer or a shaper that you have > installed? > Also, have you tried to run a ping session before and in parallel to the curl > sessions? > > Luca > > > > On Thu, Nov 19, 2020 at 2:15 PM > mailto:erik.tarald...@telenor.com>> wrote: > Update: > The 5G router was connected to a new base station. Now the limiting factor > of throughput is the policer on the PGW in mobile core, not the radio link > itself. The SIM card used is limited to 30Mbit/s. This scenario favours the > new server. I have attached graphs comparing radio link limited vs PGW > policer results, and a zoomed in graph of the policer > > > We have Huawei RAN and Ericsson RAN, rate limited and not rate limited > subscriptions, 4G and 5G access, and we are migrating to a new core with new > PGW (policer). Starting to be a bit of a matrix to set up tests for. > > > -Erik > > > > Fra: Jesper Dangaard Brouer mailto:bro...@redhat.com>> > Sendt: 17. november 2020 16:07 > Til: Taraldsen Erik; Priyaranjan Jha > Kopi: bro...@redhat.com<mailto:bro...@redhat.com>; > ncardw...@google.com<mailto:ncardw...@google.com>; > bloat@lists.bufferbloat.net<mailto:bloat@lists.bufferbloat.net> > Emne: Re: [Bloat] BBR implementations, knobs to turn? > > On Tue, 17 Nov 2020 10:05:24 + > mailto:erik.tarald...@telenor.com>> wrote: > &g
Re: [Bloat] BBR implementations, knobs to turn?
Hi Erick, one question about the PGW: is it a policer or a shaper that you have installed? Also, have you tried to run a ping session before and in parallel to the curl sessions? Luca On Thu, Nov 19, 2020 at 2:15 PM wrote: > Update: > The 5G router was connected to a new base station. Now the limiting > factor of throughput is the policer on the PGW in mobile core, not the > radio link itself. The SIM card used is limited to 30Mbit/s. This > scenario favours the new server. I have attached graphs comparing radio > link limited vs PGW policer results, and a zoomed in graph of the policer > > > We have Huawei RAN and Ericsson RAN, rate limited and not rate limited > subscriptions, 4G and 5G access, and we are migrating to a new core with > new PGW (policer). Starting to be a bit of a matrix to set up tests for. > > > -Erik > > > > Fra: Jesper Dangaard Brouer > Sendt: 17. november 2020 16:07 > Til: Taraldsen Erik; Priyaranjan Jha > Kopi: bro...@redhat.com; ncardw...@google.com; bloat@lists.bufferbloat.net > Emne: Re: [Bloat] BBR implementations, knobs to turn? > > On Tue, 17 Nov 2020 10:05:24 + wrote: > > > Thank you for the response Neal > > Yes. And it is impressive how many highly qualified people are on the > bufferbloat list. > > > old_hw # uname -r > > 5.3.0-64-generic > > (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit > > GPON access. Used as proof of concept from the lab at work) > > > > > > new_hw # uname -r > > 4.18.0-193.19.1.el8_2.x86_64 > > (Centos 8.2 on xenon rack server, discrete 10Gbit network card, > > 40Gbit server farm link (low utilization on link), intended as fully > > supported and run service. Not possible to have newer kernel and > > still get service agreement in my organization) > > Let me help out here. The CentOS/RHEL8 kernels have a huge amount of > backports. I've attached a patch/diff of net/ipv4/tcp_bbr.c changes > missing in RHEL8. > > It looks like these patches are missing in CentOS/RHEL8: > [1] https://git.kernel.org/torvalds/c/78dc70ebaa38aa3 > [2] https://git.kernel.org/torvalds/c/a87c83d5ee25cf7 > > Could missing patch [1] result in the issue Erik is seeing? > (It explicitly mentions improvements for WiFi...) > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer > ___ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat > ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
Re: [Bloat] BBR implementations, knobs to turn?
On Tue, 17 Nov 2020 10:05:24 + wrote: > Thank you for the response Neal Yes. And it is impressive how many highly qualified people are on the bufferbloat list. > old_hw # uname -r > 5.3.0-64-generic > (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit > GPON access. Used as proof of concept from the lab at work) > > > new_hw # uname -r > 4.18.0-193.19.1.el8_2.x86_64 > (Centos 8.2 on xenon rack server, discrete 10Gbit network card, > 40Gbit server farm link (low utilization on link), intended as fully > supported and run service. Not possible to have newer kernel and > still get service agreement in my organization) Let me help out here. The CentOS/RHEL8 kernels have a huge amount of backports. I've attached a patch/diff of net/ipv4/tcp_bbr.c changes missing in RHEL8. It looks like these patches are missing in CentOS/RHEL8: [1] https://git.kernel.org/torvalds/c/78dc70ebaa38aa3 [2] https://git.kernel.org/torvalds/c/a87c83d5ee25cf7 Could missing patch [1] result in the issue Erik is seeing? (It explicitly mentions improvements for WiFi...) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer --- /home/hawk/git/redhat/kernel-rhel8/net/ipv4/tcp_bbr.c 2020-01-30 17:38:20.832726582 +0100 +++ /home/hawk/git/kernel/net-next/net/ipv4/tcp_bbr.c 2020-11-17 15:38:22.665729797 +0100 @@ -115,6 +115,14 @@ struct bbr { unused_b:5; u32 prior_cwnd; /* prior cwnd upon entering loss recovery */ u32 full_bw; /* recent bw, to estimate if pipe is full */ + + /* For tracking ACK aggregation: */ + u64 ack_epoch_mstamp; /* start of ACK sampling epoch */ + u16 extra_acked[2]; /* max excess data ACKed in epoch */ + u32 ack_epoch_acked:20, /* packets (S)ACKed in sampling epoch */ + extra_acked_win_rtts:5, /* age of extra_acked, in round trips */ + extra_acked_win_idx:1, /* current index in extra_acked array */ + unused_c:6; }; #define CYCLE_LEN 8 /* number of phases in a pacing gain cycle */ @@ -128,6 +136,14 @@ static const u32 bbr_probe_rtt_mode_ms = /* Skip TSO below the following bandwidth (bits/sec): */ static const int bbr_min_tso_rate = 120; +/* Pace at ~1% below estimated bw, on average, to reduce queue at bottleneck. + * In order to help drive the network toward lower queues and low latency while + * maintaining high utilization, the average pacing rate aims to be slightly + * lower than the estimated bandwidth. This is an important aspect of the + * design. + */ +static const int bbr_pacing_margin_percent = 1; + /* We use a high_gain value of 2/ln(2) because it's the smallest pacing gain * that will allow a smoothly increasing pacing rate that will double each RTT * and send the same number of packets per RTT that an un-paced, slow-starting @@ -174,6 +190,15 @@ static const u32 bbr_lt_bw_diff = 4000 / /* If we estimate we're policed, use lt_bw for this many round trips: */ static const u32 bbr_lt_bw_max_rtts = 48; +/* Gain factor for adding extra_acked to target cwnd: */ +static const int bbr_extra_acked_gain = BBR_UNIT; +/* Window length of extra_acked window. */ +static const u32 bbr_extra_acked_win_rtts = 5; +/* Max allowed val for ack_epoch_acked, after which sampling epoch is reset */ +static const u32 bbr_ack_epoch_acked_reset_thresh = 1U << 20; +/* Time period for clamping cwnd increment due to ack aggregation */ +static const u32 bbr_extra_acked_max_us = 100 * 1000; + static void bbr_check_probe_rtt_done(struct sock *sk); /* Do we estimate that STARTUP filled the pipe? */ @@ -200,21 +225,33 @@ static u32 bbr_bw(const struct sock *sk) return bbr->lt_use_bw ? bbr->lt_bw : bbr_max_bw(sk); } +/* Return maximum extra acked in past k-2k round trips, + * where k = bbr_extra_acked_win_rtts. + */ +static u16 bbr_extra_acked(const struct sock *sk) +{ + struct bbr *bbr = inet_csk_ca(sk); + + return max(bbr->extra_acked[0], bbr->extra_acked[1]); +} + /* Return rate in bytes per second, optionally with a gain. * The order here is chosen carefully to avoid overflow of u64. This should * work for input rates of up to 2.9Tbit/sec and gain of 2.89x. */ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) { - rate *= tcp_mss_to_mtu(sk, tcp_sk(sk)->mss_cache); + unsigned int mss = tcp_sk(sk)->mss_cache; + + rate *= mss; rate *= gain; rate >>= BBR_SCALE; - rate *= USEC_PER_SEC; + rate *= USEC_PER_SEC / 100 * (100 - bbr_pacing_margin_percent); return rate >> BW_SCALE; } /* Convert a BBR bw and gain factor to a pacing rate in bytes per second. */ -static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain) +static unsigned long bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain) { u64 rate = bw; @@ -242,18 +279,12 @@ static void bbr_init_pacing_rate_from_rt sk->sk_pacing_rate = bbr_bw_to_pacing_rate(sk, bw, bbr_high_gain); } -/* Pace using current bw estimate and a gain factor. In order to help drive the - * network toward
Re: [Bloat] BBR implementations, knobs to turn?
A couple questions: - I guess this is Linux TCP BBRv1 ("bbr" module)? What's the OS distribution and exact kernel version ("uname -r")? - What do you mean when you say "The old server allows for more re-transmits"? - If BBRv1 is suffering throughput problems due to high retransmit rates, then usually the retransmit rate is around 15% or higher. If the retransmit rate is that high on a radio link that is being tested, then that radio link may be having issues that should be investigated separately? - Would you be able to take a tcpdump trace of the well-behaved and problematic traffic and share the pcap or a plot? https://github.com/google/bbr/blob/master/Documentation/bbr-faq.md#how-can-i-visualize-the-behavior-of-linux-tcp-bbr-connections - Would you be able to share the output of "ss -tin" from a recently built "ss" binary, near the end of a long-lived test flow, for the well-behaved and problematic cases? https://github.com/google/bbr/blob/master/Documentation/bbr-faq.md#how-can-i-monitor-linux-tcp-bbr-connections best, neal On Mon, Nov 16, 2020 at 10:25 AM wrote: > I'm in the process of replacing a throughput test server. The old server > is running a 1Gbit Ethernet card on a 1Gbit link and ubuntu. The new a > 10Gbit card on a 40Gbit link and centos. Both have low load and Xenon > processors. > > > The purpose is for field installers to verify the bandwidth sold to the > customers using known clients against known servers. (4G and 5G fixed > installations mainly). > > > What I'm finding is that the new server is consistently delivering > slightly lower throughput than the old server. The old server allows for > more re-transmits and has a slightly higher congestion window than the new > server. > > > Is there any way to tune bbr to allow for more re-transmits (which seems > to be the limiting factor)? Or other suggestions? > > > > (Frankly I think the old server is to aggressive for general purpose use. > It seems to starve out other tcp sessions more than the new server. So for > delivering regular content to users the new implementation seems more > balanced, but that is not the target here. We want to stress test the > radio link.) > > > Regards Erik > ___ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat > ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat
[Bloat] BBR implementations, knobs to turn?
I'm in the process of replacing a throughput test server. The old server is running a 1Gbit Ethernet card on a 1Gbit link and ubuntu. The new a 10Gbit card on a 40Gbit link and centos. Both have low load and Xenon processors. The purpose is for field installers to verify the bandwidth sold to the customers using known clients against known servers. (4G and 5G fixed installations mainly). What I'm finding is that the new server is consistently delivering slightly lower throughput than the old server. The old server allows for more re-transmits and has a slightly higher congestion window than the new server. Is there any way to tune bbr to allow for more re-transmits (which seems to be the limiting factor)? Or other suggestions? (Frankly I think the old server is to aggressive for general purpose use. It seems to starve out other tcp sessions more than the new server. So for delivering regular content to users the new implementation seems more balanced, but that is not the target here. We want to stress test the radio link.) Regards Erik ___ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat