[Bloat] finally got the proper cake experience (was: Re: openwrt e1000e (was: Re: cake + ipv6))

2020-11-20 Thread Daniel Sterling
On Tue, Nov 17, 2020 at 7:00 PM Daniel Sterling
 wrote:
> Wanted to give an update on this. All my issues (odd latency, slow
> throughput, etc) went away when I switched from the in-kernel e1000e
> driver to Intel's NAPI driver.

OK y'all, after switching to this driver, I configured my openwrt
snapshot setup as follows:

** e1000e loaded with InterruptThrottleRate=0:
rmmod r8169 ; modprobe r8169
rmmod e1000e ; insmod e1000e InterruptThrottleRate=0

** All ethtool -K offloading turned off:
rx tx sg tso gso gro all off

** flow control / pause frames turned off:
autoneg off rx off tx off

** ring buffers set to smallest setting:
rx 64 tx 64

** byte queue limits set to 3000
echo 3000 > 
/sys/devices/pci:00/:00:1c.0/:02:00.0/net/eth1/queues/tx-0/byte_queue_limits/limit_max
echo 3000 > 
/sys/devices/pci:00/:00:01.0/:01:00.0/net/eth0/queues/tx-0/byte_queue_limits/limit_max

cake configured with besteffort and no bandwidth limit:
/usr/sbin/tc qdisc add dev $WAN handle 1: root cake besteffort
internet nat egress ack-filter dual-srchost ethernet
/usr/sbin/tc qdisc add dev $LAN handle 1: root cake besteffort
internet ingress dual-dsthost ethernet


AND ALL MY LATENCY IS GONE.

I've finally experienced the Zen of low latency and high throughput of
the Promised Cake.

This is amazing.

Previously, I was able to get low latency by setting the cake
"bandwidth" setting to about 40-50mbps out of the LAN interface.
Higher than that and I'd see latency.

But now, I can run as fast as the wifi / wired connections allow with
no latency anywhere on the network. This is truly wild.

Why did no one tell me I didn't need that bandwidth setting?? I
finally pushed and pushed and messed with things until I got it
working this way. If this is the intended final configuration, it
needs to be proselytized! Truly wire-line speed and no latency. This
is just, there are no words. The dream is real

Enormous thank you to everyone who worked on cake!

Thanks,
Dan
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] What is the ifb4 interface?

2020-11-20 Thread Sam
While on the subject... I'm more interested in the packet flow. Like 
where does iptables fit at between the ifb and eth0?


--Sam


On 11/18/20 7:53 AM, Kenneth Porter wrote:
I'd never understood what the ifb4 interface was for. Here's a nice 
diagram from an sqm-scripts GitHub issue.


Ken


 Forwarded Message 
Subject:Re: [tohojo/sqm-scripts] diagram needed (#125)
Date:   Tue, 17 Nov 2020 19:36:47 -0800
From:   taggart 
Reply-To: 	tohojo/sqm-scripts 


To: tohojo/sqm-scripts 
CC: Subscribed 



Well it's not great, maybe something like this
example 
 
Here's the dia file (gzip'd cause github)
example.dia.gz 



I was trying to wrap my head around why ifb4 existed and how it worked. 
Maybe it could somehow explain that qdisc's can only work on packets 
leaving a device and that's the reason for the ifb4.


—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub 
, 
or unsubscribe 
.



___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-20 Thread Toke Høiland-Jørgensen via Bloat
Jesper Dangaard Brouer  writes:

> Hi Erik,
>
> I really appreciate that you are reaching out to the bufferbloat community
> for this real-life 5G mobile testing.  Lets all help out Erik.

Yes! FYI, I've been communicating off-list with Erik for quite some
time, he's doing great work but fighting the usual up-hill battle to get
others to recognise the issues; so +1, let's give him all the help we
can :)

> From your graphs, it does look like you are measuring latency
> under-load, e.g. while the curl download/upload is running.  This is
> great as this is the first rule of bufferbloat measuring :-)  (and Luca
> hinted to this)
>
> The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
> also sound like a recipe for bufferbloat.  I would of-cause like to
> re-write the Huawei policer/shaper with the knowledge and techniques we
> know from our bufferbloat work in the Linux Kernel.  (If only I knew
> someone that coded on 5G solutions that could implement this on their
> hardware solution, and provide a better product Cc. Carlo)
>
> Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on
> wireless networks?  (Hint: Airtime fairness)
>
> Solving bufferbloat in wireless networks require more than applying
> fq_codel on the bottleneck queue, it requires Airtime fairness.  Doing
> scheduling based Clients use of Radio-time and transmit-opportunities
> (TXOP), instead of shaping based on bytes. (This is why it can (if you
> are very careful) make sense to "holding back packets a bit" to
> generate a packet aggregate that only consumes one TXOP).
>
> The culprit is that each Client/MobilePhone will be sending at
> different rates, and scheduling based on bytes, will cause a Client with
> a low rate to consume a too large part of the shared radio airtime.
> That basically sums up Toke's PhD ;-)

Much as I of course appreciate the call-out, airtime fairness itself is
not actually much of an issue with mobile networks (LTE/5G/etc)... :)

The reason being that they use TDMA scheduling enforced by the base
station; so there's a central controller that enforces airtime usage
built into the protocol, which ensures fairness (unless the operator
explicitly configures it to be unfair for policy reasons). So the new
insight in my PhD is not so much "airtime fairness is good for wireless
links" as it is "we can achieve airtime fairness in CDMA/CS-scheduled
networks like WiFi".

Your other points about bloated queues etc, are spot on. Ideally, we
could get operators to fix their gear, but working around the issues
like Erik is doing can work in the meantime. And it's great to see that
it seems like Telenor is starting to roll this out; as far as I can tell
that has taken quite a bit of advocacy from Erik's side to get there! :)

-Toke
___
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


Re: [Bloat] BBR implementations, knobs to turn?

2020-11-20 Thread Jesper Dangaard Brouer
Hi Erik,

I really appreciate that you are reaching out to the bufferbloat community
for this real-life 5G mobile testing.  Lets all help out Erik.

From your graphs, it does look like you are measuring latency
under-load, e.g. while the curl download/upload is running.  This is
great as this is the first rule of bufferbloat measuring :-)  (and Luca
hinted to this)

The Huawei policer/shaper sounds scary.  And 1000 packets deep queue
also sound like a recipe for bufferbloat.  I would of-cause like to
re-write the Huawei policer/shaper with the knowledge and techniques we
know from our bufferbloat work in the Linux Kernel.  (If only I knew
someone that coded on 5G solutions that could implement this on their
hardware solution, and provide a better product Cc. Carlo)

Are you familiar with Toke's (cc) work/PhD on handling bufferbloat on
wireless networks?  (Hint: Airtime fairness)

Solving bufferbloat in wireless networks require more than applying
fq_codel on the bottleneck queue, it requires Airtime fairness.  Doing
scheduling based Clients use of Radio-time and transmit-opportunities
(TXOP), instead of shaping based on bytes. (This is why it can (if you
are very careful) make sense to "holding back packets a bit" to
generate a packet aggregate that only consumes one TXOP).

The culprit is that each Client/MobilePhone will be sending at
different rates, and scheduling based on bytes, will cause a Client with
a low rate to consume a too large part of the shared radio airtime.
That basically sums up Toke's PhD ;-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Cc. Marek due to his twitter post[1] and link to 5G-BBR blogpost[2]:
 [1] https://twitter.com/majek04/status/1329708548297732097
 [2] https://blog.acolyer.org/2020/10/05/understanding-operational-5g/


On Thu, 19 Nov 2020 14:35:27 +  wrote:

> Hello Luca
> 
> 
> The current PGW is a policer.   What the next version will be, I'm not sure.
> 
> 
> However on parts of the Huawei RAN the policing rate is set to be a
> shaper speed on the eNodeB (radio antenna).  1000 packets deep. And
> it not only shapes down to 30Mb, but tries to aggregate packets to
> keep a level speed whenever using the radio interface.  Meaning
> holding back packets a bit to try and get to 30Mbit when sending in
> bulk in case of less than 30Mbit user traffic.  30Mbit being an
> example subscription speed.
> 
> 
> We are rolling out a fix to turn of that Huawei shaper, but it is not
> done nation wide yet.
> 
> The test device is in a lab area, using close to, but not entirely
> the same as production 5G setup from Ericsson.  Here there should not
> be any shapers involved in the downstream path here.  There is
> however a bloated buffer on the upstream path which we are working on
> correcting.
> 
> 
> The curl graphs are "time to complete a curl download of x file
> size", using a apache webserver running bbr.
> 
> 
> -Erik
> 
> 
> 
> Fra: Luca Muscariello 
> Sendt: 19. november 2020 14:32
> Til: Taraldsen Erik
> Kopi: Jesper Dangaard Brouer; priyar...@google.com; bloat; Luca Muscariello
> Emne: Re: [Bloat] BBR implementations, knobs to turn?
> 
> Hi Erick,
> 
> one question about the PGW: is it a policer or a shaper that you have 
> installed?
> Also, have you tried to run a ping session before and in parallel to the curl 
> sessions?
> 
> Luca
> 
> 
> 
> On Thu, Nov 19, 2020 at 2:15 PM 
> mailto:erik.tarald...@telenor.com>> wrote:
> Update:
> The 5G router was connected to a new base station.  Now the limiting factor 
> of throughput is the policer on the PGW in mobile core, not the radio link 
> itself.  The SIM card used is limited to 30Mbit/s.  This scenario favours the 
> new server.  I have attached graphs comparing radio link limited vs PGW 
> policer results, and a zoomed in graph of the policer
> 
> 
> We have Huawei RAN and Ericsson RAN, rate limited and not rate limited 
> subscriptions, 4G and 5G access, and we are migrating to a new core with new 
> PGW (policer).  Starting to be a bit of a matrix to set up tests for.
> 
> 
> -Erik
> 
> 
> 
> Fra: Jesper Dangaard Brouer mailto:bro...@redhat.com>>
> Sendt: 17. november 2020 16:07
> Til: Taraldsen Erik; Priyaranjan Jha
> Kopi: bro...@redhat.com; 
> ncardw...@google.com; 
> bloat@lists.bufferbloat.net
> Emne: Re: [Bloat] BBR implementations, knobs to turn?
> 
> On Tue, 17 Nov 2020 10:05:24 + 
> mailto:erik.tarald...@telenor.com>> wrote:
> 
> > Thank you for the response Neal  
> 
> Yes. And it is impressive how many highly qualified people are on the
> bufferbloat list.
> 
> > old_hw # uname -r
> > 5.3.0-64-generic
> > (Ubuntu 19.10 on xenon workstation, integrated network card, 1Gbit
> > GPON access.  Used as proof of concept from the lab at work)
> >
> >
> >