Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Sebastian Moeller moell...@gmx.de wrote: Sebastian Moeller moell...@gmx.de wrote: No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement... With various third-party-internet-access (TPIA in Canada), the DSLAM is operated by the incumbent (monopoly) telco, and the layer-3 first hop is connected via PPPoE-VLAN or PPP/L2TP. So they “own” the copper lines connecting each customer to the DSLAM? And everybody else just rents their DSL service and resells them? Do they really connect to the DSLAM or to the BRAS? correct, the copper continues to be regulated; the incumbent was given a guaranteed 11-14% profit on that service for the past 75 years... Third parties get an NNI to the incumbent in a data centre. 1) for bridged ethernet DSL service (HSA in Bell Canada land), the each customer shows up to the ISP in a VLAN tag. 2) for PPPoE DSL service, the traffic comes in a specific VLAN, over IP (RFC1918) via L2TP. Other parties can put copper in the ground, and in some parts of Canada, this has occured. Also worth mentioning that AlbertaGovernmentTelephone/EdmontonTel/BCTel became TELUS, and then left the Stentor/Bell-Canada alliance, so Bell can be the third party in the west, while Telus is the third party in the centre, and Island/Aliant/NBTel/Sasktel remain government owned... and they actually do different things as a result. I think in Germany the incumbent has to either rent out the copper lines to competitors (who can put their own lines cards in DSLAMs backed by their own back-bone) or rent “bit-stream” access that is the incumbent handles the DSL part on both ends and passes the traffic either in the next central office or at specific transit points. I always assumed competitors renting these services would get much better guarantees than end-customers, but it seems in Canada the incumbent has more found ways to evade efficient regulation. This option exists, but the number of CLECs is large, and the move towards VDSL2 / Fiber-To-The-Neighbourhood (with much shorter copper options!!) means that this is impractical. my incumbent telco's commercial LAN extension salesperson proudly told me how they never drop packets, even when their links are congested!!! I really hope this is the opinion of a sales person and not the network operators who really operate the gear in the “field”. On the other hand having sufficient buffering in the DSLAM to never having to drop a packet sounds quite manly (and a terrible waste of otherwise fine DRAM chips) ;) I think much of the buffer is the legacy Nortel Passport 15K that ties much of the system together... The Third Party ISP has a large incentive to deploy equipment that supports whatever bandwidth measurement service we might cook up. As much as I would like to think otherwise, the only way to get a BMS in the field is if all national regulators require it by law (well maybe if ITU would bake it into the next xDSL standard that the DSLAM has to report current line speeds as per SNMP? back to all down stream devices asking for it). But I am not holding my breath… My position is that if there isn't a technical specification, no regulation could possibly follow... -- ] Never tell me the odds! | ipv6 mesh networks [ ] Michael Richardson, Sandelman Software Works| network architect [ ] m...@sandelman.ca http://www.sandelman.ca/| ruby on rails[ ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Sebastian Moeller moell...@gmx.de wrote: No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement... With various third-party-internet-access (TPIA in Canada), the DSLAM is operated by the incumbent (monopoly) telco, and the layer-3 first hop is connected via PPPoE-VLAN or PPP/L2TP. The incumbent telco has significant incentive to make the backhaul network as congested and bufferbloated as possible, and to mis-crimp cables so that the DSL resyncs at different speeds regularly... my incumbent telco's commercial LAN extension salesperson proudly told me how they never drop packets, even when their links are congested!!! The Third Party ISP has a large incentive to deploy equipment that supports whatever bandwidth measurement service we might cook up. -- Michael Richardson -on the road- pgpKum5oNQnRT.pgp Description: PGP signature ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Sebastian Moeller moell...@gmx.de wrote: The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times). Apparently common 802.1ah libraries in most routers can do speed tests at layer-2 for ethernet doing exactly this. (Apparently, one vendor's code is in 90% of equipment out there, cause some of this stuff invoves intimate knowledge of PHYs and MII buses, and it's not worth anyone's time to write the code over again vs licensing it...) But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would And correct me if I'm wrong, if you naively divide by two, you wind up overestimating the uplink speed. you can't just test that link, you have to connect to something beyond that. So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams… sadly, ICMP responses are rate limited, even when they are implemented in the fast path. PPP's LCP is not, AFAIK. -- Michael Richardson -on the road- pgpn6SHJAsTbZ.pgp Description: PGP signature ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi MIchael, On Aug 1, 2014, at 06:51 , Michael Richardson m...@sandelman.ca wrote: Sebastian Moeller moell...@gmx.de wrote: No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement... With various third-party-internet-access (TPIA in Canada), the DSLAM is operated by the incumbent (monopoly) telco, and the layer-3 first hop is connected via PPPoE-VLAN or PPP/L2TP. So they “own” the copper lines connecting each customer to the DSLAM? And everybody else just rents their DSL service and resells them? Do they really connect to the DSLAM or to the BRAS? The incumbent telco has significant incentive to make the backhaul network as congested and bufferbloated as possible, and to mis-crimp cables so that the DSL resyncs at different speeds regularly… I think in Germany the incumbent has to either rent out the copper lines to competitors (who can put their own lines cards in DSLAMs backed by their own back-bone) or rent “bit-stream” access that is the incumbent handles the DSL part on both ends and passes the traffic either in the next central office or at specific transit points. I always assumed competitors renting these services would get much better guarantees than end-customers, but it seems in Canada the incumbent has more found ways to evade efficient regulation. my incumbent telco's commercial LAN extension salesperson proudly told me how they never drop packets, even when their links are congested!!! I really hope this is the opinion of a sales person and not the network operators who really operate the gear in the “field”. On the other hand having sufficient buffering in the DSLAM to never having to drop a packet sounds quite manly (and a terrible waste of otherwise fine DRAM chips) ;) The Third Party ISP has a large incentive to deploy equipment that supports whatever bandwidth measurement service we might cook up. As much as I would like to think otherwise, the only way to get a BMS in the field is if all national regulators require it by law (well maybe if ITU would bake it into the next xDSL standard that the DSLAM has to report current line speeds as per SNMP? back to all down stream devices asking for it). But I am not holding my breath… Best Regards Sebastian -- Michael Richardson -on the road- ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
HI Michael, On Aug 1, 2014, at 06:21 , Michael Richardson m...@sandelman.ca wrote: On symmetric links, particularly PPP ones, one can use the LCP layer to do echo requests to the first layer-3 device. This can be used to measure RTT and through some math, the bandwidth. Sure. On assymetric links, my instinct is that if you can measure the downlink speed through another mechanism, that one might be able to subtract, but I can't think exactly how right now. I'm thinking that one can observe the downlink speed by observing packet arrival times/sizes for awhile --- the calculation might be too low if the sender is congested otherwise, but the average should go up slowly. If you go this rout, I would rather look at the minimum delay between incoming packets as a function of the size of the second packet. At first, this means that subtracting the downlink bandwidth from the uplink bandwidth will, I think, result in too high an uplink speed, which will result in rate limiting to a too high value, which is bad. But given all the uncertainties right now finding the proper shaping bandwidths is an iterative process anyway, but one that is best started with a decent initial guess. My thinking is that with binary search I would want to definitely see decent latency under load after the first reduction... But, if there something wrong with my notion? My other notion is that the LCP packets could be time stamped by the PPP(oE) gateway, and this would solve the asymmetry. If both devices are time synchronized to a close enough delta that would be great. Initial testing with icmp timestamp request makes me doubt the quality of synchronization (at least right now). This would take an IETF action to make standard and a decade to get deployed, but it might be a clearly measureable marketing win for ISPs. But if the “grown ups” can be made to act wouldn’t we rather see nice end-user query-able SNMP information about the current up and downlink rates (and in what protocol level, e.g. 2400Mbps down, 1103Kbps up ATM carrier) (For all I know the DSLAMs/BRASes might already support this) Best Regards Sebastian -- Michael Richardson -on the road- ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 27, 2014, at 00:24 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, David Lang wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) is this with BQL/fq_codel in both directions or only in one direction? So by shaping to below line rate the bottleneck is actually happening inside cerowrt and there I run BQL (which does not matter since due to shaping the NICs buffer does not fill up anyway) and fq_codel in both directions. Best Regards Sebastian David Lang what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are 'close' assuming that latency under load without the improvents got 1000ms fast-slow (in ms) ideal=10 untuned=43 bloated 1000 fast/slow ideal = 1.25 untuned = 1.83 bloated 19 slow/fast ideal = 0.8 untuned = 0.55 bloated = 0.05 rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version. David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 27, 2014, at 02:49 , David Lang da...@lang.hm wrote: On Sun, 27 Jul 2014, Sebastian Moeller wrote: On Jul 27, 2014, at 00:53 , David Lang da...@lang.hm wrote: On Sun, 27 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 26, 2014, at 23:45 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for 1000ms, just as with the broken line, with proper shaping even extreme pings stay 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely... just so I understand this completely you have debloated box - ISP router - ADSL - Internet - debloated server? Well more like: Macbook with dubious bloat-state - wifi to de-bloated cerowrt box that shapes the traffic - ISP router - ADSL - internet - server I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link. right, I was forgetting that unless you are the bottleneck, you aren't buffering anything and so debloating makes no difference. In a case like yours where you can't debloat the actual bottleneck, the best that you can do is to artificially become the bottleneck by shaping the traffic. but on the download side it's much harder. Actually, all RRUL plots that Dave collected show that ingress shaping does work quite well on average. It will fail with a severe DOS, but let’s face it these can only be mitigated by the ISP anyways… What are we aiming for? something that will show the problem clearly so that fixes can be put in the right place? or a work-around to use in the meantime?
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Jul 27, 2014, at 03:04 , David Lang da...@lang.hm wrote: On Sun, 27 Jul 2014, Sebastian Moeller wrote: [...] Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements... yeah, I have to remind myself of the perfect is the enemy of good enough frequently as well. I tend to fall into that trap pretty easily, as this discussion has shown :-) ping is easy to test. As a thought, is the response time of NTP queries any more or less stable? No idea? How would you test this (any command line to try). The good thingg with the ping is that often even the DSLAM responds keeping external sources (i.e. hops further away in the network) of variability out of the measurement... One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network. Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely. But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;) negotiated bandwith and effective bandwidth are not the same what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side? In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event). for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links. Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day. guarantee is too strong a word. It depends on how much competition there is. 15 years or so ago I moved from a 3Mb cablemodem to a 128K IDSL line and saw my performance increase significantly. I used to think exactly the same, but currently I tend to think that the difference is about how well managed a node is not so much the access technology, with DSL the shared medium is the link connecting the DSLAM to the backbone, if this is congested it is similar to a busy cable node. In both cases the ISP needs to make sure the shared segments congestion is well managed. I might be that DSLAMs are typically better manages as TELCO’s always dealt with interactive (bi-directional) traffic while cable traditionally was a one-directional transport. So I assume both have different traditions about provisioning. I could be off my rocker here ;) for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL. http://wiki.openwrt.org/toh/tp-link/td-w8970 or no 5GHz wireless? Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress... http://www.traverse.com.au/products ? I couldn't figure out where to buy one through their site. Maybe they only sell in AU, I guess I just wanted to be helpful, If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver. If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless. Not exactly meaningless, if gives you an upper bound... true, but is an upper bound good enough? How close
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi Wes, On Jul 25, 2014, at 22:48 , Wes Felter w...@felter.org wrote: The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT. Do you need to measure Internet bandwidth or last mile bandwidth? I think you want the bandwidth of the usual bottleneck, on DSL that typically is the actual DSL-link to the DSLAM (even though the DSLAM is oversubscribed typically its upstream link is not congested…). I think with DOCSIS it is the same. Realistically bandwidth measurement are going to be sporadic, so this will only help with pretty constant bottlenecks anyway, no use in trying to track, say the DSLAM congestion that transiently happens during peak use time... For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. And that would be sweet, but with DT’s network the first hop does not respond to ICMP probes, nor anything else under end user control, also the bottleneck might actually be in the BRAS, which can be upstream of the DSLAM. What would be great is if all CPE would return the current link rates per SNMP or so… Or if DSLAMs and CMTSs would supply data sinks and sources for easy testing of good-put. Does the packer pair technique work on TDMA link layers like DOCSIS? Toke and Dave dug up a paper showing that packet pair is not an reliable estimator for link bandwidth, So one could send independent packet of differing size, but then one needs to synchronize the clocks somehow… Best Regards Sebastian -- Wes Felter ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 25, 2014, at 22:57 , David Lang da...@lang.hm wrote: On Fri, 25 Jul 2014, Wes Felter wrote: The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT. Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS? The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times). unless the router you are connecting to is running some sort of service to support that, But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though. Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress) I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link… you can't just test that link, you have to connect to something beyond that. So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams… Best Regards Sebastian David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 25, 2014, at 23:03 , David Lang da...@lang.hm wrote: On Fri, 25 Jul 2014 14:37:34 -0400, valdis.kletni...@vt.edu wrote: On Sat, 24 May 2014 10:02:53 -0400, R. said: Further, this function could be auto-scheduled or made enabled on router boot up. Yeah, if such a thing worked, it would be good. (Note in the following that a big part of my *JOB* is doing What could possibly go wrong? analysis on mission-critical systems, which tends to color my viewpoint on projects. I still think the basic concept is good, just difficult to do, and am listing the obvious challenges for anybody brave enough to tackle it... :) I must be missing something important which prevents this. What is it? There's a few biggies. The first is what the linux-kernel calls -ENOPATCH - nobody's written the code. The second is you need an upstream target someplace to test against. You need to deal with both the server is unavalailable due to a backhoe incident 2 time zones away problem (which isn't *that* hard, just default to Something Not Obviously Bad(TM), and server is slashdotted (whci is a bit harder to deal with. Remember that there's some really odd corner cases to worry about - for instance, if there's a power failure in a town, then when the electric company restores power you're going to have every cerowrt box hit the server within a few seconds - all over the same uplink most likely. No good data can result from that... (Holy crap, it's been almost 3 decades since I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the network at once when building power was restored). And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the analysis to compute the correct values becomes interesting. Dealing with non-obvious error conditions is also a challenge - a router may only boot once every few months. And if you happen to be booting just as a BGP routing flap is causing your traffic to take a vastly suboptimal path, you may end up encoding a vastly inaccurate setting and have it stuck there, causing suckage for non-obvious reasons for the non-technical, so you really don't want to enable auto-tuning unless you also have a good plan for auto-*RE*tuning have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while. Yeah, keeping some history to “predict” when to measure next sounds clever. However, I think the big question is how much the tuning is required. I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear. If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement) If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile. Best Regards Sebastian David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sat, 26 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 25, 2014, at 22:57 , David Lang da...@lang.hm wrote: On Fri, 25 Jul 2014, Wes Felter wrote: The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT. Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS? The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times). Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing. unless the router you are connecting to is running some sort of service to support that, But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though. Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress) I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link… As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will. you also can't count on time being synced properly. Top Tier companies have trouble doing that in their dedicated datacenters, depending on it for this sort of testing is a non-starter you can't just test that link, you have to connect to something beyond that. So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams… Well, let's talk about what we would like to have on the router As I see it, we want to have two services 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received) 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were) questions: A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/??? TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state. anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same. B. How much data is needed to be statistically accurate? Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets. C. How can this be prevented from being used for DoS attacks, either against the thing running the service or against someone else via a reflected attack if it's a forgable protocol (i.e. UDP) One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network. Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely. Other requirements or restrictions? David Lang___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sat, 26 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 25, 2014, at 23:03 , David Lang da...@lang.hm wrote: On Fri, 25 Jul 2014 14:37:34 -0400, valdis.kletni...@vt.edu wrote: On Sat, 24 May 2014 10:02:53 -0400, R. said: Further, this function could be auto-scheduled or made enabled on router boot up. Yeah, if such a thing worked, it would be good. (Note in the following that a big part of my *JOB* is doing What could possibly go wrong? analysis on mission-critical systems, which tends to color my viewpoint on projects. I still think the basic concept is good, just difficult to do, and am listing the obvious challenges for anybody brave enough to tackle it... :) I must be missing something important which prevents this. What is it? There's a few biggies. The first is what the linux-kernel calls -ENOPATCH - nobody's written the code. The second is you need an upstream target someplace to test against. You need to deal with both the server is unavalailable due to a backhoe incident 2 time zones away problem (which isn't *that* hard, just default to Something Not Obviously Bad(TM), and server is slashdotted (whci is a bit harder to deal with. Remember that there's some really odd corner cases to worry about - for instance, if there's a power failure in a town, then when the electric company restores power you're going to have every cerowrt box hit the server within a few seconds - all over the same uplink most likely. No good data can result from that... (Holy crap, it's been almost 3 decades since I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the network at once when building power was restored). And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the analysis to compute the correct values becomes interesting. Dealing with non-obvious error conditions is also a challenge - a router may only boot once every few months. And if you happen to be booting just as a BGP routing flap is causing your traffic to take a vastly suboptimal path, you may end up encoding a vastly inaccurate setting and have it stuck there, causing suckage for non-obvious reasons for the non-technical, so you really don't want to enable auto-tuning unless you also have a good plan for auto-*RE*tuning have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while. Yeah, keeping some history to “predict” when to measure next sounds clever. However, I think the big question is how much the tuning is required. I assume in most cases you need to measure the home-routers bandwidth rarely (say on DSL only after a re-sync with the DSLAM), but you need to measure the bandwidth early as only then you can properly shape the downlink. And we need to know the link’s capacity to use traffic shaping so that BQL and fq_codel in the router have control over the bottleneck queue… An equivalent of BQL and fq_codel running in the DSLAM/CMTS and CPE obviously would be what we need, because then BQL and fq_codel on the router would be all that is required. But that does not seem like it is happening anytime soon, so we still need to workaround the limitations in the equipment fr a long time to come, I fear. by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. If you end up only loosing 5-10% of your overall network performance by ignoring the details of the wire, then we should ignore them by default. If however, not measuring anything first results in significantly worse performance than a tuned setup, then we need to figure out how to do the measurements needed for tuning. Some people seem to have fallen into the perfect is the enemy of good enough trap on this
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:21 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 25, 2014, at 22:57 , David Lang da...@lang.hm wrote: The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times). Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing. Sure, the two packets a required per measurement, guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end... unless the router you are connecting to is running some sort of service to support that, But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though. Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress) I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link… As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will. Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;). I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential) you can't just test that link, you have to connect to something beyond that. So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams… Well, let's talk about what we would like to have on the router As I see it, we want to have two services 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received) 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were) questions: A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/??? TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state. anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same. You thing UDP would not work out? I don't trust that UDP would go through the same codepaths and delays as TCP even fw_codel handles TCP differently so if we measure with UDP, does it really reflect the 'real world' of TCP? B. How much data is needed to be statistically accurate? Too many things can happen for 1-2 packets to tell you the answer. The systems on both ends are multi-tasking, and at high speeds, scheduling jitter will throw off your calculations with too few packets. Yeah, but you can (to steal an I idea from Rick Jones netperf) just keep measuring until the confidence interval around the mean of the data falls below a set magnitude. But for the purpose of traffic shaping you do not need the exact link bandwidth anyway just
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 26, 2014, at 23:14 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:21 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 25, 2014, at 22:57 , David Lang da...@lang.hm wrote: The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. Well that is what you typically do, but you can get away with less measurement traffic: in an ideal quiescent network sending two packets back to back should give you the bandwidth (packet size / incoming time difference of both packets), or send two packets of different size (needs synchronized clocks, then difference of packet sizes / difference of transfer times). Except that your ideal network doesn't exist in the real world. You are never going to have the entire network quiescent, the router you are going to be talking to is always going to have other things going on, which can affect it's timing. Sure, the two packets a required per measurement, guess I would calculate the average and confidence interval over several of these (potentially by a moving window) to get a handle on the variability. I have done some RTT measurements on a ADSL link and can say that realistically one needs in the hundreds data points per packet size. This sounds awe full, but at least it does not require to saturate the link and hence works without dedicated receivers on the other end... unless the router you are connecting to is running some sort of service to support that, But this still requires some service on the other side. You could try to use ICMP packets, but these will only allow to measure RTT not one-way delays (if you do this on ADSL you will find the RTT dominated by the typically much slower uplink path). If network equipment would be guaranteed to use NTP for decent clock synchronization and would respond to timestamp ICMP messages with timestamp reply measuring bandwidth might be “cheap” enough to keep running in the background, though. Since this looks too simple there must be a simple reason why this would fail. (It would be nice if ping packets with timestamps would have required the echo server top also store its incoming timestamp in the echo, but I digress) I note that gargoyle uses a sparse stream of ping packets to a close host and uses increases in RTT as proxy for congestion and signal to throttle down stream link… As you say, anything that requires symmetrical traffic (like ICMP isn't going to work, and routers do not currently offer any service that will. Well I think the gargoyle idea is feasible given that there is a reference implementation out in the wild ;). I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential) But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings... you can't just test that link, you have to connect to something beyond that. So it would be sweet if we could use services that are running on the machines anyway, like ping. That way the “load” of all the leaf nodes of the internet continuously measuring their bandwidth could be handled in a distributed fashion avoiding melt-downs by synchronized measurement streams… Well, let's talk about what we would like to have on the router As I see it, we want to have two services 1. a service you send a small amount of data to and it responds by sending you a large amount of data (preferrably with the most accurate timestamps it has and the TTL of the packets it received) 2. a service you send a large amount of data to and it responds by sending you small responses, telling you how much data it has received (with a timestamp and what the TTL of the packets it received were) questions: A. Protocol: should these be UDP/TCP/SCTP/raw IP packets/??? TCP has the problem of slow start so it would need substantially more traffic to flow to reach steady-state. anything else has the possibility of taking a different path through the router/switch software and so the performance may not be the same. You thing UDP would not work out? I don't trust that UDP would go through the same codepaths and delays as TCP Why should a router care even fw_codel handles TCP differently Does it? I thought UDP typically reacts differently to fq_codels dropping
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sat, 26 Jul 2014, David Lang wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) is this with BQL/fq_codel in both directions or only in one direction? David Lang what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are 'close' assuming that latency under load without the improvents got 1000ms fast-slow (in ms) ideal=10 untuned=43 bloated 1000 fast/slow ideal = 1.25 untuned = 1.83 bloated 19 slow/fast ideal = 0.8 untuned = 0.55 bloated = 0.05 rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version. David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 26, 2014, at 23:45 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for 1000ms, just as with the broken line, with proper shaping even extreme pings stay 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely... I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close' Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the 1000ms worst case range, while proper SQM bounds this to 100ms. assuming that latency under load without the improvents got 1000ms fast-slow (in ms) ideal=10 untuned=43 bloated 1000 The sign seems off as fast slow? I like this best ;) fast/slow ideal = 1.25 untuned = 1.83 bloated 19 But Fast Slow and hence this ration should be 0? slow/fast ideal = 0.8 untuned = 0.55 bloated = 0.05 and this 0? rather than looking at how much worse it is than the ideal, look at how much closer it is to the ideal than to the bloated version. David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sun, 27 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 26, 2014, at 23:45 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for 1000ms, just as with the broken line, with proper shaping even extreme pings stay 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely... just so I understand this completely you have debloated box - ISP router - ADSL - Internet - debloated server? and are you measuring the latency impact when uploading or downloading? I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers. It's far better than sustained multi-second ping times which is what I've seen with stock setups. but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation? It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router. As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly. I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close' Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 27, 2014, at 00:23 , David Lang da...@lang.hm wrote: [...] I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential) But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings... that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving? So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase. [...] even fw_codel handles TCP differently Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here) with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers. But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c 70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q, 71 const struct sk_buff *skb) 72 { 73 struct flow_keys keys; 74 unsigned int hash; 75 76 skb_flow_dissect(skb, keys); 77 hash = jhash_3words((__force u32)keys.dst, 78 (__force u32)keys.src ^ keys.ip_proto, 79 (__force u32)keys.ports, q-perturbation); 80 return ((u64)hash * q-flows_cnt) 32; 81 } The way I read this is that it just uses source and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where? so if we measure with UDP, does it really reflect the 'real world' of TCP? But we care for UDP as well, no? Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP. measuing with ICMP is different yet again. Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp? Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements... One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network. Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely. But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;) negotiated bandwith and effective bandwidth are not the same what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side? In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi David, On Jul 27, 2014, at 00:53 , David Lang da...@lang.hm wrote: On Sun, 27 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 26, 2014, at 23:45 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for 1000ms, just as with the broken line, with proper shaping even extreme pings stay 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely... just so I understand this completely you have debloated box - ISP router - ADSL - Internet - debloated server? Well more like: Macbook with dubious bloat-state - wifi to de-bloated cerowrt box that shapes the traffic - ISP router - ADSL - internet - server I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link. and are you measuring the latency impact when uploading or downloading? No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario. I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers. No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for 3Mbps 10ms should be achievable ) It's far better than sustained multi-second ping times which is what I've seen with stock setups. True, but compared to multi seconds even 1000ms would be a really great improvement, but also not enough. but if no estimate is this bad, how bad is it if you use as
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sun, 27 Jul 2014, Sebastian Moeller wrote: On Jul 27, 2014, at 00:53 , David Lang da...@lang.hm wrote: On Sun, 27 Jul 2014, Sebastian Moeller wrote: Hi David, On Jul 26, 2014, at 23:45 , David Lang da...@lang.hm wrote: On Sat, 26 Jul 2014, Sebastian Moeller wrote: On Jul 26, 2014, at 22:39 , David Lang da...@lang.hm wrote: by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup. Good question. Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things. the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender. As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...) what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping) So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for 1000ms, just as with the broken line, with proper shaping even extreme pings stay 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely... just so I understand this completely you have debloated box - ISP router - ADSL - Internet - debloated server? Well more like: Macbook with dubious bloat-state - wifi to de-bloated cerowrt box that shapes the traffic - ISP router - ADSL - internet - server I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link. right, I was forgetting that unless you are the bottleneck, you aren't buffering anything and so debloating makes no difference. In a case like yours where you can't debloat the actual bottleneck, the best that you can do is to artificially become the bottleneck by shaping the traffic. but on the download side it's much harder. What are we aiming for? something that will show the problem clearly so that fixes can be put in the right place? or a work-around to use in the meantime? I think both need to be pursued, but we need to be clear on what is being done for each one. If having BQL+fq_codel with defaults would solve the problem if it was on the right routers, we need to show that. Then, because we can't get the fixes on the right routers and need to work-around the problem by artificially becoming the bottleneck, we need
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sun, 27 Jul 2014, Sebastian Moeller wrote: On Jul 27, 2014, at 00:23 , David Lang da...@lang.hm wrote: [...] I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential) But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings... that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving? So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase. [...] even fw_codel handles TCP differently Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here) with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers. But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c 70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q, 71 const struct sk_buff *skb) 72 { 73 struct flow_keys keys; 74 unsigned int hash; 75 76 skb_flow_dissect(skb, keys); 77 hash = jhash_3words((__force u32)keys.dst, 78 (__force u32)keys.src ^ keys.ip_proto, 79 (__force u32)keys.ports, q-perturbation); 80 return ((u64)hash * q-flows_cnt) 32; 81 } The way I read this is that it just uses source and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where? so if we measure with UDP, does it really reflect the 'real world' of TCP? But we care for UDP as well, no? Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP. measuing with ICMP is different yet again. Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp? Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements... yeah, I have to remind myself of the perfect is the enemy of good enough frequently as well. I tend to fall into that trap pretty easily, as this discussion has shown :-) ping is easy to test. As a thought, is the response time of NTP queries any more or less stable? One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network. Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely. But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;) negotiated bandwith and effective bandwidth are not the same what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side? In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Sat, 24 May 2014 10:02:53 -0400, R. said: Further, this function could be auto-scheduled or made enabled on router boot up. Yeah, if such a thing worked, it would be good. (Note in the following that a big part of my *JOB* is doing What could possibly go wrong? analysis on mission-critical systems, which tends to color my viewpoint on projects. I still think the basic concept is good, just difficult to do, and am listing the obvious challenges for anybody brave enough to tackle it... :) I must be missing something important which prevents this. What is it? There's a few biggies. The first is what the linux-kernel calls -ENOPATCH - nobody's written the code. The second is you need an upstream target someplace to test against. You need to deal with both the server is unavalailable due to a backhoe incident 2 time zones away problem (which isn't *that* hard, just default to Something Not Obviously Bad(TM), and server is slashdotted (whci is a bit harder to deal with. Remember that there's some really odd corner cases to worry about - for instance, if there's a power failure in a town, then when the electric company restores power you're going to have every cerowrt box hit the server within a few seconds - all over the same uplink most likely. No good data can result from that... (Holy crap, it's been almost 3 decades since I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the network at once when building power was restored). And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the analysis to compute the correct values becomes interesting. Dealing with non-obvious error conditions is also a challenge - a router may only boot once every few months. And if you happen to be booting just as a BGP routing flap is causing your traffic to take a vastly suboptimal path, you may end up encoding a vastly inaccurate setting and have it stuck there, causing suckage for non-obvious reasons for the non-technical, so you really don't want to enable auto-tuning unless you also have a good plan for auto-*RE*tuning pgpXtINEGfqKa.pgp Description: PGP signature ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT. Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS? -- Wes Felter ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Fri, 25 Jul 2014, Wes Felter wrote: The Netgear stock firmware measures bandwidth on every boot or link up (not sure which) and I would suggest doing the same for CeroWRT. Do you need to measure Internet bandwidth or last mile bandwidth? For link bandwidth it seems like you can solve a lot of problems by measuring to the first hop router. Does the packer pair technique work on TDMA link layers like DOCSIS? The trouble is that to measure bandwidth, you have to be able to send and receive a lot of traffic. unless the router you are connecting to is running some sort of service to support that, you can't just test that link, you have to connect to something beyond that. David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Fri, 25 Jul 2014 14:37:34 -0400, valdis.kletni...@vt.edu wrote: On Sat, 24 May 2014 10:02:53 -0400, R. said: Further, this function could be auto-scheduled or made enabled on router boot up. Yeah, if such a thing worked, it would be good. (Note in the following that a big part of my *JOB* is doing What could possibly go wrong? analysis on mission-critical systems, which tends to color my viewpoint on projects. I still think the basic concept is good, just difficult to do, and am listing the obvious challenges for anybody brave enough to tackle it... :) I must be missing something important which prevents this. What is it? There's a few biggies. The first is what the linux-kernel calls -ENOPATCH - nobody's written the code. The second is you need an upstream target someplace to test against. You need to deal with both the server is unavalailable due to a backhoe incident 2 time zones away problem (which isn't *that* hard, just default to Something Not Obviously Bad(TM), and server is slashdotted (whci is a bit harder to deal with. Remember that there's some really odd corner cases to worry about - for instance, if there's a power failure in a town, then when the electric company restores power you're going to have every cerowrt box hit the server within a few seconds - all over the same uplink most likely. No good data can result from that... (Holy crap, it's been almost 3 decades since I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted over the network at once when building power was restored). And if you're in Izbekistan and the closest server netwise is at 60 Hudson, the analysis to compute the correct values becomes interesting. Dealing with non-obvious error conditions is also a challenge - a router may only boot once every few months. And if you happen to be booting just as a BGP routing flap is causing your traffic to take a vastly suboptimal path, you may end up encoding a vastly inaccurate setting and have it stuck there, causing suckage for non-obvious reasons for the non-technical, so you really don't want to enable auto-tuning unless you also have a good plan for auto-*RE*tuning have the router record it's finding, and then repeat the test periodically, recording it's finding as well. If the new finding is substantially different from the prior ones, schedule a retest 'soon' (or default to the prior setting if it's bad enough), otherwise, if there aren't many samples, schedule a test 'soon' if there are a lot of samples, schedule a test in a while. However, I think the big question is how much the tuning is required. If a connection with BQL and fq_codel is 90% as good as a tuned setup, default to untuned unless the user explicitly hits a button to measure (and then a second button to accept the measurement) If BQL and fw_codel by default are M70% as good as a tuned setup, there's more space to argue that all setups must be tuned, but then the question is how to they fare against a old, non-BQL, non-fq-codel setup? if they are considerably better, it may still be worthwhile. David Lang ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth. Pardon my noobiness, but is there a technical obstacle that prevents the creation of a user-triggered function on the router side that measures the provider's bandwidth? Function, when (luci-gui?) triggered, would: 1. Ensure that internet connectivity is present. 2. Disconnect all clients. 3. Engage in DL and UL on a dedicated web server, measure stats and straight up use them in fq_codel -- or suggest them in appropriate QoS-gui user-boxes. Further, this function could be auto-scheduled or made enabled on router boot up. I must be missing something important which prevents this. What is it? ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Hi R, hi List, On May 24, 2014, at 16:12 , R. red...@gmail.com wrote: I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth. Pardon my noobiness, but is there a technical obstacle that prevents the creation of a user-triggered function on the router side that measures the provider's bandwidth? Function, when (luci-gui?) triggered, would: 1. Ensure that internet connectivity is present. 2. Disconnect all clients. 3. Engage in DL and UL on a dedicated web server, measure stats and straight up use them in fq_codel -- or suggest them in appropriate QoS-gui user-boxes. Further, this function could be auto-scheduled or made enabled on router boot up. I must be missing something important which prevents this. What is it? Well, I see a couple of challenges that need to be overcome before this could work. In your step 3 you touch the issue of measuring the current stats; and somehow what is trickier than one would think: 1) what to measure precisely, a dedicated web server sounds like a great idea, but who is dedicating it and where is it located relative to the link under test? Rich Brown has made a nice script to measure current throughput and give an estimate on the effect of link saturation on latency (see betterspeedtest.sh from https://github.com/richb-hanover/CeroWrtScripts), but using this from Germany gives: 2014-05-24 15:44:47 Testing against demo.tohojo.dk with 5 simultaneous sessions while pinging gstatic.com (60 seconds in each direction) Download: 12.06 Mbps Upload: 1.99 Mbps against a server in Europe, but: Download: 10.42 Mbps Upload: 1.85 Mbps against a server on the east side of the USA. So the router would need to select a close-by server. Sites as speedtest.net offer this kind of server selection by proximity but do not have a very reliable way to load the link and do not measure the effect of link saturation on the latency… but the whole idea is to find the highest bandwidth that foes not cause indecent increase of latency under load. (Also speed tests are quite stereotypic in observable behavior and length so some ISPs special case these to look good; but that is a different kettle of fish…) Note that there is also the question where one would like to measure the linkspeed; for example for DSL there is the link to the DSLAM, the link from the DSLAM to the next network node, sometimes a PPP link to a remote BRAS system (that might throttle the traffic). All of these can be the bottlenecks of the ISP connection (depending on circumstances). My take is that one would like to look at the link between modem and DSLAM as the bottleneck, but the opinions differ (and then there is cable with its shared first segment...). 2) Some links have quite peculiar properties that are hard to deduce from quick speed tests. For example ATM based ADSL links (this includes all ADSL1, ADSL2 and to my knowledge all existing ADSL2+ links) will show a packetize dependent link speed. In short ATM uses an integer number of 48 byte cells to transport each packet, so worst case it adds 47 bytes to the payload for small packet that can effectively double the size of the packet on the wire, or stared differently half the link speed for packets of that size. (Note thanks to the work of Jesper Brouer and Russel Stuart the linux kernel can take care of that issue for you, but you need to tell the kernel explicitly.) 3) many links actually do not have a constant wire speed available. For docsis (basically cable) the local segment is shared between many users and transmit timeslots are shared between requestors, giving effectively slower links during peak hours. For DSL a resync between DSLAM and modem can (significantly) change the negotiated speed; something cerowrt does not get any notice of… I guess buffer bloat mitigation needs to move into the modems and DSLAMs to get rid of the bandwidth guessing game. For cable at least the modems are getting better (thanks to PIE being part of the docsis 3.1? standard), but for DSL I do not think there is any generic solution on the horizon… Best Regards Sebastian ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Depends on the type of the provider. Most providers now have shared paths to the backbone among users and give a peak rate up and down for brief periods that they will not sustain... In fact they usually penalize use of the peak rate by reducing the rate after that. So at what point they create bloat in their access net is hard to determine. And it depends on your neighbors' behavior as well. The number you want is the bloatedness of your path through the access provider. This is measurable by sending small probes back and forth to a measurement server... Measuring instantaneous latency in each direction and combining that information with one's recent history in a non trivial calculation. Note that that measurement does not directly produce provider speeds that can be input to the shapers used in codel. But it does produce a queue size that can. So it's a plausible way to proceed as long as the operators refuse to fix their gear to manage the actual link that is problematic. Personally I'd suggest that the gear makers' feet be held to the fire... by not fixing it by an inferior fix at the home router. Keep the pressure on them at IETF and among their customers. On May 24, 2014, R. red...@gmail.com wrote: I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth. Pardon my noobiness, but is there a technical obstacle that prevents the creation of a user-triggered function on the router side that measures the provider's bandwidth? Function, when (luci-gui?) triggered, would: 1. Ensure that internet connectivity is present. 2. Disconnect all clients. 3. Engage in DL and UL on a dedicated web server, measure stats and straight up use them in fq_codel -- or suggest them in appropriate QoS-gui user-boxes. Further, this function could be auto-scheduled or made enabled on router boot up. I must be missing something important which prevents this. What is it? ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Sent from my Android device with K-@ Mail. Please excuse my brevity.___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Besides deployment in cerowrt and openwrt, what would really have high leverage is that the techniques developed in cerowrt's exploration (including fq_codel) get deployed where they should be deployed: in the access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from the major players. Cerowrt's fundamental focus has been proving that the techniques really, really work at scale. However, the fundamental bloat-induced experiences are actually occurring due to bloat at points where fast meets slow. Cerowrt can't really fix the problem in the download direction (currently not so bad because of high download speeds relative to upload speeds in the US - that's in the CMTS's and DSLAM's. What's depressing to me is that the IETF community spends more time trying to convince themselves that bloat is only a theoretical problem, never encountered in the field. In fact, every lab I've worked at (including the startup accelerator where some of my current company work) has had the network managers complaining to me that a single heavy FTP I'm running causes all of the other users in the site to experience terrible web performance. But when they call Cisco or F5 or whomever, they get told there's nothing to do but buy complicated flow-based traffic management boxes to stick in line with the traffic (so they can slow me down). Bloat is the most common invisible elephant on the Internet. Just fixing a few access points is a start, but even if we fix all the access points so that uploads interfere less, there's still more impact this one thing can have. So, by all means get this stuff into mainstream, but it's time to start pushing on the access network technology companies (and there are now open switches from Cumulus and even Arista to hack) On Wednesday, May 21, 2014 7:42am, Frits Riep r...@riepnet.com said: Thanks Dave for your responses. Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users. I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection. Is that a valid interpretation? I am interested in further understanding the differences based on the brief differences you provide. It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid? Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters, she would benefit totally from the performance improvement. So the technology ultimately needs to be taken mainstream, and yes that is a huge task. Frits -Original Message- From: Dave Taht [mailto:dave.t...@gmail.com] Sent: Tuesday, May 20, 2014 7:14 PM To: Frits Riep Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration. On Tue, May 20, 2014 at 3:11 PM, Frits Riep r...@riepnet.com wrote: The concept of eliminating bufferbloat on many more routers is quite appealing. Reading some of the recent posts makes it clear there is a desire to get to a stable code, and also to find a new platform beyond the current Netgear. However, as good as some of the proposed platforms maybe for developing and for doing all of the new capabilities of CeroWRT, I also would like to propose that there also be some focus on reaching a wider and less sophisticated audience to help broaden the awareness and make control of bufferbloat more available and easier to attain for more users. I agree that reaching more users is important. I disagree we need to reach them with cerowrt. More below: · It appears there is a desire to merge the code into an upcoming OpenWRT barrier breaker release, which is excellent as it will make it easier to fight buffer bloat on a wide range of platforms and provide users with a much easier to install firmware release. I’d like to be able to download luci-qos-scripts and sqm-scripts and have basic bufferbloat control on a much greater variety of devices and to many more users. From an awareness perspective this would be a huge win. Is the above scenario what is being planned, is it likely to happen in the reasonable future? Yes, I'd submitted sqm for review upstream, got back a few comments. Intend to resubmit again when I get a
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 7:51 AM, dpr...@reed.com wrote: Besides deployment in cerowrt and openwrt, what would really have high leverage is that the techniques developed in cerowrt's exploration (including fq_codel) get deployed where they should be deployed: in the access network systems: CMTS's, DSLAM's, Enterprise boundary gear, etc. from the major players. +10. Cerowrt's fundamental focus has been proving that the techniques really, really work at scale. That they even work on a processor designed in 1990! :) I also have hoped that along the way we've shown what techniques don't work... However, the fundamental bloat-induced experiences are actually occurring due to bloat at points where fast meets slow. Cerowrt can't really fix the problem in the download direction (currently not so bad because of high download speeds relative to upload speeds in the US - that's in the CMTS's and DSLAM's. Well, I disagree somewhat. The downstream shaper we use works quite well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter has had the inbound shaper work up a little past 100mbits. So there is no need (theoretically) to upgrade the big fat head ends if your cpe is powerful enough to do the job. It would be better if the head ends did it, of course What's depressing to me is that the IETF community spends more time trying to convince themselves that bloat is only a theoretical problem, never encountered in the field. In fact, every lab I've worked at (including the It isn't all the IETF. Certainly google gets it and has made huge strides. reduced RTT = money. My own frustration comes from papers that are testing this stuff at 4mbit or lower and not seeing the results we get above that, on everything. https://plus.google.com/u/0/107942175615993706558/posts/AbeHRY9vzLR ns2 and ns3 could use some improvements... startup accelerator where some of my current company work) has had the network managers complaining to me that a single heavy FTP I'm running causes all of the other users in the site to experience terrible web performance. But when they call Cisco or F5 or whomever, they get told there's nothing to do but buy complicated flow-based traffic management boxes to stick in line with the traffic (so they can slow me down). It is sad that F5, in particular, doesn't have a sane solution. Their whole approach is to have a load-balancer and fq_codel is a load-balancer to end all load balancers. I do note nobody I know has ported BQL or fq_codel to bsd (codel is in bsd now) Bloat is the most common invisible elephant on the Internet. Just fixing a +10. few access points is a start, but even if we fix all the access points so that uploads interfere less, there's still more impact this one thing can have. I was scared silly at the implications 2 years back, I am more sanguine now. So, by all means get this stuff into mainstream, but it's time to start pushing on the access network technology companies (and there are now open switches from Cumulus and even Arista to hack) Oh, cool! I keep waiting for my parallella to show up so I could start fiddling with ethernet in the fpga On Wednesday, May 21, 2014 7:42am, Frits Riep r...@riepnet.com said: Thanks Dave for your responses. Based on this, it is very good that qos-scripts is available now through openwrt, and as I experienced, it provides a huge advantage for most users. I would agree prioritizing ping is in and of itself not the key goal, but based on what I've read so far, fq-codel provides dramatically better responsiveness for any interactive application such as web-browsing, voip, or gaming, so it qos-scripts would be advantageous for users like your mom if she were in an environment where she had a slow and shared internet connection. Is that a valid interpretation? I am interested in further understanding the differences based on the brief differences you provide. It is true that few devices provide DSCP marking, but if the latency is controlled for all traffic, latency sensitive traffic benefits tremendously even without prioritizing by l7 (layer 7 ?). Is this interpretation also valid? Yes, your mom wouldn't be a candidate for setting up ceroWRT herself, but if it were set up for her, or if it could be incorporated into a consumer router with automatically determining speed parameters, she would benefit totally from the performance improvement. So the technology ultimately needs to be taken mainstream, and yes that is a huge task. Frits -Original Message- From: Dave Taht [mailto:dave.t...@gmail.com] Sent: Tuesday, May 20, 2014 7:14 PM To: Frits Riep Cc: cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration. On Tue, May 20, 2014 at 3:11 PM, Frits Riep r...@riepnet.com wrote: The concept of eliminating bufferbloat on many more
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 9:03 AM, dpr...@reed.com wrote: In reality we don't disagree on this: On Wednesday, May 21, 2014 11:19am, Dave Taht dave.t...@gmail.com said: Well, I disagree somewhat. The downstream shaper we use works quite well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter has had the inbound shaper work up a little past 100mbits. So there is no need (theoretically) to upgrade the big fat head ends if your cpe is powerful enough to do the job. It would be better if the head ends did it, of course There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the up to speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay. My understanding is that head-end queues have more than that. They certainly do in LTE access networks. Compelling argument! I agree it would be best for the devices that have the most information about the network to manage themselves better. It is deeply ironic to me that I'm arguing for an e2e approach on fixing the problem in the field, with you! http://en.wikipedia.org/wiki/End-to-end_principle -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
Dave Taht dave.t...@gmail.com wrote: I should point out that another issue with deploying fq_codel widely is that it requires an accurate measurement (currently) of the providers bandwidth. I've been thinking about ways to do this over PPP(oE) links if one controls both ends --- many third party internet access ISPs terminate the PPP on their equipment, rather than the telco's, so it should be possible to avoid all the L2 issues. My ISP now offers fiber-to-the-neighbourhood, 50Mb/s down, 10 up. (vs 7/640 that I have now). They are offering me an http://smartrg.com/products/products/sr505n/ which they suggest I run in bridge (layer-2) mode. I'm trying to figure out what is inside, as it has the DSL interface right on it. I didn't know of this device before. My hope/expectation is that more ISPs that provide CPE will ship something that is configured correctly by default, following in free.fr's footsteps, and trying to beat the cable industry to the punch, now that the core code is debugged and documented, creating an out-of-box win. Agreed. -- ] Never tell me the odds! | ipv6 mesh networks [ ] Michael Richardson, Sandelman Software Works| network architect [ ] m...@sandelman.ca http://www.sandelman.ca/| ruby on rails[ ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 12:03 PM, dpr...@reed.com wrote: In reality we don't disagree on this: On Wednesday, May 21, 2014 11:19am, Dave Taht dave.t...@gmail.com said: Well, I disagree somewhat. The downstream shaper we use works quite well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter has had the inbound shaper work up a little past 100mbits. So there is no need (theoretically) to upgrade the big fat head ends if your cpe is powerful enough to do the job. It would be better if the head ends did it, of course There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the up to speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay. My understanding is that head-end queues have more than that. They certainly do in LTE access networks. I have measured 200ms on a 28Mbps LTE quadrant to a single station. This was using the simplest possible test on an idle cell. Easy to see how that can grow to the second range. Similarly, Dave Taht and I took data recently that showed a large downstream buffer at the CMTS end (line card?), IIRC, it was something like .25 megabyte, using a UDP flooding tool. As always, there may be multiple different buffers lurking in these complex devices, which may only come into play when different parts of them bottleneck, just as we found many different buffering locations inside of Linux. In fact, some of these devices include Linux boxes (though I do not know if they are on the packet forwarding path or not). Bandwidth shaping downstream of those bottlenecks can help, but only to a degree, and I believe primarily for well behaved long lived elephant flows. Offload engines on servers and coalescing acks in various equipment makes the degree of help, particularly for transient behavior such as opening a bunch of TCP connections simultaneously and downloading the elements of a web page I believe are likely to put large bursts of packets into these queues, causing transient poor latency. I think we'll get a bit of help out of the packet pacing code that recently went into Linux (for well behaved servers) as it deploys. Thanks to Eric Dumazet for that work! Ironically, servers get updated much more frequently than these middle boxes, as far as I can tell. Somehow we gotta get the bottlenecks in these devices (broadband cellular) to behave better. - Jim ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
The end-to-end argument against putting functionality in the network is a modularity principle, as you know. The exception is when there is a function that you want to provide that is not strictly end-to-end. Congestion is one of them - there is a fundamental issue with congestion that it happens because of collective actions among independent actors. So if you want to achieve the goals of the modularity principle, you need to find either a) the minimal sensing and response you can put in the network that allows the independent actors to cooperate, or b) require the independent actors to discover and communicate amongst each other individually. Any solution that tries to satisfy the modularity principle has the property that it provides sufficient information, in a sufficiently timely manner, for the independent actors to respond cooperatively to resolve the issue (by reducing their transmission volume in some - presumably approximately fair - way). Sufficiently timely is bounded by the draining time of a switch's outbound link's queue. For practical applications of the Internet today, the draining time should never exceed about 30-50 msec., at the outbound link's rate. However, the optimal normal depth of the queue should be no larger than the size needed to keep the outbound link continuously busy at its peak rate whatever that is (for a shared WiFi access point the peak rate is highly variable as you know). This suggests that the minimal function the network must provide to the endpoints is the packet's instantaneous contribution to the draining time of the most degraded link on the path. Given this information, a pair of endpoints know what to do. If it is a receiver-managed windowed protocol like TCP, the window needs to be adjusted to minimize the contribution to the draining time of the currently bottlenecked node, to stop pipelined flows from its sender as quickly as possible. In that case, cooperative behavior is implicit. The bottleneck switch needs only to inform all independent flows of their contribution, and with an appropriate control loop on each node, approximate fairness can result. And this is the most general approach. Switches have no idea of the meaning of the flows, so beyond timely and accurate reporting, they can't make useful decisions about fixing congestion. Note that this all is an argument about architectural principles and the essence of the congestion problem. I could quibble about whether fq_codel is the simplest or best choice for the minimal functionality an internetwork could provide. But it's pretty nice and simple. Not clear it works for a decentralized protocol like WiFi as a link - but something like it would seem to be the right thing. On Wednesday, May 21, 2014 12:30pm, Dave Taht dave.t...@gmail.com said: On Wed, May 21, 2014 at 9:03 AM, dpr...@reed.com wrote: In reality we don't disagree on this: On Wednesday, May 21, 2014 11:19am, Dave Taht dave.t...@gmail.com said: Well, I disagree somewhat. The downstream shaper we use works quite well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter has had the inbound shaper work up a little past 100mbits. So there is no need (theoretically) to upgrade the big fat head ends if your cpe is powerful enough to do the job. It would be better if the head ends did it, of course There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the up to speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay. My understanding is that head-end queues have more than that. They certainly do in LTE access networks. Compelling argument! I agree it would be best for the devices that have the most information about the network to manage themselves better. It is deeply ironic to me that I'm arguing for an e2e approach on fixing the problem in the field, with you! http://en.wikipedia.org/wiki/End-to-end_principle -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 10:47 AM, Jim Gettys j...@freedesktop.org wrote: On Wed, May 21, 2014 at 12:03 PM, dpr...@reed.com wrote: In reality we don't disagree on this: On Wednesday, May 21, 2014 11:19am, Dave Taht dave.t...@gmail.com said: Well, I disagree somewhat. The downstream shaper we use works quite well, until we run out of cpu at 50mbits. Testing on the ubnt edgerouter has had the inbound shaper work up a little past 100mbits. So there is no need (theoretically) to upgrade the big fat head ends if your cpe is powerful enough to do the job. It would be better if the head ends did it, of course There is an advantage for the head-ends doing it, to the extent that each edge device has no clarity about what is happening with all the other cpe that are sharing that head-end. When there is bloat in the head-end even if all cpe's sharing an upward path are shaping themselves to the up to speed the provider sells, they can go into serious congestion if the head-end queues can grow to 1 second or more of sustained queueing delay. My understanding is that head-end queues have more than that. They certainly do in LTE access networks. I have measured 200ms on a 28Mbps LTE quadrant to a single station. This was using the simplest possible test on an idle cell. Easy to see how that can grow to the second range. Similarly, Dave Taht and I took data recently that showed a large downstream buffer at the CMTS end (line card?), IIRC, it was something like .25 megabyte, using a UDP flooding tool. No it was twice that. The udpburst tool is coming along nicely, but still needs some analytics against the departure rate to get it right. As always, there may be multiple different buffers lurking in these complex devices, which may only come into play when different parts of them bottleneck, just as we found many different buffering locations inside of Linux. In fact, some of these devices include Linux boxes (though I do not know if they are on the packet forwarding path or not). Bandwidth shaping downstream of those bottlenecks can help, but only to a degree, and I believe primarily for well behaved long lived elephant flows. Offload engines on servers and coalescing acks in various equipment makes the degree of help, particularly for transient behavior such as opening a bunch of TCP connections simultaneously and downloading the elements of a web page I believe are likely to put large bursts of packets into these queues, causing transient poor latency. I think we'll get a bit of help out of the packet pacing code that recently went into Linux (for well behaved servers) as it deploys. Thanks to Eric Dumazet for that work! Ironically, servers get updated much more frequently than these middle boxes, as far as I can tell. Somehow we gotta get the bottlenecks in these devices (broadband cellular) to behave better. Or we can take a break, and write books about how we learned to relax and stop worrying about the bloat. - Jim ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wednesday, May 21, 2014 1:53pm, Dave Taht dave.t...@gmail.com said: Or we can take a break, and write books about how we learned to relax and stop worrying about the bloat. Leading to waistline bloat?___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 1:56 PM, dpr...@reed.com wrote: On Wednesday, May 21, 2014 1:53pm, Dave Taht dave.t...@gmail.com said: Or we can take a break, and write books about how we learned to relax and stop worrying about the bloat. Leading to waistline bloat? We resemble that remark already ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
On Wed, May 21, 2014 at 10:57 AM, Jim Gettys j...@freedesktop.org wrote: On Wed, May 21, 2014 at 1:56 PM, dpr...@reed.com wrote: On Wednesday, May 21, 2014 1:53pm, Dave Taht dave.t...@gmail.com said: Or we can take a break, and write books about how we learned to relax and stop worrying about the bloat. Leading to waistline bloat? We resemble that remark already I put on 35 pounds since starting to work on this. -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
The concept of eliminating bufferbloat on many more routers is quite appealing. Reading some of the recent posts makes it clear there is a desire to get to a stable code, and also to find a new platform beyond the current Netgear. However, as good as some of the proposed platforms maybe for developing and for doing all of the new capabilities of CeroWRT, I also would like to propose that there also be some focus on reaching a wider and less sophisticated audience to help broaden the awareness and make control of bufferbloat more available and easier to attain for more users. . It appears there is a desire to merge the code into an upcoming OpenWRT barrier breaker release, which is excellent as it will make it easier to fight buffer bloat on a wide range of platforms and provide users with a much easier to install firmware release. I'd like to be able to download luci-qos-scripts and sqm-scripts and have basic bufferbloat control on a much greater variety of devices and to many more users. From an awareness perspective this would be a huge win. Is the above scenario what is being planned, is it likely to happen in the reasonable future? . From my perspective, it would be ideal to have this available to many users in a more affordable platform, something like an 8mb flash router like the TP-Link WDR-4300, which is otherwise a very capable router with dual channels and good performance. . (I've managed to set up such a WDR-4300, with OpenWRT trunk, figured how to telnet and install Luci, then luci-app-qos, and qos-scripts and I thought the bufferbloat control was remarkable.) How much better would it be if I were able to use luci-qos-scripts and sqm-scripts instead? . For these target users, they need simplicity, good performance, ease of setup and affordability. They are not able to deal with the routing between subnets on wireless, IPv6 setup, and any complexities introduced by DNSSEC. Marketing the advantages of bufferbloat alone requires lots of education and publicity (and as we have seen there are many misleading posts by seemingly persuasive nay-sayers that it is all smoke and mirrors. . Would it be possible to have a simplified release of CeroWRT (in terms of setup, and features), make It available for a reliable and affordable platform, and publicize it and have it reach hopefully a much wider audience? This could potentially be through the OpenWRT channels. . Part of the reason why Tomato had been so popular is that the firmware upgrade, install, configuration, and management was well within the capabilities of the average weekend hacker, and there were compelling features and reliability vs the factory firmwares at the time. . Even installing OpenWRT, especially Trunk, and finding, downloading and enabling packages, while very powerful, and flexible, is still quite complex to someone who does not spend a lot of time reading wiki's and release notes. I'd be interested in feedback on these thoughts. Frits Riep ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel