Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
On 3/1/13 10:39 PM, Ketan Kulkarni wrote: Consider from end-user perspective, getting a voice call while surfing/downloading on 2G/3G interrupts all the download and it is annoying. Ummm, this isn't entirely accurate. When Karn and I designed CDMA IS-99 circa '93 -'94 with data in the control channel, data never stopped during voice calls. Maybe some versions of 2G/3G couldn't do that, but better versions ;-) Similarly going ahead we might very well have handoff from wifi to LTE - why not? Agreed. But again, we've known how to do soft hand-off for a long time. On Fri, Mar 1, 2013 at 9:57 PM, dpr...@reed.com mailto:dpr...@reed.com wrote: This is why more buffering is better continues to be the law of the land - the spec is defined to be no lost packets under load. I'm sure that the primary measure under load for RRUL will be no lost packets by the time it gets to field engineers in the form of specs - because that's what they've *always* been told, and they will disregard any changes as typos. We've had this problem with bell-heads forever. Even back in the days with heavy packet loss at MAE-East, bell-heads would continue to insist that any packet loss was an alarm condition. Even after PPP LQM showed they mangled bits and bytes on even their most prized T3 links (and had been lying to the FCC about uptime for decades), we never could shake off the syndrome. It's the every bit is sacred mentality. ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
overbuffired) in that direction. I'm looking forward to some measurements of actual buffering at the device driver/device levels. I don't know how inbound to the handset is managed via LTE. Still quite a few assumptions left to smash in the above. ... in the home router case ... When there are artificial rate limits in play (in, for example, a cable modem/CMTS, hooked up via gigE yet rate limiting to 24up/4mbit down), then a rate limiter (tbf,htb,hfsc) needs to be applied locally to move that rate limiter/queue management into the local device, se we can manage it better. I'd like to be rid of the need to use htb and come up with a rate limiter that could be adjusted dynamically from a daemon in userspace, probing for short all bandwidth fluctuations while monitoring the load. It needent send that much data very often, to come up with a stable result You've described one soft-rate sensing scheme (piggybacking on TCP), and I've thought up a few others, that could feed back from a daemon some samples into a a soft(er) rate limiter that would keep control of the queues in the home router. I am thinking it's going to take way too long to fix the CPE and far easier to fix the home router via this method, and certainly it's too painful and inaccurate to merely measure the bandwidth once, then set a hard rate, when So far as I know the gargoyle project was experimenting with this approach. A problem is in places that connect more than one device to the cable modem... then you end up with those needing to communicate their perception of the actual bandwidth beyond the link. Where will it get that from the 4G or 3G uplink? -Original Message- From: Maciej Soltysiak mac...@soltysiak.com Sent: Thursday, February 28, 2013 1:03pm To: cerowrt-devel@lists.bufferbloat.net Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android Hiya, Looks like Google's experimenting with 3.8 for Android: https://android.googlesource.com/kernel/common/+/experimental/android-3.8 Sounds great if this means they will utilize fq_codel, TFO, BQL, etc. Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to Android 5.0 so I hope Nexus 7 will get it too some day or at least 3.3+ Phoronix coverage: http://www.phoronix.com/scan.php?page=news_itempx=MTMxMzc Their 3.8 changelog: https://android.googlesource.com/kernel/common/+log/experimental/android-3.8 Regards, Maciej ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
One wonders why all this complexity is necessary, and how likely it is to be well tuned by operators and their contract installers. I'm willing to bet $1000 that all the testing that is done is Can you hear me now and a speed test. Not even something as simple and effective as RRUL. -Original Message- From: Ketan Kulkarni ketku...@gmail.com Sent: Friday, March 1, 2013 3:00am To: Jim Gettys j...@freedesktop.org Cc: cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys [mailto:j...@freedesktop.org] j...@freedesktop.org wrote: I've got a bit more insight into LTE than I did in the past, courtesy of the last couple days. To begin with, LTE runs with several classes of service (the call them bearers). Your VOIP traffic goes into one of them. And I think there is another as well that is for guaranteed bit rate traffic. One transmit opportunity may have a bunch of chunks of data, and that data may be destined for more than one device (IIRC). It's substantially different than WiFi. Just thought to put more light on bearer stuff: There are two ways bearers are setup: 1. UE initiated - where User Equipment sets-up the parameters for bearer 2. Network initiated - where node like PCRF and PGW sets-up the parameters. Parameters include the Guaranteed bit-rates, maximum bit-rates. Something called QCI is associated with bearers. The QCI parameters are authorized at PCRF (policy control rule function) and there is certain mapping maintained at either PCRF or PGW between QCI values and DSCP and MBRs. These parameters enforcing is done at PGW (in such case it is termed as PCEF - policy and rule enforcement function). So PGWs depending on bearers can certainly modify dscp bits. Though these can be modified by other nodes in the network. There are two types of bearers: 1. Dedicated bearers - to carry traffic which need special treatment 2. Default or general pupose bearers - to carry all general purpose data. So generally the voip, streaming videos are passed over dedicated bearers and apply (generally) higher GBRs, MBRs and correct dscp markings. And other non-latency sensitive traffic generally follows the default bearer. Theoretical limit on maximum bearers is 11 though practically most of the deployments use upto 3 bearers max. Note that these parameters may very well very based on the subscriber profiles. Premium/Corporate subscribers can well have more GBRs and MBRs. ISPs are generally very much sensitive to the correct markings at gateways for obvious reasons. But most of what we think of as Internet stuff (web surfing, dns, etc) all gets dumped into a single best effort (BE), class. The BE class is definitely badly bloated; I can't say how much because I don't really know yet; the test my colleague ran wasn't run long enough to be confident it filled the buffers). But I will say worse than most cable modems I've seen. I expect this will be true to different degrees on different hardware. The other traffic classes haven't been tested yet for bufferbloat, though I suspect they will have it too. I was told that those classes have much shorter queues, and when the grow, they dump the whole queues (because delivering late real time traffic is useless). But trust *and* verify Verification hasn't been done for anything but BE traffic, and that hasn't been quantified. But each device gets a fair shot at bandwidth in the cell (or sector of a cell; they run 3 radios in each cell), where fair is basically time based; if you are at the edge of a cell, you'll get a lot less bandwidth than someone near a tower; and this fairness is guaranteed by a scheduler than runs in the base station (called a b-nodeb, IIIRC). So the base station guarantees some sort of fairness between devices (a place where Linux's wifi stack today fails utterly, since there is a single queue per device, rather than one per station). Whether there are bloat problems at the link level in LTE due to error correction I don't know yet; but it wouldn't surprise me; I know there was in 3g. The people I talked to this morning aren't familiar with the HARQ layer in the system. The base stations are complicated beasts; they have both a linux system in them as well as a real time operating system based device inside We don't know where the bottle neck(s) are yet. I spent lunch upping their paranoia and getting them through some conceptual hurdles (e.g. multiple bottlenecks that may move, and the like). They will try to get me some of the data so I can help them figure it out. I don't know if the data flow goes through the linux system in the bnodeb or not, for example. Most carriers are now trying to ensure that their backhauls from the base station are never congested, though that is another known source of problems
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
On Fri, Mar 1, 2013 at 10:40 AM, dpr...@reed.com wrote: One wonders why all this complexity is necessary, and how likely it is to be well tuned by operators and their contract installers. I'm willing to bet $1000 that all the testing that is done is Can you hear me now and a speed test. Not even something as simple and effective as RRUL. Actually, at least some the the carriers do much more extensive testing; but not with the test tools we would like to see used (yet). An example is ATT, where in research, KK Ramakrishnan has a van with 20 or so laptops so he can go driving around and load up a cell in the middle of the night and get data. And he's research; the operations guys do lots of testing I gather, but more at the radio level. Next up, is to educate KK to run RRUL. And in my own company, I've seen data, but it is too high level: e.g. performance of web video: e.g. siverlight, flash, youtube, etc. A common disease that has complicated all this is the propensity for companies to use Windows XP internally for everything: since window scaling is turned off, you can't saturate a LTE link the way you might like to do with a single TCP connection. - Jim -Original Message- From: Ketan Kulkarni ketku...@gmail.com Sent: Friday, March 1, 2013 3:00am To: Jim Gettys j...@freedesktop.org Cc: cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys j...@freedesktop.org wrote: I've got a bit more insight into LTE than I did in the past, courtesy of the last couple days. To begin with, LTE runs with several classes of service (the call them bearers). Your VOIP traffic goes into one of them. And I think there is another as well that is for guaranteed bit rate traffic. One transmit opportunity may have a bunch of chunks of data, and that data may be destined for more than one device (IIRC). It's substantially different than WiFi. Just thought to put more light on bearer stuff: There are two ways bearers are setup: 1. UE initiated - where User Equipment sets-up the parameters for bearer 2. Network initiated - where node like PCRF and PGW sets-up the parameters. Parameters include the Guaranteed bit-rates, maximum bit-rates. Something called QCI is associated with bearers. The QCI parameters are authorized at PCRF (policy control rule function) and there is certain mapping maintained at either PCRF or PGW between QCI values and DSCP and MBRs. These parameters enforcing is done at PGW (in such case it is termed as PCEF - policy and rule enforcement function). So PGWs depending on bearers can certainly modify dscp bits. Though these can be modified by other nodes in the network. There are two types of bearers: 1. Dedicated bearers - to carry traffic which need special treatment 2. Default or general pupose bearers - to carry all general purpose data. So generally the voip, streaming videos are passed over dedicated bearers and apply (generally) higher GBRs, MBRs and correct dscp markings. And other non-latency sensitive traffic generally follows the default bearer. Theoretical limit on maximum bearers is 11 though practically most of the deployments use upto 3 bearers max. Note that these parameters may very well very based on the subscriber profiles. Premium/Corporate subscribers can well have more GBRs and MBRs. ISPs are generally very much sensitive to the correct markings at gateways for obvious reasons. But most of what we think of as Internet stuff (web surfing, dns, etc) all gets dumped into a single best effort (BE), class. The BE class is definitely badly bloated; I can't say how much because I don't really know yet; the test my colleague ran wasn't run long enough to be confident it filled the buffers). But I will say worse than most cable modems I've seen. I expect this will be true to different degrees on different hardware. The other traffic classes haven't been tested yet for bufferbloat, though I suspect they will have it too. I was told that those classes have much shorter queues, and when the grow, they dump the whole queues (because delivering late real time traffic is useless). But trust *and* verify Verification hasn't been done for anything but BE traffic, and that hasn't been quantified. But each device gets a fair shot at bandwidth in the cell (or sector of a cell; they run 3 radios in each cell), where fair is basically time based; if you are at the edge of a cell, you'll get a lot less bandwidth than someone near a tower; and this fairness is guaranteed by a scheduler than runs in the base station (called a b-nodeb, IIIRC). So the base station guarantees some sort of fairness between devices (a place where Linux's wifi stack today fails utterly, since there is a single queue per device
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
to spend, and time's a wasting. They don't even work for Verizon or ATT. They follow specs handed down, and charge more if you tell them that the specs have changed. This is not how brand-new systems get tuned. It's a Clown Circus out there, and more parameters don't help. This is why more buffering is better continues to be the law of the land - the spec is defined to be no lost packets under load. I'm sure that the primary measure under load for RRUL will be no lost packets by the time it gets to field engineers in the form of specs - because that's what they've *always* been told, and they will disregard any changes as typos. A system with more than two control parameters that interact in complex ways is ungovernable - and no control parameters in LTE are orthogonal, much less linear in their interaction. -Original Message- From: Jim Gettys j...@freedesktop.org Sent: Friday, March 1, 2013 11:09am To: David P Reed dpr...@reed.com Cc: Ketan Kulkarni ketku...@gmail.com, cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android On Fri, Mar 1, 2013 at 10:40 AM, dpr...@reed.com wrote: One wonders why all this complexity is necessary, and how likely it is to be well tuned by operators and their contract installers. I'm willing to bet $1000 that all the testing that is done is Can you hear me now and a speed test. Not even something as simple and effective as RRUL. Actually, at least some the the carriers do much more extensive testing; but not with the test tools we would like to see used (yet). An example is ATT, where in research, KK Ramakrishnan has a van with 20 or so laptops so he can go driving around and load up a cell in the middle of the night and get data. And he's research; the operations guys do lots of testing I gather, but more at the radio level. Next up, is to educate KK to run RRUL. And in my own company, I've seen data, but it is too high level: e.g. performance of web video: e.g. siverlight, flash, youtube, etc. A common disease that has complicated all this is the propensity for companies to use Windows XP internally for everything: since window scaling is turned off, you can't saturate a LTE link the way you might like to do with a single TCP connection. - Jim -Original Message- From: Ketan Kulkarni ketku...@gmail.com Sent: Friday, March 1, 2013 3:00am To: Jim Gettys j...@freedesktop.org Cc: cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys j...@freedesktop.org wrote: I've got a bit more insight into LTE than I did in the past, courtesy of the last couple days. To begin with, LTE runs with several classes of service (the call them bearers). Your VOIP traffic goes into one of them. And I think there is another as well that is for guaranteed bit rate traffic. One transmit opportunity may have a bunch of chunks of data, and that data may be destined for more than one device (IIRC). It's substantially different than WiFi. Just thought to put more light on bearer stuff: There are two ways bearers are setup: 1. UE initiated - where User Equipment sets-up the parameters for bearer 2. Network initiated - where node like PCRF and PGW sets-up the parameters. Parameters include the Guaranteed bit-rates, maximum bit-rates. Something called QCI is associated with bearers. The QCI parameters are authorized at PCRF (policy control rule function) and there is certain mapping maintained at either PCRF or PGW between QCI values and DSCP and MBRs. These parameters enforcing is done at PGW (in such case it is termed as PCEF - policy and rule enforcement function). So PGWs depending on bearers can certainly modify dscp bits. Though these can be modified by other nodes in the network. There are two types of bearers: 1. Dedicated bearers - to carry traffic which need special treatment 2. Default or general pupose bearers - to carry all general purpose data. So generally the voip, streaming videos are passed over dedicated bearers and apply (generally) higher GBRs, MBRs and correct dscp markings. And other non-latency sensitive traffic generally follows the default bearer. Theoretical limit on maximum bearers is 11 though practically most of the deployments use upto 3 bearers max. Note that these parameters may very well very based on the subscriber profiles. Premium/Corporate subscribers can well have more GBRs and MBRs. ISPs are generally very much sensitive to the correct markings at gateways for obvious reasons. But most of what we think of as Internet stuff (web surfing, dns, etc) all gets dumped into a single best effort (BE), class. The BE class is definitely badly bloated; I can't say how much
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
Doesn't fq_codel need an estimate of link capacity? Where will it get that from the 4G or 3G uplink? -Original Message- From: Maciej Soltysiak mac...@soltysiak.com Sent: Thursday, February 28, 2013 1:03pm To: cerowrt-devel@lists.bufferbloat.net Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android Hiya, Looks like Google's experimenting with 3.8 for Android: [https://android.googlesource.com/kernel/common/+/experimental/android-3.8] https://android.googlesource.com/kernel/common/+/experimental/android-3.8 Sounds great if this means they will utilize fq_codel, TFO, BQL, etc. Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to Android 5.0 so I hope Nexus 7 will get it too some day or at least 3.3+ Phoronix coverage: [http://www.phoronix.com/scan.php?page=news_itempx=MTMxMzc] http://www.phoronix.com/scan.php?page=news_itempx=MTMxMzc Their 3.8 changelog: [https://android.googlesource.com/kernel/common/+log/experimental/android-3.8] https://android.googlesource.com/kernel/common/+log/experimental/android-3.8 Regards, Maciej ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
the link. Where will it get that from the 4G or 3G uplink? -Original Message- From: Maciej Soltysiak mac...@soltysiak.com Sent: Thursday, February 28, 2013 1:03pm To: cerowrt-devel@lists.bufferbloat.net Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android Hiya, Looks like Google's experimenting with 3.8 for Android: https://android.googlesource.com/kernel/common/+/experimental/android-3.8 Sounds great if this means they will utilize fq_codel, TFO, BQL, etc. Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to Android 5.0 so I hope Nexus 7 will get it too some day or at least 3.3+ Phoronix coverage: http://www.phoronix.com/scan.php?page=news_itempx=MTMxMzc Their 3.8 changelog: https://android.googlesource.com/kernel/common/+log/experimental/android-3.8 Regards, Maciej ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel ___ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
In short, people who build hardware devices, or device drivers, don't understand TCP. There is a first class education failure in all this. We have yet to find almost any device that isn't bloated; the only question is how badly. - Jim On Thu, Feb 28, 2013 at 3:58 PM, dpr...@reed.com wrote: At least someone actually saw what I've been seeing for years now in Metro area HSPA and LTE deployments. As you know, when I first reported this on the e2e list I was told it could not possibly be happening and that I didn't know what I was talking about. No one in the phone companies was even interested in replicating my experiments, just dismissing them. It was sad. However, I had the same experience on the original Honeywell 6180 dual CPU Multics deployment in about 1973. One day all my benchmarks were running about 5 times slower every other time I ran the code. I suggested that one of the CPUs was running 5x slower, and it was probably due to the CPU cache being turned off. The hardware engineer on site said that that was *impossible*. After 4 more hours of testing, I was sure I was right. That evening, I got him to take the system down, and we hauled out an oscilloscope. Sure enough, the gate that received the cache hit signal had died in one of the processors. The machine continued to run, since all that caused was for memory to be fetched every time, rather than using the cache. Besides the value of finding the root cause of anomalies, the story points out that you really need to understand software and hardware sometimes. The hardware engineer didn't understand the role of a cache, even though he fully understood timing margins, TTL logic, core memory (yes, this machine used core memory), etc. We both understood oscilloscopes, fortunately. In some ways this is like the LTE designers understanding TCP. They don't. But sometimes you need to know about both in some depth. Congratulations, Jim. More Internet Plumbing Merit Badges for you. -Original Message- From: Jim Gettys j...@freedesktop.org Sent: Thursday, February 28, 2013 3:03pm To: Dave Taht dave.t...@gmail.com Cc: David P Reed dpr...@reed.com, cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android I've got a bit more insight into LTE than I did in the past, courtesy of the last couple days. To begin with, LTE runs with several classes of service (the call them bearers). Your VOIP traffic goes into one of them. And I think there is another as well that is for guaranteed bit rate traffic. One transmit opportunity may have a bunch of chunks of data, and that data may be destined for more than one device (IIRC). It's substantially different than WiFi. But most of what we think of as Internet stuff (web surfing, dns, etc) all gets dumped into a single best effort (BE), class. The BE class is definitely badly bloated; I can't say how much because I don't really know yet; the test my colleague ran wasn't run long enough to be confident it filled the buffers). But I will say worse than most cable modems I've seen. I expect this will be true to different degrees on different hardware. The other traffic classes haven't been tested yet for bufferbloat, though I suspect they will have it too. I was told that those classes have much shorter queues, and when the grow, they dump the whole queues (because delivering late real time traffic is useless). But trust *and* verify Verification hasn't been done for anything but BE traffic, and that hasn't been quantified. But each device gets a fair shot at bandwidth in the cell (or sector of a cell; they run 3 radios in each cell), where fair is basically time based; if you are at the edge of a cell, you'll get a lot less bandwidth than someone near a tower; and this fairness is guaranteed by a scheduler than runs in the base station (called a b-nodeb, IIIRC). So the base station guarantees some sort of fairness between devices (a place where Linux's wifi stack today fails utterly, since there is a single queue per device, rather than one per station). Whether there are bloat problems at the link level in LTE due to error correction I don't know yet; but it wouldn't surprise me; I know there was in 3g. The people I talked to this morning aren't familiar with the HARQ layer in the system. The base stations are complicated beasts; they have both a linux system in them as well as a real time operating system based device inside We don't know where the bottle neck(s) are yet. I spent lunch upping their paranoia and getting them through some conceptual hurdles (e.g. multiple bottlenecks that may move, and the like). They will try to get me some of the data so I can help them figure it out. I don't know if the data flow goes through the linux
Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android
It all started when CS departments decided they didn't need EE courses or affiliation with EE depts., and continued with the idea that digital communications had nothing to do with the folks who design the gear, so all you needed to know was the bit layouts of packets in memory to be a network expert. You can see this in the curricula at all levels. Cisco certifies network people who have never studied control theory, queueing theory, ..., and the phone companies certify communications engineers who have never run traceroute or ping, much less debugged the performance of a web-based UI. Modularity is great. But it comes at a cost. Besides this kind of failure, it's the primary cause of security vulnerabilities. -Original Message- From: Jim Gettys j...@freedesktop.org Sent: Thursday, February 28, 2013 4:02pm To: David P Reed dpr...@reed.com Cc: Dave Taht dave.t...@gmail.com, cerowrt-devel@lists.bufferbloat.net cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android In short, people who build hardware devices, or device drivers, don't understand TCP. There is a first class education failure in all this. We have yet to find almost any device that isn't bloated; the only question is how badly. - Jim On Thu, Feb 28, 2013 at 3:58 PM, [mailto:dpr...@reed.com] dpr...@reed.com wrote: At least someone actually saw what I've been seeing for years now in Metro area HSPA and LTE deployments. As you know, when I first reported this on the e2e list I was told it could not possibly be happening and that I didn't know what I was talking about. No one in the phone companies was even interested in replicating my experiments, just dismissing them. It was sad. However, I had the same experience on the original Honeywell 6180 dual CPU Multics deployment in about 1973. One day all my benchmarks were running about 5 times slower every other time I ran the code. I suggested that one of the CPUs was running 5x slower, and it was probably due to the CPU cache being turned off. The hardware engineer on site said that that was *impossible*. After 4 more hours of testing, I was sure I was right. That evening, I got him to take the system down, and we hauled out an oscilloscope. Sure enough, the gate that received the cache hit signal had died in one of the processors. The machine continued to run, since all that caused was for memory to be fetched every time, rather than using the cache. Besides the value of finding the root cause of anomalies, the story points out that you really need to understand software and hardware sometimes. The hardware engineer didn't understand the role of a cache, even though he fully understood timing margins, TTL logic, core memory (yes, this machine used core memory), etc. We both understood oscilloscopes, fortunately. In some ways this is like the LTE designers understanding TCP. They don't. But sometimes you need to know about both in some depth. Congratulations, Jim. More Internet Plumbing Merit Badges for you. -Original Message- From: Jim Gettys [mailto:j...@freedesktop.org] j...@freedesktop.org Sent: Thursday, February 28, 2013 3:03pm To: Dave Taht [mailto:dave.t...@gmail.com] dave.t...@gmail.com Cc: David P Reed [mailto:dpr...@reed.com] dpr...@reed.com, [mailto:cerowrt-devel@lists.bufferbloat.net] cerowrt-devel@lists.bufferbloat.net [mailto:cerowrt-devel@lists.bufferbloat.net] cerowrt-devel@lists.bufferbloat.net Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android I've got a bit more insight into LTE than I did in the past, courtesy of the last couple days. To begin with, LTE runs with several classes of service (the call them bearers). Your VOIP traffic goes into one of them. And I think there is another as well that is for guaranteed bit rate traffic. One transmit opportunity may have a bunch of chunks of data, and that data may be destined for more than one device (IIRC). It's substantially different than WiFi. But most of what we think of as Internet stuff (web surfing, dns, etc) all gets dumped into a single best effort (BE), class. The BE class is definitely badly bloated; I can't say how much because I don't really know yet; the test my colleague ran wasn't run long enough to be confident it filled the buffers). But I will say worse than most cable modems I've seen. I expect this will be true to different degrees on different hardware. The other traffic classes haven't been tested yet for bufferbloat, though I suspect they will have it too. I was told that those classes have much shorter queues, and when the grow, they dump the whole queues (because delivering late real time traffic is useless). But trust *and* verify Verification hasn't been done for anything but BE traffic, and that hasn't been quantified. But each device gets a fair shot at bandwidth