Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread dpreed

Doesn't fq_codel need an estimate of link capacity?  Where will it get that 
from the 4G or 3G uplink?
 
-Original Message-
From: "Maciej Soltysiak" 
Sent: Thursday, February 28, 2013 1:03pm
To: cerowrt-devel@lists.bufferbloat.net
Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for 
Android



Hiya,

Looks like Google's experimenting with 3.8 for Android: 
[https://android.googlesource.com/kernel/common/+/experimental/android-3.8] 
https://android.googlesource.com/kernel/common/+/experimental/android-3.8
Sounds great if this means they will utilize fq_codel, TFO, BQL, etc.

Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to Android 
5.0 so I hope Nexus 7 will get it too some day or at least 3.3+

Phoronix coverage: [http://www.phoronix.com/scan.php?page=news_item&px=MTMxMzc] 
http://www.phoronix.com/scan.php?page=news_item&px=MTMxMzc
Their 3.8 changelog: 
[https://android.googlesource.com/kernel/common/+log/experimental/android-3.8] 
https://android.googlesource.com/kernel/common/+log/experimental/android-3.8

Regards,
Maciej
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread Dave Taht
On Thu, Feb 28, 2013 at 1:57 PM,  wrote:

> Doesn't fq_codel need an estimate of link capacity?
>

No, it just measures delay. Since so far as I know the outgoing portion of
LTE is not soft-rate limited, but sensitive to the actual available link
bandwidth, fq_codel should work pretty good (if the underlying interfaces
weren't horribly overbuffired) in that direction.

I'm looking forward to some measurements of actual buffering at the device
driver/device levels.

I don't know how inbound to the handset is managed via LTE.

Still quite a few assumptions left to smash in the above.

...

in the home router case

...

When there are artificial rate limits in play (in, for example, a cable
modem/CMTS, hooked up via gigE yet rate limiting to 24up/4mbit down), then
a rate limiter (tbf,htb,hfsc) needs to be applied locally to move that rate
limiter/queue management into the local device, se we can manage it better.

I'd like to be rid of the need to use htb and come up with a rate limiter
that could be adjusted dynamically from a daemon in userspace, probing for
short all bandwidth fluctuations while monitoring the load. It needent send
that much data very often, to come up with a stable result

You've described one soft-rate sensing scheme (piggybacking on TCP), and
I've thought up a few others, that could feed back from a daemon some
samples into a a soft(er) rate limiter that would keep control of the
queues in the home router. I am thinking it's going to take way too long to
fix the CPE and far easier to fix the home router via this method, and
certainly it's too painful and inaccurate to merely measure the bandwidth
once, then set a hard rate, when

So far as I know the gargoyle project was experimenting with this approach.

A problem is in places that connect more than one device to the cable
modem... then you end up with those needing to communicate their perception
of the actual bandwidth beyond the link.

Where will it get that from the 4G or 3G uplink?
>
>
>
> -Original Message-
> From: "Maciej Soltysiak" 
> Sent: Thursday, February 28, 2013 1:03pm
> To: cerowrt-devel@lists.bufferbloat.net
> Subject: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel
> for Android
>
>  Hiya,
>  Looks like Google's experimenting with 3.8 for Android:
> https://android.googlesource.com/kernel/common/+/experimental/android-3.8
> Sounds great if this means they will utilize fq_codel, TFO, BQL, etc.
>  Anyway my nexus 7 says it has 3.1.10 and this 3.8 will probably go to
> Android 5.0 so I hope Nexus 7 will get it too some day or at least 3.3+
>  Phoronix coverage:
> http://www.phoronix.com/scan.php?page=news_item&px=MTMxMzc
> Their 3.8 changelog:
> https://android.googlesource.com/kernel/common/+log/experimental/android-3.8
>  Regards,
> Maciej
>
> ___
> Cerowrt-devel mailing list
> Cerowrt-devel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>


-- 
Dave Täht

Fixing bufferbloat with cerowrt:
http://www.teklibre.com/cerowrt/subscribe.html
___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread Jim Gettys
I've got a bit more insight into LTE than I did in the past, courtesy of
the last couple days.

To begin with, LTE runs with several classes of service (the call them
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate
traffic.  One transmit opportunity may have a bunch of chunks of data, and
that data may be destined for more than one device (IIRC).  It's
substantially different than WiFi.

But most of what we think of as Internet stuff (web surfing, dns, etc) all
gets dumped into a single best effort ("BE"), class.

The BE class is definitely badly bloated; I can't say how much because I
don't really know yet; the test my colleague ran wasn't run long enough to
be confident it filled the buffers).  But I will say worse than most cable
modems I've seen.  I expect this will be true to different degrees on
different hardware.  The other traffic classes haven't been tested yet for
bufferbloat, though I suspect they will have it too.  I was told that those
classes have much shorter queues, and when the grow, they dump the whole
queues (because delivering late real time traffic is useless).  But trust
*and* verify  Verification hasn't been done for anything but BE
traffic, and that hasn't been quantified.

But each device gets a "fair" shot at bandwidth in the cell (or sector of a
cell; they run 3 radios in each cell), where fair is basically time based;
if you are at the edge of a cell, you'll get a lot less bandwidth than
someone near a tower; and this fairness is guaranteed by a scheduler than
runs in the base station (called a b-nodeb, IIIRC).  So the base station
guarantees some sort of "fairness" between devices (a place where Linux's
wifi stack today fails utterly, since there is a single queue per device,
rather than one per station).

Whether there are bloat problems at the link level in LTE due to error
correction I don't know yet; but it wouldn't surprise me; I know there was
in 3g.  The people I talked to this morning aren't familiar with the HARQ
layer in the system.

The base stations are complicated beasts; they have both a linux system in
them as well as a real time operating system based device inside  We don't
know where the bottle neck(s) are yet.  I spent lunch upping their paranoia
and getting them through some conceptual hurdles (e.g. multiple bottlenecks
that may move, and the like).  They will try to get me some of the data so
I can help them figure it out.  I don't know if the data flow goes through
the linux system in the bnodeb or not, for example.

Most carriers are now trying to ensure that their backhauls from the base
station are never congested, though that is another known source of
problems.  And then there is the lack of AQM at peering point routers
 You'd think they might run WRED there, but many/most do not.
 - Jim





On Thu, Feb 28, 2013 at 2:08 PM, Dave Taht  wrote:

>
>
> On Thu, Feb 28, 2013 at 1:57 PM,  wrote:
>
>> Doesn't fq_codel need an estimate of link capacity?
>>
>
> No, it just measures delay. Since so far as I know the outgoing portion of
> LTE is not soft-rate limited, but sensitive to the actual available link
> bandwidth, fq_codel should work pretty good (if the underlying interfaces
> weren't horribly overbuffired) in that direction.
>
> I'm looking forward to some measurements of actual buffering at the device
> driver/device levels.
>
> I don't know how inbound to the handset is managed via LTE.
>
> Still quite a few assumptions left to smash in the above.
>
> ...
>
> in the home router case
>
> ...
>
> When there are artificial rate limits in play (in, for example, a cable
> modem/CMTS, hooked up via gigE yet rate limiting to 24up/4mbit down), then
> a rate limiter (tbf,htb,hfsc) needs to be applied locally to move that rate
> limiter/queue management into the local device, se we can manage it better.
>
> I'd like to be rid of the need to use htb and come up with a rate limiter
> that could be adjusted dynamically from a daemon in userspace, probing for
> short all bandwidth fluctuations while monitoring the load. It needent send
> that much data very often, to come up with a stable result
>
> You've described one soft-rate sensing scheme (piggybacking on TCP), and
> I've thought up a few others, that could feed back from a daemon some
> samples into a a soft(er) rate limiter that would keep control of the
> queues in the home router. I am thinking it's going to take way too long to
> fix the CPE and far easier to fix the home router via this method, and
> certainly it's too painful and inaccurate to merely measure the bandwidth
> once, then set a hard rate, when
>
> So far as I know the gargoyle project was experimenting with this
> approach.
>
> A problem is in places that connect more than one device to the cable
> modem... then you end up with those needing to communicate their perception
> of the actual b

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread dpreed

At least someone actually saw what I've been seeing for years now in Metro area 
HSPA and LTE deployments.
 
As you know, when I first reported this on the e2e list I was told it could not 
possibly be happening and that I didn't know what I was talking about.  No one 
in the phone companies was even interested in replicating my experiments, just 
dismissing them.  It was sad.
 
However, I had the same experience on the original Honeywell 6180 dual CPU 
Multics deployment in about 1973.  One day all my benchmarks were running about 
5 times slower every other time I ran the code.  I suggested that one of the 
CPUs was running 5x slower, and it was probably due to the CPU cache being 
turned off.   The hardware engineer on site said that that was *impossible*.  
After 4 more hours of testing, I was sure I was right.  That evening, I got him 
to take the system down, and we hauled out an oscilloscope.  Sure enough, the 
gate that received the "cache hit" signal had died in one of the processors.   
The machine continued to run, since all that caused was for memory to be 
fetched every time, rather than using the cache.
 
Besides the value of finding the "root cause" of anomalies, the story points 
out that you really need to understand software and hardware sometimes.  The 
hardware engineer didn't understand the role of a cache, even though he fully 
understood timing margins, TTL logic, core memory (yes, this machine used core 
memory), etc.
 
We both understood oscilloscopes, fortunately.
 
In some ways this is like the LTE designers understanding TCP.   They don't.  
But sometimes you need to know about both in some depth.
 
Congratulations, Jim.  More Internet Plumbing Merit Badges for you.
 
-Original Message-
From: "Jim Gettys" 
Sent: Thursday, February 28, 2013 3:03pm
To: "Dave Taht" 
Cc: "David P Reed" , "cerowrt-devel@lists.bufferbloat.net" 

Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android



I've got a bit more insight into LTE than I did in the past, courtesy of the 
last couple days.
To begin with, LTE runs with several classes of service (the call them 
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate traffic.  
One transmit opportunity may have a bunch of chunks of data, and that data may 
be destined for more than one device (IIRC).  It's substantially different than 
WiFi.
But most of what we think of as Internet stuff (web surfing, dns, etc) all gets 
dumped into a single best effort ("BE"), class.
The BE class is definitely badly bloated; I can't say how much because I don't 
really know yet; the test my colleague ran wasn't run long enough to be 
confident it filled the buffers).  But I will say worse than most cable modems 
I've seen.  I expect this will be true to different degrees on different 
hardware.  The other traffic classes haven't been tested yet for bufferbloat, 
though I suspect they will have it too.  I was told that those classes have 
much shorter queues, and when the grow, they dump the whole queues (because 
delivering late real time traffic is useless).  But trust *and* verify  
Verification hasn't been done for anything but BE traffic, and that hasn't been 
quantified.
But each device gets a "fair" shot at bandwidth in the cell (or sector of a 
cell; they run 3 radios in each cell), where fair is basically time based; if 
you are at the edge of a cell, you'll get a lot less bandwidth than someone 
near a tower; and this fairness is guaranteed by a scheduler than runs in the 
base station (called a b-nodeb, IIIRC).  So the base station guarantees some 
sort of "fairness" between devices (a place where Linux's wifi stack today 
fails utterly, since there is a single queue per device, rather than one per 
station).
Whether there are bloat problems at the link level in LTE due to error 
correction I don't know yet; but it wouldn't surprise me; I know there was in 
3g.  The people I talked to this morning aren't familiar with the HARQ layer in 
the system.
The base stations are complicated beasts; they have both a linux system in them 
as well as a real time operating system based device inside  We don't know 
where the bottle neck(s) are yet.  I spent lunch upping their paranoia and 
getting them through some conceptual hurdles (e.g. multiple bottlenecks that 
may move, and the like).  They will try to get me some of the data so I can 
help them figure it out.  I don't know if the data flow goes through the linux 
system in the bnodeb or not, for example.
Most carriers are now trying to ensure that their backhauls from the base 
station are never congested, though that is another known source of problems.  
And then there is the lack of AQM at peering point ro

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread Jim Gettys
In short, people who build hardware devices, or device drivers, don't
understand TCP.

There is a first class education failure in all this.

We have yet to find almost any device that isn't bloated; the only question
is how badly.
 - Jim



On Thu, Feb 28, 2013 at 3:58 PM,  wrote:

> At least someone actually saw what I've been seeing for years now in Metro
> area HSPA and LTE deployments.
>
>
>
> As you know, when I first reported this on the e2e list I was told it
> could not possibly be happening and that I didn't know what I was talking
> about.  No one in the phone companies was even interested in replicating my
> experiments, just dismissing them.  It was sad.
>
>
>
> However, I had the same experience on the original Honeywell 6180 dual CPU
> Multics deployment in about 1973.  One day all my benchmarks were running
> about 5 times slower every other time I ran the code.  I suggested that one
> of the CPUs was running 5x slower, and it was probably due to the CPU cache
> being turned off.   The hardware engineer on site said that that was
> *impossible*.  After 4 more hours of testing, I was sure I was right.  That
> evening, I got him to take the system down, and we hauled out an
> oscilloscope.  Sure enough, the gate that received the "cache hit" signal
> had died in one of the processors.   The machine continued to run, since
> all that caused was for memory to be fetched every time, rather than using
> the cache.
>
>
>
> Besides the value of finding the "root cause" of anomalies, the story
> points out that you really need to understand software and hardware
> sometimes.  The hardware engineer didn't understand the role of a cache,
> even though he fully understood timing margins, TTL logic, core memory
> (yes, this machine used core memory), etc.
>
>
>
> We both understood oscilloscopes, fortunately.
>
>
>
> In some ways this is like the LTE designers understanding TCP.   They
> don't.  But sometimes you need to know about both in some depth.
>
>
>
> Congratulations, Jim.  More Internet Plumbing Merit Badges for you.
>
>
>
> -Original Message-
> From: "Jim Gettys" 
> Sent: Thursday, February 28, 2013 3:03pm
> To: "Dave Taht" 
> Cc: "David P Reed" , "cerowrt-devel@lists.bufferbloat.net"
> 
> Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux
> kernel for Android
>
>  I've got a bit more insight into LTE than I did in the past, courtesy of
> the last couple days.
> To begin with, LTE runs with several classes of service (the call them
> bearers).  Your VOIP traffic goes into one of them.
> And I think there is another as well that is for guaranteed bit rate
> traffic.  One transmit opportunity may have a bunch of chunks of data, and
> that data may be destined for more than one device (IIRC).  It's
> substantially different than WiFi.
> But most of what we think of as Internet stuff (web surfing, dns, etc) all
> gets dumped into a single best effort ("BE"), class.
> The BE class is definitely badly bloated; I can't say how much because I
> don't really know yet; the test my colleague ran wasn't run long enough to
> be confident it filled the buffers).  But I will say worse than most cable
> modems I've seen.  I expect this will be true to different degrees on
> different hardware.  The other traffic classes haven't been tested yet for
> bufferbloat, though I suspect they will have it too.  I was told that those
> classes have much shorter queues, and when the grow, they dump the whole
> queues (because delivering late real time traffic is useless).  But trust
> *and* verify  Verification hasn't been done for anything but BE
> traffic, and that hasn't been quantified.
> But each device gets a "fair" shot at bandwidth in the cell (or sector of
> a cell; they run 3 radios in each cell), where fair is basically time
> based; if you are at the edge of a cell, you'll get a lot less bandwidth
> than someone near a tower; and this fairness is guaranteed by a scheduler
> than runs in the base station (called a b-nodeb, IIIRC).  So the base
> station guarantees some sort of "fairness" between devices (a place where
> Linux's wifi stack today fails utterly, since there is a single queue per
> device, rather than one per station).
> Whether there are bloat problems at the link level in LTE due to error
> correction I don't know yet; but it wouldn't surprise me; I know there was
> in 3g.  The people I talked to this morning aren't familiar with the HARQ
> layer in the system.
> The base s

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-02-28 Thread dpreed

It all started when CS departments decided they didn't need EE courses or 
affiliation with EE depts., and continued with the idea that digital 
communications had nothing to do with the folks who design the gear, so all you 
needed to know was the bit layouts of packets in memory to be a "network 
expert".
 
You can see this in the curricula at all levels. Cisco certifies network people 
who have never studied control theory, queueing theory, ..., and the phone 
companies certify communications engineers who have never run traceroute or 
ping, much less debugged the performance of a web-based UI.
 
Modularity is great.  But it comes at a cost.  Besides this kind of failure, 
it's the primary cause of security vulnerabilities.
 
-Original Message-
From: "Jim Gettys" 
Sent: Thursday, February 28, 2013 4:02pm
To: "David P Reed" 
Cc: "Dave Taht" , "cerowrt-devel@lists.bufferbloat.net" 

Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android



In short, people who build hardware devices, or device drivers, don't 
understand TCP.
There is a first class education failure in all this.

We have yet to find almost any device that isn't bloated; the only question is 
how badly.
- Jim



On Thu, Feb 28, 2013 at 3:58 PM,  <[mailto:dpr...@reed.com] dpr...@reed.com> 
wrote:

At least someone actually saw what I've been seeing for years now in Metro area 
HSPA and LTE deployments.
 
As you know, when I first reported this on the e2e list I was told it could not 
possibly be happening and that I didn't know what I was talking about.  No one 
in the phone companies was even interested in replicating my experiments, just 
dismissing them.  It was sad.
 
However, I had the same experience on the original Honeywell 6180 dual CPU 
Multics deployment in about 1973.  One day all my benchmarks were running about 
5 times slower every other time I ran the code.  I suggested that one of the 
CPUs was running 5x slower, and it was probably due to the CPU cache being 
turned off.   The hardware engineer on site said that that was *impossible*.  
After 4 more hours of testing, I was sure I was right.  That evening, I got him 
to take the system down, and we hauled out an oscilloscope.  Sure enough, the 
gate that received the "cache hit" signal had died in one of the processors.   
The machine continued to run, since all that caused was for memory to be 
fetched every time, rather than using the cache.
 
Besides the value of finding the "root cause" of anomalies, the story points 
out that you really need to understand software and hardware sometimes.  The 
hardware engineer didn't understand the role of a cache, even though he fully 
understood timing margins, TTL logic, core memory (yes, this machine used core 
memory), etc.
 
We both understood oscilloscopes, fortunately.
 
In some ways this is like the LTE designers understanding TCP.   They don't.  
But sometimes you need to know about both in some depth.
 
Congratulations, Jim.  More Internet Plumbing Merit Badges for you.


 
-Original Message-
From: "Jim Gettys" <[mailto:j...@freedesktop.org] j...@freedesktop.org>
Sent: Thursday, February 28, 2013 3:03pm
To: "Dave Taht" <[mailto:dave.t...@gmail.com] dave.t...@gmail.com>
 Cc: "David P Reed" <[mailto:dpr...@reed.com] dpr...@reed.com>, 
"[mailto:cerowrt-devel@lists.bufferbloat.net] 
cerowrt-devel@lists.bufferbloat.net" 
<[mailto:cerowrt-devel@lists.bufferbloat.net] 
cerowrt-devel@lists.bufferbloat.net>
 Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android



I've got a bit more insight into LTE than I did in the past, courtesy of the 
last couple days.
To begin with, LTE runs with several classes of service (the call them 
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate traffic.  
One transmit opportunity may have a bunch of chunks of data, and that data may 
be destined for more than one device (IIRC).  It's substantially different than 
WiFi.
But most of what we think of as Internet stuff (web surfing, dns, etc) all gets 
dumped into a single best effort ("BE"), class.
The BE class is definitely badly bloated; I can't say how much because I don't 
really know yet; the test my colleague ran wasn't run long enough to be 
confident it filled the buffers).  But I will say worse than most cable modems 
I've seen.  I expect this will be true to different degrees on different 
hardware.  The other traffic classes haven't been tested yet for bufferbloat, 
though I suspect they will have it too.  I was told that those classes have 
much shorter queues, and when the grow, they dump the whole queues (because 
delivering late real time traffic is useless).  

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-01 Thread Ketan Kulkarni
On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys  wrote:

> I've got a bit more insight into LTE than I did in the past, courtesy of
> the last couple days.
>
> To begin with, LTE runs with several classes of service (the call them
> bearers).  Your VOIP traffic goes into one of them.
> And I think there is another as well that is for guaranteed bit rate
> traffic.  One transmit opportunity may have a bunch of chunks of data, and
> that data may be destined for more than one device (IIRC).  It's
> substantially different than WiFi.
>

Just thought to put more light on bearer stuff:

There are two ways bearers are setup:
1. UE initiated - where User Equipment sets-up the "parameters" for bearer
2. Network initiated - where node like PCRF and PGW sets-up the
"parameters".
Parameters include the Guaranteed bit-rates, maximum bit-rates. Something
called QCI is associated with bearers. The QCI parameters are authorized at
PCRF (policy control rule function) and there is certain mapping maintained
at either PCRF or PGW between QCI values and DSCP and MBRs.
These parameters enforcing is done at PGW (in such case it is termed as
PCEF - policy and rule enforcement function). So PGWs depending on bearers
can certainly modify dscp bits. Though these can be modified by other nodes
in the network.

There are two types of bearers: 1. Dedicated bearers - to carry traffic
which need "special" treatment 2. Default or general pupose bearers - to
carry all general purpose data.
So generally the voip, streaming videos are passed over dedicated bearers
and apply (generally) higher GBRs, MBRs and correct dscp markings.
And other non-latency sensitive traffic generally follows the default
bearer.

Theoretical limit on maximum bearers is 11 though practically most of the
deployments use upto 3 bearers max.

Note that these parameters may very well very based on the subscriber
profiles. Premium/Corporate subscribers can well have more GBRs and MBRs.
ISPs are generally very much sensitive to the correct markings at gateways
for obvious reasons.


> But most of what we think of as Internet stuff (web surfing, dns, etc) all
> gets dumped into a single best effort ("BE"), class.
>
> The BE class is definitely badly bloated; I can't say how much because I
> don't really know yet; the test my colleague ran wasn't run long enough to
> be confident it filled the buffers).  But I will say worse than most cable
> modems I've seen.  I expect this will be true to different degrees on
> different hardware.  The other traffic classes haven't been tested yet for
> bufferbloat, though I suspect they will have it too.  I was told that those
> classes have much shorter queues, and when the grow, they dump the whole
> queues (because delivering late real time traffic is useless).  But trust
> *and* verify  Verification hasn't been done for anything but BE
> traffic, and that hasn't been quantified.
>
> But each device gets a "fair" shot at bandwidth in the cell (or sector of
> a cell; they run 3 radios in each cell), where fair is basically time
> based; if you are at the edge of a cell, you'll get a lot less bandwidth
> than someone near a tower; and this fairness is guaranteed by a scheduler
> than runs in the base station (called a b-nodeb, IIIRC).  So the base
> station guarantees some sort of "fairness" between devices (a place where
> Linux's wifi stack today fails utterly, since there is a single queue per
> device, rather than one per station).
>
> Whether there are bloat problems at the link level in LTE due to error
> correction I don't know yet; but it wouldn't surprise me; I know there was
> in 3g.  The people I talked to this morning aren't familiar with the HARQ
> layer in the system.
>
> The base stations are complicated beasts; they have both a linux system in
> them as well as a real time operating system based device inside  We don't
> know where the bottle neck(s) are yet.  I spent lunch upping their paranoia
> and getting them through some conceptual hurdles (e.g. multiple bottlenecks
> that may move, and the like).  They will try to get me some of the data so
> I can help them figure it out.  I don't know if the data flow goes through
> the linux system in the bnodeb or not, for example.
>
> Most carriers are now trying to ensure that their backhauls from the base
> station are never congested, though that is another known source of
> problems.  And then there is the lack of AQM at peering point routers
>  You'd think they might run WRED there, but many/most do not.
>  - Jim
>
>
>
>
>
> On Thu, Feb 28, 2013 at 2:08 PM, Dave Taht  wrote:
>
>>
>>
>> On Thu, Feb 28, 2013 at 1:57 PM,  wrote:
>>
>>> Doesn't fq_codel need an estimate of link capacity?
>>>
>>
>> No, it just measures delay. Since so far as I know the outgoing portion
>> of LTE is not soft-rate limited, but sensitive to the actual available link
>> bandwidth, fq_codel should work pretty good (if the underlying

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-01 Thread dpreed

One wonders why all this complexity is necessary, and how likely it is to be 
"well tuned" by operators and their contract installers.
 
I'm willing to bet $1000 that all the testing that is done is "Can you hear me 
now" and a "speed test".  Not even something as simple and effective as RRUL.
 
-Original Message-
From: "Ketan Kulkarni" 
Sent: Friday, March 1, 2013 3:00am
To: "Jim Gettys" 
Cc: "cerowrt-devel@lists.bufferbloat.net" 
Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android





On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys <[mailto:j...@freedesktop.org] 
j...@freedesktop.org> wrote:

I've got a bit more insight into LTE than I did in the past, courtesy of the 
last couple days.
To begin with, LTE runs with several classes of service (the call them 
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate traffic.  
One transmit opportunity may have a bunch of chunks of data, and that data may 
be destined for more than one device (IIRC).  It's substantially different than 
WiFi.

Just thought to put more light on bearer stuff:

There are two ways bearers are setup: 
1. UE initiated - where User Equipment sets-up the "parameters" for bearer 
 2. Network initiated - where node like PCRF and PGW sets-up the "parameters". 
Parameters include the Guaranteed bit-rates, maximum bit-rates. Something 
called QCI is associated with bearers. The QCI parameters are authorized at 
PCRF (policy control rule function) and there is certain mapping maintained at 
either PCRF or PGW between QCI values and DSCP and MBRs.
 These parameters enforcing is done at PGW (in such case it is termed as PCEF - 
policy and rule enforcement function). So PGWs depending on bearers can 
certainly modify dscp bits. Though these can be modified by other nodes in the 
network. 

There are two types of bearers: 1. Dedicated bearers - to carry traffic which 
need "special" treatment 2. Default or general pupose bearers - to carry all 
general purpose data.
So generally the voip, streaming videos are passed over dedicated bearers and 
apply (generally) higher GBRs, MBRs and correct dscp markings.
 And other non-latency sensitive traffic generally follows the default bearer.

Theoretical limit on maximum bearers is 11 though practically most of the 
deployments use upto 3 bearers max.

Note that these parameters may very well very based on the subscriber profiles. 
Premium/Corporate subscribers can well have more GBRs and MBRs.
 ISPs are generally very much sensitive to the correct markings at gateways for 
obvious reasons.



But most of what we think of as Internet stuff (web surfing, dns, etc) all gets 
dumped into a single best effort ("BE"), class.
The BE class is definitely badly bloated; I can't say how much because I don't 
really know yet; the test my colleague ran wasn't run long enough to be 
confident it filled the buffers).  But I will say worse than most cable modems 
I've seen.  I expect this will be true to different degrees on different 
hardware.  The other traffic classes haven't been tested yet for bufferbloat, 
though I suspect they will have it too.  I was told that those classes have 
much shorter queues, and when the grow, they dump the whole queues (because 
delivering late real time traffic is useless).  But trust *and* verify  
Verification hasn't been done for anything but BE traffic, and that hasn't been 
quantified.
But each device gets a "fair" shot at bandwidth in the cell (or sector of a 
cell; they run 3 radios in each cell), where fair is basically time based; if 
you are at the edge of a cell, you'll get a lot less bandwidth than someone 
near a tower; and this fairness is guaranteed by a scheduler than runs in the 
base station (called a b-nodeb, IIIRC).  So the base station guarantees some 
sort of "fairness" between devices (a place where Linux's wifi stack today 
fails utterly, since there is a single queue per device, rather than one per 
station).
Whether there are bloat problems at the link level in LTE due to error 
correction I don't know yet; but it wouldn't surprise me; I know there was in 
3g.  The people I talked to this morning aren't familiar with the HARQ layer in 
the system.
The base stations are complicated beasts; they have both a linux system in them 
as well as a real time operating system based device inside  We don't know 
where the bottle neck(s) are yet.  I spent lunch upping their paranoia and 
getting them through some conceptual hurdles (e.g. multiple bottlenecks that 
may move, and the like).  They will try to get me some of the data so I can 
help them figure it out.  I don't know if the data flow goes through the linux 
system in the bnodeb or not, f

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-01 Thread Jim Gettys
On Fri, Mar 1, 2013 at 10:40 AM,  wrote:

> One wonders why all this complexity is necessary, and how likely it is to
> be "well tuned" by operators and their contract installers.
>
>
>
> I'm willing to bet $1000 that all the testing that is done is "Can you
> hear me now" and a "speed test".  Not even something as simple and
> effective as RRUL.
>

Actually, at least some the the carriers do much more extensive testing;
but not with the test tools we would like to see used (yet).

An example is AT&T, where in research, KK Ramakrishnan has a van with 20 or
so laptops so he can go driving around and load up a cell in the middle of
the night and get data.   And he's research; the operations guys do lots of
testing I gather, but more at the radio level.

Next up, is to educate KK to run RRUL.

And in my own company, I've seen data, but it is too high level: e.g.
performance of "web" video: e.g. siverlight, flash, youtube, etc.

A common disease that has complicated all this is the propensity for
companies to use Windows XP internally for everything: since window scaling
is turned off, you can't saturate a LTE link the way you might like to do
with a single TCP connection.
  - Jim




>
>
> -Original Message-
> From: "Ketan Kulkarni" 
> Sent: Friday, March 1, 2013 3:00am
> To: "Jim Gettys" 
> Cc: "cerowrt-devel@lists.bufferbloat.net" <
> cerowrt-devel@lists.bufferbloat.net>
> Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux
> kernel for Android
>
>
>
> On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys  wrote:
>
>> I've got a bit more insight into LTE than I did in the past, courtesy of
>> the last couple days.
>> To begin with, LTE runs with several classes of service (the call them
>> bearers).  Your VOIP traffic goes into one of them.
>> And I think there is another as well that is for guaranteed bit rate
>> traffic.  One transmit opportunity may have a bunch of chunks of data, and
>> that data may be destined for more than one device (IIRC).  It's
>> substantially different than WiFi.
>>
>  Just thought to put more light on bearer stuff:
>
> There are two ways bearers are setup:
> 1. UE initiated - where User Equipment sets-up the "parameters" for bearer
> 2. Network initiated - where node like PCRF and PGW sets-up the
> "parameters".
> Parameters include the Guaranteed bit-rates, maximum bit-rates. Something
> called QCI is associated with bearers. The QCI parameters are authorized at
> PCRF (policy control rule function) and there is certain mapping maintained
> at either PCRF or PGW between QCI values and DSCP and MBRs.
> These parameters enforcing is done at PGW (in such case it is termed as
> PCEF - policy and rule enforcement function). So PGWs depending on bearers
> can certainly modify dscp bits. Though these can be modified by other nodes
> in the network.
>
> There are two types of bearers: 1. Dedicated bearers - to carry traffic
> which need "special" treatment 2. Default or general pupose bearers - to
> carry all general purpose data.
> So generally the voip, streaming videos are passed over dedicated bearers
> and apply (generally) higher GBRs, MBRs and correct dscp markings.
> And other non-latency sensitive traffic generally follows the default
> bearer.
>
> Theoretical limit on maximum bearers is 11 though practically most of the
> deployments use upto 3 bearers max.
>
> Note that these parameters may very well very based on the subscriber
> profiles. Premium/Corporate subscribers can well have more GBRs and MBRs.
> ISPs are generally very much sensitive to the correct markings at gateways
> for obvious reasons.
>
>  But most of what we think of as Internet stuff (web surfing, dns, etc)
>> all gets dumped into a single best effort ("BE"), class.
>> The BE class is definitely badly bloated; I can't say how much because I
>> don't really know yet; the test my colleague ran wasn't run long enough to
>> be confident it filled the buffers).  But I will say worse than most cable
>> modems I've seen.  I expect this will be true to different degrees on
>> different hardware.  The other traffic classes haven't been tested yet for
>> bufferbloat, though I suspect they will have it too.  I was told that those
>> classes have much shorter queues, and when the grow, they dump the whole
>> queues (because delivering late real time traffic is useless).  But trust
>> *and* verify  Verification hasn't been done for anything but BE
>> traffic, and that hasn't been quantif

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-01 Thread dpreed

I don't doubt that they test.  My point was different - there are too many 
knobs and too big a parameter space to test efectively.  And that's the point.
 
I realize that it's extremely fun to invent parameters in "standards 
organizations" like 3GPP.  Everybody has their own favorite knob, and a great 
rationale for some unusual, but critically "important" customer requirement 
that might come up some day.  Hell, Linux has a gazillion (yes, that's a 
technical term in mathematics!) parameters, almost none of which are touched.  
This reflects the fact that nothing ever gets removed once added.  LTE is now 
going into release 12, and it's completely ramified into "solutions" to 
problems that will never be fixed in the field with those solutions.  It's 
great for European Publically Funded Academic-Industry research - lots for 
those "Professors" to claim they invented.
 
I've worked with telco contractors in the field.   They don't read manuals, and 
they don't read specs.  They have a job to do, and so much money to spend, and 
time's a wasting.  They don't even work for Verizon or ATT.  They follow 
"specs" handed down, and charge more if you tell them that the specs have 
changed.
 
This is not how brand-new systems get tuned.
 
It's a Clown Circus out there, and more parameters don't help.
 
This is why "more buffering is better" continues to be the law of the land - 
the spec is defined to be "no lost packets under load".   I'm sure that the 
primary measure under load for RRUL will be "no lost packets" by the time it 
gets to field engineers in the form of "specs" - because that's what they've 
*always* been told, and they will disregard any changes as "typos".
 
A system with more than two control parameters that interact in complex ways is 
ungovernable - and no control parameters in LTE are "orthogonal", much less 
"linear" in their interaction.
 
 
 
-----Original Message-
From: "Jim Gettys" 
Sent: Friday, March 1, 2013 11:09am
To: "David P Reed" 
Cc: "Ketan Kulkarni" , 
"cerowrt-devel@lists.bufferbloat.net" 
Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android








On Fri, Mar 1, 2013 at 10:40 AM,  <[mailto:dpr...@reed.com] dpr...@reed.com> 
wrote:

One wonders why all this complexity is necessary, and how likely it is to be 
"well tuned" by operators and their contract installers.
 
I'm willing to bet $1000 that all the testing that is done is "Can you hear me 
now" and a "speed test".  Not even something as simple and effective as RRUL.
Actually, at least some the the carriers do much more extensive testing; but 
not with the test tools we would like to see used (yet).
An example is AT&T, where in research, KK Ramakrishnan has a van with 20 or so 
laptops so he can go driving around and load up a cell in the middle of the 
night and get data.   And he's research; the operations guys do lots of testing 
I gather, but more at the radio level.
Next up, is to educate KK to run RRUL.
And in my own company, I've seen data, but it is too high level: e.g. 
performance of "web" video: e.g. siverlight, flash, youtube, etc.
A common disease that has complicated all this is the propensity for companies 
to use Windows XP internally for everything: since window scaling is turned 
off, you can't saturate a LTE link the way you might like to do with a single 
TCP connection.
- Jim



 
-Original Message-
From: "Ketan Kulkarni" <[mailto:ketku...@gmail.com] ketku...@gmail.com>
Sent: Friday, March 1, 2013 3:00am
To: "Jim Gettys" <[mailto:j...@freedesktop.org] j...@freedesktop.org>
 Cc: "[mailto:cerowrt-devel@lists.bufferbloat.net] 
cerowrt-devel@lists.bufferbloat.net" 
<[mailto:cerowrt-devel@lists.bufferbloat.net] 
cerowrt-devel@lists.bufferbloat.net>
 Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android





On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys <[mailto:j...@freedesktop.org] 
j...@freedesktop.org> wrote:

I've got a bit more insight into LTE than I did in the past, courtesy of the 
last couple days.
To begin with, LTE runs with several classes of service (the call them 
bearers).  Your VOIP traffic goes into one of them.
And I think there is another as well that is for guaranteed bit rate traffic.  
One transmit opportunity may have a bunch of chunks of data, and that data may 
be destined for more than one device (IIRC).  It's substantially different than 
WiFi.
Just thought to put more light on bearer stuff:

There are two ways bearers are setup: 
1. UE initiated - where User Equipment sets-up the "parameters" for bearer 
 2.

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-01 Thread Ketan Kulkarni
hey invented.
>
>
>
> I've worked with telco contractors in the field.   They don't read
> manuals, and they don't read specs.  They have a job to do, and so much
> money to spend, and time's a wasting.  They don't even work for Verizon or
> ATT.  They follow "specs" handed down, and charge more if you tell them
> that the specs have changed.
>
>
>
> This is not how brand-new systems get tuned.
>
>
>
> It's a Clown Circus out there, and more parameters don't help.
>
>
>
> This is why "more buffering is better" continues to be the law of the land
> - the spec is defined to be "no lost packets under load".   I'm sure that
> the primary measure under load for RRUL will be "no lost packets" by the
> time it gets to field engineers in the form of "specs" - because that's
> what they've *always* been told, and they will disregard any changes as
> "typos".
>
>
>
> A system with more than two control parameters that interact in complex
> ways is ungovernable - and no control parameters in LTE are "orthogonal",
> much less "linear" in their interaction.
>
>
>
>
>
>
>
> -Original Message-
> From: "Jim Gettys" 
> Sent: Friday, March 1, 2013 11:09am
> To: "David P Reed" 
> Cc: "Ketan Kulkarni" , "
> cerowrt-devel@lists.bufferbloat.net" 
> Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux
> kernel for Android
>
>
>
>
> On Fri, Mar 1, 2013 at 10:40 AM,  wrote:
>
>> One wonders why all this complexity is necessary, and how likely it is to
>> be "well tuned" by operators and their contract installers.
>>
>>
>>
>> I'm willing to bet $1000 that all the testing that is done is "Can you
>> hear me now" and a "speed test".  Not even something as simple and
>> effective as RRUL.
>>
> Actually, at least some the the carriers do much more extensive testing;
> but not with the test tools we would like to see used (yet).
> An example is AT&T, where in research, KK Ramakrishnan has a van with 20
> or so laptops so he can go driving around and load up a cell in the middle
> of the night and get data.   And he's research; the operations guys do lots
> of testing I gather, but more at the radio level.
> Next up, is to educate KK to run RRUL.
> And in my own company, I've seen data, but it is too high level: e.g.
> performance of "web" video: e.g. siverlight, flash, youtube, etc.
> A common disease that has complicated all this is the propensity for
> companies to use Windows XP internally for everything: since window scaling
> is turned off, you can't saturate a LTE link the way you might like to do
> with a single TCP connection.
> - Jim
>
>>
>>
>> -Original Message-
>> From: "Ketan Kulkarni" 
>> Sent: Friday, March 1, 2013 3:00am
>> To: "Jim Gettys" 
>> Cc: "cerowrt-devel@lists.bufferbloat.net" <
>> cerowrt-devel@lists.bufferbloat.net>
>> Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux
>> kernel for Android
>>
>>
>>
>> On Fri, Mar 1, 2013 at 1:33 AM, Jim Gettys  wrote:
>>
>>> I've got a bit more insight into LTE than I did in the past, courtesy of
>>> the last couple days.
>>> To begin with, LTE runs with several classes of service (the call them
>>> bearers).  Your VOIP traffic goes into one of them.
>>> And I think there is another as well that is for guaranteed bit rate
>>> traffic.  One transmit opportunity may have a bunch of chunks of data, and
>>> that data may be destined for more than one device (IIRC).  It's
>>> substantially different than WiFi.
>>>
>> Just thought to put more light on bearer stuff:
>>
>> There are two ways bearers are setup:
>> 1. UE initiated - where User Equipment sets-up the "parameters" for
>> bearer
>> 2. Network initiated - where node like PCRF and PGW sets-up the
>> "parameters".
>> Parameters include the Guaranteed bit-rates, maximum bit-rates. Something
>> called QCI is associated with bearers. The QCI parameters are authorized at
>> PCRF (policy control rule function) and there is certain mapping maintained
>> at either PCRF or PGW between QCI values and DSCP and MBRs.
>> These parameters enforcing is done at PGW (in such case it is termed as
>> PCEF - policy and rule enforcement function). So PGWs depending on bearers
>> can certainly modify

Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-02 Thread William Allen Simpson

On 3/1/13 10:39 PM, Ketan Kulkarni wrote:

Consider from end-user perspective, getting a voice call while 
surfing/downloading on 2G/3G interrupts all the download and it is annoying.


Ummm, this isn't entirely accurate.  When Karn and I designed CDMA IS-99
circa '93 -'94 with data in the control channel, data never stopped
during voice calls.

Maybe some versions of 2G/3G couldn't do that, but better versions ;-)



Similarly going ahead we might very well have handoff from wifi to LTE - why 
not?


Agreed.  But again, we've known how to do soft hand-off for a long time.



On Fri, Mar 1, 2013 at 9:57 PM, mailto:dpr...@reed.com>> 
wrote:
This is why "more buffering is better" continues to be the law of the land - the spec is 
defined to be "no lost packets under load".   I'm sure that the primary measure under load for RRUL 
will be "no lost packets" by the time it gets to field
engineers in the form of "specs" - because that's what they've *always* been told, 
and they will disregard any changes as "typos".


We've had this problem with bell-heads forever.  Even back in the days
with heavy packet loss at MAE-East, bell-heads would continue to
insist that any packet loss was an alarm condition.  Even after PPP
LQM showed they mangled bits and bytes on even their most prized T3
links (and had been lying to the FCC about uptime for decades), we
never could shake off the syndrome.

It's the "every bit is sacred" mentality.

___
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel


Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel for Android

2013-03-02 Thread dpreed

Hi Ketan -
 
It is possible for good architects to simplify rather than to ramify.  It takes 
clear understanding of the system as a whole, a unifying perspective, and a 
goal to make the system work extremely well and simply.
 
One of the key insights into how to do this was the choice of features included 
in the IP "layer" of the Internet stack.  That is - almost none.  And if you 
read history, the IP layer got simpler as features (like TOS) that had no 
sensible definition were de facto deprecated, by failure to be utilized for any 
useful purpose.
 
This process works for well-architected abstractions.
 
And it is why the original Internet team included people with expertise in 
radio networks, shared-medium LANs, etc., end-to-end cryptographic security and 
authentication, as well as people who understood the properties of voice 
conversation codecs, etc.  "Features" for those were not "omitted" - they were, 
instead, carefully thought through.
 
Since the goal of IP was to operate a completely technologically heterogeneous 
and application heterogeneous universal network, one where new technologies 
could be introduced without change,  those were the most serious issues.
 
Yet relative simplicity was achieved.
 
This contrasts with the process of 3GPP and the wireless industry.  Bad designs 
that specifically focus on one application (voip) without abstraction or 
generalization, or specifically bind-in properties of one specific "bearer" 
technology (circuits, scheduling algorithms), fail utterly to even fit new 
situations that come up as a matter of course.
 
This is why LTE (just deployed) is on version 12, and IP is on version 4 (where 
the first 3 never were deployed...), and moving to 6, which is almost exactly 
the same as 4.
 
And I claim (it's no longer a radical claim - we see the same successes in 
other areas of architecture), that the principle of well-thought-through 
simplicity is the *only* quality architectural approach.
 
However, there are opposing views.  The latest of these is the absolute 
obsession with finding "cross layer" optimizations, and making a religion of 
protocol cross-layer "features".  Which of course makes such architectures 
"optimal" in a sense, for one instant of time, one particular technology, and 
one Professor's "career".
 
However, nothing is more likely than change.  And "cross layer" ideas 
essentially blow up any potential for change.
 
So it's easy to predict that LTE is a bloody disaster in the making.  It's a 
dysfunctional system designed by a committee, driven by egos and equipment 
vendors' desire to "differentiate" merely to partition the market. (why else do 
no cellphones that support LTE work on any other operators' LTE network?  
That's on purpose - it's a marketing requirement!
 
So a protocol that started out to use a very nice innovation (OFDM) is now a 
turkey.
 
It has a life, but not because of all those "features".  In most likelihood, 
all those "features" will eventually make it so toxic that it will be replaced 
quickly (not by another cellular operator-centric protocol but by something 
quite different that unifies fiber and wireless in "local" deployments, if I 
were to bet).
 
-Original Message-
From: "Ketan Kulkarni" 
Sent: Friday, March 1, 2013 10:39pm
To: dpr...@reed.com
Cc: "Jim Gettys" , "cerowrt-devel@lists.bufferbloat.net" 
, "bloat" 
Subject: Re: [Cerowrt-devel] Google working on experimental 3.8 Linux kernel 
for Android



Hi David,

While I tend to agree with most of the stuff, however the complexity and too 
many knobs in mobile networks do come with the added technology.

Consider from end-user perspective, getting a voice call while 
surfing/downloading on 2G/3G interrupts all the download and it is annoying. 
 So when LTE provides a spec to handle voip + internet both simultaneously, its 
great benefit to end user.
While roaming around in LTE and moving to 2G/3G network or vice-versa, the 
handoff occurs seamlessly, the internet traffic is not interrupted. This was 
not the case in previous mobile generations. End-user is more satisfied as it 
relates to the daily usage of mobile phones.
 Similarly going ahead we might very well have handoff from wifi to LTE - why 
not? 

Now for (non-technical) mobile users, these are good and simple to have 
features. But from networks perspective, where and how will this complexity be 
handled? definitely some nodes in the network will have to worry about LTE, 
UMTS, CDMA, eHRPD and what not.
 This gives some idea of how really complex the network looks like -
[http://www.trilliumposter.com/lte.php] http://www.trilliumposter.com/lte.php

From mobile ISP perspective, they invest heavy amount in getting channel 
license from governments. It takes years to co