Hi All,
 
With the recent release of the Linux 3.5 kernel[1], the sch_fq_codel TCP/IP 
scheduler has been pulled into the mainline kernel. For those unfamiliar, 
this is the "Fair Queue Controlled Delay Active Queue Management" algorithm 
(hereafter referred to as "fq_codel"), based on an IETF draft[2] and 
enhanced by Eric Dumazet, who authored the Linux kernel implementation and 
enhanced it by adding the "Fair Queue" component.
 
The purpose of fq_codel is to schedule packets in an "intelligent" way to 
enhance the user experience in common use cases: the goals are to maximize 
bandwidth utilization; minimize buffer bloat; allow small requests such as 
small HTTP downloads, VoIP packets, etc. to quickly get across the network 
while still allowing large requests such as streaming video to use a lot of 
bandwidth and use the buffering to compensate for jitter; and to 
efficiently address problems introduced by link asymmetry (e.g. faster 
downstream than upstream).
 
On a common Android device connected to a 3G/4G cellular network, there are 
many sources of buffer bloat, and many different areas where controlling 
buffer sizes would be useful. The Linux kernel fq_codel implementation can 
not address all of these areas, because, for example, the cellular baseband 
processor tends not to run Linux, but it undoubtedly has some significant 
buffering. But a very significant part of that buffer bloat is the kernel 
IP stack's queues, which live in the operating system kernel rather than on 
the baseband.
 
The observed result, which most people should be able to reproduce if 
you're on a cellular network, is that there is excessive buffering, leading 
to large delays in sending/receiving packets, when the network is 
saturated. A user-friendly way to observe this result is to run the ICSI 
Netalyzr test[3], but you'll either need a Java Applet environment on your 
Android device, or you will need to run the applet from another device that 
is networked with your phone. If you choose the latter, it is recommended 
that you enable fq_codel on the tethered device, and use a wired connection 
such as Ethernet or USB, to minimize the impact of adding a hop to the 
connection. If for some reason you can't run the Netalyzr test, you can 
also try the codel HOWTO test[4], but it is not necessary to run the 
ethtool command if you are able to easily saturate your link -- seeing how 
most cellular broadband links are quite slow, just sending random data over 
ssh to a dedicated server should give you a sufficiently saturated link for 
several seconds, long enough to run the test, which is basically a 
continuous ping while sending as much data as possible to another machine 
over ssh.
 
The problem is that, with the current scheduling algorithm of pfifo_fast, 
it is very easy to introduce unreasonable latencies on the cellular 
connection (or really, any other connection you can saturate, e.g. slow 
wifi links), and it is often out of the control of the user due to the way 
programs like to do things on Android without asking the user.
 
For example, you could be playing a real time multiplayer game on your 
phone, frequently sending and receiving tiny bits of data over UDP. Then 
your Play Store randomly decides to check for updates and finds that a 20 
MB program (Google Chrome, say) needs to be updated. It starts downloading. 
Your latency within the game will skyrocket from typical 
unsaturated network latencies of ~50 to 200ms (depending on the cellular 
tech used, e.g. EvDO vs LTE) to something on the order of 3000 ms, ruining 
your experience. This is an extremely repeatable test and it transcends all 
barriers of device manufacturer, hardware, signal strength, carrier, and so 
on.
 
The reason for this is that the "pfifo_fast" scheduling algorithm -- and 
most others within the Linux kernel other than fq_codel -- make either NO 
attempt to reduce/control buffer sizes, or their attempts are ultimately 
ineffective, or they require a lot of manual tuning to be even remotely 
successful at doing so.
 
Another example scenario: the user is on a VoIP call (Google Voice, Skype, 
etc) using only a modest amount of bandwidth on the network, say, 10 or 20% 
(the compression on these services is amazing). During the call the user 
has the impulse to download a large file, say a Word document or PDF, over 
the internet using the Browser app. A few seconds after the download 
starts, the queues explode in size, and the latencies make it impossible 
for the time-sensitive VoIP packets to reach their destination in time. As 
a result, many of the packets that are received have to be discarded, and 
you end up with a lot of jitter and re-sends, which just further compounds 
the problem by further saturating the network and making the buffers even 
larger, which creates a positive feedback loop. Within a minute, 9 times 
out of 10, the VoIP call will completely drop because one or both ends will 
have timed out from being unable to get a packet across the wire within the 
allowed time slot.
 
I would like to propose a two-stage approach to combatting buffer bloat on 
cellular networks. One of the stages directly involves the Android kernel; 
the other stage does not.
 
Stage 1: The Android kernel should either backport the fq_codel code to 
current kernel trees, or else upgrade to at least the 3.5 kernel. Once the 
fq_codel code is integrated into the Android kernel trees, one way or 
another, it then has to be enabled by default. It is recommended to compile 
it built-in to the kernel, since, once it's enabled, it'd be used all of 
the time. This can be accomplished by setting CONFIG_NET_SCH_CODEL=y and 
CONFIG_NET_SCH_FQ_CODEL=y . Then, it has to be enabled at runtime for the 
relevant network interfaces using the `tc qdisc` command (see [4] for 
detaiils).
 
Once all of this is accomplished, there should be a measurable reduction in 
latency over saturated links. Note that saturation occurs very quickly for 
protocols that are not very "chatty"; for instance, HTTP downloads. Even 
for a small download on the order of 1 - 2 MB, the network is saturated for 
a fraction of a second, and that can be enough to cause buffers to start 
bloating. Also, the slower the connection, the easier it is to saturate, 
because it is more likely that the requested bandwidth is available on the 
remote server. So this is less of a problem for upcoming networks like LTE 
Advanced, where speeds of up to 100 Mbps are expected -- it is not very 
common to be able to max out a 100 Mbps connection when downloading from an 
arbitrary server over the public Internet. But a 3G or 2G connection is 
basically always saturated whenever the device is doing almost anything 
with the network.
 
Stage 2: Once the OS-level buffers are under control by fq_codel, it's time 
to lean on hardware manufacturers who make cellular basebands to add the 
codel algorithm to their products. It can be efficiently implemented either 
in software or directly in silicon, according to [2]. Although the local 
computation is obviously more complex than simpler / trivial scheduling 
algorithms -- such as pfifo_fast -- Moore's Law should obviate the 
availability of spare cycles for scheduling on modern devices, even for 
tiny baseband processors running the scheduling algorithm in software.
 
The end result will be that users can simultaneously enjoy low latency 
services such as VoIP and real-time gaming, while saturating the connection 
with high-utilization protocols such as large HTTP uploads/downloads. With 
the current implementation, this is not really possible.
 
[1]: http://kernel.org
[2]: http://tools.ietf.org/html/draft-nichols-tsvwg-codel-00
[3]: http://netalyzr.icsi.berkeley.edu 
[4]: http://www.bufferbloat.net/projects/codel/wiki/HOWTO
 
Thoughts? Comments?
 
Thanks,
 
Sean McNamara

-- 
unsubscribe: [email protected]
website: http://groups.google.com/group/android-kernel

Reply via email to