Hi All, With the recent release of the Linux 3.5 kernel[1], the sch_fq_codel TCP/IP scheduler has been pulled into the mainline kernel. For those unfamiliar, this is the "Fair Queue Controlled Delay Active Queue Management" algorithm (hereafter referred to as "fq_codel"), based on an IETF draft[2] and enhanced by Eric Dumazet, who authored the Linux kernel implementation and enhanced it by adding the "Fair Queue" component. The purpose of fq_codel is to schedule packets in an "intelligent" way to enhance the user experience in common use cases: the goals are to maximize bandwidth utilization; minimize buffer bloat; allow small requests such as small HTTP downloads, VoIP packets, etc. to quickly get across the network while still allowing large requests such as streaming video to use a lot of bandwidth and use the buffering to compensate for jitter; and to efficiently address problems introduced by link asymmetry (e.g. faster downstream than upstream). On a common Android device connected to a 3G/4G cellular network, there are many sources of buffer bloat, and many different areas where controlling buffer sizes would be useful. The Linux kernel fq_codel implementation can not address all of these areas, because, for example, the cellular baseband processor tends not to run Linux, but it undoubtedly has some significant buffering. But a very significant part of that buffer bloat is the kernel IP stack's queues, which live in the operating system kernel rather than on the baseband. The observed result, which most people should be able to reproduce if you're on a cellular network, is that there is excessive buffering, leading to large delays in sending/receiving packets, when the network is saturated. A user-friendly way to observe this result is to run the ICSI Netalyzr test[3], but you'll either need a Java Applet environment on your Android device, or you will need to run the applet from another device that is networked with your phone. If you choose the latter, it is recommended that you enable fq_codel on the tethered device, and use a wired connection such as Ethernet or USB, to minimize the impact of adding a hop to the connection. If for some reason you can't run the Netalyzr test, you can also try the codel HOWTO test[4], but it is not necessary to run the ethtool command if you are able to easily saturate your link -- seeing how most cellular broadband links are quite slow, just sending random data over ssh to a dedicated server should give you a sufficiently saturated link for several seconds, long enough to run the test, which is basically a continuous ping while sending as much data as possible to another machine over ssh. The problem is that, with the current scheduling algorithm of pfifo_fast, it is very easy to introduce unreasonable latencies on the cellular connection (or really, any other connection you can saturate, e.g. slow wifi links), and it is often out of the control of the user due to the way programs like to do things on Android without asking the user. For example, you could be playing a real time multiplayer game on your phone, frequently sending and receiving tiny bits of data over UDP. Then your Play Store randomly decides to check for updates and finds that a 20 MB program (Google Chrome, say) needs to be updated. It starts downloading. Your latency within the game will skyrocket from typical unsaturated network latencies of ~50 to 200ms (depending on the cellular tech used, e.g. EvDO vs LTE) to something on the order of 3000 ms, ruining your experience. This is an extremely repeatable test and it transcends all barriers of device manufacturer, hardware, signal strength, carrier, and so on. The reason for this is that the "pfifo_fast" scheduling algorithm -- and most others within the Linux kernel other than fq_codel -- make either NO attempt to reduce/control buffer sizes, or their attempts are ultimately ineffective, or they require a lot of manual tuning to be even remotely successful at doing so. Another example scenario: the user is on a VoIP call (Google Voice, Skype, etc) using only a modest amount of bandwidth on the network, say, 10 or 20% (the compression on these services is amazing). During the call the user has the impulse to download a large file, say a Word document or PDF, over the internet using the Browser app. A few seconds after the download starts, the queues explode in size, and the latencies make it impossible for the time-sensitive VoIP packets to reach their destination in time. As a result, many of the packets that are received have to be discarded, and you end up with a lot of jitter and re-sends, which just further compounds the problem by further saturating the network and making the buffers even larger, which creates a positive feedback loop. Within a minute, 9 times out of 10, the VoIP call will completely drop because one or both ends will have timed out from being unable to get a packet across the wire within the allowed time slot. I would like to propose a two-stage approach to combatting buffer bloat on cellular networks. One of the stages directly involves the Android kernel; the other stage does not. Stage 1: The Android kernel should either backport the fq_codel code to current kernel trees, or else upgrade to at least the 3.5 kernel. Once the fq_codel code is integrated into the Android kernel trees, one way or another, it then has to be enabled by default. It is recommended to compile it built-in to the kernel, since, once it's enabled, it'd be used all of the time. This can be accomplished by setting CONFIG_NET_SCH_CODEL=y and CONFIG_NET_SCH_FQ_CODEL=y . Then, it has to be enabled at runtime for the relevant network interfaces using the `tc qdisc` command (see [4] for detaiils). Once all of this is accomplished, there should be a measurable reduction in latency over saturated links. Note that saturation occurs very quickly for protocols that are not very "chatty"; for instance, HTTP downloads. Even for a small download on the order of 1 - 2 MB, the network is saturated for a fraction of a second, and that can be enough to cause buffers to start bloating. Also, the slower the connection, the easier it is to saturate, because it is more likely that the requested bandwidth is available on the remote server. So this is less of a problem for upcoming networks like LTE Advanced, where speeds of up to 100 Mbps are expected -- it is not very common to be able to max out a 100 Mbps connection when downloading from an arbitrary server over the public Internet. But a 3G or 2G connection is basically always saturated whenever the device is doing almost anything with the network. Stage 2: Once the OS-level buffers are under control by fq_codel, it's time to lean on hardware manufacturers who make cellular basebands to add the codel algorithm to their products. It can be efficiently implemented either in software or directly in silicon, according to [2]. Although the local computation is obviously more complex than simpler / trivial scheduling algorithms -- such as pfifo_fast -- Moore's Law should obviate the availability of spare cycles for scheduling on modern devices, even for tiny baseband processors running the scheduling algorithm in software. The end result will be that users can simultaneously enjoy low latency services such as VoIP and real-time gaming, while saturating the connection with high-utilization protocols such as large HTTP uploads/downloads. With the current implementation, this is not really possible. [1]: http://kernel.org [2]: http://tools.ietf.org/html/draft-nichols-tsvwg-codel-00 [3]: http://netalyzr.icsi.berkeley.edu [4]: http://www.bufferbloat.net/projects/codel/wiki/HOWTO Thoughts? Comments? Thanks, Sean McNamara
-- unsubscribe: [email protected] website: http://groups.google.com/group/android-kernel
