Hello Erik, On 08/15/21 - 15:39, Erik Auerswald wrote: > Hi, > > I'd like to thank you for working on a nice I-D describing an interesting > and IMHO useful network measurement metric. > > Since feedback was asked for, I'd like to try and provide constructive > feedback.
thanks a lot for your detailed feedback! Please see further inline. > In general, I like the idea of "Round-trips per Minute" (RPM) as a > metric used to characterize (one aspect of) a network. I do think that > introducing this would improve the status quo. Since this RPM definition > comprises a specific way of adding load to the network and measuring a > complex metric, I think it is useful to "standardize" it. > > I do not think RPM can replace all other metrics. This is, in a way, > mentioned in the introduction, where it is suggested to add RPM to > existing measurement platforms. As such I just want to point this out > more explicitely, but do not intend to diminish the RPM idea by this. > In short, I'd say it's complicated. Yes, I fully agree that RPM is not the only metric. It is one among many. If there is a sentiment in our document that sounds like "RPM is the only that matters", please let me know where so we can reword the text. > Bandwidth matters for bulk data transfer, e.g., downloading a huge update > required for playing a multiplayer game online. > > Minimum latency matters for the feasibility of interactive applications, > e.g., controlling a toy car in your room vs. a robotic arm on the ISS > from Earth vs. orbital insertion around Mars from Earth. For a more > mundane use case consider a voice conference. (A good decade ago I > experienced a voice conferencing system running over IP that introduced > over one second of (minimum) latency and therefore was awkward to use.) Wrt to minimum latency: To some extend it is a subset of "RPM". But admittedly, measuring minimum latency on its own is good for debugging purposes and to know what one can get on a network that is not in persistent working conditions. > Expressing 'bufferbloat as a measure of "Round-trips per Minute" (RPM)' > exhibits (at least) two problems: > > 1. A high RPM value is associated with little bufferbloat problems. > > 2. A low RPM value may be caused by high minimum delay instead of > bufferbloat. > > I think that RPM (i.e., under working conditions) measures a network's > usefulness for interactive applications, but not necessarily bufferbloat. You are right and we are definitely misrepresenting this in the text. I filed https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/8. If you want, feel free to submit a pull-request otherwise, we will get to the issue in the next weeks. > I do think that RPM is in itself more generally useful than minimum > latency or bandwidth. > > A combination of low minimum latency with low RPM value strongly hints > at bufferbloat. Other combinations are less easily characterized. > > Bufferbloat can still lie in hiding, e.g., when a link with bufferbloat > is not yet the bottleneck, or if the communications end-points are not > yet able to saturate the network inbetween. Thus high bandwidth can > result in high RPM values despite (hidden) bufferbloat. > > The "Measuring is Hard" section mentions additional complications. > > All in all, I do think that "measuring bufferbloat" and "measuring RPM" > should not be used synonymously. The I-D title clearly shows this: > RPM is measuring "Responsiveness under Working Conditions" which may be > affected by bufferbloat, among other potential factors, but is not in > itself bufferbloat. > > Under the assumption that only a single value (performance score) is > considered, I do think that RPM is more generally useful than bandwidth > or idle latency. > > On a meta-level, I think that the word "bufferbloat" is not used according > to a single self-consistent definition in the I-D. Fully agree with all your points above on how we misrepresented the relation between RPM and bufferbloat. > Additionally, I think that the I-D should reference DNS, HTTP/2, and > TLS 1.3, since these protocols are required for implementing the RPM > measurement. The same for JSON, I think. Possibly URL. Yes, we have not given the references & citations enough care. (https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/2) > Using "rpm.example" instead of "example.apple.com" would result in shorter > lines for the example JSON. > > "host123.cdn.example" instead of "hostname123.cdnprovider.com" might be > a more appropriate example DNS name. Oups, we forgot to adjust these to a more generic hostname... (https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/9) > Adding an informative reference to RFC 2606 / BCP 32 might raise awareness > of the existence of a BCP on example DNS names. > > Please find both a unified diff against the text rendering of the I-D, > and a word diff produced from the unified diff, attached to this email > in order to suggest editorial changes that are intended to improve the > reading experience. They are intended for reading and (possibly partial) > manual application, since the text rendering of an I-D is usually not > the preferred form of editing it. Thanks a lot for these (https://github.com/network-quality/draft-cpaasch-ippm-responsiveness/issues/10) Regards, Christoph > > Thanks, > Erik > -- > Always use the right tool for the job. > -- Rob Pike > > > On Fri, Aug 13, 2021 at 02:41:05PM -0700, Christoph Paasch via Bloat wrote: > > I already posted this to the RPM-list, but the audience here on bloat should > > be interested as well. > > > > > > This is the specification of Apple's responsiveness/RPM test. We believe > > that it > > would be good for the bufferbloat-effort to have a specification of how to > > quantify the extend of bufferbloat from a user's perspective. Our > > Internet-draft is a first step in that direction and we hope that it will > > kick off some collaboration. > > > > > > Feedback is very welcome! > > > > > > Cheers, > > Christoph > > > > > > ----- Forwarded message from internet-dra...@ietf.org ----- > > > > From: internet-dra...@ietf.org > > To: Christoph Paasch <cpaa...@apple.com>, Omer Shapira <o...@apple.com>, > > Randall Meyer <r...@apple.com>, Stuart Cheshire > > <chesh...@apple.com> > > Date: Fri, 13 Aug 2021 09:43:40 -0700 > > Subject: New Version Notification for > > draft-cpaasch-ippm-responsiveness-00.txt > > > > > > A new version of I-D, draft-cpaasch-ippm-responsiveness-00.txt > > has been successfully submitted by Christoph Paasch and posted to the > > IETF repository. > > > > Name: draft-cpaasch-ippm-responsiveness > > Revision: 00 > > Title: Responsiveness under Working Conditions > > Document date: 2021-08-13 > > Group: Individual Submission > > Pages: 12 > > URL: > > https://www.ietf.org/archive/id/draft-cpaasch-ippm-responsiveness-00.txt > > Status: > > https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/ > > Htmlized: > > https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness > > > > > > Abstract: > > Bufferbloat has been a long-standing problem on the Internet with > > more than a decade of work on standardizing technical solutions, > > implementations and testing. However, to this date, bufferbloat is > > still a very common problem for the end-users. Everyone "knows" that > > it is "normal" for a video conference to have problems when somebody > > else on the same home-network is watching a 4K movie. > > > > The reason for this problem is not the lack of technical solutions, > > but rather a lack of awareness of the problem-space, and a lack of > > tooling to accurately measure the problem. We believe that exposing > > the problem of bufferbloat to the end-user by measuring the end- > > users' experience at a high level will help to create the necessary > > awareness. > > > > This document is a first attempt at specifying a measurement > > methodology to evaluate bufferbloat the way common users are > > experiencing it today, using today's most frequently used protocols > > and mechanisms to accurately measure the user-experience. We also > > provide a way to express the bufferbloat as a measure of "Round-trips > > per minute" (RPM) to have a more intuitive way for the users to > > understand the notion of bufferbloat. > > > > > > > > > > > > The IETF Secretariat > > > > > > > > ----- End forwarded message ----- > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > --- draft-cpaasch-ippm-responsiveness-00.txt 2021-08-15 12:01:01.213813125 > +0200 > +++ draft-cpaasch-ippm-responsiveness-00-ea.txt 2021-08-15 > 15:08:08.013416074 +0200 > @@ -17,7 +17,7 @@ > > Bufferbloat has been a long-standing problem on the Internet with > more than a decade of work on standardizing technical solutions, > - implementations and testing. However, to this date, bufferbloat is > + implementations, and testing. However, to this date, bufferbloat is > still a very common problem for the end-users. Everyone "knows" that > it is "normal" for a video conference to have problems when somebody > else on the same home-network is watching a 4K movie. > @@ -33,8 +33,8 @@ > methodology to evaluate bufferbloat the way common users are > experiencing it today, using today's most frequently used protocols > and mechanisms to accurately measure the user-experience. We also > - provide a way to express the bufferbloat as a measure of "Round-trips > - per minute" (RPM) to have a more intuitive way for the users to > + provide a way to express bufferbloat as a measure of "Round-trips > + per Minute" (RPM) to have a more intuitive way for the users to > understand the notion of bufferbloat. > > Status of This Memo > @@ -81,14 +81,14 @@ > Table of Contents > > 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 > - 2. Measuring is hard . . . . . . . . . . . . . . . . . . . . . . 3 > + 2. Measuring is Hard . . . . . . . . . . . . . . . . . . . . . . 3 > 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 > 4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5 > 4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5 > 4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6 > - 4.1.2. From single-flow to multi-flow . . . . . . . . . . . 7 > - 4.1.3. Reaching saturation . . . . . . . . . . . . . . . . . 7 > - 4.1.4. Final algorithm . . . . . . . . . . . . . . . . . . . 7 > + 4.1.2. From Single-flow to Multi-flow . . . . . . . . . . . 7 > + 4.1.3. Reaching Saturation . . . . . . . . . . . . . . . . . 7 > + 4.1.4. Final Algorithm . . . . . . . . . . . . . . . . . . . 7 > 4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8 > 4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9 > 4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10 > @@ -103,8 +103,8 @@ > > For many years, bufferbloat has been known as an unfortunately common > issue in todays networks [Bufferbloat]. Solutions like FQ-codel > - [RFC8289] or PIE [RFC8033] have been standardized and are to some > - extend widely implemented. Nevertheless, users still suffer from > + [RFC8290] or PIE [RFC8033] have been standardized and are to some > + extent widely implemented. Nevertheless, users still suffer from > bufferbloat. > > > @@ -129,7 +129,7 @@ > bufferbloat problem. > > We believe that it is necessary to create a standardized way for > - measuring the extend of bufferbloat in a network and express it to > + measuring the extent of bufferbloat in a network and express it to > the user in a user-friendly way. This should help existing > measurement tools to add a bufferbloat measurement to their set of > metrics. It will also allow to raise the awareness to the problem > @@ -144,10 +144,10 @@ > classification for those protocols is very common. It is thus very > important to use those protocols for the measurements to avoid > focusing on use-cases that are not actually affecting the end-user. > - Finally, we propose to use "round-trips per minute" as a metric to > - express the extend of bufferbloat. > + Finally, we propose to use "Round-trips per Minute" as a metric to > + express the extent of bufferbloat. > > -2. Measuring is hard > +2. Measuring is Hard > > There are several challenges around measuring bufferbloat accurately > on the Internet. These challenges are due to different factors. > @@ -155,7 +155,7 @@ > problem space, and the reproducibility of the measurement. > > It is well-known that transparent TCP proxies are widely deployed on > - port 443 and/or port 80, while less common on other ports. Thus, > + port 443 and/or port 80, while less commonly on other ports. Thus, > choice of the port-number to measure bufferbloat has a significant > influence on the result. Other factors are the protocols being used. > TCP and UDP traffic may take a largely different path on the Internet > @@ -186,17 +186,17 @@ > measurement. It seems that it's best to avoid extending the duration > of the test beyond what's needed. > > - The problem space around the bufferbloat is huge. Traditionally, one > + The problem space around bufferbloat is huge. Traditionally, one > thinks of bufferbloat happening on the routers and switches of the > Internet. Thus, simply measuring bufferbloat at the transport layer > would be sufficient. However, the networking stacks of the clients > and servers can also experience huge amounts of bufferbloat. Data > sitting in TCP sockets or waiting in the application to be scheduled > for sending causes artificial latency, which affects user-experience > - the same way the "traditional" bufferbloat does. > + the same way "traditional" bufferbloat does. > > Finally, measuring bufferbloat requires us to fill the buffers of the > - bottleneck and when buffer occupancy is at its peak, the latency > + bottleneck, and when buffer occupancy is at its peak, the latency > measurement needs to be done. Achieving this in a reliable and > reproducible way is not easy. First, one needs to ensure that > buffers are actually full for a sustained period of time to allow for > @@ -250,15 +250,15 @@ > bufferbloat. > > 4. Finally, in order for this measurement to be user-friendly to a > - wide audience it is important that such a measurement finishes > - within a short time-frame and short being anything below 20 > + wide audience, it is important that such a measurement finishes > + within a short time-frame with short being anything below 20 > seconds. > > 4. Measuring Responsiveness > > The ability to reliably measure the responsiveness under typical > working conditions is predicated by the ability to reliably put the > - network in a state representative of the said conditions. Once the > + network in a state representative of said conditions. Once the > network has reached the required state, its responsiveness can be > measured. The following explains how the former and the latter are > achieved. > @@ -270,7 +270,7 @@ > experiencing ingress and egress flows that are similar to those when > used by humans in the typical day-to-day pattern. > > - While any network can be put momentarily into working condition by > + While any network can be put momentarily into working conditions by > the means of a single HTTP transaction, taking measurements requires > maintaining such conditions over sufficient time. Thus, measuring > the network responsiveness in a consistent way depends on our ability > @@ -286,7 +286,7 @@ > way to achieve this is by creating multiple large bulk data-transfers > in either downstream or upstream direction. Similar to conventional > speed-test applications that also create a varying number of streams > - to measure throughput. Working-conditions does the same. It also > + to measure throughput. Working conditions does the same. It also > requires a way to detect when the network is in a persistent working > condition, called "saturation". This can be achieved by monitoring > the instantaneous goodput over time. When the goodput stops > @@ -298,7 +298,7 @@ > o Should not waste traffic, since the user may be paying for it > > o Should finish within a short time-frame to avoid impacting other > - users on the same network and/or experience varying conditions > + users on the same network and/or experiencing varying conditions > > 4.1.1. Parallel vs Sequential Uplink and Downlink > > @@ -308,8 +308,8 @@ > upstream) or the routing in the ISPs. Users sending data to an > Internet service will fill the bottleneck on the upstream path to the > server and thus expose a potential for bufferbloat to happen at this > - bottleneck. On the downlink direction any download from an Internet > - service will encounter a bottleneck and thus exposes another > + bottleneck. In the downlink direction any download from an Internet > + service will encounter a bottleneck and thus expose another > potential for bufferbloat. Thus, when measuring responsiveness under > working conditions it is important to consider both, the upstream and > the downstream bufferbloat. This opens the door to measure both > @@ -322,13 +322,16 @@ > seconds of test per direction, while parallel measurement will allow > for 20 seconds of testing in both directions. > > - However, a number caveats come with measuring in parallel: - Half- > - duplex links may not expose uplink and downlink bufferbloat: A half- > - duplex link may not allow during parallel measurement to saturate > - both the uplink and the downlink direction. Thus, bufferbloat in > - either of the directions may not be exposed during parallel > - measurement. - Debuggability of the results becomes more obscure: > - During parallel measurement it is impossible to differentiate on > + However, a number caveats come with measuring in parallel: > + > + - Half-duplex links may not expose uplink and downlink bufferbloat: > + A half-duplex link may not allow to saturate both the uplink > + and the downlink direction during parallel measurement. Thus, > + bufferbloat in either of the directions may not be exposed during > + parallel measurement. > + > + - Debuggability of the results becomes more obscure: > + During parallel measurement it is impossible to differentiate on > > > > @@ -338,26 +341,26 @@ > Internet-Draft Responsiveness under Working Conditions August 2021 > > > - whether the bufferbloat happens in the uplink or the downlink > - direction. > + whether the bufferbloat happens in the uplink or the downlink > + direction. > > -4.1.2. From single-flow to multi-flow > +4.1.2. From Single-flow to Multi-flow > > - As described in RFC 6349, a single TCP connection may not be > + As described in [RFC6349], a single TCP connection may not be > sufficient to saturate a path between a client and a server. On a > high-BDP network, traditional TCP window-size constraints of 4MB are > often not sufficient to fill the pipe. Additionally, traditional > - loss-based TCP congestion control algorithms aggressively reacts to > + loss-based TCP congestion control algorithms aggressively react to > packet-loss by reducing the congestion window. This reaction will > - reduce the queuing in the network, and thus "artificially" make the > - bufferbloat appear lesser. > + reduce the queuing in the network, and thus "artificially" make > + bufferbloat appear less of a problem. > > - The goal of the measurement is to keep the network as busy as > - possible in a sustained and persistent way. Thus, using multiple TCP > + The goal is to keep the network as busy as possible in a sustained > + and persistent way during the measurement. Thus, using multiple TCP > connections is needed for a sustained bufferbloat by gradually adding > - TCP flows until saturation is needed. > + TCP flows until saturation is reached. > > -4.1.3. Reaching saturation > +4.1.3. Reaching Saturation > > It is best to detect when saturation has been reached so that the > measurement of responsiveness can start with the confidence that the > @@ -367,8 +370,8 @@ > buffers are completely filled. Thus, this depends highly on the > congestion control that is being deployed on the sender-side. > Congestion control algorithms like BBR may reach high throughput > - without causing bufferbloat. (because the bandwidth-detection portion > - of BBR is effectively seeking the bottleneck capacity) > + without causing bufferbloat (because the bandwidth-detection portion > + of BBR is effectively seeking the bottleneck capacity). > > It is advised to rather use loss-based congestion controls like Cubic > to "reliably" ensure that the buffers are filled. > @@ -379,7 +382,7 @@ > packet-loss or ECN-marks signaling a congestion or even a full buffer > of the bottleneck link. > > -4.1.4. Final algorithm > +4.1.4. Final Algorithm > > The following is a proposal for an algorithm to reach saturation of a > network by using HTTP/2 upload (POST) or download (GET) requests of > @@ -404,7 +407,7 @@ > throughput will remain stable. In the latter case, this means that > saturation has been reached and - more importantly - is stable. > > - In detail, the steps of the algorithm are the following > + In detail, the steps of the algorithm are the following: > > o Create 4 load-bearing connections > > @@ -453,7 +456,7 @@ > the different stages of a separate network transaction as well as > measuring on the load-bearing connections themselves. > > - Two aspects are being measured with this approach : > + Two aspects are being measured with this approach: > > 1. How the network handles new connections and their different > stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2 > @@ -463,19 +466,19 @@ > > 2. How the network and the client/server networking stack handles > the latency on the load-bearing connections themselves. E.g., > - Smart queuing techniques on the bottleneck will allow to keep the > + smart queuing techniques on the bottleneck will allow to keep the > latency within a reasonable limit in the network and buffer- > - reducing techniques like TCP_NOTSENT_LOWAT makes sure the client > + reducing techniques like TCP_NOTSENT_LOWAT make sure the client > and server TCP-stack is not a source of significant latency. > > To measure the former, we send a DNS-request, establish a TCP- > connection on port 443, establish a TLS-context using TLS1.3 and send > - an HTTP2 GET request for an object of a single byte large. This > + an HTTP/2 GET request for an object the size of a single byte. This > measurement will be repeated multiple times for accuracy. Each of > these stages allows to collect a single latency measurement that can > then be factored into the responsiveness computation. > > - To measure the latter, on the load-bearing connections (that uses > + To measure the latter, on the load-bearing connections (that use > HTTP/2) a GET request is multiplexed. This GET request is for a > 1-byte object. This allows to measure the end-to-end latency on the > connections that are using the network at full speed. > @@ -492,10 +495,10 @@ > an equal weight to each of these measurements. > > Finally, the resulting latency needs to be exposed to the users. > - Users have been trained to accept metrics that have a notion of "The > + Users have been trained to accept metrics that have a notion of "the > higher the better". Latency measuring in units of seconds however is > "the lower the better". Thus, converting the latency measurement to > - a frequency allows using the familiar notion of "The higher the > + a frequency allows using the familiar notion of "the higher the > better". The term frequency has a very technical connotation. What > we are effectively measuring is the number of round-trips from the > > @@ -513,7 +516,7 @@ > which is a wink to the "revolutions per minute" that we are used to > in cars. > > - Thus, our unit of measure is "Round-trip per Minute" (RPM) that > + Thus, our unit of measure is "Round-trips per Minute" (RPM) that > expresses responsiveness under working conditions. > > 4.2.2. Statistical Confidence > @@ -527,13 +530,13 @@ > 5. Protocol Specification > > By using standard protocols that are most commonly used by end-users, > - no new protocol needs to be specified. However, both client and > + no new protocol needs to be specified. However, both clients and > servers need capabilities to execute this kind of measurement as well > - as a standard to flow to provision the client with the necessary > + as a standard to follow to provision the client with the necessary > information. > > First, the capabilities of both the client and the server: It is > - expected that both hosts support HTTP/2 over TLS 1.3. That the > + expected that both hosts support HTTP/2 over TLS 1.3, and that the > client is able to send a GET-request and a POST. The server needs > the ability to serve both of these HTTP commands. Further, the > server endpoint is accessible through a hostname that can be resolved > @@ -546,13 +549,13 @@ > 1. A config URL/response: This is the configuration file/format used > by the client. It's a simple JSON file format that points the > client at the various URLs mentioned below. All of the fields > - are required except "test_endpoint". If the service-procier can > + are required except "test_endpoint". If the service-provider can > pin all of the requests for a test run to a specific node in the > service (for a particular run), they can specify that node's name > in the "test_endpoint" field. It's preferred that pinning of > some sort is available. This is to ensure the measurement is > against the same paths and not switching hosts during a test run > - (ie moving from near POP A to near POP B) Sample content of this > + (i.e., moving from near POP A to near POP B). Sample content of this > JSON would be: > > > @@ -577,7 +580,7 @@ > > 3. A "large" URL/response: This needs to serve a status code of 200 > and a body size of at least 8GB. The body can be bigger, and > - will need to grow as network speeds increases over time. The > + will need to grow as network speeds increase over time. The > actual body content is irrelevant. The client will probably > never completely download the object. > > @@ -618,16 +621,19 @@ > Internet-Draft Responsiveness under Working Conditions August 2021 > > > + [RFC6349] ... > + > [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, > "Proportional Integral Controller Enhanced (PIE): A > Lightweight Control Scheme to Address the Bufferbloat > Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, > <https://www.rfc-editor.org/info/rfc8033>. > > - [RFC8289] Nichols, K., Jacobson, V., McGregor, A., Ed., and J. > - Iyengar, Ed., "Controlled Delay Active Queue Management", > - RFC 8289, DOI 10.17487/RFC8289, January 2018, > - <https://www.rfc-editor.org/info/rfc8289>. > + [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Ed., and > + Gettys, J., "The Flow Queue CoDel Packet Scheduler and > + Active Queue Management Algorithm", RFC 8290, > + DOI 10.17487/RFC8290, January 2018, > + <https://www.rfc-editor.org/info/rfc8290>. > > Authors' Addresses > > [--- draft-cpaasch-ippm-responsiveness-00.txt-]{+++ > draft-cpaasch-ippm-responsiveness-00-ea.txt+} 2021-08-15 > [-12:01:01.213813125-] {+15:08:08.013416074+} +0200 > @@ -17,7 +17,7 @@ > > Bufferbloat has been a long-standing problem on the Internet with > more than a decade of work on standardizing technical solutions, > [-implementations-] > {+implementations,+} and testing. However, to this date, bufferbloat is > still a very common problem for the end-users. Everyone "knows" that > it is "normal" for a video conference to have problems when somebody > else on the same home-network is watching a 4K movie. > @@ -33,8 +33,8 @@ > methodology to evaluate bufferbloat the way common users are > experiencing it today, using today's most frequently used protocols > and mechanisms to accurately measure the user-experience. We also > provide a way to express [-the-] bufferbloat as a measure of "Round-trips > per [-minute"-] {+Minute"+} (RPM) to have a more intuitive way for the > users to > understand the notion of bufferbloat. > > Status of This Memo > @@ -81,14 +81,14 @@ > Table of Contents > > 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 > 2. Measuring is [-hard-] {+Hard+} . . . . . . . . . . . . . . . . . . . . > . . 3 > 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 > 4. Measuring Responsiveness . . . . . . . . . . . . . . . . . . 5 > 4.1. Working Conditions . . . . . . . . . . . . . . . . . . . 5 > 4.1.1. Parallel vs Sequential Uplink and Downlink . . . . . 6 > 4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] > {+Multi-flow+} . . . . . . . . . . . 7 > 4.1.3. Reaching [-saturation-] {+Saturation+} . . . . . . . . . . . . > . . . . . 7 > 4.1.4. Final [-algorithm-] {+Algorithm+} . . . . . . . . . . . . . . > . . . . . 7 > 4.2. Measuring Responsiveness . . . . . . . . . . . . . . . . 8 > 4.2.1. Aggregating Round-trips per Minute . . . . . . . . . 9 > 4.2.2. Statistical Confidence . . . . . . . . . . . . . . . 10 > @@ -103,8 +103,8 @@ > > For many years, bufferbloat has been known as an unfortunately common > issue in todays networks [Bufferbloat]. Solutions like FQ-codel > [-[RFC8289]-] > {+[RFC8290]+} or PIE [RFC8033] have been standardized and are to some > [-extend-] > {+extent+} widely implemented. Nevertheless, users still suffer from > bufferbloat. > > > @@ -129,7 +129,7 @@ > bufferbloat problem. > > We believe that it is necessary to create a standardized way for > measuring the [-extend-] {+extent+} of bufferbloat in a network and > express it to > the user in a user-friendly way. This should help existing > measurement tools to add a bufferbloat measurement to their set of > metrics. It will also allow to raise the awareness to the problem > @@ -144,10 +144,10 @@ > classification for those protocols is very common. It is thus very > important to use those protocols for the measurements to avoid > focusing on use-cases that are not actually affecting the end-user. > Finally, we propose to use [-"round-trips-] {+"Round-trips+} per > [-minute"-] {+Minute"+} as a metric to > express the [-extend-] {+extent+} of bufferbloat. > > 2. Measuring is [-hard-] {+Hard+} > > There are several challenges around measuring bufferbloat accurately > on the Internet. These challenges are due to different factors. > @@ -155,7 +155,7 @@ > problem space, and the reproducibility of the measurement. > > It is well-known that transparent TCP proxies are widely deployed on > port 443 and/or port 80, while less [-common-] {+commonly+} on other > ports. Thus, > choice of the port-number to measure bufferbloat has a significant > influence on the result. Other factors are the protocols being used. > TCP and UDP traffic may take a largely different path on the Internet > @@ -186,17 +186,17 @@ > measurement. It seems that it's best to avoid extending the duration > of the test beyond what's needed. > > The problem space around [-the-] bufferbloat is huge. Traditionally, one > thinks of bufferbloat happening on the routers and switches of the > Internet. Thus, simply measuring bufferbloat at the transport layer > would be sufficient. However, the networking stacks of the clients > and servers can also experience huge amounts of bufferbloat. Data > sitting in TCP sockets or waiting in the application to be scheduled > for sending causes artificial latency, which affects user-experience > the same way [-the-] "traditional" bufferbloat does. > > Finally, measuring bufferbloat requires us to fill the buffers of the > [-bottleneck-] > {+bottleneck,+} and when buffer occupancy is at its peak, the latency > measurement needs to be done. Achieving this in a reliable and > reproducible way is not easy. First, one needs to ensure that > buffers are actually full for a sustained period of time to allow for > @@ -250,15 +250,15 @@ > bufferbloat. > > 4. Finally, in order for this measurement to be user-friendly to a > wide [-audience-] {+audience,+} it is important that such a > measurement finishes > within a short time-frame [-and-] {+with+} short being anything below > 20 > seconds. > > 4. Measuring Responsiveness > > The ability to reliably measure the responsiveness under typical > working conditions is predicated by the ability to reliably put the > network in a state representative of [-the-] said conditions. Once the > network has reached the required state, its responsiveness can be > measured. The following explains how the former and the latter are > achieved. > @@ -270,7 +270,7 @@ > experiencing ingress and egress flows that are similar to those when > used by humans in the typical day-to-day pattern. > > While any network can be put momentarily into working [-condition-] > {+conditions+} by > the means of a single HTTP transaction, taking measurements requires > maintaining such conditions over sufficient time. Thus, measuring > the network responsiveness in a consistent way depends on our ability > @@ -286,7 +286,7 @@ > way to achieve this is by creating multiple large bulk data-transfers > in either downstream or upstream direction. Similar to conventional > speed-test applications that also create a varying number of streams > to measure throughput. [-Working-conditions-] {+Working conditions+} > does the same. It also > requires a way to detect when the network is in a persistent working > condition, called "saturation". This can be achieved by monitoring > the instantaneous goodput over time. When the goodput stops > @@ -298,7 +298,7 @@ > o Should not waste traffic, since the user may be paying for it > > o Should finish within a short time-frame to avoid impacting other > users on the same network and/or [-experience-] {+experiencing+} > varying conditions > > 4.1.1. Parallel vs Sequential Uplink and Downlink > > @@ -308,8 +308,8 @@ > upstream) or the routing in the ISPs. Users sending data to an > Internet service will fill the bottleneck on the upstream path to the > server and thus expose a potential for bufferbloat to happen at this > bottleneck. [-On-] {+In+} the downlink direction any download from an > Internet > service will encounter a bottleneck and thus [-exposes-] {+expose+} another > potential for bufferbloat. Thus, when measuring responsiveness under > working conditions it is important to consider both, the upstream and > the downstream bufferbloat. This opens the door to measure both > @@ -322,13 +322,16 @@ > seconds of test per direction, while parallel measurement will allow > for 20 seconds of testing in both directions. > > However, a number caveats come with measuring in parallel: > > - [-Half- > duplex-] {+Half-duplex+} links may not expose uplink and downlink > bufferbloat: > A [-half- > duplex-] {+half-duplex+} link may not allow [-during parallel > measurement-] to saturate both the uplink > and the downlink [-direction.-] {+direction during parallel > measurement.+} Thus, > bufferbloat in either of the directions may not be exposed during > parallel measurement. > > - Debuggability of the results becomes more obscure: > During parallel measurement it is impossible to differentiate on > > > > @@ -338,26 +341,26 @@ > Internet-Draft Responsiveness under Working Conditions August 2021 > > > whether the bufferbloat happens in the uplink or the downlink > direction. > > 4.1.2. From [-single-flow-] {+Single-flow+} to [-multi-flow-] {+Multi-flow+} > > As described in [-RFC 6349,-] {+[RFC6349],+} a single TCP connection may > not be > sufficient to saturate a path between a client and a server. On a > high-BDP network, traditional TCP window-size constraints of 4MB are > often not sufficient to fill the pipe. Additionally, traditional > loss-based TCP congestion control algorithms aggressively [-reacts-] > {+react+} to > packet-loss by reducing the congestion window. This reaction will > reduce the queuing in the network, and thus "artificially" make [-the-] > bufferbloat appear [-lesser.-] {+less of a problem.+} > > The goal [-of the measurement-] is to keep the network as busy as possible > in a sustained > and persistent [-way.-] {+way during the measurement.+} Thus, using > multiple TCP > connections is needed for a sustained bufferbloat by gradually adding > TCP flows until saturation is [-needed.-] {+reached.+} > > 4.1.3. Reaching [-saturation-] {+Saturation+} > > It is best to detect when saturation has been reached so that the > measurement of responsiveness can start with the confidence that the > @@ -367,8 +370,8 @@ > buffers are completely filled. Thus, this depends highly on the > congestion control that is being deployed on the sender-side. > Congestion control algorithms like BBR may reach high throughput > without causing [-bufferbloat.-] {+bufferbloat+} (because the > bandwidth-detection portion > of BBR is effectively seeking the bottleneck [-capacity)-] {+capacity).+} > > It is advised to rather use loss-based congestion controls like Cubic > to "reliably" ensure that the buffers are filled. > @@ -379,7 +382,7 @@ > packet-loss or ECN-marks signaling a congestion or even a full buffer > of the bottleneck link. > > 4.1.4. Final [-algorithm-] {+Algorithm+} > > The following is a proposal for an algorithm to reach saturation of a > network by using HTTP/2 upload (POST) or download (GET) requests of > @@ -404,7 +407,7 @@ > throughput will remain stable. In the latter case, this means that > saturation has been reached and - more importantly - is stable. > > In detail, the steps of the algorithm are the [-following-] {+following:+} > > o Create 4 load-bearing connections > > @@ -453,7 +456,7 @@ > the different stages of a separate network transaction as well as > measuring on the load-bearing connections themselves. > > Two aspects are being measured with this [-approach :-] {+approach:+} > > 1. How the network handles new connections and their different > stages (DNS-request, TCP-handshake, TLS-handshake, HTTP/2 > @@ -463,19 +466,19 @@ > > 2. How the network and the client/server networking stack handles > the latency on the load-bearing connections themselves. E.g., > [-Smart-] > {+smart+} queuing techniques on the bottleneck will allow to keep the > latency within a reasonable limit in the network and buffer- > reducing techniques like TCP_NOTSENT_LOWAT [-makes-] {+make+} sure the > client > and server TCP-stack is not a source of significant latency. > > To measure the former, we send a DNS-request, establish a TCP- > connection on port 443, establish a TLS-context using TLS1.3 and send > an [-HTTP2-] {+HTTP/2+} GET request for an object {+the size+} of a single > [-byte large.-] {+byte.+} This > measurement will be repeated multiple times for accuracy. Each of > these stages allows to collect a single latency measurement that can > then be factored into the responsiveness computation. > > To measure the latter, on the load-bearing connections (that [-uses-] > {+use+} > HTTP/2) a GET request is multiplexed. This GET request is for a > 1-byte object. This allows to measure the end-to-end latency on the > connections that are using the network at full speed. > @@ -492,10 +495,10 @@ > an equal weight to each of these measurements. > > Finally, the resulting latency needs to be exposed to the users. > Users have been trained to accept metrics that have a notion of [-"The-] > {+"the+} > higher the better". Latency measuring in units of seconds however is > "the lower the better". Thus, converting the latency measurement to > a frequency allows using the familiar notion of [-"The-] {+"the+} higher > the > better". The term frequency has a very technical connotation. What > we are effectively measuring is the number of round-trips from the > > @@ -513,7 +516,7 @@ > which is a wink to the "revolutions per minute" that we are used to > in cars. > > Thus, our unit of measure is [-"Round-trip-] {+"Round-trips+} per Minute" > (RPM) that > expresses responsiveness under working conditions. > > 4.2.2. Statistical Confidence > @@ -527,13 +530,13 @@ > 5. Protocol Specification > > By using standard protocols that are most commonly used by end-users, > no new protocol needs to be specified. However, both [-client-] > {+clients+} and > servers need capabilities to execute this kind of measurement as well > as a standard to [-flow-] {+follow+} to provision the client with the > necessary > information. > > First, the capabilities of both the client and the server: It is > expected that both hosts support HTTP/2 over TLS [-1.3. That-] {+1.3, and > that+} the > client is able to send a GET-request and a POST. The server needs > the ability to serve both of these HTTP commands. Further, the > server endpoint is accessible through a hostname that can be resolved > @@ -546,13 +549,13 @@ > 1. A config URL/response: This is the configuration file/format used > by the client. It's a simple JSON file format that points the > client at the various URLs mentioned below. All of the fields > are required except "test_endpoint". If the [-service-procier-] > {+service-provider+} can > pin all of the requests for a test run to a specific node in the > service (for a particular run), they can specify that node's name > in the "test_endpoint" field. It's preferred that pinning of > some sort is available. This is to ensure the measurement is > against the same paths and not switching hosts during a test run > [-(ie-] > {+(i.e.,+} moving from near POP A to near POP [-B)-] {+B).+} Sample > content of this > JSON would be: > > > @@ -577,7 +580,7 @@ > > 3. A "large" URL/response: This needs to serve a status code of 200 > and a body size of at least 8GB. The body can be bigger, and > will need to grow as network speeds [-increases-] {+increase+} over > time. The > actual body content is irrelevant. The client will probably > never completely download the object. > > @@ -618,16 +621,19 @@ > Internet-Draft Responsiveness under Working Conditions August 2021 > > > {+[RFC6349] ...+} > > [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, > "Proportional Integral Controller Enhanced (PIE): A > Lightweight Control Scheme to Address the Bufferbloat > Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, > <https://www.rfc-editor.org/info/rfc8033>. > > [-[RFC8289] Nichols, K., Jacobson, V., McGregor, A.,-] > > {+[RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D.,+} Ed., and > [-J. > Iyengar, Ed., "Controlled Delay-] > {+Gettys, J., "The Flow Queue CoDel Packet Scheduler and+} > Active Queue [-Management",-] {+Management Algorithm",+} RFC > [-8289,-] {+8290,+} > DOI [-10.17487/RFC8289,-] {+10.17487/RFC8290,+} January 2018, > [-<https://www.rfc-editor.org/info/rfc8289>.-] > {+<https://www.rfc-editor.org/info/rfc8290>.+} > > Authors' Addresses > _______________________________________________ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat