[Inline]

On 01/23/2014 01:18 AM, Dave Taht wrote:
> On Thu, Jan 23, 2014 at 1:15 AM, Dave Taht <dave.t...@gmail.com> wrote:
>> On Thu, Jan 23, 2014 at 1:10 AM, Fred Baker (fred) <f...@cisco.com> wrote:
>>> No, you're not blowing smoke. I'm not sure I would compare the behavior to
>>> PMTUD, as in that the endpoint is given a magic number and manages to it,
>>> where in this case, it is given the results of its behavior, and it manages
>>> to improve that.
>>>
>>> But this is what I have rambled on about in threads relating to the size of
>>> a buffer. Folks would really like to have a magic way to calculate the
>>> buffer size (an amount of memory) they need to install in a router or
>>> switch, and it isn't that easy, because it has a lot to do with where in the
>>> network a system is located and how it is used by the applications that use
>>> it. But AQM, in the end, isn't about buffer size. It is about buffer
>>> occupancy. In the ideal case, if there are N sessions active on the
>>> bottleneck link in a path, we would like each to obtain 1/N of the
>>> bottleneck's capacity, which is to say that it should be able to maximize
>>> its throughput, while keeping an average of zero packets standing in the
>>> queue (minimizing both latency and variation in latency). If you know your
>>> math, you know that the ideal goal isn't actually achievable. But that
>>> doesn't stop us from trying to asymptotically approach it.
>> I prefer to think of the goal as to keep a minimum of 1 packet in the queue,
>> not as an average of 0.
> And that's not strictly true either. In the case of wifi, and other bundling
> technologies like those used in cable, you want to keep a minimum of a "good"
> aggregate of packets in the queue for that technology.

>From my pointy queue-head, occupancy is something that averages over
time, and has a good value if something less than 1.  However, that's
not necessarily one packet.

In  cases where packets are spaced out in time, an average of
0.<something> packets is what I expect.  Let's say 0.7 , arbitrarily.
When they come in trains, "bullet trains", or more likely, "thundering
herds", then 0.7 trains are what I expect to see.

The size of trains can be measured, but I expect the real thing we care
about is whether the queue drains back to zero during the time period
you're using for your measurements.  If it doesn't, you're about to be
in a world of hurt (:-))  It's only the physical size limitations of a
buffer that forces an eventual drop and congestion control to happen in
bad algorithms.  In the queuing world, we'd expect the queue to grow
"without bound", and the latency to try for infinity.

<start going off-topic>
Sun used to measure queue occupancy as a Riemann sum, which allowed one
to graph it like this:

1.0       
           ||
   ___   ||
  |    |   ||
  |    |   ||
  |    |   ||
_|    |__||__
0.0

The first one is 0.66 for 5 time units, the second 1.0 for one unit.
Both fall to zero, so the system catches up. They have different effects
on other traffic, which is why I find them interesting: as CPU queues,
the first hurts interactivity more than the second.

I *think* this will be of interest to us, but probably later.  I've
pasted in a chunk of the kstat.h comment at the very bottom for later
reference...
</off-topic>

Because occupancy can be done in units of time, we don't have to know
buffer and packet *sizes*

  * A well-performing queue/buffer system will have an average queue
    occupancy of < 1
  * A buffer is of sufficient size if the queue occupancy regularly
    falls to zero repeatedly during the time period. Another way of
    saying the same thing is that  the queue empties.
  * An operable queue management system sends congestion notifications
    if the queue/buffer does not empty
  * A good queue management system sends congestion notifications in
    time to keep the queue/buffer emptying regularly.

The parameter here is the time period, and has to do with the maximum
rate at which packets (work) can arrive. In other words, it's a
parameter based on the speed of the link.

--dave
[By the way, there are lots of better queue-heads than I.  The serious
mathies have way better intuition about what you're seeing that I do]

>
>>> On Jan 17, 2014, at 3:51 PM, David Collier-Brown <dave...@rogers.com> wrote:
>>>
>>> I've been reading through the internet-drafts, and one paragraph struck me
>>> as very illuminating.  This is therefor a sanity-check before I go full-hog
>>> down a particular path...
>>>
>>> The comment is from Baker and Fairhurst,
>>> https://datatracker.ietf.org/doc/draft-ietf-aqm-recommendation/ and the
>>> paragraph is [emphases added]
>>>
>>> The point of buffering in the network is to absorb data bursts and to
>>> transmit them during the (hopefully) ensuing bursts of silence. This is
>>> essential to permit the transmission of bursty data. Normally small queues
>>> are preferred in network devices, with sufficient queue capacity to absorb
>>> the bursts. The counter-intuitive result is that maintaining normally-small
>>> queues can result in higher throughput as well as lower end-to- end delay.
>>> In summary, queue limits should not reflect the steady state queues we want
>>> to be maintained in the network; instead, they should reflect the size of
>>> bursts that a network device needs to absorb.
>>>
>>> All of a sudden we're talking about the kinds of queues I know a little
>>> about (:-))
>>> ---
>>>
>>> I'm going to suggest that these are queues and associated physical buffers
>>> that do two things:
>>>
>>> hold packets that arrive at a bottleneck for a long as it takes to send them
>>> out a slower link that they came in on, and
>>> hold bursts of packets that arrive adjacent to each other until they can be
>>> sent out in a normal spacing, with some small amount of time between them
>>>
>>>
>>> In an illustration of Dave Taht's, the first looks something like this
>>>
>>> -------------------------+
>>>     |X|            |X|      +-------------------
>>>     |X|            |X|         |XXX|    |XXX|
>>>     |X|            |X|      +-------------------
>>> -------------------------+
>>>
>>> At the choke-point there is a buffer at least big enough to give the packet
>>> a chance to wheel from line into column (:-)) and start down the smaller
>>> pipe.
>>>
>>> The speed at which the acks come back,  the frequency of drops, and any
>>> explicit congestion notifications slows the sender until they don't overload
>>> the skinnier pipe, thus spacing the packets in the fatter pipe out.
>>>
>>> Various causes [Leland] can slow or speed the packets in the fat pipe,
>>> making it possible for several to arrive adjacent to each other, followed by
>>> a gap.  The second purpose of a buffer is to hold  these bursts while things
>>> space themselves back out.
>>>
>>> They need to be big enough at minimum to do the speed matching,  and at
>>> maximum,  big enough to spread a burst back into a normal progression,
>>> always assuming that acks, drops and explicit congestion notifications are
>>> slowing the sender to the speed of the slowest part of the network.
>>> ---
>>>
>>> If I'm right about this, we can draw some helpful conclusions
>>>
>>> buffer sizes can be set based on measurements:
>>>
>>> speed differences, which are pretty static, plus
>>> observed burstyness
>>>
>>> drops and ECN can be done to match the slowest speed in the path
>>>
>>> The latter suddenly sounds a bit like path MTU discovery, except it's a bit
>>> more dynamic, and varies with both path and what's happening in various
>>> parts of it.
>>>
>>> To me, as a capacity/performance nerd, this sounds a lot more familiar and
>>> manageable. My question to you, before I start madly scribbling on the
>>> internet drafts is:
>>>
>>>     Am I blowing smoke?
>>>
>>> --dave
>>>
>>> --
>>> David Collier-Brown,         | Always do right. This will gratify
>>> System Programmer and Author | some people and astonish the rest
>>> dav...@spamcop.net           |                      -- Mark Twain
>>> (416) 223-8968
>>>
>>> _______________________________________________
>>>
>>> aqm mailing list
>>> aqm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/aqm
>>>
>>>
>>>
>>
>>
>> --
>> Dave Täht
>>
>> Fixing bufferbloat with cerowrt: 
>> http://www.teklibre.com/cerowrt/subscribe.html
>
>

       /*
         * Accumulated time and queue length statistics.
         *
         * Accumulated time statistics are kept as a running sum
         * of "active" time.  Queue length statistics are kept as a
         * running sum of the product of queue length and elapsed time
         * at that length -- i.e., a Riemann sum for queue length
         * integrated against time.  (You can also think of the active time
         * as a Riemann sum, for the boolean function (queue_length > 0)
         * integrated against time, or you can think of it as the
         * Lebesgue measure of the set on which queue_length > 0.)
         *
         *              ^
         *              |                       _________
         *              8                       | i4    |
         *              |                       |       |
         *      Queue   6                       |       |
         *      Length  |       _________       |       |
         *              4       | i2    |_______|       |
         *              |       |           i3          |
         *              2_______|                       |
         *              |    i1                         |
         *              |_______________________________|
         *              Time->  t1      t2      t3      t4
         * At each change of state (entry or exit from the queue),
         * we add the elapsed time (since the previous state change)
         * to the active time if the queue length was non-zero during
         * that interval; and we add the product of the elapsed time
         * times the queue length to the running length*time sum.
         *
         * This method is generalizable to measuring residency
         * in any defined system: instead of queue lengths, think
         * of "outstanding RPC calls to server X".
         *
         * A large number of I/O subsystems have at least two basic
         * "lists" of transactions they manage: one for transactions
         * that have been accepted for processing but for which processing
         * has yet to begin, and one for transactions which are actively
         * being processed (but not done). For this reason, two cumulative
         * time statistics are defined here: wait (pre-service) time,
         * and run (service) time.
         *
         * All times are 64-bit nanoseconds (hrtime_t), as returned by
         * gethrtime().
         *
         * The units of cumulative busy time are accumulated nanoseconds.
         * The units of cumulative length*time products are elapsed time
         * times queue length.        
         */


The ascii art is an excerpt from /usr/include/sys/kstat.h on Solaris 10



-- 
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
dav...@spamcop.net           |                      -- Mark Twain
(416) 223-8968

_______________________________________________
aqm mailing list
aqm@ietf.org
https://www.ietf.org/mailman/listinfo/aqm

Reply via email to