Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-07 Thread Pete Heist

> On May 6, 2018, at 9:17 PM, Sebastian Moeller  wrote:
> 
> you could also try different packet sizes for you probes. The size ratio 
> should give you an idea about the RTT difference explained by simple transit 
> times and anything on top of that might be related to queueing. I am probably 
> severely underestimating the scope of your question, but I believe that that 
> also has the advantage that, unlike packet-tuples, will not be as sensitive 
> to bonded path segments (which probably do not exist in your environment in 
> the first place).

I was also considering adding the option to irtt for different packet sizes in 
each cluster. I’d have to change the IPDV calculation to only calculate it for 
like-sized packets, which is possible.

I see how using the RTTs for different packet lengths might help estimate the 
transmission time, but not how I could use that to differentiate between 
channel contention and queueing delays(?) Maybe I see what you’re saying, that 
in addition to varying the DSCP value, I can also vary length to estimate the 
transmission time, and subtract that from the estimate I get for “contention” 
to have a better approximation of how much contention time there is, which 
sounds like a good idea... :)

I don’t know of any bonded segments in this case…


___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Pete Heist

> On May 6, 2018, at 10:51 PM, Pete Heist  wrote:
>> 
>> Ah, so this is a wired connection? And you are only targeting a
>> particular setup?
> 
> The backhaul is a mixture of wired and wireless links, each with an ALIX or 
> APU router. A 5 GHz wireless node might contain:
> 
> NanoStation M5   <—Ethernet—>   APU   <—Ethernet—>   NanoStation M5

BTW, I know the perils of soft rate limiting for WiFi connections, but there’s 
no other practical option here (I don’t think they’ll want to run LEDE on their 
UBNT hardware), and that’s the way it’s been working for years before I saw it. 
:)

I see a rate limit of 80 Mbit set on a point-to-point connection between two 
NSM5s, for example, whose real-world throughput might be 90Mbit. It wouldn’t 
surprise me if the queue is sometimes lost...___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Pete Heist

> On May 6, 2018, at 9:27 PM, Toke Høiland-Jørgensen  wrote:
>> 
>> The backhaul I’d like to test it on uses mostly NSM5s as the wireless
>> devices and APUs for routing, QoS, etc. The QoS scripts use the htb,
>> sfq and prio qdiscs. I’m hoping I can just add a prio qdisc / tc
>> filter somewhere in the existing rules.
> 
> Ah, so this is a wired connection? And you are only targeting a
> particular setup?

The backhaul is a mixture of wired and wireless links, each with an ALIX or APU 
router. A 5 GHz wireless node might contain:

NanoStation M5   <—Ethernet—>   APU   <—Ethernet—>   NanoStation M5

Meaning, the links to other nodes are (usually) wireless, but the connections 
to the APU routers are wired, obviously. Some links are fiber, 10 GHz, 60 GHz, 
etc...

http://mapa.czfree.net/#lat=50.75798855854454=15.045905113220215=13=1=satellite=98%7C114%7C111%7C117%7C109%7C111%7C118%7C115%7C107%7C97=19444=0=0=1=1=1=1;
 


> If you are running sfq you are probably not going to
> measure a lot of queueing delay, though, as the UDP flow will just get
> its own queue…

Yeah, great point. :) I guess there will be some interflow latency due to fair 
queueing though, depending on the number of flows. In the lab, Cake and 
fq_codel can improve upon sfq’s interflow latency.

So far what I’m proposing was only supposed to determine the relative 
contribution to latency in the backhaul from channel contention vs queueing, 
but my ultimate goal is to demonstrate how Cake or fq_codel improves (or does 
not improve) upon sfq in the backhaul. I could run flent through various paths 
in the backhaul, if I’m allowed. It would also be nice if in addition to that I 
can say, here’s how much Cake or fq_codel improve intra-flow and inter-flow 
latency for real-world traffic. If some improvement can be shown, it may 
justify the development of the ISP flavored Cake that has been discussed.

>> I now added SmokePing ICMP and IRTT targets for the same LAN host, and
>> can look at that distribution after a day to judge the overhead.

I’m already getting the picture (attached), which we probably already know, 
that irtt shows slightly higher and more variable response times, but not tens 
of milliseconds higher, so that should hold true under higher network load as 
well, assuming the irtt devices themselves are not loaded.

> Yeah, ICMP is definitely treated differently in many places... Another
> example is routers and switches that have a data plane implemented in
> hardware will reply to ICMP from the software control plane, which is
> way *slower* than regular forwarding... Also, ICMP is often subject to
> rate limiting... etc, etc...

Ok, that’s also interesting. That makes it harder to determine when ping is 
actually measuring. :)

___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Sebastian Moeller


> On May 6, 2018, at 21:32, Toke Høiland-Jørgensen  wrote:
> 
> Sebastian Moeller  writes:
>>> [...]
>>> Yeah, ICMP is definitely treated differently in many places... Another
>>> example is routers and switches that have a data plane implemented in
>>> hardware will reply to ICMP from the software control plane, which is
>>> way *slower* than regular forwarding...
>> 
>>  But that "should" only affect the ICMP reflector, no? I always
>>  assumed that routers will treat ICMP packets that they only pass
>>  though just like any other IP packet...
> 
> Sure, it depends on what you are pinging...

Ah, okay, I am relieved ;)

> 
>>> Also, ICMP is often subject to
>>> rate limiting... etc, etc...
>>> 
>> 
>>  If I understand posts in the nanog list correctly thee is also a
>>  trend towards limiting UDP as well.
> 
> Eh? That would mean limiting half the internet (QUIC is UDP, for
> instance)...

Exactly my sentiment when I read it, but I might have misunderstood the 
amount of limiting...

Best Regards
Sebastian


> -Toke


___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Toke Høiland-Jørgensen
Sebastian Moeller  writes:

>> On May 6, 2018, at 21:27, Toke Høiland-Jørgensen  wrote:
>> 
>> Pete Heist  writes:
>> 
 On May 6, 2018, at 5:54 PM, Toke Høiland-Jørgensen  wrote:
 
 Pete Heist  writes:
 
> Channel contention delay may be estimated by the difference between
> the round-trip times for the strict priority VO and BE packets (0x01
> and 0xb9), and queueing delay between the regular vs strict priority
> VO packets (0xb8 and 0xb9), right?
 
 I like the idea, but is there any equipment that actually implements
 strict priority queueing within a single QoS level? Or how are you
 planning to elicit this behavior?
>>> 
>>> The backhaul I’d like to test it on uses mostly NSM5s as the wireless
>>> devices and APUs for routing, QoS, etc. The QoS scripts use the htb,
>>> sfq and prio qdiscs. I’m hoping I can just add a prio qdisc / tc
>>> filter somewhere in the existing rules.
>> 
>> Ah, so this is a wired connection? And you are only targeting a
>> particular setup? If you are running sfq you are probably not going to
>> measure a lot of queueing delay, though, as the UDP flow will just get
>> its own queue...
>> 
> Lastly, I’ll need to find out for sure how much impact the use of UDP
> with a userspace client/server (in Go) is having vs ICMP. I find it hard 
> to believe that I’m seeing tens of
> milliseconds going into userspace.
 
 That does seem a bit much. Hard to tell what is the cause without a more
 controlled experiment, though...
>>> 
>>> Actually, I think it’s impossible that userspace overhead is the
>>> problem here. The irtt client and server devices are completely
>>> independent of the network routing/firewalling hardware, so the CPU
>>> load on them is identical at times of low and high network load.
>>> 
>>> I now added SmokePing ICMP and IRTT targets for the same LAN host, and
>>> can look at that distribution after a day to judge the overhead.
>>> 
>>> I guess it’s possible that ICMP may route faster over the Internet
>>> than UDP even if it isn’t being prioritized, but I would be surprised
>>> if that much faster. Not quite related, but I also find this
>>> interesting:
>>> 
>>> https://perso.uclouvain.be/olivier.bonaventure/blog/html/2013/05/22/don_t_use_ping_for_delay_measurements.html
>> 
>> Yeah, ICMP is definitely treated differently in many places... Another
>> example is routers and switches that have a data plane implemented in
>> hardware will reply to ICMP from the software control plane, which is
>> way *slower* than regular forwarding...
>
>   But that "should" only affect the ICMP reflector, no? I always
>   assumed that routers will treat ICMP packets that they only pass
>   though just like any other IP packet...

Sure, it depends on what you are pinging...

>> Also, ICMP is often subject to
>> rate limiting... etc, etc...
>> 
>
>   If I understand posts in the nanog list correctly thee is also a
>   trend towards limiting UDP as well.

Eh? That would mean limiting half the internet (QUIC is UDP, for
instance)...

-Toke

___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Toke Høiland-Jørgensen
Pete Heist  writes:

>> On May 6, 2018, at 5:54 PM, Toke Høiland-Jørgensen  wrote:
>> 
>> Pete Heist  writes:
>> 
>>> Channel contention delay may be estimated by the difference between
>>> the round-trip times for the strict priority VO and BE packets (0x01
>>> and 0xb9), and queueing delay between the regular vs strict priority
>>> VO packets (0xb8 and 0xb9), right?
>> 
>> I like the idea, but is there any equipment that actually implements
>> strict priority queueing within a single QoS level? Or how are you
>> planning to elicit this behavior?
>
> The backhaul I’d like to test it on uses mostly NSM5s as the wireless
> devices and APUs for routing, QoS, etc. The QoS scripts use the htb,
> sfq and prio qdiscs. I’m hoping I can just add a prio qdisc / tc
> filter somewhere in the existing rules.

Ah, so this is a wired connection? And you are only targeting a
particular setup? If you are running sfq you are probably not going to
measure a lot of queueing delay, though, as the UDP flow will just get
its own queue...

>>> Lastly, I’ll need to find out for sure how much impact the use of UDP
>>> with a userspace client/server (in Go) is having vs ICMP. I find it hard to 
>>> believe that I’m seeing tens of
>>> milliseconds going into userspace.
>> 
>> That does seem a bit much. Hard to tell what is the cause without a more
>> controlled experiment, though...
>
> Actually, I think it’s impossible that userspace overhead is the
> problem here. The irtt client and server devices are completely
> independent of the network routing/firewalling hardware, so the CPU
> load on them is identical at times of low and high network load.
>
> I now added SmokePing ICMP and IRTT targets for the same LAN host, and
> can look at that distribution after a day to judge the overhead.
>
> I guess it’s possible that ICMP may route faster over the Internet
>than UDP even if it isn’t being prioritized, but I would be surprised
>if that much faster. Not quite related, but I also find this
>interesting:
>
> https://perso.uclouvain.be/olivier.bonaventure/blog/html/2013/05/22/don_t_use_ping_for_delay_measurements.html

Yeah, ICMP is definitely treated differently in many places... Another
example is routers and switches that have a data plane implemented in
hardware will reply to ICMP from the software control plane, which is
way *slower* than regular forwarding... Also, ICMP is often subject to
rate limiting... etc, etc...

-Toke

___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Sebastian Moeller
Hi Pete,

you could also try different packet sizes for you probes. The size ratio should 
give you an idea about the RTT difference explained by simple transit times and 
anything on top of that might be related to queueing. I am probably severely 
underestimating the scope of your question, but I believe that that also has 
the advantage that, unlike packet-tuples, will not be as sensitive to bonded 
path segments (which probably do not exist in your environment in the first 
place).

Best Regards
Sebastian

> On May 6, 2018, at 20:28, Pete Heist  wrote:
> 
> 
>> On May 6, 2018, at 5:54 PM, Toke Høiland-Jørgensen  wrote:
>> 
>> Pete Heist  writes:
>> 
>>> Channel contention delay may be estimated by the difference between
>>> the round-trip times for the strict priority VO and BE packets (0x01
>>> and 0xb9), and queueing delay between the regular vs strict priority
>>> VO packets (0xb8 and 0xb9), right?
>> 
>> I like the idea, but is there any equipment that actually implements
>> strict priority queueing within a single QoS level? Or how are you
>> planning to elicit this behavior?
> 
> The backhaul I’d like to test it on uses mostly NSM5s as the wireless devices 
> and APUs for routing, QoS, etc. The QoS scripts use the htb, sfq and prio 
> qdiscs. I’m hoping I can just add a prio qdisc / tc filter somewhere in the 
> existing rules.
> 
>>> Lastly, I’ll need to find out for sure how much impact the use of UDP
>>> with a userspace client/server (in Go) is having vs ICMP. I find it hard to 
>>> believe that I’m seeing tens of
>>> milliseconds going into userspace.
>> 
>> That does seem a bit much. Hard to tell what is the cause without a more
>> controlled experiment, though...
> 
> Actually, I think it’s impossible that userspace overhead is the problem 
> here. The irtt client and server devices are completely independent of the 
> network routing/firewalling hardware, so the CPU load on them is identical at 
> times of low and high network load.
> 
> I now added SmokePing ICMP and IRTT targets for the same LAN host, and can 
> look at that distribution after a day to judge the overhead.
> 
> I guess it’s possible that ICMP may route faster over the Internet than UDP 
> even if it isn’t being prioritized, but I would be surprised if that much 
> faster. Not quite related, but I also find this interesting:
> 
> https://perso.uclouvain.be/olivier.bonaventure/blog/html/2013/05/22/don_t_use_ping_for_delay_measurements.html
> 
> 
> ___
> Flent-users mailing list
> Flent-users@flent.org
> http://flent.org/mailman/listinfo/flent-users_flent.org


___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Pete Heist

> On May 6, 2018, at 5:54 PM, Toke Høiland-Jørgensen  wrote:
> 
> Pete Heist  writes:
> 
>> Channel contention delay may be estimated by the difference between
>> the round-trip times for the strict priority VO and BE packets (0x01
>> and 0xb9), and queueing delay between the regular vs strict priority
>> VO packets (0xb8 and 0xb9), right?
> 
> I like the idea, but is there any equipment that actually implements
> strict priority queueing within a single QoS level? Or how are you
> planning to elicit this behavior?

The backhaul I’d like to test it on uses mostly NSM5s as the wireless devices 
and APUs for routing, QoS, etc. The QoS scripts use the htb, sfq and prio 
qdiscs. I’m hoping I can just add a prio qdisc / tc filter somewhere in the 
existing rules.

>> Lastly, I’ll need to find out for sure how much impact the use of UDP
>> with a userspace client/server (in Go) is having vs ICMP. I find it hard to 
>> believe that I’m seeing tens of
>> milliseconds going into userspace.
> 
> That does seem a bit much. Hard to tell what is the cause without a more
> controlled experiment, though...

Actually, I think it’s impossible that userspace overhead is the problem here. 
The irtt client and server devices are completely independent of the network 
routing/firewalling hardware, so the CPU load on them is identical at times of 
low and high network load.

I now added SmokePing ICMP and IRTT targets for the same LAN host, and can look 
at that distribution after a day to judge the overhead.

I guess it’s possible that ICMP may route faster over the Internet than UDP 
even if it isn’t being prioritized, but I would be surprised if that much 
faster. Not quite related, but I also find this interesting:

https://perso.uclouvain.be/olivier.bonaventure/blog/html/2013/05/22/don_t_use_ping_for_delay_measurements.html


___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org


Re: [Flent-users] irtt packet-cluster mode for estimating queueing delay

2018-05-06 Thread Toke Høiland-Jørgensen
Pete Heist  writes:

> I’d like to add the ability for https://github.com/peteheist/irtt
>  to send N request packets
> per-interval, and to optionally vary, at a minimum, the dscp value.
> I’d then like to try to use this to try to estimate queueing delay
> (technique discussed below). Would anyone else have a use for this, or
> have any requests/suggestions/warnings about it?
>
> This was initially inspired by a link Toke posted earlier to a paper
> about Skype’s ping-pair technique:
> https://www.microsoft.com/en-us/research/wp-content/uploads/2017/09/PingPair-CoNEXT2017.pdf
> .
> The technique estimates WiFi congestion by sending back-to-back ICMP
> echo requests to the AP, one with DSCP 0x00 and another 0xb8. The
> difference in round-trip times is used to quantify the contribution to
> observed delay from either self-congestion or cross-traffic, then this
> is fed back into the congestion control mechanism in Skype so as not
> to back off as aggressively in the cross-traffic case. There’s more
> historical literature on the "packet-pair” technique in general, and
> even “packet-triplets” or related techniques for bandwidth estimation.
>
> As for me, I’d like to try to estimate queueing delay in a WiFi
> backhaul, and to determine what proportion of the overall delay is due
> to queueing vs transmission time or channel contention. Before coming
> up with a technique to do this, I’m imagining an experiment that sends
> a "packet-quadruplet” :) with four different DSCP values, two of which
> will be classified into WMM’s BE queue, and two into the VO queue, and
> two of which are given strict priority, while two are not. For sake of
> argument, let’s use these values:
>
> 0x00: regular priority in QoS, WMM’s BE queue
> 0x01: strict priority in QoS, WMM’s BE queue
> 0xb8: regular priority in QoS, WMM’s VO queue
> 0xb9: strict priority in QoS, WMM’s VO queue
>
> Channel contention delay may be estimated by the difference between
> the round-trip times for the strict priority VO and BE packets (0x01
> and 0xb9), and queueing delay between the regular vs strict priority
> VO packets (0xb8 and 0xb9), right? Theoretically, the differences
> between 0x00 and 0xb8, then between 0x00 and 0x01 could be used to
> cross-check these numbers, but I’m not sure exactly what to expect
> (thus the experiment).

I like the idea, but is there any equipment that actually implements
strict priority queueing within a single QoS level? Or how are you
planning to elicit this behaviour?

> Lastly, I’ll need to find out for sure how much impact the use of UDP
> with a userspace client/server (in Go) is having vs ICMP. Attached is
> a SmokePing graph of ICMP vs IRTT on the same Internet route. The
> minimums are quite similar, but the deviations under load are rather
> different. I find it hard to believe that I’m seeing tens of
> milliseconds going into userspace. I think there may be some
> prioritization of ICMP somewhere on the route, but I need to somehow
> prove definitively whether this is the case or not!

That does seem a bit much. Hard to tell what is the cause without a more
controlled experiment, though...

-Toke

___
Flent-users mailing list
Flent-users@flent.org
http://flent.org/mailman/listinfo/flent-users_flent.org