Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-10-19 Thread Andy Furniss

Andy Furniss wrote:

Andy Furniss wrote:

I haven't checked those figures or tested close to limits though, the 
12k burst would need increasing a bit aswell or that will slightly 
over limit rate at HZ=1000.



It seems that htb still uses ip level for burst so 12k is enough.

With the overhead at 38 I can ceil at 99mbit OK.



Jonathan spotted that on eth skb-len is ip len + 14 so overhead should 
be 24 not 38.


Andy.


___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-20 Thread Jonathan Lynch

I did a number of tests and there doesn't appear to be any noticeable
differences between using CPU and JIFFIES (HZ=1000) as packet scheduler
source. 

I didn't mention that the outgoing interface on the core router has 2 ip
addresses. One is vlan tagged for the test network im running and the
other is not tagged, so the vast majority of outgoing traffic will be
tagged on the outgoing interface. 


Question 1. When sending a packet on a tagged interface, the 4 byte vlan
header is added between the source address and the ethertype field. As
far as I know the tag is added and removed in the device driver and so
it does not show up when using ethereal or tcpdump. The driver in use is
the e1000 which supports VLANS. As a result the minimum pkt size for
tagged packets is 68bytes and maximum is 1522 bytes.

When using MPU and overhead in htb, does that mean that the overhead
should be increased to 42 (38 + 4 for vlan header) and MPU = 88 (42 +
46). There are both tagged and untagged packets frames on this
interface.


I have run tests using MPU of 84 and overhead of 38 and mpu 88 and
overhead of 42 and increasing the ceil to 90, 95, 97 and 99Mbit. There
are spikes in the delay and the higher I increase the ceil, the higher
the spikes are. 

I have graphs of the tests I done. The filename of the graph should
explain the settings that were used in the test. Hysteresis is set to 0,
CPU as the packet scheduler clock source, mpu and overhead ( 38 and 84
respectively) or vlan mpu and overhead (42 and 88 respectively) and then
the ceiling at either 90,95 or 99 mbit. 


21957_hysteresis_CPU_mpu_overhead_95_ceil.png
22762_hysteresis_CPU_vlan_mpu_overhead_90_ceil.png 
22875_hysteresis_CPU_vlan_mpu_overhead_99_ceil.png 
24135_hysteresis_CPU_mpu_overhead_90_ceil.png  
24143_hysteresis_CPU_vlan_mpu_overhead_95_ceil.png 
24262_hysteresis_CPU_mpu_overhead_99_ceil.png   

They can be found at

http://www.compsoc.nuigalway.ie/~jlynch/htb/

   

MPU and overhead seems to be used mainly in places where the size of
frames are fixed. Does it make a difference using it with Ethernet where
the frame size is variable ?

When you said you could ceil at 99mbit ok did you look at the max
delay ? Did u notice spikes like what are in my graphs ? Do you have any
idea what could be causing these spikes ??



Other observations: 

I was using values from /proc/net/dev to measure the throughput going
through the core router and I noticed that different network drivers
increment the packet counters differently. (ie for e1000 for a packet
with max Ethernet payload of 1500 bytes byte counter incremented by 1518
bytes which includes eth header and trailer. 3c59x on the other hand
increments by 1514 bytes which does not include the eth trailer. For
vlan tagged packets e1000 increments byte counter by 1522 bytes, 3c59x
increments by 1518 bytes. Have you come across this before ?

Also according to the source of tc, specifically in tc_util.c it refers
to this page http://physics.nist.gov/cuu/Units/binary.html as to how
rates are specified in tc. So i presume tc uses this to specify rate,
ceil etc. This doesnt seem to be mentioned anywhere. 

static const struct rate_suffix {
const char *name;
double scale;
} suffixes[] = {
{ bit,1. },
{ Kibit,  1024. },
{ kbit,   1000. },
{ mibit,  1024.*1024. },
{ mbit,   100. },
{ gibit,  1024.*1024.*1024. },
{ gbit,   10. },
{ tibit,  1024.*1024.*1024.*1024. },
{ tbit,   1. },
{ Bps,8. },
{ KiBps,  8.*1024. },
{ KBps,   8000. },
{ MiBps,  8.*1024*1024. },
{ MBps,   800. },
{ GiBps,  8.*1024.*1024.*1024. },
{ GBps,   80. },
{ TiBps,  8.*1024.*1024.*1024.*1024. },
{ TBps,   8. },
{ NULL }
};


Jonathan



On Thu, 2005-08-11 at 17:36 +0100, Andy Furniss wrote: 
 Jonathan Lynch wrote:
 
  
  I dont understand what you mean when you say if you could sample truly
  randomly you would get a proper distribution. 
  
  Also having the timers synchronized will allow for more accurate
  measurements of the delay. I cant see how this would have an impact on
  the pattern.
 
 I mean it's possibly just to do with the test if a 0ms - 1ms delay is 
 expected then you could see patterns arising depending on how you 
 measure delay/clock drift or something.
 
 Now I have two pings that do intervels  1 sec - the inetutils GNU ping 
 guys implemented it for me :-),  and I also have the iputils one I can 
 simulate a stream better.
 
 While doing this I noticed that iputils ping actually gives lower 
 latency readings when sending many pps. Using tcpdump deltas I can see 
 the network latency is the same however many pps I do - it's just that 
 when measuring 1ms delays and doing many pps it seems that some code 
 gets cached (guess) 

Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-11 Thread Andy Furniss

Jonathan Lynch wrote:



I dont understand what you mean when you say if you could sample truly
randomly you would get a proper distribution. 


Also having the timers synchronized will allow for more accurate
measurements of the delay. I cant see how this would have an impact on
the pattern.


I mean it's possibly just to do with the test if a 0ms - 1ms delay is 
expected then you could see patterns arising depending on how you 
measure delay/clock drift or something.


Now I have two pings that do intervels  1 sec - the inetutils GNU ping 
guys implemented it for me :-),  and I also have the iputils one I can 
simulate a stream better.


While doing this I noticed that iputils ping actually gives lower 
latency readings when sending many pps. Using tcpdump deltas I can see 
the network latency is the same however many pps I do - it's just that 
when measuring 1ms delays and doing many pps it seems that some code 
gets cached (guess) and the reported delay changes as a result.


I mention that just to illustrate that measuring small delays can be 
misleading and influenced by the exact nature of your setup.






I have the default setting which is to set it to jiffies. There is a
comment in the kernal config for Packet scheduler clock source that
mentions that Jiffies its resolution is too low for accurate shaping
except at very low speed. I will recompile the kernel and try the CPU
option tomorrow to see if there is any change.


Maybe not in the case of htb - I use CPU and see similar results, the 
comment about accurate shaping was probably written when HZ=100, but I 
suppose it will be better for something :-)






I didnt realise such options existed for htb (mpu + overhead). These parameters are not mentioned in the man pages or in the htb manual. 
I presume I have to patch tc to get these features ?. 



There is mention on the htb page - it was added as a patch so was not 
designed in, which explains why burst doesn't use it.


You don't need to patch recent iproute2 it's already in there.

Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-10 Thread Jonathan Lynch
Andy, thanks for all the feedback. I was away on holidays for the last
week and am only back today. I have a few more questions which are
listed below.



On Wed, 2005-08-03 at 15:04 +0100, Andy Furniss wrote:
 Jonathan Lynch wrote:
  I did the same tests that I outlined earlier, but this time by setting
  hysteresis to 0. The config for the core router is included at the
  bottom. The graphs for the delay of the voip stream and the traffic
  going through the core router can be found at the following addresses.
  
  http://140.203.56.30/~jlynch/htb/core_router_hysteresis.png
  http://140.203.56.30/~jlynch/htb/voip_stream_24761_hysteresis.png
  
  
  The max delay of the stream has dropped to 1.8ms. Again the jitter seems
  to be around 1ms. There seems to be a pattern going whereby the delay
  reaches about 1.6ms then drops back to 0.4 ms, jumps back to 1.6ms and
  then back to 0.4ms repeatedly and then it rises from 0.5ms gradually and
  repeats this behaviour. Is there any explanation to this pattern ?
  
  Would it have anything go to do with burst being 1ms ?
 
 Yes I suppose if you could sample truly randomly you would get a proper 
 distribution - I guess the pattern arises because your timers are 
 synchronised for the test.



I dont understand what you mean when you say if you could sample truly
randomly you would get a proper distribution. 

Also having the timers synchronized will allow for more accurate
measurements of the delay. I cant see how this would have an impact on
the pattern.


  
  When the ceil is specified as being 90mbit, is this at IP level ? 
  What does this correspond to when a Mbit = 1,000,000 bits. Im a bit
  confused with the way tc interprets this rate.
 
 Yes htb uses ip level length (but you can specify overhead  min size) , 
 the rate calculations use a lookup table which is likely to have a 
 granularity of 8 bytes (you can see this with tc -s -d class ls .. look 
 for /8 after the burst/cburst).
 
 There is a choice in 2.6 configs about using CPU/jiffies/gettimeofday - 
 I use CPU and now I've got a ping that does  1 sec I get the same 
 results as you.
 

I have the default setting which is to set it to jiffies. There is a
comment in the kernal config for Packet scheduler clock source that
mentions that Jiffies its resolution is too low for accurate shaping
except at very low speed. I will recompile the kernel and try the CPU
option tomorrow to see if there is any change.

  
  If the ceil is based at IP level then the max ceil is going to be a
  value between 54 Mbit and 97 Mbit (not the tc values) for a 100 Mbit
  interface depending on the size of the packets passing through, right ?
  
  Minimum Ethernet frame
  148,809 * (46 * 8) =   148,809 * 368 = 54,761,712 Mbps
  
  Maximum Ethernet frame
  8,127 * (1500 * 8) =   8,127 * 12,000 =  97,524,000 Mbps
 
 If you use the overhead option I think you will be to overcome this 
 limitation and push the rates closer to 100mbit.
 
 
  About the red settings, I dont understand properly how to configure the
  settings. I was using the configuration that came with the examples.
 
 I don't use red it was just something I noticed - maybe making it longer 
 would help, maybe my test wasn't rerpresentative.
 
 FWIW I had a play around with HFSC (not that I know what I am doing 
 really) and at 92mbit managed to get -
 
 rtt min/avg/max/mdev = 0.330/0.414/0.493/0.051 ms loaded
 from
 rtt min/avg/max/mdev = 0.114/0.133/0.187/0.028 ms idle
 
 and that was through a really cheap switch.
 
 Andy.


 looked up ethernet overheads and found the figure of 38 bytes per 
 frame, the 46 is min eth payload size? and looking at the way mpu is 
 handled by the tc rate table generator I think you would need to use
 46 
 + 38 as mpu.
 
 So on every htb line that has a rate put . overhead 38 mpu 84
 
 I haven't checked those figures or tested close to limits though, the 
 12k burst would need increasing a bit aswell or that will slightly
 over 
 limit rate at HZ=1000.
 
 
 
 
 I haven't checked those figures or tested close to limits though, the 
 12k burst would need increasing a bit aswell or that will slightly over 
 limit rate at HZ=1000.
 
 It seems that htb still uses ip level for burst so 12k is enough.
 
 With the overhead at 38 I can ceil at 99mbit OK.


I didnt realise such options existed for htb (mpu + overhead). These parameters 
are not mentioned in the man pages or in the htb manual. 
I presume I have to patch tc to get these features ?. 


Yep 46 is the minimum eth payload size and 38 is the min overhead for ethernet 
frames.

interframe gap  96bits  12 bytes
+preamble   56bits   7 bytes
+sfd 8bits   1 byte
+eth header 14 bytes 
+crc 4 bytes
-
38 bytes overhead per ethernet frame.



Jonathan



___
LARTC mailing list
LARTC@mailman.ds9a.nl

Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-04 Thread Andy Furniss

Andy Furniss wrote:

I haven't checked those figures or tested close to limits though, the 
12k burst would need increasing a bit aswell or that will slightly over 
limit rate at HZ=1000.


It seems that htb still uses ip level for burst so 12k is enough.

With the overhead at 38 I can ceil at 99mbit OK.

Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-03 Thread Andy Furniss

Jonathan Lynch wrote:

I did the same tests that I outlined earlier, but this time by setting
hysteresis to 0. The config for the core router is included at the
bottom. The graphs for the delay of the voip stream and the traffic
going through the core router can be found at the following addresses.

http://140.203.56.30/~jlynch/htb/core_router_hysteresis.png
http://140.203.56.30/~jlynch/htb/voip_stream_24761_hysteresis.png


The max delay of the stream has dropped to 1.8ms. Again the jitter seems
to be around 1ms. There seems to be a pattern going whereby the delay
reaches about 1.6ms then drops back to 0.4 ms, jumps back to 1.6ms and
then back to 0.4ms repeatedly and then it rises from 0.5ms gradually and
repeats this behaviour. Is there any explanation to this pattern ?

Would it have anything go to do with burst being 1ms ?


Yes I suppose if you could sample truly randomly you would get a proper 
distribution - I guess the pattern arises because your timers are 
synchronised for the test.




When the ceil is specified as being 90mbit, is this at IP level ? 
What does this correspond to when a Mbit = 1,000,000 bits. Im a bit

confused with the way tc interprets this rate.


Yes htb uses ip level length (but you can specify overhead  min size) , 
the rate calculations use a lookup table which is likely to have a 
granularity of 8 bytes (you can see this with tc -s -d class ls .. look 
for /8 after the burst/cburst).


There is a choice in 2.6 configs about using CPU/jiffies/gettimeofday - 
I use CPU and now I've got a ping that does  1 sec I get the same 
results as you.




If the ceil is based at IP level then the max ceil is going to be a
value between 54 Mbit and 97 Mbit (not the tc values) for a 100 Mbit
interface depending on the size of the packets passing through, right ?

Minimum Ethernet frame
148,809 * (46 * 8) =   148,809 * 368 = 54,761,712 Mbps

Maximum Ethernet frame
8,127 * (1500 * 8) =   8,127 * 12,000 =  97,524,000 Mbps


If you use the overhead option I think you will be to overcome this 
limitation and push the rates closer to 100mbit.




About the red settings, I dont understand properly how to configure the
settings. I was using the configuration that came with the examples.


I don't use red it was just something I noticed - maybe making it longer 
would help, maybe my test wasn't rerpresentative.


FWIW I had a play around with HFSC (not that I know what I am doing 
really) and at 92mbit managed to get -


rtt min/avg/max/mdev = 0.330/0.414/0.493/0.051 ms loaded
from
rtt min/avg/max/mdev = 0.114/0.133/0.187/0.028 ms idle

and that was through a really cheap switch.

Andy.

___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-08-03 Thread Andy Furniss

Andy Furniss wrote:

Jonathan Lynch wrote:



If the ceil is based at IP level then the max ceil is going to be a
value between 54 Mbit and 97 Mbit (not the tc values) for a 100 Mbit
interface depending on the size of the packets passing through, right ?

Minimum Ethernet frame
148,809 * (46 * 8) =   148,809 * 368 = 54,761,712 Mbps

Maximum Ethernet frame
8,127 * (1500 * 8) =   8,127 * 12,000 =  97,524,000 Mbps



If you use the overhead option I think you will be to overcome this 
limitation and push the rates closer to 100mbit.


I looked up ethernet overheads and found the figure of 38 bytes per 
frame, the 46 is min eth payload size? and looking at the way mpu is 
handled by the tc rate table generator I think you would need to use 46 
+ 38 as mpu.


So on every htb line that has a rate put . overhead 38 mpu 84

I haven't checked those figures or tested close to limits though, the 
12k burst would need increasing a bit aswell or that will slightly over 
limit rate at HZ=1000.


Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-07-28 Thread Jonathan Lynch
Andy, thanks again for your help. Yes, HZ is still 1000 in 2.6.12. I
tried your suggestions are here are the results.

ASCII diagram

(network A)  --  (eth1) core router (eth0) -- (network C)
(eth2) 
 ^
 |
 |
 (network B)

Looking at the following graphics

http://140.203.56.30/~jlynch/htb/core_router.png
http://140.203.56.30/~jlynch/htb/voip_stream_23691.png

voip_stream_23691.png is a graph of the delay of the voice stream
travelling from network A to network C in test 2. Notice from the core
router graph that there is only voip traffic passing through the core
router until time 07:55 and the delay in voip stream is 0.25 ms until
then. After this time tcp traffic is introduced saturating the outgoing
interface of the core router (eth0). The delay increases to a maximum of
2.75 ms , which is a considerable improvement on 30ms when I was using
the ceil value of 100mbit. But there is a lot of jitter. 


With the ceil at 90Mbit, the outgoing bit rate of eth0 has gone from
98mbit to approx 90Mbit as can be seen from the core router graph for
eth0 bytes out. Note that with the tcp traffic is all http downloads, so
most Ethernet frames will be of maximum size, 1518 bytes, so 98mbits is
the maximum throughput possible on a 100mbit card, taking into account
the overheads of ethernet such as the interframe gap, preamble and start
frame delimiter.

Im not sure how to configure some of the htb parameters. The following
is my understanding of them and a few questions I have as well.

How exactly does the HZ value have a bearing on the ceil value ? How can
I calculate a maximum for the ceil value ?

12kb is the minimum burst size for a 100 mbit NIC with a timer
resolution of 1ms (1000hz) and tc calculates the smallest possible burst
when it is not specified, right ?. 

cburst is the number of bytes that can be burst as fast as the interface
can transmit them. It is smaller than burst can is ideally one packet
size, right ?
 
quantum determines the ratio at which the classes share their parents
bandwidth. Each class is given quantum number of bytes before serving
the next class, right ?

Is there any way I can limit the jitter of the VoIP traffic passing
through the htb class ?



Jonathan


On Wed, 2005-07-27 at 22:53 +0100, Andy Furniss wrote:
 Jonathan Lynch wrote:
  Andy, Many thanks for your reply. Below is some output from the queueing
  disciplines to show that the filters are working correctly and they are
  going to the right classes.
 
 OK classification looks good then.
 
 
  
  pass_on means if no class id equal to the result of the filter is found
  then try next filter, which is the BE class in this case.
 
 Ahh I'll have to play with this dsmark stuff one day :-)
 
  
  So back to the main question, could anyone tell me why the delay is so
  high (30ms) for VoIP packets which are treated with the EF phb when the
  outgoing interface of core  router to network c is saturated ?
 
 I would test next with htb setup like (assuming you are HZ=1000 - you 
 will be under rate if not) -
 
 ...
 
 tc class add dev $1 parent 2:0 classid 2:1 htb rate 90Mbit ceil 90Mbit 
 quantum 1500 burst 12k cburst 12k
 
 tc class add dev $1 parent 2:1 classid 2:10 htb rate 1500kbit ceil 
 90Mbit quantum 1500 burst 12k cburst 12k
 ...
 
 tc class add dev $1 parent 2:1 classid 2:20 htb rate 5Mbit ceil 90Mbit 
 quantum 1500 burst 12k cburst 12k
 
 ...
 
 
 If that doesn't make things any better then you could try giving the 
 2:10 class a rate alot higher than it needs and see if that helps.
 
 
 Andy.

___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-07-28 Thread Andy Furniss

Jonathan Lynch wrote:

Andy, thanks again for your help. Yes, HZ is still 1000 in 2.6.12. I
tried your suggestions are here are the results.

ASCII diagram

(network A)  --  (eth1) core router (eth0) -- (network C)
(eth2) 
 ^

 |
 |
 (network B)

Looking at the following graphics

http://140.203.56.30/~jlynch/htb/core_router.png
http://140.203.56.30/~jlynch/htb/voip_stream_23691.png

voip_stream_23691.png is a graph of the delay of the voice stream
travelling from network A to network C in test 2. Notice from the core
router graph that there is only voip traffic passing through the core
router until time 07:55 and the delay in voip stream is 0.25 ms until
then. After this time tcp traffic is introduced saturating the outgoing
interface of the core router (eth0). The delay increases to a maximum of
2.75 ms , which is a considerable improvement on 30ms when I was using
the ceil value of 100mbit. But there is a lot of jitter.


I suppose you could hope for a bit less jitter 12k burst is about 1ms at 
100mbit.


There is a tweak you can do for htb which may help - in 
net/sched/sch_htb.c there is a #define HYSTERESIS 1 - changing it to 0 
and recompiling kernel/the module makes things more accurate.





With the ceil at 90Mbit, the outgoing bit rate of eth0 has gone from
98mbit to approx 90Mbit as can be seen from the core router graph for
eth0 bytes out. Note that with the tcp traffic is all http downloads, so
most Ethernet frames will be of maximum size, 1518 bytes, so 98mbits is
the maximum throughput possible on a 100mbit card, taking into account
the overheads of ethernet such as the interframe gap, preamble and start
frame delimiter.

Im not sure how to configure some of the htb parameters. The following
is my understanding of them and a few questions I have as well.

How exactly does the HZ value have a bearing on the ceil value ? How can
I calculate a maximum for the ceil value ?


It's more to do with burst/cburst than ceil.



12kb is the minimum burst size for a 100 mbit NIC with a timer
resolution of 1ms (1000hz) and tc calculates the smallest possible burst
when it is not specified, right ?. 


It seems not, I think hysteresis may be involved again here (but then 
one of my tcs is hacked about a bit).


You can see what htb is using as defaults by doing tc -s -d class ls ..

If I do that on similar kernels one with hysteresis 0 and one with 1 I 
see quite different values.


I chose 12k as big enough for the 90mbit test 12000*8*1000=96mbit at ip 
level and it seemed like a nice multiple of 1500mtu :-)





cburst is the number of bytes that can be burst as fast as the interface
can transmit them. It is smaller than burst can is ideally one packet
size, right ?


Ideally 1 packet but not achievable with htb at lan speed and hz 1000, 
also AIUI the way htb does drr means with mixed packet sizes things 
aren't packet perfect even at low rates.


Saying that I use htb at low rates and can apparently get packet perfect 
with my traffic mix.


I think hfsc can do it perfectly on both counts.

 
quantum determines the ratio at which the classes share their parents

bandwidth. Each class is given quantum number of bytes before serving
the next class, right ?


Yea setting 1500 probably makes no difference for this test.



Is there any way I can limit the jitter of the VoIP traffic passing
through the htb class ?


Try the hysteresis and/or setting the rate for interactive way higher 
than it's traffic rate.


I did a quick test to see how things were for me at 100mbit. Because my 
other pcs are slow I needed to use two as receivers for netperf.


I noticed something I didn't expect with red or the settings you use - 
one of the pcs is slower and has less memory thus smaller tcp socket 
size. Using 4 streams two to each unshaped they get about the same, 
though with txqueuelen = 1000 there are no drops (with window scalng off 
there is a difference). With red and wscale on, the red really favoured 
the fast pc - I tried a 40k bfifo so that I got drops, expecting to see 
the same, but it was still far more even than the red.


I couldn't really simulate the voip traffic in theory I should be able 
to use ping with -i  1 sec, but using the latest inetutils you get a 
flood ping if you do that. I reported this about 18 months ago and it's 
supposedly fixed in the cvs (though I don't know if fixed means it just 
says invalid argument rather than actually does what's asked, because I 
have failed to build it so far).


So if anyone reading this has a i386 ping that -i 0.5 works on, please 
mail me the binary :-)


Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-07-27 Thread Andy Furniss

Jonathan Lynch wrote:


Could anyone tell me why the delay is so high (30ms) for VoIP packets
which are treated with the EF phb when the outgoing interface of core
router to network c is saturated ?



I have never used dsmark so am not sure about the classification parts 
of your rules. You need to check where the packets are going with with 
tc -s qdisc ls dev ...


The other parts have some issues see below.


---
The config I used for each setup is included below. These are slight
modifications that are supplied with iproute2 source code.

Config 1 using htb
---
tc qdisc add dev $1 handle 1:0 root dsmark indices 64 set_tc_index
tc filter add dev $1 parent 1:0 protocol ip prio 1 tcindex mask 0xfc
shift 2


flowid/classid here maybe, to get packets to 2:0, though it may work - 
check.




Main htb qdisc  class
tc qdisc add dev $1 parent 1:0 handle 2:0 htb
tc class add dev $1 parent 2:0 classid 2:1 htb rate 100Mbit ceil 100Mbit


100mbit will be too high if it's 100mbit nic.



EF Class (2:10)
tc class add dev $1 parent 2:1 classid 2:10 htb rate 1500Kbit ceil
100Mbit
tc qdisc add dev $1 parent 2:10 pfifo limit 5
tc filter add dev $1 parent 2:0 protocol ip prio 1 handle 0x2e tcindex
classid 2:10 pass_on


Don't know what pass_on will mean here.



BE Class (2:20)
tc class add dev $1 parent 2:1 classid 2:20 htb rate 5Mbit ceil 100Mbit
tc qdisc add dev $1 parent 2:20 red limit 60KB min 15KB max 45KB burst
20 avpkt 1000 bandwidth 100Mbit probability 0.4
tc filter add dev $1 parent 2:0 protocol ip prio 2 handle 0 tcindex mask
0 classid 2:20 pass_on

Config 2 using PRIO
---
Main dsmark  classifier
tc qdisc add dev $1 handle 1:0 root dsmark indices 64 set_tc_index
tc filter add dev $1 parent 1:0 protocol ip prio 1 tcindex mask 0xfc
shift 2

Main prio queue
tc qdisc add dev $1 parent 1:0 handle 2:0 prio
tc qdisc add dev $1 parent 2:1 tbf rate 1.5Mbit burst 1.5kB limit 1.6kB


Won't hurt if the packets are small voip but TBF has a nasty habit of 
taking 1 from the burst/mtu you specify so your burst setting may result 
in packets 1499B getting dropped - tc -s -d qdisc ls dev ... should 
show what it's using.



tc filter add dev $1 parent 2:0 protocol ip prio 1 handle 0x2e tcindex
classid 2:1 pass_on

BE class(2:2)
tc qdisc add dev $1 parent 2:2 red limit 60KB min 15KB max 45KB burst 20
avpkt 1000 bandwidth 100Mbit probability 0.4
tc filter add dev $1 parent 2:0 protocol ip prio 2 handle 0 tcindex mask
0 classid 2:2 pass_on


Without wrapping it with something like htb red won't shape traffic.

Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-07-27 Thread Jonathan Lynch

Andy, Many thanks for your reply. Below is some output from the queueing
disciplines to show that the filters are working correctly and they are
going to the right classes.

NOTE: The root qdisc of each interface is deleted before I run the
tests. This resets the statistics for the qdisc. The following is the
output after the tests.


Output of tc -s qdisc show on the core route for the 3 network
interfaces.


qdisc dsmark 1: dev eth0 indices 0x0040 set_tc_index
 Sent 2183574289 bytes 1496372 pkts (dropped 60982, overlimits 0
requeues 0)

qdisc htb 2: dev eth0 parent 1: r2q 10 default 0 direct_packets_stat 22
 Sent 2183574289 bytes 1496372 pkts (dropped 60982, overlimits 140759
requeues 0)

qdisc pfifo 8007: dev eth0 parent 2:10 limit 5p
 Sent 7265998 bytes 51169 pkts (dropped 0, overlimits 0 requeues 0)

qdisc red 8008: dev eth0 parent 2:20 limit 60Kb min 15Kb max 45Kb
 Sent 2176307367 bytes 1445181 pkts (dropped 60982, overlimits 60982
requeues 0)
  marked 0 early 60982 pdrop 0 other 0

qdisc pfifo 8009: dev eth1 limit 1000p
 Sent 4496 bytes 477176 pkts (dropped 0, overlimits 0 requeues 0)

qdisc pfifo 800a: dev eth2 limit 1000p
 Sent 40637134 bytes 585931 pkts (dropped 0, overlimits 0 requeues 0)


Again here is the ASCII diagram

(network A)  --  (eth1) core router (eth0) -- (network C)
(eth2) 
 ^
 |
 |
 (network B)



From network A to C (from 2 pcs used for the purpose of traffic
generation)

TCP traffic - pc 1
Sent 994762580 bytes 658704 pkts (dropped 0, overlimits 0 requeues 0)

VoIP traffic - pc 2
Sent 7286487 bytes 51298 pkts (dropped 0, overlimits 0 requeues 0)

From network B to C
TCP traffic
Sent 1271745729 bytes 841217 pkts (dropped 27, overlimits 0 requeues 0)


So total amount of packets transmitted to incoming interface on the core
router is 658704 + 51298 + (841217 - 27) = 1,551,192 packets.

The total sent by the dsmark and htb qdisc on the core router is
1,496,372 packets and 60,982 are dropped. The total received is
1,557,354. There is also some more traffic received from other nodes in
network A, but this is minimal and also traffic from the core router
itself. This should account for the difference.

VoIP traffic sent from a machine in network A = 51298 packets. It is
practically the same as the number of packets that pass through the
pfifo 51169 which is attached to class 2:1


TCP traffic that should be passing through class 2:10 which is the BE
class is 658,704 packets (TCP) from network A and 841,217 packets from B
which equals which totals 1,499,921

traffic sent from the BE class is  1,445,181 + 60,982 packets which were
dropped. So 1,506,163 packets were received by the BE class 2:10

The traffic sent from the output interface of eth1 and eth2 is mainly
acks back to network A and network B respectively. 


100mbit will be too high if it's 100mbit nic.

What value would you recommend to set as the ceil for a 100 mbit NIC ??.


Don't know what pass_on will mean here.

pass_on means if no class id equal to the result of the filter is found
then try next filter, which is the BE class in this case.



So back to the main question, could anyone tell me why the delay is so
high (30ms) for VoIP packets which are treated with the EF phb when the
outgoing interface of core  router to network c is saturated ?



Jonathan




 Won't hurt if the packets are small voip but TBF has a nasty habit of 
 taking 1 from the burst/mtu you specify so your burst setting may result 
 in packets 1499B getting dropped - tc -s -d qdisc ls dev ... should 
 show what it's using.

 Without wrapping it with something like htb red won't shape traffic.

I am not to concerned about the PRIO + TBF setup. My priority is with
the htb setup but I will look into this and see if I notice that.


On Wed, 2005-07-27 at 14:25 +0100, Andy Furniss wrote:
 Jonathan Lynch wrote:
 
  Could anyone tell me why the delay is so high (30ms) for VoIP packets
  which are treated with the EF phb when the outgoing interface of core
  router to network c is saturated ?
  
 
 I have never used dsmark so am not sure about the classification parts 
 of your rules. You need to check where the packets are going with with 
 tc -s qdisc ls dev ...
 
 The other parts have some issues see below.
 
  ---
  The config I used for each setup is included below. These are slight
  modifications that are supplied with iproute2 source code.
  
  Config 1 using htb
  ---
  tc qdisc add dev $1 handle 1:0 root dsmark indices 64 set_tc_index
  tc filter add dev $1 parent 1:0 protocol ip prio 1 tcindex mask 0xfc
  shift 2
 
 flowid/classid here maybe, to get packets to 2:0, though it may work - 
 check.
 
  
  Main htb qdisc  class
  tc qdisc add dev $1 parent 1:0 handle 2:0 htb
  tc class add dev $1 parent 2:0 

Re: [LARTC] HTB and PRIO qdiscs introducing extra latency when output interface is saturated

2005-07-27 Thread Andy Furniss

Jonathan Lynch wrote:

Andy, Many thanks for your reply. Below is some output from the queueing
disciplines to show that the filters are working correctly and they are
going to the right classes.


OK classification looks good then.




pass_on means if no class id equal to the result of the filter is found
then try next filter, which is the BE class in this case.


Ahh I'll have to play with this dsmark stuff one day :-)



So back to the main question, could anyone tell me why the delay is so
high (30ms) for VoIP packets which are treated with the EF phb when the
outgoing interface of core  router to network c is saturated ?


I would test next with htb setup like (assuming you are HZ=1000 - you 
will be under rate if not) -


...

tc class add dev $1 parent 2:0 classid 2:1 htb rate 90Mbit ceil 90Mbit 
quantum 1500 burst 12k cburst 12k


tc class add dev $1 parent 2:1 classid 2:10 htb rate 1500kbit ceil 
90Mbit quantum 1500 burst 12k cburst 12k

...

tc class add dev $1 parent 2:1 classid 2:20 htb rate 5Mbit ceil 90Mbit 
quantum 1500 burst 12k cburst 12k


...


If that doesn't make things any better then you could try giving the 
2:10 class a rate alot higher than it needs and see if that helps.



Andy.

___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc