Re: [LARTC] The effects of queueing on delay...(TX Ring Buffer the problem)

2005-12-05 Thread Andy Furniss

Jonathan Lynch wrote:

Quoting Andy Furniss [EMAIL PROTECTED]:



Jonathan Lynch wrote:


This was down to the tx buffer size on the network card i was using. It
was an Intel 82547EI gigabit Card using the e1000 driver and operating
at 100mbit. The tx buffer was set to 256 which caused this huge delay.
The minimum the driver lets me reduce the tx buffer size using ethtool
is 80. By reducing the tx ring buffer to 80, the delay when there is
full link utilisation and a maximum queue of 10 packets was reduced from
30ms to 10ms.

The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
16 on the e1000 driver, but the max throughput i could achieve on the
interface went down.

Has anyone experimented with reducing the size of the tx buffer on this
card to get a good balance between delay and throughput ?


Strange - I thought that as long as you are under rate for the link then
the most htb should burst per tick is the burst size specified.

That assumes one bulk class - more will make it worse.

Andy.




Just noticed your reply there, havnt been very busy lately and havnt been 
checked LARTC in a while.

say for example with a htb qdisc configured with a ceil of 100 Mbit (overhead 
24 mpu 84 mtu 1600
burst 12k cburst 12k quantum 1500) or a queue discipline that doesnt rate limit 
such as prio or red
there was a delay of 30 ms imposed when the outgoing interface was saturated 
and the tx ring size
was 256. when the tx ring size was reduced to 80 the delay was around 9ms.


Ahh I see what you mean - reducing the buffer beyond htb, but I don't 
really see why you need to, rather than reducing htb rate so you only 
have one htb burst of packets in it at a time (that assumes you only 
have two classes like in your other tests - more bulk classes would be 
worse).




The tx ring is a fifo structure. The NIC driver uses DMA to transmit packets 
from the tx ring. these
are worst case delays when The tx ring is full of maximum size FTP packets with 
the VoIP packet at
the end. The VoIP has to wait for all the FTP packets to be transmitted.

When the rate was reduced to 99Mbit the maximum delay imposed is about 2ms. It 
seems that with the
reduced rate there is time to clear more packets from the TX ring...there are 
less packets in the
ring resulting in a lower delay. But the delay increases linearly.


I agree from our previous discussion and tests that even with overheads 
added you need to back off a bit more than expected, but I assumed this 
was to do with either :


The 8/16 byte granularity of the lookup tables.
The fact that overhead was not designed into htb, but added later.
Timers maybe being a bit out.
Me not knowing the overhead of ethernet properly.

Making tx buffer smaller is just a workaround for htb being over rate 
for whatever reason.




Also a question when defining the following parameters (overhead 24 mpu 84 mtu 
1600 burst 12k cburst
12k quantum 1500)


I suppose quantum should be 1514 - as you pointed out to me previously 
  as it's eth - maybe more if you are still testing on vlans. I don't 
think it will make any difference in this test - I see the overhead 
allows for it already (I am just stating for the list as it was down 
during our previous discussions about overheads).


mtu 1600 - It's a bit late to check now - but there is a size makes the 
table go from 8 to 16 and 1600 seems familiar - but 2048 comes to mind 
aswell :-) I would check with tc -s -d class ls and reduce it a bit if 
you see /16 after burst.


 i have them defined on all classes and on the htb qdisc itself. Is 
there a minimum

 place where they can be specified...ie just on the htb qdisc itself, or do 
they have to be
specified on all


I would think so, htb has a lookup table for each rate and ceil.

Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] The effects of queueing on delay...(TX Ring Buffer the problem)

2005-11-21 Thread Jonathan Lynch
Quoting Andy Furniss [EMAIL PROTECTED]:

 Jonathan Lynch wrote:
  This was down to the tx buffer size on the network card i was using. It
  was an Intel 82547EI gigabit Card using the e1000 driver and operating
  at 100mbit. The tx buffer was set to 256 which caused this huge delay.
  The minimum the driver lets me reduce the tx buffer size using ethtool
  is 80. By reducing the tx ring buffer to 80, the delay when there is
  full link utilisation and a maximum queue of 10 packets was reduced from
  30ms to 10ms.
 
  The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
  16 on the e1000 driver, but the max throughput i could achieve on the
  interface went down.
 
  Has anyone experimented with reducing the size of the tx buffer on this
  card to get a good balance between delay and throughput ?

 Strange - I thought that as long as you are under rate for the link then
 the most htb should burst per tick is the burst size specified.

 That assumes one bulk class - more will make it worse.

 Andy.


Just noticed your reply there, havnt been very busy lately and havnt been 
checked LARTC in a while.

say for example with a htb qdisc configured with a ceil of 100 Mbit (overhead 
24 mpu 84 mtu 1600
burst 12k cburst 12k quantum 1500) or a queue discipline that doesnt rate limit 
such as prio or red
there was a delay of 30 ms imposed when the outgoing interface was saturated 
and the tx ring size
was 256. when the tx ring size was reduced to 80 the delay was around 9ms.

The tx ring is a fifo structure. The NIC driver uses DMA to transmit packets 
from the tx ring. these
are worst case delays when The tx ring is full of maximum size FTP packets with 
the VoIP packet at
the end. The VoIP has to wait for all the FTP packets to be transmitted.

When the rate was reduced to 99Mbit the maximum delay imposed is about 2ms. It 
seems that with the
reduced rate there is time to clear more packets from the TX ring...there are 
less packets in the
ring resulting in a lower delay. But the delay increases linearly.


Also a question when defining the following parameters (overhead 24 mpu 84 mtu 
1600 burst 12k cburst
12k quantum 1500) i have them defined on all classes and on the htb qdisc 
itself. Is there a minimum
 place where they can be specified...ie just on the htb qdisc itself, or do 
they have to be
specified on all

Jonathan


___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] The effects of queueing on delay...(TX Ring Buffer the problem)

2005-10-17 Thread Andy Furniss

Jonathan Lynch wrote:

This was down to the tx buffer size on the network card i was using. It
was an Intel 82547EI gigabit Card using the e1000 driver and operating
at 100mbit. The tx buffer was set to 256 which caused this huge delay.
The minimum the driver lets me reduce the tx buffer size using ethtool
is 80. By reducing the tx ring buffer to 80, the delay when there is
full link utilisation and a maximum queue of 10 packets was reduced from
30ms to 10ms.

The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
16 on the e1000 driver, but the max throughput i could achieve on the
interface went down.

Has anyone experimented with reducing the size of the tx buffer on this
card to get a good balance between delay and throughput ?


Strange - I thought that as long as you are under rate for the link then 
the most htb should burst per tick is the burst size specified.


That assumes one bulk class - more will make it worse.

Andy.
___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc


Re: [LARTC] The effects of queueing on delay...(TX Ring Buffer the problem)

2005-10-13 Thread Jonathan Lynch

This was down to the tx buffer size on the network card i was using. It
was an Intel 82547EI gigabit Card using the e1000 driver and operating
at 100mbit. The tx buffer was set to 256 which caused this huge delay.
The minimum the driver lets me reduce the tx buffer size using ethtool
is 80. By reducing the tx ring buffer to 80, the delay when there is
full link utilisation and a maximum queue of 10 packets was reduced from
30ms to 10ms.

The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
16 on the e1000 driver, but the max throughput i could achieve on the
interface went down.

Has anyone experimented with reducing the size of the tx buffer on this
card to get a good balance between delay and throughput ?

Jonathan







On Tue, 2005-10-11 at 22:04 +0100, Jonathan Lynch wrote: 
 I have a router with 3 network interfaces like in the following ASCII
 diagram below. All interfaces are 100mbit. There is tcp traffic being
 sent from net1 to net3 and from net2 to net3 and the tcp connections
 consume as much bandwidth as possible. There is a pfifo queue on the
 egress interface eth0 of the core router with a limit of 10 packets.
 
 
 net1 -- (eth1) router (eth0) - net3 
 (eth2)
   ^
   |
 net 2
 
 I police traffic on the edge of net1 to 48.4375 Mbit and shape the
 traffic on exit of net 2 to 48.4375 Mbit. There are no packets in the
 queue of the egress interface eth0 of the router at any stage. (every
 packet is enqueued by pfifo_enqueue() to an empty queue. I have
 confirmed this by adding adding a counter in sch_fifo.c that is
 incremented every time there is a packet in the queue when a new packet
 is enqueued.) The delay is at a maximum of 2ms. 
 
 When I increase the policing rate and shaping rates to 48.4687. The
 combined increase is 31.2 kbit which is very small. there are some
 packets queued for a short period and some dropped which clears the
 queue. The maximum number of packets dropped was 20 per second. But the
 delay goes up to 30ms.  
 
 check out the graphs at
 http://frink.nuigalway.ie/~jlynch/queue/
 
 
 I cant seem to explain this. Even if the queue was full all the time and
 each packet was of maximum size, the delay imposed by queueing should be
 a maximum of 10 * 1500 * 8 /100,000,000 which equals 1ms. 
 
 How can so much delay be added by such a small increase in the
 throughput coming from net1 and net2 ? 
 
 I would appreciate if someone could explain it to me.
 
 Btw im using a stratum 1 NTP server on the same LAN to ensure
 measurement accuracy.
 
 
 Jonathan
 
 ___
 LARTC mailing list
 LARTC@mailman.ds9a.nl
 http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

___
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc