Re: [LARTC] SFQ buckets/extensions

2002-06-08 Thread Gregory Maxwell

On Fri, Jun 07, 2002 at 04:37:07PM +0200, Jan Coppens wrote:
 Why not implementing the sfb (stochastic fair Blue)? From the papers I read,
 the algorithm with the double buffered moving hash extension, seemed like a
 better alternative for the FRED, Stochastic fair RED and SFQ.

I'd like to toss in a request that anyone who impliments this please make
the bloom filter size adjustable and provide a mask to remove information
from the hash (i.e. sport/dport) to allow for per host fairness rather then
per flow.
___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



RE: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Michael T. Babcock

   In a long term always droping from the largest subqueue 
 gives you equal subqueues.

And, of course, one could have it drop them using a RED-like algorithm
to make the sessions stabilize themselves better.

 what they have) is better. May be doing it like the cbq with average
 packet size can gave us better results.

Calculating average packet size on the fly isn't that hard either.

   All started with the so called Download Accelerators,
 which do paralel gets and make TCP behave bad. 
 In normal case TCP adjusts fast and does not create backlogs.
 But when you have to change it's bandwidth , you have to create 
 a backlog to slow it down. It then clears fast.

I actually usually download pieces of the same file from multiple
mirrors if its truly large and want it fast :-).  Ranged downloads make
that quite possible ... The only way to equalize bandwidth fairly in
these scenarios still seems to be to implement the hierarchial approach
of hashing against destination IP (the user receiving the packets) and
then having sub-queues to those queues based on ports and to drop
packets (based on RED?) in each of these sub-queues (because they're
closer to the actual TCP sessions involved).

   People wanted options to tune hashsize/limits/depths and
 the most wanted (needed) was the option to select hash type.
 SRC/DST Port hash is in TODO now.

And I'm glad it exists for testing at least.
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Jan Coppens


- Original Message -
From: Michael T. Babcock [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, June 07, 2002 4:18 PM
Subject: RE: [LARTC] SFQ buckets/extensions


  In a long term always droping from the largest subqueue
  gives you equal subqueues.

 And, of course, one could have it drop them using a RED-like algorithm
 to make the sessions stabilize themselves better.

If you are dealing with a responsive session (like TCP) that is...

  what they have) is better. May be doing it like the cbq with average
  packet size can gave us better results.

 Calculating average packet size on the fly isn't that hard either.

  All started with the so called Download Accelerators,
  which do paralel gets and make TCP behave bad.
  In normal case TCP adjusts fast and does not create backlogs.
  But when you have to change it's bandwidth , you have to create
  a backlog to slow it down. It then clears fast.

 I actually usually download pieces of the same file from multiple
 mirrors if its truly large and want it fast :-).  Ranged downloads make
 that quite possible ... The only way to equalize bandwidth fairly in
 these scenarios still seems to be to implement the hierarchial approach
 of hashing against destination IP (the user receiving the packets) and
 then having sub-queues to those queues based on ports and to drop
 packets (based on RED?) in each of these sub-queues (because they're
 closer to the actual TCP sessions involved).


Why not implementing the sfb (stochastic fair Blue)? From the papers I read,
the algorithm with the double buffered moving hash extension, seemed like a
better alternative for the FRED, Stochastic fair RED and SFQ.

  People wanted options to tune hashsize/limits/depths and
  the most wanted (needed) was the option to select hash type.
  SRC/DST Port hash is in TODO now.

 And I'm glad it exists for testing at least.
 --
 Michael T. Babcock
 CTO, FibreSpeed Ltd.

 ___
 LARTC mailing list / [EMAIL PROTECTED]
 http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Don Cohen

Alexander Atanasov writes:

  SFQ classify connections with ports, esfq classify them and just by IP so
  we can have flows:
   SRC IP+proto+DST IP+PORT - one flow
   just DST IP - one flow
   another we can think of - one flow
  So i think just about packets of size bytes but without protos which they
  carry and TCP with its features to tune becomes a kind of exception.

I don't think this is true.  First, of course, almost all packets TCP.
Second, I'd expect that multiple TCP streams along the same path tend
to equalize, assuming of course that they are not differentiated along
the way.  E.g., with just fifo queuing I'd expect that two scp's along
the same path would tend toward equal bandwidth.  If this is true then 
multiple TCP streams along the same path would tend to act like one.

Perhaps someone else out there can either confirm or debunk that
expectation?
Of course, things get more complicated when the streams have the same
source but different destinations.  In that case I'd expect them to
again adjust to the differences in bandwidth to the different
destinations.  So maybe sub-subqueues below the source IP subqueues
are not so important. 

   No, the errors are accumulated.  It's always within one packet of the
   ideal in terms of what's sent.  The advantage of measuring queue
   length in bytes would be more fairness in dropping, i.e., you should
   drop from a queue with 10 packets each of length 1000 bytes instead of
   a queue with 20 packets each of length 100 bytes.
  
   I didn't get it ? What do you mean - have different queues 
  agains packets sizes ?

I was suggesting that when you add a packet to a subqueue you not just
record the fact that there are now 16 packets in that subqueue, but
1600 bytes.  Then the limit on total queue size is measured in bytes.
When you enqueue, you check to see whether this packet would cause the
limit to be exceeded, and if so you drop from the subqueue with the
most bytes.

That's closer to the spirit of SFQ, but it's probably more expensive
in run time.  The current implementation has a very fast way to
determine which subqueue has the most packets.  The object is to make
enqueue/dequeue small constant time operations.  I don't see (so far)
how to do that with queue lengths measured in bytes.
___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Martin Devera

   No, i do not think like this. If we think just for TCP connections
 we may end up with ideas like changig window size. TCP tunes to send this
 way but it also tries to send more each 5 seconds, if it have data.

please not this. :-) Packetizer sw goes this way and it is
- not allowed by RFC
- problematic with some TCP stacks

devik


___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



RE: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Martin Devera

 that quite possible ... The only way to equalize bandwidth fairly in
 these scenarios still seems to be to implement the hierarchial approach
 of hashing against destination IP (the user receiving the packets) and

exactly. The discussion should be about how to implement if
efficiently. What about to have N IP buckets and N IP/PORT
buckets. When IP/PORT hash resolves to bucket then we could (re)link the
IP/PORT bucket to IP bucket's DRR list.
During colision (two different IP but the same IP/PORT hash) the
while bucket would be stolen from original IP and moved to new.
It is bad but it is why we call it Stochastic ;)

It would be fast with bounded memory usage ...

Just my opinion ..
devik

  People wanted options to tune hashsize/limits/depths and
  the most wanted (needed) was the option to select hash type.
  SRC/DST Port hash is in TODO now.

 And I'm glad it exists for testing at least.
 --
 Michael T. Babcock
 CTO, FibreSpeed Ltd.

 ___
 LARTC mailing list / [EMAIL PROTECTED]
 http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



RE: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Michael T. Babcock

 exactly. The discussion should be about how to implement if
 efficiently. What about to have N IP buckets and N IP/PORT
 buckets. When IP/PORT hash resolves to bucket then we could 
 (re)link the

Consider a new classful queueing discipline SFC that behaves exactly
as SFQ does and can have only one sub-queue attached to it.

... SFC (8 bit hash against dest ip)
 |
 \-- SFC (11 bit hash against src ip, port, dest ip, port)
  |
  \-- dequeue ...

Is this even possible with how the classes are currently built in the
kernel?
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



RE: [LARTC] SFQ buckets/extensions

2002-06-07 Thread Michael T. Babcock

For the sake of a play-by-play (and why it wouldn't quite work right
initially):

1) we need to dequeue a packet
2) we ask the 11 bit lower SFC for a packet.
3) it asks the upper SFC
4) the upper SFC takes the next bucket, based on IP, and gives us a
packet.
5) the lower SFC takes that packet and has nothing else to work with, so
it passes it on.

... So you're right :-)

The two have to intercommunicate more ... So they have to be one.

 Hmm I can't imagine that :) Probably it should be tested
 
 On Fri, 7 Jun 2002, Michael T. Babcock wrote:
 
   It can't be done. Outer SFC could do nothing with the packet.
   It could only give it to inner one
 
  ... Ah, but it can choose which packet to give the inner 
 one.  It would
  reorder the packets into a semi-fair order based on IP.
 
... SFC (8 bit hash against dest ip)
 |
 \-- SFC (11 bit hash against src ip, port, dest ip, port)
  |
  \-- dequeue ...

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-06 Thread Michael T. Babcock

 I disagree that the goal is to make the subqueues the same length.
 The goal is to serve them with the same bandwidth (as long as they
 don't become empty.)

Not that you need backing-up, but I agree with you.  SFQ is there to provide
near-fair queuing on a per-session basis.  As modified, it could also be
used to provide near-fair queuing on a per-IP basis instead, but having
different-depthed sub-queues simply indicates why fairness is needed (one
sub-queue would otherwise have dominated the available bandwidth).

A very long sub-queue however, indicates that perhaps fairness is not being
acheived (although, that's why its refered to as being stochastic).  This is
why I suggested at least being able to tune the size of the hash / number of
hash buckets so as to redistribute the streams through more sub-queues.
Dealing with bursts properly at Linux timer resolutions is another issue
:-).
--
Michael T. Babcock
CTO, FibreSpeed Ltd.

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-06 Thread Alexander Atanasov

On Thu, 6 Jun 2002, Don Cohen wrote:

 Alexander Atanasov writes:
 
I don't see why that should be the case.  And I don't recall ever
observing it.  This adaptation time should be practically zero.
There's no work in making the queues equal.
   
  When queues get full sfq have to drop packets and it must
   send less(delay) or drop on the faster flows which gets a bigger queues to
   slow down them. That's the work i mean. And it really depends on the
   subflow depth.
 
 If the queue is full (currently measured in number of packets
 allowed) then a packet is dropped from the longest subqueue.
 That does not necessarily equalize the subqueue lengths.  You
 can still have queues of wildly differing lengths.

In a long term always droping from the largest subqueue gives you
equal subqueues.

 
(Let's use the word queue to mean the whole SFQ and subqueue for the
part sharing a hash index.)
If you have, say, 100 packets in one subqueue and 10 in another they're
already sharing the bandwidth 50-50. 
   
  No, that means that the flow with 100 packets is using about
   10 times more bandwidth. To be fair both must have lenght around 50.
 
 Again, no.  It's perfectly possible for one queue to always have
 length in range 99-101 and another to always have length 9-11 and
 still be serving them both at the same rate.  Just imagine that one
 client sends 100 packets at once and the other sends 10 at once and
 then they each send one per second and we forward one per second.
 Same rate (suppose all packets are same length), different queue
 lengths.  In fact SFQ serves them at the same rate no matter how long
 they are.  Note - serving in this case means sending, not receiving.

Okay, we said that sfq dropes from the longest subqueue.
We start with:
q1 - 100 
q2 - 10

we enqueue(recive) 2 packets/sec
and dequeue(send) 1 packet/sec. ( rr from q1 and q2 ).
q1 will grow to the limit first so sfq will start to drop there,
it will be the longest queue until, q2 gets equal to q1.
In time things should go like this:
After 56 seconds ( if limit is 128 packets ) q1 hits the limit
and sfq starts to drop. 
Now q2 is just with 38 packets and grows until it gets equal to q1,
then start to drop round robin.
I agree that rate is the same .5 packet/sec. But the initial
100 packets are burst which sfq delays (buffers) since it fits
fine in queue limit. If q1 recives again 100 packets most of them get
will dropped.

 
  I agree that the goal is to serve the same bandwidth, but
 
 Not only is this the goal, it's what SFQ does.
 
   bandwidth is served with packets which carry a data of X bytes up to MTU,
   so i consider packets as the base unit.
 
 Actually, SFQ serves the same rate in bytes/sec, not in packets/sec.
 If one flow uses only 1000byte packets and another only 100 byte
 packets, the second will get 10 times as many packets/sec as the first.
 It's only for measuring the length of the queue and subqueues that SFQ
 counts packets.  It would be more accurate in some sense to measure
 this in bytes as well.  But if you're using a lot of packets then you
 really ought to use large ones.

I've not said that it servers packets/sec. But said that
the packets i speak of are equal in size, so rate=packet/sec=bytes/sec.
It powers a queue to send allotment bytes, not packets. But
when it have allotment = 1 and packet with len = 800 the packet is 
send and it's powered again. so it sends more than it should be.
Measurement in bytes can make this more incorrect since you have enqueued
packets with different lenghts which you can not split a part and dequeue
in bytes (sigh wouldn't it be nice? ), counting queues in packets (exactly
what they have) is better. May be doing it like the cbq with average
packet size can gave us better results.

[cut]
 
Queue length depends on how many packets each download is willing to
send without an ack.  If one is willing to send 100 and the other is
willing to send 10, then the subqueues will likely be length 100 and
10, but each will still get the same bandwidth.  Without window
   
  That's true if you can send 110 packets and not exceed your
   bandwidth , trouble comes when you can send 100 , how to deal with with
   the 10 that exceed. 
 
 SFQ is not trying to control incoming bandwidth, just outgoing.
 The sending 100 packets above is some other machine, X, sending to the
 machine with SFQ, Y.  For Y these packets are incoming.  We don't
 control that.  We only control the rate at which Y forwards them.
 So X sends 100 packets.  Y gets 100 packets.  If they all go into
 our SFQ and that is limited to 10 packets/sec then those packets are
 going to be in the queue for a while.  In the mean while, Z sends 10
 packets to Y.  Now we have two subqueues.  In the next second, each 
 will get to send 5 packets.  So now we have queues of length 95 and 5.

Sending from the queue, not controling incoming bandwidht.
Yes look 

[LARTC] SFQ buckets/extensions

2002-06-05 Thread Don Cohen


  ... What if SFQ were to start with a minimal number of buckets, and
  track how 'deep' each bucket was, then go to a larger number of bits
  (2/4 at a time?) if the buckets hit a certain depth?  Theoretically,
  this would mean that 'fairness' would be achieved more often in current
  collision situations but that a smaller number of buckets would be
  necessary to achieve fairness in currently low-collision situations.
  
  I haven't looked at the SFQ code in a while, so I don't know how much
  benefit this would be in terms of processing time, or even how expensive
  it would be to change hash sizes on the fly, but at a certain level of
  resolution (+/- 2-4 bits), the changes wouldn't be terribly frequent
  anyway.

A few reactions:
- The only runtime cost of lots of buckets is a small amount of
storage for each bucket.  Allocating buckets at runtime also
introduces the problem that you could run out of space.
- There's no advantage to having many more buckets that the number 
of packets you're willing to queue, which is typically only on the
order of a few hundred.

 extensions
  And all the discussions tend to lead to the conclusion that there should
  be an sfq option (when the queue is created) for:
   a) how big the hash is
   b) whether to take into account source ports or not
   c) whether to take into account destination ports or not
   d) etc. :)
  
  Maybe someone who's written a qdisc would feel up to this?

I've been hoping to get to it, since I have other stuff I'd like to
incorporate into a new sfq version.  

  From: Alexander Atanasov [EMAIL PROTECTED]
   I've done some in this direction , probably needs more work, and
  it's poorly tested - expect b00ms ;)
   
   This adds a new qdisc for now - esfq which is a 100% clone of
  original sfq.
   - You can set all sfq parameters: hash table size, queue depths,
  queue limits.
   - You can choose from 3 hash types: original(classic), dst ip, src
  ip.
   Things to consider: perturbation with dst and src hashes is not
  good IMHO, you can try with perturb 0 if it couses trouble.
  
   Please, see the attached files.
  
   Plaing with it gives interesting results:  
   higher depth - makes flows equal slower
   small depth  - makes flows equal faster
   limit kills big delays when set at about 75-85% of depth.

I don't understand what these last three lines mean.  Could you
explain?

  
   Needs testings and mesurements - that's why i made it
  separate qdisc and not a patch over sfq, i wanted to compare both.
  
   Any feedback good or bad is welcome. 

I'll send you my current module, also a variant of SFQ.  It contains
doc that I think is worth including, also changes some of the code to
be more understandable, separates the number of packets allowed in the
queue from the number of buckets, supports the time limit (discussed
in earlier messages), controls these things via /proc, maybe a few
other things I'm forgetting.  This version does not support hashing on
different properties of the packet, cause it uses a totally different
criterion for identifying subclasses of traffic.  You can discard
that and restore the sfq hash with your modifications.  I think (hope)
these changes are pretty much independent.
___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-05 Thread Alexander Atanasov

On Wed, 5 Jun 2002, Don Cohen wrote:

  extensions
   And all the discussions tend to lead to the conclusion that there should
   be an sfq option (when the queue is created) for:
  a) how big the hash is
  b) whether to take into account source ports or not
  c) whether to take into account destination ports or not
  d) etc. :)
   
   Maybe someone who's written a qdisc would feel up to this?
 
 I've been hoping to get to it, since I have other stuff I'd like to
 incorporate into a new sfq version.  

At first look - i think i've have to incorporate my changes into
your work. I've not done much just added hashes and unlocked what
Alexey Kuznetsov did.

  Plaing with it gives interesting results:  
  higher depth - makes flows equal slower
  small depth  - makes flows equal faster
  limit kills big delays when set at about 75-85% of depth.
 
 I don't understand what these last three lines mean.  Could you
 explain?

depth is how much packets which are queued on a row of the
hash table. If you have large queues (higher depth) sfq reacts slower when
a new flow appears (it has to do more work to make queue lengths equal
). When you have short queues it reacts faster, so adjusting depth
to your bandwidth and traffic type can make it do better work.
I set bounded cbq class 320kbits and esfq with dst hash:
Start an upload - it gets 40KB
Start second one - it should get 20KB asap to be fair.
With depth 128 it would take it let's say 6 sec. to make both 20KB, with
depth 64 about 3sec - drop packets early with shorter queue.
(i've to make some exact measurements since this is just an example
and may not be correct). 

limit sets a threshold on queued packets - if a packet exceeds
it's dropped so delay is smaller, but when it tries to make flows equal it
counts depth, not limit. With above example depth 128 and limit 100:
When first upload enqueue 100 packets sfq starts to drop,
but goal to make flows equal is 64 packets in queue. Flow doesn't get
the 28 packets which are to be enqueued and delayed for a long time and
probably dropped when recived.
It's similiar to RED - you have a queue  which you try to keep at about X
percent full but can go over it, limit keeps it at most X percent full.
What about making it like RED - more packets go over limit more are droped
early - limit goes down, less are dropped - limit goes up?

 
   
  Needs testings and mesurements - that's why i made it
   separate qdisc and not a patch over sfq, i wanted to compare both.
   
  Any feedback good or bad is welcome. 
 
 I'll send you my current module, also a variant of SFQ.  It contains
 doc that I think is worth including, also changes some of the code to
 be more understandable, separates the number of packets allowed in the
 queue from the number of buckets, supports the time limit (discussed
 in earlier messages), controls these things via /proc, maybe a few
 other things I'm forgetting.  This version does not support hashing on
 different properties of the packet, cause it uses a totally different
 criterion for identifying subclasses of traffic.  You can discard
 that and restore the sfq hash with your modifications.  I think (hope)
 these changes are pretty much independent.
 

I've taken a quick look over it and i like the ideas there,
give me some time to get the details. I hope we can make something good of
both. 

-- 
have fun,
alex

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/



Re: [LARTC] SFQ buckets/extensions

2002-06-05 Thread Don Cohen

Alexander Atanasov writes:
   At first look - i think i've have to incorporate my changes into
  your work. I've not done much just added hashes and unlocked what
  Alexey Kuznetsov did.
Not quite that simple.  You have to throw out about half of my file,
mostly the last third or so, which replaces the hash function, plus 
most of the /proc stuff, and probably a lot of other little pieces
scattered here and there.
I now recall a few other things - I support configuration of sevice 
weights for different subqueues, which makes no sense for sfq, also
I record the amount of service (bytes and packets) per subqueue and
report these to the tc -s -d stuff, which also makes no sense for sfq.
After removing all that stuff you then have to restore the hash.

 Plaing with it gives interesting results:  
 higher depth - makes flows equal slower
 small depth  - makes flows equal faster
 limit kills big delays when set at about 75-85% of depth.
   
   I don't understand what these last three lines mean.  Could you
   explain?
  
   depth is how much packets which are queued on a row of the
  hash table. If you have large queues (higher depth) sfq reacts slower when
  a new flow appears (it has to do more work to make queue lengths equal
  ). When you have short queues it reacts faster, so adjusting depth
  to your bandwidth and traffic type can make it do better work.
  I set bounded cbq class 320kbits and esfq with dst hash:
   Start an upload - it gets 40KB
   Start second one - it should get 20KB asap to be fair.
  With depth 128 it would take it let's say 6 sec. to make both 20KB, with
  depth 64 about 3sec - drop packets early with shorter queue.
  (i've to make some exact measurements since this is just an example
  and may not be correct). 
I don't see why that should be the case.  And I don't recall ever
observing it.  This adaptation time should be practically zero.
There's no work in making the queues equal.
(Let's use the word queue to mean the whole SFQ and subqueue for the
part sharing a hash index.)
If you have, say, 100 packets in one subqueue and 10 in another they're
already sharing the bandwidth 50-50. 

   limit sets a threshold on queued packets - if a packet exceeds
  it's dropped so delay is smaller, but when it tries to make flows equal it
  counts depth, not limit. With above example depth 128 and limit 100:
   When first upload enqueue 100 packets sfq starts to drop,
  but goal to make flows equal is 64 packets in queue. Flow doesn't get
  the 28 packets which are to be enqueued and delayed for a long time and
  probably dropped when recived.
I disagree that the goal is to make the subqueues the same length.
The goal is to serve them with the same bandwidth (as long as they
don't become empty.)
Queue length depends on how many packets each download is willing to
send without an ack.  If one is willing to send 100 and the other is
willing to send 10, then the subqueues will likely be length 100 and
10, but each will still get the same bandwidth.  Without window
scaling the max window size is 64K which is only about 45 packets,
so it's not really normal to have 100 packets in a subqueue.

___
LARTC mailing list / [EMAIL PROTECTED]
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/