[freenet-dev] Load balancing redux

Matthew Toseland Tue, 20 Jun 2006 03:03:31 +0100

On Sat, Jun 17, 2006 at 09:36:53PM -0400, Ed Tomlinson wrote:
> On Saturday 17 June 2006 17:56, Matthew Toseland wrote:
> > I'm still not entirely sure how this is going to work...
> > 
> > As far as I can see, this is the algorithm proposed:
> > 
> > OUTGOING REQUESTS: (normal load balancing)
> > For each node, we keep an AIMD. This determines the rate at which we can
> > send requests to this node. When a timeout or a RejectedOverload occurs,
> > we reduce the rate at which we can send to the node. It also has a small
> > token bucket, simply to smooth things out a bit; every time we would
> > have an opportunity to send a request, we add a token.
> 
> What do you mean by "every time we would have an opportunity to send a 
> request"?  
>   
> The existing AIMD logic works at the packet level and effectivily controls
> bandwidth usage.   Am I understanding this correctly in that you are 
> suggesting
> a second AIMD at the message/peer level (which is what my patch does)...


Yes.
> 
> If you are using tokens to control sends then why the high level AIMD? We 
> already have a packet level AIMD to control bandwidth so the token system 
> should suffice to balance load.

No. We need to balance load on a packet level in order to be
TCP-compatible, and at a request level in order to limit load. Packet
level AIMD is only sensitive to packet loss on that specific link.
>  
> > INCOMING REQUESTS: (load limiting)
> > For each node (and also for the local node), we keep another token
> > bucket. If this node sends us a request, we remove a token from its
> > bucket; if there are no tokens left, we reject the request with
> > RejectedOverload. 
> 
> clear enough
> 
> > When a token is created, it is added to the fullest non-full bucket. The
> > result of this is that if there are enough request slots to go around,
> > the node will accept every request, but if not, then it will
> > preferentially reject those coming from nodes which send the most
> > requests. This is how we ensure that load propagates back to the
> > original flooder / source, and interferes with other requests and routes
> > to the absolute minimum degree necessary.
> 
> Base assumption is that fair scheduling is what we really want.  Is it
> really?

The basic requirement is that load be propagated back to the source
without relying on the source solely to reduce its request rate (hence
allowing flooding) and without any difference between the original request
source's required behaviour (which might give away that it is the original
request source), and that of any other node.
> 
> > All the buckets start full. Whenever a request is forwarded or
> > answered locally (obviously only once per request), we create a token.
> > Even if that request resulted in a timeout, or every node rejected it
> > due to overload. Thus the rate at which requests are sent is constrained
> > by the AIMD on each link, and the rate at which requests are sent itself
> > constrains the rate at which requests can be accepted.
> > 
> > Hence, when we answer a request from the store, we can accept another
> > request. When we forward a request, we can accept another request. And
> > if we don't manage to forward it, then we reject that request and we
> > return the token - to the fullest non-full bucket, not normally to
> > the bucket we just took it from; this is so that nodes which flood
> > deplete their buckets quickly, even when their requests are not actually
> > forwarded.
> 
> How do we control the number of tokens?  I see how we create and use tokens
> but what decides how many we need start with, how do we know if we are
> using too few or too many?

The size of the buckets is arbitrary. We start with a certain number in
each bucket, which is another parameter. :|
> 
> > So:
> > - We won't run out of tokens.
> > - The rejection will normally cause the previous node to reduce the rate
> >   at which it sends requests, because of its outgoing AIMD.
> > - If a node floods, then it will quickly deplete its bucket. Other nodes
> >   will get all the tokens and the flooder will be left with the few
> >   request slots that were really not needed by anyone else.
> > - If there is enough capacity to forward every incoming request, then no
> >   requests will be rejected.
> > - Load is propagated back to the source. If node A is connected to node
> >   B to node C, and A floods, then part of the flood will go through B to
> >   C. C will then reject some requests pre-emptively and may have some
> >   timeouts. This will result in B reducing its send rate to C. That will
> >   result in more requests to B being rejected, which should result in A
> >   reducing its rate of sending requests. But even if A does not, B will
> >   refuse almost all of A's requests.
> > 
> > MISROUTING:
> > The above should propagate load, and thereby limit it; however, the
> > level that it limits it to may still be too high resulting in misrouting.
> > 
> > The main levers we have over this:
> > - We could route a request to any available node (best choice first
> >   obviously). This is what we do now (modulo backoff). The question is,
> >   would this result in significant misrouting? In the case of really
> >   slow (modem) nodes, this may be acceptable, but we don't want too much
> >   misrouting in general. Also, this strategy may promote ubernodes; one
> >   node is fast, the rest are slow, so the ubernode gets all the
> >   requests.
> 
> > - We could only route a request to its best choice (unless it's already
> >   been there). We would probably have to make RejectedOverload's fatal,
> >   otherwise this would be pointless as it would be tried elsewhere,
> >   causing more hops.
> 
> - We could try to route a request to its best choice, if it works within a 
> short,
> configurable, time great, otherwise we treat the request as if we got a DNF 
> (but 
> do not play with the HTL).

You mean queue it? I was trying to avoid delays...
>  
> > - We could compromize; only route the request to "roughly" its best
> >   choice, for some definition of roughly...
> > 
> > In any case this needs to be simulated. It may be that the load limiting
> > algorithm described will keep misrouting within reasonable bounds...
> > 
> > OTHER ISSUES:
> > This essentially is the proposal I think.
> > 
> > Some have asked whether we should have some sort of tit-for-tat system
> > to reward nodes for faithfully serving our requests. This is a matter
> > for the future - after we have a new load limiting algorithm which
> > removes the problems with the current one (security, inserts...)
> 
> Reading through this to me implies we need a few metrics.  For instance what
> percent of time are nodes backed off (both per node and global numbers would
> be interesting).  With the above scheme a node is in effect backed off when is
> out going token bucket is empty.
> 
> We also need to know the number of times non optimal routing is happening due
> to backoff and if introducing a possible short delay would improve this 
> figure.

Absolutely. But I doubt delays would improve routing. Queueing can be
simulated - with token buckets for example.
> 
> Another observation.  Quite a few message types ignore the backoff conditions
> (eg Swap* messages).  We should look closely at these and decide if they
> ready should be bypassing the backoff controls.

Yes they should. Backoff and load balancing only apply to *requests*.
> 
> The patch I sent to toad_ and mrogers implements the high level (on 
> FNPSSKDataRequest, 
> FNPCHKDataRequest & FNPInsertRequest messages) (G)AIMD.  It sends until a 
> window
> is exceeded and then waits for requests to complete before opening the next 
> transmit
> window.  It also implements part of the metrics mentioned above.

Suggest you commit the metrics, if they're not too disruptive.
> 
> Thanks,
> Ed
-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20060620/bf1c1b20/attachment.pgp>

[freenet-dev] Load balancing redux

Reply via email to