[freenet-dev] Load Balancing, Tit for Tat, Fully utilizing nodes, etc?

Matthew Toseland Fri, 9 Jun 2006 21:48:31 +0100

On Thu, Jun 08, 2006 at 12:23:17PM +0100, Michael Rogers wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> >> AIMD for congestion control, 
> >> token buckets for flow control and load balancing, and load propagated 
> >> back to the sender by (1) giving each peer a fair share of your 
> >> resources, (2) calculating how many requests you can accept and filling 
> >> the token buckets at appropriate rates, and (3) telling each peer how 
> >> many tokens are in its bucket, so it can send at an appropriate rate?
> > 
> > I don't understand (3), that's straying into MRI territory... I thought
> > we were going to simply let it work it out based on how many are
> > rejected? This fits better with a minimal- or no- misrouting model...
> 
> Even if you're not going to route around overloaded nodes, you can save
> some bandwidth by not sending a request if you know it's going to be
> rejected.


Hmmm I suppose so but the model isn't really "this node can accept 10
packets per second, so lets give it 10 packets per second and move the
rest to others"; won't that just result in more overload?

Well, will it result in more overload? Normally misrouting will result
in more overload... if a node is permanently really slow, we should back
off from it and tell the owner... unless we can somehow give it a
smaller slice of the keyspace through swapping.

I suppose the question is global versus local behaviour. If the node is
slow, and so are all the others, then it is imperitive that we propagate
load back to the source, and slow down all senders. However, if most
nodes are much faster than this node, and load propagation is working
(the overall load is well matched to the other nodes), then a little
misrouting may be acceptable. Any comments on this theory? Ian? It needs
to be simulated anyway. We have a certain amount of misrouting with
backoff; in theory it is just for short-term problems...

> Also, knowing how many requests each of your peers is willing
> to accept allows you to calculate how many requests /you/ can accept, so
> information about load is propagated upstream without wasting any
> bandwidth on rejected requests.

Hmmm, that sounds rather dangerous. How do you ensure that loops don't
cause the whole network to lockdown? It seems like premature
optimization to me, lets get the basics right first. Obviously we can
add up our receiver windows, but some requests will be answered locally,
and more seriously, we can't just route any request to any node; we have
to route it to the right node for its particular key.
> 
> > (2) is determined by AIMD, yes?
> 
> Partly - the minimum of the AIMD congestion window and the receiver
> window tells you how many requests you can forward on each link. You
> also need to work out the approximate relationship between the number of
> incoming requests and the number of outgoing requests - some incoming
> requests will be answered locally, and some outgoing requests will be
> generated locally. If you keep a running average requestsInOverOut then
> you can calculate (2) as follows:
> 
> double totalOutgoing = 0.0;
> for (int i = 0; i < peers.length; i++)
>       totalOutgoing += Math.min (peers[i].cwindow, peers[i].rwindow);
> totalIncoming = totalOutgoing / requestsInOverOut;

What's this for, exactly? You seem to be leaping ahead beyond anything
we have seriously discussed into realms of speculation which may or may
not work, and are worth simulating but not a priority...
> 
> > You seemed to be leaving out a chunk... weren't we using two separate
> > token buckets last time?
> 
> At one stage we were using a token bucket for each outgoing link, but
> that function's now performed by the receiver window (which just tells
> you the number of tokens in the bucket at the other end of the link,
> removing the need for a duplicate bucket at your end).

Hmmm. I think I see ... Does this work without propagating request
windows? I think propagating request windows will be difficult to get
right, though I can see possible strategies... When we tried to
propagate MRIs on 0.5, we had serious problems...
> 
> I did forget the bandwidth limiter, though... I'll have a think about
> how to incorporate it into the calculation above, so we can ask our
> peers to slow down before we run into the bandwidth limiter and have to
> reject requests.
> 
> > We'd need to drop the HTL by something like 5 to make that safe. I
> > accept that temporary overload should be routed around, but generally
> > misrouting is a REALLY REALLY BAD THING, because it *causes more load*.
> 
> OK, but rejecting requests that have already travelled several hops also
> causes more load.

Right. We have to find some compromize. I think maybe we can do a little
misrouting, but only as long as we are *absolutely* sure that load is
propagated properly, so that the demand will be reduced.
> 
> Here's how I look at it: having a slow peer on a dialup link is
> somewhere between having a fast peer and having no peer at all.
> Sometimes you can take advantage of the slow peer, but most of the time
> you have to act as if it isn't there. If it wasn't there and you got a
> request for its portion of the keyspace, would you reject the request?
> No, you'd route to the best available peer.

Right. I can see the argument. I do think this should be taken into
account by swapping, but it would obviously be preferable to not have to
exclude really slow nodes from routing altogether!
> 
> Essentially all I'm saying is that "available" should mean "connected
> and able to accept requests" rather than just "connected". Apart from
> that I'm not suggesting any changes in routing: we still always route to
> the best available peer.

Ok. This does need extensive simulations, but it would be best if we
could use dial-up connections, and thereby give them some plausible
deniability.
> 
> Here's another way of looking at it: a friend with a dialup connection
> wants to become your peer. Your friend's connection is only fast enough
> to accept one in every ten requests: the rest will be rejected. With the
> current behaviour, nine out of ten requests that fall into your friend's
> portion of the keyspace will be rejected: from your point of view, the
> slow peer will have created a hole in the keyspace. You're better off
> saying no. On the other hand if you send one out of ten requests to the
> slow peer and the other nine to the best available peer (ie whichever of
> your current peers is closest to the key in question), the slow peer is
> doing useful work by reducing the load on your other peers, and you're
> better off saying yes.

Right. As long as the swapping algorithm doesn't create a topology
whereby rejecting 9 out of 10 requests to the modem peer is a serious
problem. Which it shouldn't as long as the topology is indeed
small-world. Needs simulation. Also, we need measures to ensure that
ubernodes don't get more than their fair share of requests; just because
they can handle more does not mean that they should drag the rest of the
network up to their level. Fully utilizing ubernodes is a *BAD* thing.
> 
> > Our current algorithm propagates load back to the source.
> 
> Does it? I thought backoff was only triggered by local rejections,
> meaning that only the neighbour of the overloaded node changes its
> behaviour. The only way for this to propagate load back to the source
> would be for every node along the path to become overloaded. Or are you
> talking about well-behaved senders slowing down when their requests get
> rejected?

I'm talking about well-behaved senders slowing down when their requests
get rejected.
> 
> Cheers,
> Michael
-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20060609/cd80fcb7/attachment.pgp>

[freenet-dev] Load Balancing, Tit for Tat, Fully utilizing nodes, etc?

Reply via email to