>> 1. Congestion control ... > Why bother? Hasn't this been simulated a thousand times already? Isn't > there extensive literature?
Not very extensive. But I'm happy to pick values from the literature instead of running simulations if you think the other jobs should take priority. >> 2. Load control ... > I thought we'd agreed on an AIMD-based and > token-bucket based system that propagated load back to the sender? So did I - isn't that what I described? AIMD for congestion control, token buckets for flow control and load balancing, and load propagated back to the sender by (1) giving each peer a fair share of your resources, (2) calculating how many requests you can accept and filling the token buckets at appropriate rates, and (3) telling each peer how many tokens are in its bucket, so it can send at an appropriate rate? >> 3. Load balancing ... > NO! We need to give a greater share to those nodes which are sending > fewer requests. Unless we do this, load will not propagate back to the > sender. I agree in principle, but we need a local mechanism that achieves that global aim without harming 'innocent bystanders' (requests that happen to be traversing a link that gets throttled because of a flood). I'm not sure that your proposed mechanism (redistributing excess tokens to the fullest bucket rather than the emptiest bucket) achieves that aim, but I'm going to simulate it so let's save the arguments for later. :-) >> * Give local clients priority over peers ... > Sounds like a bad idea to me. It might have security implications, and > in any case we have effectively a single local client in > ClientRequestScheduler OK cool, let's treat all clients as a single peer. >> Here's a mechanism that implements the policy above: ... > Hmmm. This is significantly different from your previous proposal. And > it does not address overload. How is it different? (Apart from the tokens being handed out randomly rather than in round-robin order, which amounts to the same thing in the long run.) I'm not sure what you mean by overload - if you mean that it doesn't propagate load back to the sender, that's only true as long as there's excess capacity. Once load reaches capacity, the heaviest users will be throttled first. On the other hand if you mean that it doesn't detect overloaded peers, that's what the receiver window does. > We need you to simulate the original proposal firstly. That proposal > didn't address #4, but it did propagate load back to source and deal > with all the other problems. Sorry, could you give me a clear definition of what you mean by the original proposal? I thought I'd just given a more detailed description of the proposal we discussed on the list and in my SoC proposal. > Routing to the next-best peer will result in requests always being routed > to the fastest node, and the whole network collapsing. At least, that > was our experience on 0.5. Yup, it definitely needs to be approached with caution. My intuition is that when a peer is overloaded, we should act (from the point of view of routing) as if it's temporarily disconnected - that implies routing to the next-best peer and incrementing the backtracking counter if necessary. The backtracking counter will limit the number of non-optimal hops a single request can take. But that's just my intuition, I'll see what the simulations throw up. >> 4. Resource contribution ... > I think we should defer work on #4 until we have something solid to deal > with the other problems. Good, I agree. > We need a new load balancing algorithm that: > - Treats all nodes equally, and in particular doesn't give away any > information about which requests are local (our current one does give > away some timing info on this). OK now I see what you meant about giving priority to local clients. > - Limits load; if nodes are consistently rejecting too many requests on > load, we should reduce everyone's send rate. Agreed - hopefully the receiver window will allow us to detect this before requests start getting rejected. > - Propagates load back to its source. Agreed, unless doing so harms other flows (in which case we are just creating another avenue for attack). > - Limits floods. Definitely agreed. The f2f topology is an asset here - no matter how fast the attacker's node, the size of the flood is limited by the speed of the attacker's neighbours. If the attacker's neighbour allocate resources fairly, the size of the flood is even more limited. > - Works with inserts. (our current one doesn't work very well with > inserts). Agreed - let's defer the question of separate token buckets for inserts and requests until we've seen how the current proposal performs. Cheers, Michael
