After trying out New Load Management on the network and seeing rather bad 
results, we need to reconsider load management. IMHO Old Load Management (the 
current system) is still not an acceptable answer.

Ideal load management would:
- PERFORMANCE: Performance is the number of requests running in parallel, 
divided by the average time taken. This means latency should only be increased 
if we increase the number of requests running in parallel by a similar factor. 
We want to achieve close to the ideal for *both* successful and unsuccessful 
requests. Our capacity for unsuccessful requests is huge, but we don't know 
whether they are going to succeed when we send them, which creates some 
problems.
- LATENCY: Not increase latency significantly for realtime requests. These must 
respond quickly, even at the expense of somewhat poorer routing.
- REACTIVITY: React quickly to variation in available resources (e.g. some 
requests completing quickly), but not overshoot.
- ACCURACY: Route accurately, severely limiting misrouting, but avoid 
excessively slow nodes.
- INCENTIVES: Be incentives-compatible and secure: Ideally, the originator 
should not be special.
- DOS: Make DoS attacks hard, dependant on the number of connections you have 
into the network (hence very hard on darknet and hopefully similar difficulty 
to global surveillance on opennet).

NEW LOAD MANAGEMENT:

So far, NLM seems to have problems. Ian argues these are largely due to 
queueing making a bad situation worse - queueing causes requests to take 
longer, the only way to counteract this is to run more requests in parallel, 
but that ALSO causes queueing to take longer...

Performance: So far poor
Latency: Very poor
Reactivity: Should be reasonable, not clear
Accuracy: Very good
Incentives: Good, the originator is not special
DoS: Good, thanks mainly to fair sharing (see below)

OLD LOAD MANAGEMENT:

AIMDs on the originator based on RejectedOverload's, which are generated when a 
request is rejected, and passed all the way back to the originator. When the 
RejectedOverload is originally generated, the peer in question gets backed off.

Performance: Moderate
Latency: Good
Reactivity: Poor
Accuracy: Poor (all the backoffs)
Incentives: Poor (the originator is special; sending loads of requests can 
improve performance at the cost of the network)
DoS: Poor (nothing to stop you ignoring AIMDs, sending loads of requests and 
causing lots of backoffs)

OLD LOAD MANAGEMENT + FAIR SHARING:

Fair sharing between peers greatly reduces our vulnerability to DoS, and 
improves performance on nodes with relatively few peers. However, current fair 
sharing includes an abrupt transition which can cause backoffs, and makes the 
next item harder. This should be fixed soon.

Incentives: Better but the originator is still special
DoS: Moderate

IMPROVED AIMD'S:

We can make the AIMD's be a true request count window rather than a rate 
estimator. This should respond faster to variations in retrieval times (e.g. 
getting a bunch of offered keys). We should probably not multiply it by the 
number of peers as we do now, and we should probably consider how sensitive we 
want the AIMD's to be (I'm not sure how we would easily calirbate that).

Performance: Should be improved a bit.
Reactivity: Definitely improved a bit.

EARLY "SLOW DOWN" MESSAGES:

Ian has proposed that we send some sort of "slow down" message when we are over 
some load threshold but still able to accept requests. This could be 
implemented by sending a non-local RejectedOverload (i.e. pretending we are 
relaying it), but ideally we'd like to have two distinct messages so we can 
tell what proportion of requests receive each. One problem is given there is a 
time lag involved, we could get oscillations and still see too many rejections.

Performance: Should be improved
Accuracy: Significantly better
DoS: Unaffected but need to figure out how to deal with peers that consistently 
send slow-down messages.

AIMD'S ON EACH NODE INCLUDING REMOTE REQUESTS:

I propose that we could keep a rate estimation on each node, and use it not 
only for our requests but for all requests, based on whether requests complete 
with or without a slow-down message. This would determine an upper and lower 
bound. Above the upper bound, we would reject requests. Above the lower bound, 
we would send slow-down messages, but still accept requests; below the lower 
bound, we would accept requests. This would also determine when we start local 
requests. The main risk here is that we could get some sort of feedback loop 
situation. It probably would need to be simulated and we might need to find a 
better algorithm, or tune the existing one, for calculating the rate. Also, 
like with fair sharing, there is a time lag involved in telling peers to slow 
down.

Performance: Unclear, should be similar if not better
Reactivity: Should be reasonable, possibly better than NLM as it should be able 
to deal with bottlenecks better
Incentives: Good, the originator is not special
DoS: Good, assuming we solve the problem mentioned in the previous strategy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20110827/533e26b5/attachment.pgp>

Reply via email to