If "retransmits" - retries after an overload message - are going to
happen, then plainly they need to be included in the round trip time.

Ideally (to make the algorithm simplest) we would fail the insert on a
RejectedOverload anywhere on the request. But this would suck. It would
be inefficient, and require a network reset.

Can we avoid this? I think we can. We include the "retransmit" time in
the RTT, but take into account retransmits when determining the window
size. This may result in a slightly lower rate of sending requests than
we should ideally have, however it should not be much lower.

Any other options? I suppose we could make the insert fail on a single
RejectedOverload, but I'm not keen on this; it would be very
inefficient.

Also it is clear that the main thing we need to limit is the individual
RejectedOverload's or timeouts, because they produce backoffs.
Therefore, one way or another, any single RejectedOverload must cause a
reduction in the window size. If these were always fatal to a request,
we would just measure the time taken by a successful request and call
that our round trip time. Since they are not fatal, the time can quite
reasonably be that for a successful request, even if it includes some
failures.

Does this seem reasonable, or will we have to do a network reset and
make a single RejectedOverload always cause a backoff?

One other thing: A timeout (during the search phase; a timeout waiting
for Accepted is harmless as it's not relayed), as opposed to a
RejectedOverload, is really bad. It will cause backoffs on a whole chain
of nodes. We could simply count it as several RejectedOverload's, but
we'd have to make up an arbitrary number... or wait for the series of
RejectedOverload's that come in after we have timed out.

The other option is to "fix" the timeouts so that all the nodes on the
chain don't time out all at once. Doing this *safely* is surprisingly
difficult. It can be done un-safely with relative ease, of course.

For now I will just count a timeout as a RejectedOverload... nodes
which timeout will tend to cause a lot of timeouts and get backed off.

On Wed, Apr 12, 2006 at 02:46:16PM +0100, Matthew Toseland wrote:
> I would appreciate somebody checking my logic here...
> 
> 
> The problem with the shared-window solution is that a failed CHK insert
> has the same impact as a failed SSK request. However the former will
> often have caused many nodes to be backed off, whereas the latter will
> usually have caused only one node to be backed off.
> 
> The basic currency here is backed off nodes: The load limiting
> algorithm's purpose is to adjust the rate at which requests are sent in
> order to stabilize the amount of backoff happening.
> 
> Separate windows for each type of request may be sensible, but ONLY if
> they also have separate backoff for each type of request.
> 
> If we have a unified window, then we need to have proportional impact of
> different failure types.
> 
> What we could do is report each RejectedOverload separately, even if a
> request generates several of them. But then what would success be? If we
> only report success once for each successful request, is this fair? Well
> yes it is. It's only analogous to resending a packet in TCP. *Provided
> that* we count the resend as well. In other words, every time we get a
> RejectedOverload we should report that as a failure, and every time we
> get a completion other than a timeout or a local RejectedOverload we
> should report that as a success. EVEN IF we have already received a
> non-local RejectedOverload on that request. This most closely tracks the
> TCP metaphor: it corresponds to packet retransmission.
> 
> So much for window size; what about round trip time? Strictly speaking
> we would use the time for the packet retransmit; the time from the local
> RejectedOverload to the success. However this is not appropriate because
> we do not resend the request from the source; therefore strict adherence
> would be stretching the metaphor too far. So the solution? Since a
> request may trigger a retransmit (meaning the request continuing after
> the first RejectedOverload), include the time taken by all requests
> which don't actually time out on the round trip time counter.
> -- 
> Matthew J Toseland - toad at amphibian.dyndns.org
> Freenet Project Official Codemonkey - http://freenetproject.org/
> ICTHUS - Nothing is impossible. Our Boss says so.



> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20060412/a227375b/attachment.pgp>

Reply via email to