[Tech] [Old frost] Delayed socket reads

Matthew Toseland Wed, 8 Aug 2007 15:50:46 +0100

----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.06.18 - 15:46:45GMT -----


One socket per peer is unrealistic given that 90%+ of nodes are NATed, often 
behind port restricted cones etc. Different behaviour for data we requested 
is bad in general because it might indicate to an attacker whether we 
requested it. And I'm not sure what you use token buckets for - why not just 
use the congestion window?

For the second part of the mail, you've just re-invented NGR, well done. We 
had it in 0.5, it didn't work terribly well. Given that we have locations, 
misrouting is almost always bad.

On Wednesday 08 August 2007 15:49, Matthew Toseland wrote:
> ----- Anonymous at o9_0DTuZniSf_+oDmRsonByWxsI ----- 2007.06.17 -
> 13:53:13GMT -----
>
> Sounds like the proposal should be expressed in more formal way, to ensure
> that we all do not speak about different things. Maybe comparing it with
> current implementation is a good starting point, so the proposed
> differences are:
>
>
> - instead of using a single system UDP socket, each connection should be
> given a distinct UDP socket bound to specific remote address. The UDP
> socket non-bound to remote is still used to notice new connections (and
> connections where peer changed its IP address) - once newly seen address
> authentificated, new bound UDP socket created to handle the connection.
>
> Potentially, specific peer might be sending UDP packets from different IP
> address, where the address changes not on daily/hourly basis (which is
> quite common) but a remote NAT assigns a random address from a pool of
> alloted addresses for [virtually] every packet. I havn't seen so badly
> broken internet connections yet; most likely such node would prefer using
> TCP or HTTP over any UDP based transport solution anyway.
>
> The unbound UDP socket is given low weight as far as bandwidth sharing
> algoritm goes.
>
>
> - Aside the unbound UDP socket, the per-connection UDP sockets are given
> equal bandwith sharing weight (with unused bandwidth somehow (not
> neccessarily evenly) distributed among peers that could handle more
> traffic). Each peer/socket is handled by a separate thread, and so
> different peers do never block each other on network I/O.
>
> Other words, every peer is guaranteed to get at least 1/(P+1) share of
> incoming bandwidth - never long delays of no read which might lead to
> oscillations (which, as mroger pointed out, could be fought with very
> inertious averages, but no need to provoking them also). But as all the
> momentarily unused bandwidth is made available for other peers/sockets, the
> token bucket is very small, and most likely should be not higher than zero
> (using the following semantics: upon request processing the bucket size
> becomes negative, and next read from the socket might occur only when the
> buckets gets refilled back to zero level at which point the continuous
> background refilling stops).
>
> Using NIO the extra threads could be avoided, but it is not really needed
> seeing that a node is dicouraged having more than a dozen connected peers
> anyway.
>
> The threads will of course be synchronizing on local tokens/bucket handling
> - which is expected to be within a microsecond, and tasks like db access -
> which is very good as it keeps node from spiralling down due to CPU/disk
> performance starvation. Simple implementation would also limit each peer up
> to one simultaneous db access - and that is good, as that throttles the
> peers that abuse our disk resources (by flooding requests with htl=1 for
> example).
>
>
> - the current I/O bandwidth liability limiting mechanism is not deprecated
> at all; as toad pointed out multiply times, it is essential for handling
> tiny requests that tend to result in large responses often enough.
>
> Once a packet is read from a bound UDP socket:
>
> a) an insert or request: it is verified against the bandwidth liability
> limiters; is the checks passed, the insert/request accepted as it currently
> does. Otherwise the insert/request gets explicitly rejected
> (FNPRejectOverload) as it currently does. Nothing really changes here.
>
> At the moment of accepting the insert/request the peer token bucket might
> be charged, in order to discourage floods. Then credit the charged tokens
> to the peer we later received the requested data from.
>
> b) data we are awaiting for, either for our node or as forwarded request.
> c) auxilarly traffic like acks or location swaps or keepalives.
>
> These undergo no liability limiting, as the transfer already happened -
> exactly the same as now. Notice that if the received traffic is 'valuable'
> (explicitly requested earlier) we credit the peer with the tokens charged
> from the original requestor bucket (the only mechanism to possibly raise
> the bucket size above zero), so this peer is not penalized (at very least
> not as heavily as now) - and so requests of this peer are not getting
> postponed for long.
>
> As the bonus tokens are not just thrown in, but taken from other peers who
> are interested in service, there should be no bursts happenning. At least,
> charging buckets might be done at the moment the data response sent, not at
> the moment of requests (or the charge can be somehow split among those
> timepoints).
>
>
>
> Now, a question should be raised - if all the liability limiter stuff is
> expected to work as it already does, why bother? The trick is that delayed
> socket reads will be allowing remote side to notice in advance that the
> peer became slower. The remote will not know if that happens due to
> exceeding the 1/(P+1) share of bandwidth, or due to CPU/disk/swap/whatever
> performance shortage: all these situations require slowing down to avoid
> rejects and/or timeouts. So the slowing down is made at 21 levels: first,
> the delayed socket reads smooth the load over longer time period, and
> second, the requestor will start sending/queueing less requests here, more
> utilizing alternative paths and less busy peers. And that's the whole
> point.
>
>
> Additional note: this algorithm requires no modification if UDP sockets are
> mixed with TCP ones (provided that TCP window size is comparable to MTU,
> not the default 32/64/128KB).
>
>
>
> Until last week I was under impression that the timing based mechanism of
> routeing around overloaded nodes is already implemented. After taking time
> to study the close closer it turns out to be false: AFAIS the routing
> around happens only upon explicit FNPRequestOverload - the
> freenet.node.PeerManager._closerPeer() method does not take the peer
> roundtrip average into account at all. Here is the [rather simple]
> improvement I propose: instead of making routing decision based solely on
> location distance skipping 'unsuitable' locations, I propose searching a
> peer by choosing the best (numerically highest) request advance speed,
> specificially for each node:
>
> - disconnected peer is skipped (as it is now);
> - the peers we already tried to search the key at are skipped (as it is
> now); - the peer the original request came from is skipped (as it is now);
> - once passive requests implemented, other requestors of the key are to be
> skipped too (is they gain the key, the subscription will time out soon and
> then we will be able to try there);
> - [the maxDiff and the old version logic is not too clear for me];
>
> Now the real salt:
> - the speed is calculated as signedLocationDiff(our_location,
> peer_location, key_location)/RTT(peer) where signedLocationDiff is positive
> if
> locationDiff(our_location, key_location)>locationDiff(peer_location,
> key_location), and negative otherwise (assuming
> our_location!=peer_location, which would give zero location difference but
> should never happen). That can be multiplied by nodeAveragePingTime if
> values of predictable range (say -3..3) are preferred (but no need to
> provide guarantees).
> - backed off node speed is subtracted by 3 (or thereabout; but maybe
> smaller value, like 1 or even 0.5).
>
> Using the speed for connections sorting is very natural: it basically
> allows to answer the question 'which peer the request should be sent to in
> order to receive response ASAP - preferring to use slightly more bandwidth
> on underloaded nodes (where bandwidth is idle i.e. virtually free) rather
> than waiting for slow/overloaded nodes to do the job, and provided that the
> solicited key already exists where it should'.
>
> And so it gives the following advantages:
> - (major) we start routing around busy nodes well before they start
> generating FNPRejectOverload packets.
> - (minor) no longer need to scan the peer list twice, first skipping backed
> off nodes then including them.
> - (noticeable) requests that result in successful data retrieve will be
> delivering answer to original requestor faster (on average).
> - (average) the routes will be more fluid and thus harder for adversary to
> supervise or censor;
> - (minor) very fast nodes even with small own storage/cache provide
> effective caching services (for popular keys) for their immediate peers
> which are slow but have huge well-specialized storage.
>
> And note that this algorithm will be useful even for the current
> code/protocol, without delayed socket reads; delaying the socket reads will
> undoubtfully have positive impact by smoothing the choice over time (and
> efficiently fighting any oscillations) and having routing around busy nodes
> to start occuring much earlier (starting from 'boundary' keys, then if the
> node anyway gets increasingly busy, by further narrowing down the location
> space we try at the busy node).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20070808/5c94e556/attachment.pgp>

[Tech] [Old frost] Delayed socket reads

Reply via email to