Token passing is really cool! was Re: [freenet-dev] Load balancing: token based schemes

Ian Clarke Tue, 20 Jun 2006 21:27:43 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Still thinking about this - but is there any place here for the  
"minimum request interval" mechanism we used in 0.5?


Ian.

On 20 Jun 2006, at 17:24, Matthew Toseland wrote:

> On Wed, Jun 21, 2006 at 12:02:45AM +0100, Michael Rogers wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Matthew Toseland wrote:
>>> I don't understand what you mean when you talk about token  
>>> schemes. We
>>> can certainly tell our peers how many requests they can send -  
>>> but so
>>> what? How does this limit load?
>>
>> We tell a peer when its bucket is empty, and it stops sending  
>> requests.
>> This is more efficient than waiting for a request we know we'll  
>> reject
>> anyway, then rejecting it. (But we can fall back to that if the peer
>> isn't well behaved or the messages cross on the wire.)
>>
>>> I don't see how you are propagating them either.
>>
>> There are two answers to this, depending on what you mean by
>> propagating. I've got the impression from earlier discussions that  
>> you
>> mean asking the source to slow down. If that's the case then  
>> here's how
>> we propagate load:
>>
>> * A peer would like to send us a request
>> * There are no tokens left in its bucket (and we've told it so)
>> * Being unable to forward the request, it has to reject it
>> * The RejectedOverload travels back to the source
>
> So let me get this straight, RejectedOverload's kill the request?
>
>> * The source slows down
>
> Okay, so the token mechanism will limit flooding; a node which spams
> requests will not receive any more tokens than any other node, and
> therefore will not be able to send more requests than any other.
>
> However, you don't solve the fundamental problem with the currently
> implemented mechanism in Freenet 0.7. Which is that we require the
> original sender to slow down in response to RejectedOverload's -  
> and we
> do not require any similar reaction from any other node. This means  
> that
> at a network level it may be possible to distinguish between the
> original sender and any other node - which is BAD! 0.5 did not have  
> this
> vulnerability.
>>
>> On the other hand, if by 'propagating' you mean calculating how many
>> tokens to give our peers on the basis of how many tokens our peers  
>> have
>> given us, here's what I'd suggest:
>>
>> * Calculate the ratio of incoming to outgoing requests (running  
>> average)
>> * Work out how many requests our peers are willing to accept, in  
>> total
>> * This tells us how many requests we can accept, in total
>> * Divide these requests among our peers (eg give each bucket the same
>> number of tokens, with full buckets overflowing into the fullest
>> non-full bucket)
>
> This will produce misrouting. Our nodes' locations are not necessarily
> at any given moment distributed precisely evenly across the  
> keyspace. We
> have to take this fact into account. And we can't assume that we  
> receive
> requests distributed randomly across the keyspace either.
>
> So we have to track what proportion of incoming (or local) requests go
> to each node. How we produce an overall number then depends on how we
> handle misrouting.
>
> We could just add up the raw numbers for possible queries relayed  
> to each
> node, but this is unacceptable: If we have an ubernode, we would  
> end up
> accepting far more requests than we can handle without serious  
> misrouting
> (i.e. forwarding them all to the ubernode).
>
> An option that would minimize misrouting:
>
> total = min_over_all_nodes
>               ( node capacity / node fraction of routing requests )
>
> Obviously the problem here is that while it would eliminate  
> misrouting,
> at least in theory, it would also be extremely vulnerable to slow  
> nodes.
> So perhaps excluding the slowest few nodes would help.
>
> However, there is another option. We don't have to do this sort of
> calculations, we can simply pass the tokens on directly:
>
> Propagate the tokens, and queue.
>
> Each node has, for incoming requests:
> - A token bucket.
> - A bunch of queued or active requests, size not exceeding the current
>   value of the token bucket.
>
> Every time the token bucket size changes for a node, we send a message
> to the node indicating the new size, probably piggybacked on some
> related message.
>
> When we answer a request from the store, or a request otherwise
> terminates locally, we create a token and give it to a deserving  
> node's
> token bucket. If the chosen node is not the same as the node whose
> request just completed, its token bucket is reduced by 1 at the same
> time as the running request is removed from its queue.
>
> We match queued requests with output nodes: When we receive a token  
> from
> a potential output node, we check whether there are any queued  
> requests
> who would like to be sent to that node. If there are, we pick one, and
> forward it. Likewise if we have a surplus of tokens for a node and we
> receive a request which would like to be sent to that node, we forward
> it to that node.
>
> Requests are normally forwarded to their first choice. However, the
> first choice may change as nodes go offline etc. Misrouting can be
> introduced at this point: If the first choice node is significantly
> slower than the median, we can allow most requests that would go to  
> that
> node to be sent to the second choice - perhaps by some simple
> probabilistic scheme. Alternatively we can simply exclude really slow
> nodes from routing, but that would suck.
>
> Note that the first choice is dependant on the request history as well
> as the location; the node it comes from, the nodes it has been to, and
> so on, affect the choice of next hop.
>
>
> Now, how does this limit load?
>
> Firstly, we will never start requests faster than we can physically
> handle them, except during the early bootstrapping phase. Nodes should
> limit themselves rather nicely on bandwidth. Timeouts would  
> generally be
> due to bugs, transient network failures and so on. Since timeouts  
> take a
> long time, by definition, if timeouts are happening due to load  
> then very
> few requests are being started; so the system should balance itself.
> -- 
> Matthew J Toseland - toad at amphibian.dyndns.org
> Freenet Project Official Codemonkey - http://freenetproject.org/
> ICTHUS - Nothing is impossible. Our Boss says so.
> _______________________________________________
> Devl mailing list
> Devl at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (Darwin)

iD8DBQFEmMrCQtgxRWSmsqwRAhzsAKCAJs0+OYVeIDaV2vMSSBscwBxKHgCfafLr
pL/215FaPnjfGVHGqWDT1ok=
=QgRV
-----END PGP SIGNATURE-----

Token passing is really cool! was Re: [freenet-dev] Load balancing: token based schemes

Reply via email to