On Thu, Aug 14, 2003 at 09:59:53AM -0400, Zooko wrote: > Hm. The longer I think about it, the more ways I come up with for two > different NGrouting schemes to develop divergent views of which peers handle > which keyspace. That would be bad.
I am not so sure, with NGrouting there isn't really such a clearly
defined idea of "peer Y is the right place to route a request for key
X", but rather "peer Y will take time Z to get key X", where we try to
minimize Z. Provided that the algorithm's ability to estimate Z is good
enough - it should work.
> Perhaps the convergence property should be written down in the NGrouting
> doc -- it is very important for the behavior of a large-scale network.
I am still not sure that what you refer to as "convergence" isn't
implicit in the specification already. If nodes are making radically
different estimates to each-other about routing times, then those
estimates can't be very good. If nodes are making good estimates, then
those estimates will be similar from node to node.
> There is definitely some information lost by this though. Suppose
> peer A and peer B have identical chance of returning data versus
> returning DNF, and it takes them an identical amount of time to return
> the data if they return it, but peer A returns DNFs twice as fast as
> peer B. Well, peer A is definitely better, and the more frequent DNFs
> are the better peer A becomes, but the Mnet measure gives peer A and
> peer B identical scores.
>
> Does the Freenet measure distinguish between peer A and peer B? I'm
> not sure.
Yes, Freenet tracks and tries to predict the time required for a node to
return a DNF for a given key - since this is the time a node must wait
before re-requesting the data if a DNF occurs.
> Yep. I wish there were some really good simulations of NGrouting -- it seems
> hard to answer these questions analytically.
Perhaps, although I think it is more amenable to analysis than the
current approach.
> That is: if people inserting data
> into the network are choosing different nodes to store the data than
> the people requesting the data from the network choose to query, then
> the network as a whole will suffer even though from any individual
> node's perspective his own strategy is locally optimal. A similar
> argument goes for two different queriers using different routing,
> because if their routes are divergent they lose caching, and maybe
> they even get DNFs.
I think if a situation where nodes weren't making similar estimates to
each other about routing times occurs, then it basically means that the
estimation mechanism isn't good enough.
There should be a positive feedback loop, as the better nodes get at
routing requests, the easier it will be to predict their performance,
allowing nodes to get even better at routing.
> > In the case where a node received most of its requests within 0.004
> > of its keyspace - then isn't it desirable that most of the points be
> > within that segment to more accurately represent the variations in
> > request times within that segment of keyspace for which the rest of
> > the network obviously considers this node to be an expert?
>
> Yes, but wouldn't it take an infinitely long time for the points to
> get dragged into a small space? I guess I misunderstood how the
> running average is computed.
I don't think so - not unless the implementation is buggy - the relevant
code is in the report() method in:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/freenet/freenet/src/freenet/node/rt/ResponseTimeEstimator.java?rev=1.4.2.3&content-type=text/vnd.viewcvs-markup
Perhaps Mark J Roberts, who implemented this class, can comment
(although he has been quiet lately).
Ian.
--
Ian Clarke [EMAIL PROTECTED]
Coordinator, The Freenet Project http://freenetproject.org/
Weblog http://slashdot.org/~sanity/journal
pgp00000.pgp
Description: PGP signature
