On Thu, Apr 20, 2000 at 01:43:39PM +0100, Theodore Hong wrote:
> Michael Wiktowy <spam at mindless.com> wrote:
> > > FROM: finney.org
> > > I want to reiterate a comment I made earlier, with regard to storing
> > > things into the Freenet under a "searchkey" like mp3.  This is not
> > > going to work, because too many documents will use that keyword, and
> > > they will all try to go onto that one node (even if the "documents" are
> > > just index or metadata entries there are too many).
> > 
> > I read your concerns before and can totally see where you are coming
> > from.  There certainly will be an increased load on IPs that the smart
> > routing thinks should the the home for hashes of popular keywords. There
> > are other things to consider though. I don't know how the routing
> > algorithm works exactly but it seems to me that it's focus can be
> > adjusted. What I mean is the "best" IP for a particular hash may not
> > strictly be one single IP but rather a group of IPs. By adjusting the
> > fuzziness of the targeting you might reduce the efficiency of the routing
> > mechanism by a hop or two but the load on the targeted server will be
> > dropped by a lot more.
> 
> I don't think this is really a problem.  The thing is, the routing is not
> absolute -- it's not the case that globally, 123.56.78.* might have a
> really big affinity for the hash of the keyword mp3.  Each node decides for
> itself which target it thinks might have an affinity for mp3, based on
> "past experience."  If we draw all those associations as a graph, it's
> possible all the arrows would go towards a single node, but more likely
> they would swirl into a stream that loops around and doesn't go anywhere in
> particular.
> 
> theo
>

After thinking about it, I have to agree. A 'tag', or a 'tag' with some
meta-data attached, is going to behave very much like ordinary poplular
data. As a thought experiment, I pictured a network where one node had
some -very- popular data in it, as an initial condition. What would happen
as requests were made? The data would disperse.

If there is a problem with a tag / meta-data pair routed by tag, it would
be that searches for a particular pair would have to be *deeper than for
normal data. In this case you would be trying to route to a specific instance
of popular data.

As to key clustering in general, there is a randomizing factor involved, that of
the user interaction with the node. An 'undisturbed' node might have a tendency
to attract  close keys, ending up with a datastore of closely packed key values,
but any amount of user interaction involving inserts and requests is going to
stir the pot. Also, this mixing has to considered from a network view, in that
even if a single node is undisturbed, others that reference it will be, so the
pattern of requests will change.

Unfortunately, I don't have the math to demonstrate this conclusively, in fact,
I don't think -anyone- has the math to model the network as a whole,  
which makes things pretty tricky.

I'm going to revise the meta-data spiel I wrote up, I think it's worth it
even if it's just for future reference. I'll make a note on the chat list
when it's available if anyone is interested. 


David Schutt

_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

Reply via email to