[freenet-devl] Fwd: Some musings on Freenet

Oskar Sandberg Sat, 26 May 2001 20:52:33 +0200

On Sat, May 26, 2001 at 07:15:05PM +0100, Adam Langley wrote:
<> 
> Maybe satisfied requests could pass on a YouShouldHave message. This
> would cause the node to start up a request if it doesn't have the
> data, or forward a YouShouldHave if it does.


No abuse possible there....

> An since with an announce protocol the node knows it's center of
> keyspace (COKS) it can tell when it has hit a local epicenter and
> maybe forward to a random node.

The fact that we initialize it as something does not mean that we want to
it be static.

<>


-- 
'DeCSS would be fine. Where is it?'
'Here,' Montag touched his head.
'Ah,' Granger smiled and nodded.

Oskar Sandberg
oskar at freenetproject.org

_______________________________________________
Devl mailing list
Devl at freenetproject.org
http://lists.freenetproject.org/mailman/listinfo/devl

>From - Tue May 29 05:42:47 2001
X-UIDL: 3adbdd6c00000669
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
Return-Path: <devl-admin at freenetproject.org>
Received: from hawk.freenetproject.org (postfix@[4.18.42.11])
        by funky.danky.com (8.9.3/8.8.7) with ESMTP id OAA22334
        for <danello at danky.com>; Sat, 26 May 2001 14:47:17 -0400
Received: from hawk.freenetproject.org (localhost [127.0.0.1])
        by hawk.freenetproject.org (Postfix) with ESMTP
        id E8ECC5801B; Sat, 26 May 2001 12:16:22 -0700 (PDT)
Delivered-To: devl at freenetproject.org
Received: from finney.org (226-132.adsl2.netlojix.net [207.71.226.132])
        by hawk.freenetproject.org (Postfix) with ESMTP id 070C157B37
        for <devl at freenetproject.org>; Sat, 26 May 2001 12:15:12 -0700 (PDT)
Received: (from hal at localhost)
        by finney.org (8.9.3/8.9.3) id LAA08550;
        Sat, 26 May 2001 11:48:07 -0700
From: h...@finney.org
Message-Id: <200105261848.LAA08550 at finney.org>
To: devl at freenetproject.org
Cc: nurf at wuzzle.org
Subject: [freenet-devl] Re: Fwd: Some musings on Freenet
Sender: devl-admin at freenetproject.org
Errors-To: devl-admin at freenetproject.org
X-BeenThere: devl at freenetproject.org
X-Mailman-Version: 2.0.3
Precedence: bulk
Reply-To: devl at freenetproject.org
List-Help: <mailto:devl-request at freenetproject.org?subject=help>
List-Post: <mailto:devl at freenetproject.org>
List-Subscribe: <http://lists.freenetproject.org/mailman/listinfo/devl>,
        <mailto:devl-request at freenetproject.org?subject=subscribe>
List-Id: Discussion of information related to Freenet development 
<devl.freenetproject.org>
List-Unsubscribe: <http://lists.freenetproject.org/mailman/listinfo/devl>,
        <mailto:devl-request at freenetproject.org?subject=unsubscribe>
List-Archive: <http://lists.freenetproject.org/pipermail/devl/>
Date: Sat, 26 May 2001 11:48:07 -0700
Status: O
Content-Length: 8328
Lines: 175

Ray Heasman wrote:
> 3. Insertion of data
>
> My understanding is that FreeNet performs a depth-first backtracking graph
> traversal of nodes on the network.

Technically, yes, but keep in mind that backtracking happens only in
two circumstances: if you hit a dead node (hopefully rare), or if you
hit a node which is already working on this search request (common).

> Insertion has two stages: first, a search
> to check that the data to be inserted does not already exist, second, an
> insertion phase that tunnels data to the last point at which the search
> failed.

...and caches it along the way.

> During the request phase, each node forwards a request to nodes it thinks
> most likely of having the data being inserted. This is done by making an
> estimate of how close the key describing the search is to the keys usually
> handled by the nodes it knows about.

Isn't it still just a matter of choosing the node which holds the
key closest to the target one?  This description makes it sound like
a more elaborate algorithm is used.

> Unique IDs are used so that requests
> are never processed twice by the same node (i.e. They are used by the search
> algorithm to change the graph of nodes into a dynamically generated tree
> that contains no loops). This eliminates redundant searching.

Right, this is the main source of the "backtracking" above.

> Search packets also contain a "hops to live" parameter (HTL), which is
> decremented every time the search packet is forwarded to a node.
>
> Assuming the data to be inserted is not already on the network, eventually
> HTL drops to 0, and the lucky node thus found signals the search failed. At
> this point, the originating node starts streaming the data back along the
> path searched (sans backtracks) to the lucky node.

Keeping in mind that the main cause of backtracks is that we hit the path
already, the "sans backtracks" is not really meaningful here in that you
are considering the same set of nodes whether you count backtracks or not.

> What I find interesting about this is that we are storing data in the last
> place the network looked (rather than the first).

This is not true.  The data is stored in the first place the network
looked, namely the best neighbor of the initiating node.  It is stored in
the second place the network looked, namely the best neighbor of this next
node.  And so on, all the way up to the last place the network looked.
This is entirely because of caching during insertion.

Reducing caching, as proposed later, will make it less true that we
store data in the first place we looked.  That may or may not be a good
idea but it is important to understand the role of caching.

> 5. Search keys
>
> Keys are 160 bit hashs of the data to be stored. This means that data with
> similar keys may not be associated in terms of content at all. This is a
> good thing as data are not clustered on servers by subject, making the loss
> of a node less damaging. However for searching to work, there has to be a
> concept of "closeness" of keys. Adam Langley provides a rough description of
> how this is currently done in [AL1]. This combines the test of whether one
> key is "less" than another with a binary tree to find the closest key in a
> group. Closeness is defined as the distance between leaves on this tree.

I thought the tree based search was just a speed optimization.  The
basic heuristic is still to define relative closeness, right?  We can
tell if A is closer to B or to C.  This should be irrespective of what
other keys D, E, F, ... are present on the node.

> 5.1 Reservations
>
> One thing that worries me about this definition of closeness is that it is
> entirely dependent on what keys a node currently holds. An outside observer
> with knowledge of only a subset of keys on the node would not be able to
> predict the closeness of two keys calculated by that node. It also means
> that as each node prunes and updates its node routing list it will change
> the behaviour of future routing decisions in unpredictable ways.

I believe that the only real issue is which key on the node is the closest
to the target one.  Any self-organizing network is going to change its
routing decisions over time based on the information present at each node,
so this is not really a cause for worry.

> 6. Potential problems
>
> The model mentioned in section 2 suggests a potential problem. In order to
> be a good self-organising network that routes requests to nodes containing
> their associated data, it is necessary for nodes in the vicinity of the
> insertion point to have many requests routed through them.

I don't understand this, and it may be based on the false assumption that
most requests "should" be satisfied by a node near the original insertion
point of the data.  But Freenet is not designed to follow this principle.
It is entirely possible that most requests will be satisfied by a node
far from the original insertion point.  In that case there is no reason
for nodes near the insertion point to see many requests for that data.

> In the current
> implementation, caching is very aggressive: a node caches data the first

[freenet-devl] Fwd: Some musings on Freenet

Reply via email to