While we've got the Freenet and I2P people together on one list I'd like 
to put forward an idea that's been growing in my brain for some time, as 
the taxi driver says. The idea is hypertext with location-independent, 
verifiable identifiers, which I call the free-floating web. It's not a 
new idea by any means, but it would be useful if we could agree on a 
standard so that files can easily migrate between different anonymous 
and censorship-resistant networks.

Goals:
pseudonymous and anonymous publishing
incremental verification
plausible deniability
mutable files which publishers can update
immutable files which publishers can't be forced to update

Tools:
one-way hashes
block ciphers
public-key signatures

1. Immutable files

Freenet's CHKs offer immutability and plausible deniability: a file is 
encrypted with its hash, the hash of the encrypted file identifies it, 
and the hash of the unencrypted file (which isn't revealed to relays) 
can be used to decrypt it.

CHKs can be combined with hash trees to allow incremental verification 
and parallel downloads. The file is encrypted with its hash, and the 
encrypted blocks form the bottom layer of the hash tree. The network 
representation of the file contains all the blocks of the hash tree in 
depth-first order, so that each subtree occupies a contiguous range. For 
example if the hash tree looks like this:

            A1
     B1            B2
  C1     C2     C3     C4
D1 D2  D3 D4  D5 D6  D7 D8

then the network representation looks like this:

A1 B1 C1 D1 D2 C2 D3 D4 B2 C3 D5 D6 C4 D7 D8

The on-disk representation might be different - for example, the file 
might be stored unencrypted in a shared folder, with the encryption key 
and the rest of the hash tree stored in a separate metadata file 
(convenience may be more important than plausible deniability for some 
users).

The network representation is designed to allow parallel downloads - 
each subtree can be requested and verified independently. Given the root 
hash of the tree, the root hash of the subtree, and the starting and 
ending offsets, a server can quickly find the requested blocks and if 
necessary encrypt them as it reads them from disk. The client and relays 
can verify each block as soon as it's received, using a hash from the 
request message or from a previous block in the subtree.

A hyperlink to an immutable document contains the root hash of the tree, 
the hash of the unencrypted file, the hash function and the block 
cipher. A request message contains the root hash, the hash function, and 
optionally the subtree hash and the starting and ending offsets. Relays 
and caches can verify the file but they can't decrypt it without the 
hash of the unencrypted file.

2. Mutable files

Mutable files can be implemented using public-key signatures. The 
publisher creates a redirect block which contains:

1. A file name, chosen by the publisher (each public key defines a 
separate namespace)
2. A hyperlink to the latest (immutable) version of the file

Fields 1 and 2 are encrypted with a unique symmetric key to hide them 
from relays. The following fields are unencrypted:

3. A monotonically-increasing version number, which could be a timestamp
4. The publisher's public key
5. The signature function
6. A signature of fields 1-2 (encrypted) and 3-5 using the publisher's 
private key

A hyperlink to a mutable document contains the hash of the public key, 
the symmetric key, the hash function, the signature function, and 
optionally the version number. A request message contains the hash of 
the public key, the hash function, the signature function, and 
optionally the minimum and maximum acceptable version numbers. Relays 
and caches can verify the signature and version number, but they can't 
read the file name or the hyperlink without the symmetric key.

3. Bundles

When linking to a file, authors can choose between "hard linking" to a 
specific (immutable) version and "soft linking" to a mutable redirect 
block. To solve the problem of reference cycles, files that link to one 
another can be collected into a "bundle", which contains a directory (or 
manifest) that maps names onto hard links. Links between files in the 
bundle use names instead of hard links, and the entire bundle can be 
published as a single immutable file. (FIXME: hard links into bundles 
must include a name.)

4. Spidering

 From a single entry point, the entire web can be browsed without 
needing to contact any specific server. This could be an advantage for 
anonymity, because it prevents long-term intersection attacks. The 
free-floating web can be spidered by search engines just like the world 
wide web, which should help to address the problem of finding content.


Any thoughts? I realise that Freenet does most of this already, but if 
possible I'd like to come up with a standard that can be used by 
multiple networks, and the transition from 0.5 to 0.7 seems like the 
right time to break compatibility if necessary. Do we also need a 
standard format for authenticated streams?

Cheers,
Michael

Reply via email to