I and others have been recently coming to the conclusion that premix routing 
is not the next generation security enhancement that we had hoped it would 
be.

The objective here is to make various attacks harder. The two attacks we are 
most interested in:
- Correlation attacks: What is my peer doing? This relies on recognising a 
splitfile (or a series of FMS posts for example) and doing statistics on it 
to see how likely it is that the peer is the originator.
- Adaptive search: The attacker picks up a few requests from a splitfile (e.g. 
a big insert). Each one gives him a predecessor sample, and the precise key 
being fetched gives him a vague idea of where the originator is. On opennet, 
and to some degree on darknet, he can use this information to move closer to 
the target. If he is successful, the number and accuracy of samples increases 
as he moves closer to the target, leading to exponential growth in the 
effectiveness of the attack. Hence for a big insert or long lived identity 
there's a good chance of finding the target in a relatively short period.

A partial solution to the latter is the recent obfuscated inserts proposal. 
This prevents the attacker from moving towards the target until after the key 
for the insert has been announced (or the top block inserted in the case of a 
well known SSK), even if he can guess the content. But he may still be able 
to get a good idea of the originator's location if he has logged enough 
blocks.

Encrypted tunnels (e.g. premix routing) would also solve the datastore probing 
attack mentioned in The Register a while ago, and may enable some significant 
optimisations such as Bloom filters (which would likely save quite a few hops 
on many requests).

Basically, premix routing as we have envisaged it up until recently would 
divide the (darknet) network up into semi-permanent cells. Within each cell, 
all nodes are reasonably densely/directly connected, and all nodes are kept 
up to date on the status of other nodes in the cell. When we make a new 
request, we choose two nodes from the cell at random and create an encrypted 
tunnel from ourself (the start point) to the first node and then to the 
second node. Then the requests exit the tunnel and begin their journey. 
Hence, a request from one node in a cell is indistinguishable (we hope) from 
a request from any other node, and you have a definite anonymity set: the 
cell.

The main disadvantages:
- It will cost around twice the diameter of the cell, in hops, at the 
beginning of a request. Note that this is an encrypted tunnel so caches 
cannot be checked during these hops.
- It only protects the beginning of the request, *not* the main request 
routing phase. You can hide within the cell, but the attacker will eventually 
be able to trace your requests back to the cell. If the cell is small enough, 
and/or the attacker powerful enough, he may simply seize every node in the 
cell...
- We can enlarge the cell but only at the expense of increasing the number of 
hops at the beginning of the request, as well as the required co-ordination 
traffic.

Hence a better approach is needed. Michael Rogers and I discussed this fairly 
extensively at the anonymity summit on Monday. One paper that was presented 
there showed fairly convincingly that no current structured peer to peer 
system is likely to achieve better than c/n security, so that may be an 
acceptable goal. By this I mean a roughly c/n chance of a tunnel being 
compromised given that c nodes out of n have been compromised. Tor (not a 
structured network, every node knows about all nodes) achieves roughly 
c^2/n^2 in theory however there are attacks that may reduce that to c/n.

Unencrypted tunnels:

 We would simply bundle a load of requests together and route them together 
(randomly, or to a random location) for a few hops. After that, they go on 
their way as normal. If the attacker doesn't manage to see the requests 
during the tunnel phase, instead of getting one predecessor sample for every 
request (4G insert = 270,000 requests), he would get one predecessor sample 
for each tunnel, which hopefully would carry very many requests. However, if 
he controls any of the nodes involved in the tunnel setup (probability around 
[tunnel length] * c / n), he gains a much stronger predecessor sample. One 
advantage with unencrypted tunnels, apart from simplicity, is that we can 
check the datastore on each node on the tunnel; on the other hand, we can't 
use Bloom filters and it doesn't solve The Register's datastore probing 
attack as we will still have to cache local requests locally.

Encrypted tunnels via rendezvous at a location: 

We would create a key, split it into 2 or 3 parts (which can be XORed together 
or concatenated to recover the original), and choose a location and a unique 
ID. The first part would be routed directly to the target location. The 
second and third (or more) parts would be random routed for some hops and 
then routed to the target location. Once they exit the random routing phase 
they look exactly the same as the first part. Each "anchor" continues until 
it rendezvous's with the others, or it runs out of hops (randomly or with an 
HTL or some hybrid). If all parts meet, a tunnel completion message 
(including the hash of the complete key) is sent back to the originating 
node, along the shortest path (either explicitly identify the first one, but 
we lose some data; or have a random number which starts the same for all 3 
but is incremented on each hop; or even go by the first one). Then we use 
this to set up an encrypted tunnel to send requests through. We should have 
an anonymity set roughly equal to the whole network. One problem is that we 
may have to go a significant number of hops. In terms of attacks, we give the 
attacker more location samples (on the direct-routed path), and more 
predecessor samples (on the random-routed prefixes). That's the main 
disadvantage: the search will visit lots of nodes, giving the attacker many 
opportunities to intercept it. If he manages to get all the pieces, he can 
decrypt the requests.

Encrypted tunnels via random routing and rendezvous:

Partly as a result of reading the Rumour Riding paper, Michael has proposed 
that we simply rely on random walks, instead of routing towards a specific 
location. The advantage is it doesn't provide location samples, and we can 
make the paths shorter than in the previous option (thus we can tune the 
tradeoff between the part of the network that could have originated it vs the 
cost). However, it is likely to be less reliable, and therefore require many 
random walks, many nodes have an opportunity to get a predecessor sample 
and/or the keys. Also imho it will frequently produce tunnels that are *too* 
short. I'll let Michael elaborate on it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080612/b0eb384d/attachment.pgp>

Reply via email to