PROBLEM:
If an attacker can identify that each block belongs to the same stream of 
requests or inserts, he can move towards the originator increasingly rapidly: 
each request gives him a bearing on the direction (in the keyspace) of the 
originator, and he can use this information to connect to nodes closer to the 
originator. Each time he does this, the amount of the stream that he sees 
increases, and therefore his search accelerates. This attack is easiest on 
opennet, but it may also be possible on darknet for some attackers (but much 
slower!).

PARTIAL SOLUTION:
We have a partial solution in that we don't insert the top block, or generate 
the key, until after we have inserted all the data below it. However, in many 
instances the data will be guessable (or guessable to within some computible 
entropy e.g. all but a few bytes) and therefore the attacker can still 
identify the blocks.

BAD SOLUTION:
One proposed solution (for inserts) is to insert each splitfile encrypted with 
a random key. This would help a lot with this attack (for inserts), but it 
would cause much more data duplication, entirely losing the benefits of 
convergent encryption (CHKs). To what degree convergent encryption is useful 
is an open question, but there is a better way...

NEW PROPOSED SOLUTION:
When Alice inserts a file, her node runs FEC encoding as usual. Then a random 
obfuscation key is chosen, and each block is encoded, both with the random 
obfuscation key, and with the normal CHK convergent encryption. Alice's node 
inserts only the obfuscated blocks, but it computes the CHKs for the 
non-obfuscated blocks if they were inserted.

The top block includes pointers to the top level of the splitfile for both the 
obfuscated and non-obfuscated versions, but only the obfuscated version has 
been inserted *by Alice*.

Alice announces the key on a public forum. Bob (hopefully there will be many 
Bob's) starts to download it. For each block in the splitfile, Bob's node 
tries the non-obfuscated block first, maybe giving it a 3-try head start. 
Then when this fails it tries the obfuscated block. When each obfuscated 
block is fetched, there is a chance that the non-obfuscated version will be 
inserted by Bob's node.

Thus, Alice is protected by Bob. Most likely Alice is in the greater danger: 
generally you want to go for the source of the data. The attacker will then 
only gain a small amount of information from a splitfile insert: even though 
he can identify the blocks in retrospect, he can't move towards the insertor 
during the insert, so he gathers much less information. It will probably take 
more than one splitfile insert to trace Alice. Of course, splitfile inserts 
aren't the only thing that gives him bearings on Alice's location, her FMS 
posts etc (e.g. announcing her files) will also betray her in sufficient 
quantity. It would be good to have some sort of guesstimate that you have to 
change identity every X messages or inserts to be reasonably safe...

Sadly protecting requestors in this way is impossible afaics (although in the 
long term, premix routing and/or tunnels will help). And in the long run we 
don't fetch the obfuscated data, only the non-obfuscated data, which will 
collide for multiple inserts of the same data, so we don't waste any space. 
Obviously this whole mechanism would be optional so that you can turn it off 
for a slightly faster / less demanding insert (but with less security).

Problems:

** Whether to include the non-obfuscated blocks in the splitfile metadata. **

Inserting the non-obfuscated blocks when we get the obfuscated ones, and 
trying the non-obfuscated ones before the obfuscated ones, is intended to 
result in the non-obfuscated blocks being preferred, and the obfuscated ones 
falling out of the network. 

We have two options:

1. We could have a parallel metadata structure, with only the obfuscated keys 
in the obfuscated metadata, and only the non-obfuscated keys in the 
non-obfuscated metadata, and the top block referring to both, but Alice only 
inserting the obfuscated metadata.
CON:
- A requestor will not know the obfuscated blocks' keys until he has decoded 
them. Thus, he cannot request them, until some requestor manages to download 
the entire (obfuscated) file and insert the non-obfuscated metadata.

2. We could include the keys of both the obfuscated and non-obfuscated blocks 
in the obfuscated metadata.
PRO:
- We can immediately fetch the non-obfuscated blocks from day one. So we can 
further propagate the blocks inserted by our fellow requestors, and not 
propagate the obfuscated blocks, maximising efficiency and resulting in 
faster downloads when a file is new.
- Also, if the same file has been inserted previously, we can immediately pick 
up the blocks inserted for the other file.
CON:
- The obfuscated metadata will be twice the size because twice the number of 
keys. And it will not collide, because it is obfuscated.
MITIGATION:
- However, we can have non-obfuscated metadata in parallel. This could be 
inserted by any requestor, and would be the same size as splitfile metadata 
is now. Bob's node would request both the obfuscated and the non-obfuscated 
metadata initially, but if the non-obfuscated metadata was found, there would 
be no need to fetch the lower layers of the obfuscated metadata unless they 
were needed i.e. problems were encountered fetching the data.

IMHO we should implement the second option, with the mitigation so that 
long-term very few blocks are fetched from the obfuscated data, and it falls 
off the network.

** Possible optimisation: Timestamps **

We could include a couple of timestamps in the top level metadata:
- Before time T1, fetch the non-obfuscated data as often as the obfuscated 
data.
- After time T2, don't fetch the non-obfuscated data at all (or only if really 
desperate).

Obviously this would have to be optional, and the original time of insertion 
would have to be obfuscated. And this will only work if we are sure that our 
data will be downloaded by several others shortly after we announce it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080607/004a80d9/attachment.pgp>

Reply via email to