[freenet-dev] Temporary insert obfuscation: The best of both worlds

Matthew Toseland Thu, 12 Jun 2008 15:59:43 +0100

On Monday 09 June 2008 04:48, Daniel Cheng wrote:
> On Mon, Jun 9, 2008 at 11:09 AM, Florent Daigni?re
> <nextgens at freenetproject.org> wrote:
> > * Daniel Cheng <j16sdiz+freenet at gmail.com> [2008-06-09 10:58:02]:
> >
> >> On Sat, Jun 7, 2008 at 6:24 PM, Matthew Toseland
> >> <toad at amphibian.dyndns.org> wrote:
> >> > On Saturday 07 June 2008 07:36, Daniel Cheng wrote:
> >> >> On Sat, Jun 7, 2008 at 7:46 AM, Matthew Toseland
> >> >> <toad at amphibian.dyndns.org> wrote:
> >> >> > On Saturday 07 June 2008 00:06, Matthew Toseland wrote:
> >> >> >> PROBLEM:
> >> >> >> If an attacker can identify that each block belongs to the same 
stream of
> >> >> >> requests or inserts, he can move towards the originator 
increasingly
> >> >> > rapidly:
> >> >> >> each request gives him a bearing on the direction (in the keyspace) 
of
> >> > the
> >> >> >> originator, and he can use this information to connect to nodes 
closer to
> >> >> > the
> >> >> >> originator. Each time he does this, the amount of the stream that 
he sees
> >> >> >> increases, and therefore his search accelerates. This attack is 
easiest
> >> > on
> >> >> >> opennet, but it may also be possible on darknet for some attackers 
(but
> >> > much
> >> >> >> slower!).
> >> >> >>
> >> >> >> PARTIAL SOLUTION:
> >> >> >> We have a partial solution in that we don't insert the top block, 
or
> >> >> > generate
> >> >> >> the key, until after we have inserted all the data below it. 
However, in
> >> >> > many
> >> >> >> instances the data will be guessable (or guessable to within some
> >> > computible
> >> >> >> entropy e.g. all but a few bytes) and therefore the attacker can 
still
> >> >> >> identify the blocks.
> >> >> >>
> >> >> >> BAD SOLUTION:
> >> >> >> One proposed solution (for inserts) is to insert each splitfile 
encrypted
> >> >> > with
> >> >> >> a random key. This would help a lot with this attack (for inserts), 
but
> >> > it
> >> >> >> would cause much more data duplication, entirely losing the 
benefits of
> >> >> >> convergent encryption (CHKs). To what degree convergent encryption 
is
> >> > useful
> >> >> >> is an open question, but there is a better way...
> >> >> >>
> >> >> >> NEW PROPOSED SOLUTION:
> >> >> >> When Alice inserts a file, her node runs FEC encoding as usual. 
Then a
> >> >> > random
> >> >> >> obfuscation key is chosen, and each block is encoded, both with the
> >> > random
> >> >> >> obfuscation key, and with the normal CHK convergent encryption. 
Alice's
> >> > node
> >> >> >> inserts only the obfuscated blocks, but it computes the CHKs for 
the
> >> >> >> non-obfuscated blocks if they were inserted.
> >> >> >>
> >> >> >> The top block includes pointers to the top level of the splitfile 
for
> >> > both
> >> >> > the
> >> >> >> obfuscated and non-obfuscated versions, but only the obfuscated 
version
> >> > has
> >> >> >> been inserted *by Alice*.
> >> >> >>
> >> >> >> Alice announces the key on a public forum. Bob (hopefully there 
will be
> >> > many
> >> >> >> Bob's) starts to download it. For each block in the splitfile, 
Bob's node
> >> >> >> tries the non-obfuscated block first, maybe giving it a 3-try head 
start.
> >> >> >> Then when this fails it tries the obfuscated block. When each 
obfuscated
> >> >> >> block is fetched, there is a chance that the non-obfuscated version 
will
> >> > be
> >> >> >> inserted by Bob's node.
> >> >> >>
> >> >> >> Thus, Alice is protected by Bob. Most likely Alice is in the 
greater
> >> > danger:
> >> >> >> generally you want to go for the source of the data. The attacker 
will
> >> > then
> >> >> >> only gain a small amount of information from a splitfile insert: 
even
> >> > though
> >> >> >> he can identify the blocks in retrospect, he can't move towards the
> >> > insertor
> >> >> >> during the insert, so he gathers much less information. It will 
probably
> >> >> > take
> >> >> >> more than one splitfile insert to trace Alice. Of course, splitfile
> >> > inserts
> >> >> >> aren't the only thing that gives him bearings on Alice's location, 
her
> >> > FMS
> >> >> >> posts etc (e.g. announcing her files) will also betray her in 
sufficient
> >> >> >> quantity. It would be good to have some sort of guesstimate that 
you have
> >> > to
> >> >> >> change identity every X messages or inserts to be reasonably 
safe...
> >> >> >>
> >> >> >> Sadly protecting requestors in this way is impossible afaics 
(although in
> >> >> > the
> >> >> >> long term, premix routing and/or tunnels will help). And in the 
long run
> >> > we
> >> >>
> >> >> Does premix routing solve this problem? Or just make this a little bit
> >> >> more difficult?
> >> >>
> >> >> If can solve this problem, then we may want to defer this until we
> >> >> have some final discition on premix routing. There is no urgent need
> >> >> to duplicating this effort.
> >> >
> >> > Premix routing does not fully solve this problem. It will help in that 
it will
> >> > only be possible to trace the insert to a premix cell, but that cell 
will
> >> > have to be a small subset of the overall network. Even if a cell is 
10,000
> >> > nodes, if you combine that with some other data (times of inserts 
maybe), you
> >> > may be able to get it down to a number of nodes which it is practicable 
to
> >> > send the blackshirts to go bust!
> >> >
> >> > The two mechanisms operate on different levels. On a darknet, premix 
routing
> >> > lets you hide in a (possibly fairly large) cell, constructed from 
darknet,
> >> > leveraging the trust network to select non-evil peers; on opennet, 
premix
> >> > routing may be more like a traditional onion router such as Tor (more 
easily
> >> > subject to Sybil attacks, and possibly observation of tunnel 
construction
> >> > depending on how we implement it). Certainly darknet premix routing 
does not
> >> > solve the problem. So no, the two systems are complementary.
> >> >
> >> > Also, it is highly unlikely that we will implement premix routing this 
year.
> >> > It would be a large task, and would not immediately improve performance 
or
> >> > usability, so it is out of scope for 0.7.1, given the severe time and 
money
> >> > constraints we are facing.
> >>
> >> Something I worry about:
> >> (1) CHK@ key change on re-insert
> >>
> >> Currently, if a file fall out of the store, people post on the
> >> "unsuccess" frost/fms board and the inserter just re-insert the file.
> >
> > The encrypted key will be different but not the "plain" version so it's
> > not a big deal.
> 
> The head block (is this what it's called?) which contain metadata and
> pointers to the encrypted blocks would change.


Yes, however as long as enough people fetch the newly inserted version, the 
plain version will be reinserted and thus will be fetchable even by those 
fetching different versions of the key.

Would it help to have a plain version with a separate key?
> 
> >> If we employ this, does this means he have to re announce the key to
> >> everybody? This means some indexes or freesite have to be updated. And
> >> there are no easy way for the downloader to confirm the new key is the
> >> same as the old one.
> >
> > That's a long-standing issue: we should provide a *content hash* of the
> > content in the metadatas (so that only them would have to be fetched to
> > figure out if it's the same file or not)...
> > I was willing to implement it myself at some point but got distracted...
> > and I'm not willing to do it now because toad is working on some heavy
> > refactoring of the client-layer.

Which will eventually complete. :)
> >
> >> (2) Doubling the store usage, increase traffic on miss.
> >>
> >> If a file have been fall out (or become hard to retrieve due to
> >> location chunk), the request have double the effort to get the file.
> >> How do we determine when to try the obfuscated blocks?
> >
> > Alchemy, as often...

Explicit timestamps perhaps. But beyond that, yes, alchemy - some factor 
between attempts on the one and attempts on the other. Assuming the file 
isn't findable, the load will be the same, except that there will be twice as 
many keys, so twice as many entries in the failure table.
> 
> >> (3) Backoff scheme
> >>
> >> What worry me most is, if this turns out to be harmful to the network
> >> as a whole. Can we remove this sometime later easily? Will the 0.8
> >> network backward compatible with 0.7.1?
> >
> > Don't worry about that; as we don't have good metrics to evaluate how
> > it performs we won't ever go back on the basis that it harms anything :)
> 
> ugh.
> Maybe a premix routing scheme acrossing darknet/openet broader would help?
> (no, I don't have any idea on how to implement that efficiently yet secure.)

No, afaics premix routing within the darknet will have to be completely 
different to premix routing on opennet.

However, I am increasingly of the view that our previous models of premix 
routing are wrong and expensive, I will post more soon...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080612/88dd1dea/attachment.pgp>

[freenet-dev] Temporary insert obfuscation: The best of both worlds

Reply via email to