[freenet-dev] Summary of conclusions from Freenet mini-conference

Matthew Toseland Mon, 7 Jul 2008 12:17:33 +0100

On Sunday, Oskar, myself, Ian, Theo, Michael, Vilhelm (vive), and a
couple of newbies met up in a pub in London and had a long productive
conversation about the current status of Freenet and the near future
priorities. Vive has some notes; proper minutes will not be produced,
but I include the important points and action items below, most of which
inevitably fall to me! :) One important side benefit was to explain
various current issues to people who haven't been involved very much for
a while... hopefully they will contribute more now.


FINANCIAL STATUS:
Freenet Project Inc has $16,000 in the bank and paypal combined. This is
enough to keep me paid up until the end of the year. Google contributed
$18K some months ago, also iirc we had a $1500 donation more recently,
but we had expected much of that to be used by now; the difference may
be private (paypal) donations.

VERSIONING:
Given the significant gains available from some of the below items in
security and performance, we should IMHO skip 0.7.1 and release
0.8-alpha1 in approximately December. Hopefully we will be able to raise
enough money on the back of this to continue into next year. Many
features will need to be postponed to 0.9; hopefully 1.0 will be largely
stabilisation work.

ACTION ITEMS/TODO LIST ADDITIONS:

Somewhat theoretical stuff: should work, but needs simulations to be
absolutely certain. Mostly relatively easy to implement once simulations
confirm that it works, but actual implementation should probably be
postponed to 0.9 as mostly darknet specific and potentially
destabilising.
Responsible: Vive. (After he comes back from holiday, assisted by Oskar).

1. Selective randomization.

It is necessary to randomize node locations from time to time in order
to prevent the network from degenerating, with some parts of the
keyspace becoming increasingly crowded and other parts of the keyspace
becoming huge voids with no nodes in that area at all. This can be
either caused by join-leave churn (causing an unhelpful form of
"evolution", with locations in less dense areas being shifted to new
nodes and then dropped), or by deliberate attack on the swapping
algorithm, again with the goal of destroying locations and concentrating
all nodes' locations over an increasingly small area of the keyspace. For
more detail read the Pitch Black paper (is it linked from the website??).

Currently we randomize on a regular basis. This is bad because it is not
sensitive to the network size or whether an active attack is happening,
hence it either does too much randomization or too little. Oskar has
proposed a new way to decide when to randomize: We send out 5 probe
requests (not the same as the current probe requests), to 5 random
locations, and each reports how close the reached node is to its peers'
locations. We then take the overall median, and if we are much closer
(tuning factor) to our peers than the probed nodes on average are, we
reset our location to that of one of the random locations we probed for.

This requires simulation to:
a) Determine a reasonable tuning factor.
b) Prove that it works.
c) Prove that it is less vulnerable to attack than the current
mechanism. An attacker would have to compromize 3 of the 5 paths, but
for each path he only needs to be on the path to fake it... the paths
might be limited to 5 hops, but even so, that's a lot of opportunities.
On the other hand, it is unlikely that regular randomization will beat
location spamming attacks, unless there is so much randomization as to
have a significant impact on performance when there is no attack.

2. Targeted swapping.

Freenet initially assigns random locations to nodes on the darknet, and
constantly attempts to swap locations between pairs of nodes to improve
the routability of the network, using the Metropolis-Hastings algorithm.
At the moment, these swaps are proposed completely randomly, from 10 hop
random walks. Vive proposed in a paper last year that some proportion of
swaps should be "directed" i.e. the node figures out where would be
optimal for its peers, and then routes to that location and proposes a
swap with the "ideal" node. Not all of the swaps can be directed, in
order to avoid long-lived false minima. This would speed up location
swapping, the network would stabilise faster, achieve more efficient
routing, and hopefully less location churn. But mostly this is about
making routing work well on darknet: see the next item.

3. Opennet nodes and swapping.

Opennet nodes use path folding, which Oskar's work shows is a very fast,
reliable algorithm for making a navigable network by changing the links
rather than the locations, and therefore the do not need swapping. 
Swapping in fact may cause significant location churn for opennet nodes,
leading to poor data storage. But simply turning it off for opennet nodes
may have unexpected side-effects on a hybrid network. One proposal is
that opennet nodes not initiate swaps, but relay them for darknet nodes;
or even accept swaps, but not initiate them. The question is whether the
differences in swap pressure will cause problems... We need some detailed
simulations, there is the potential to improve performance (location
stability, hence data reachability) significantly for opennet (i.e. the
majority of the network at present), without jeopardising our long term
strategy of moving everyone progressively to darknet in order to reduce
our attackability (especially as ISPs may begin blocking p2p networks in
many countries in the next few years).


More straightforward todo items (mostly me), for 0.8-beta1

1. Finish the db4o branch.

I have been working on the db4o branch for some time, around a month.
This is an attempt to move the client layer (for persistent requests) into
a lightweight object database. This should allow fast resuming of requests,
and for users to queue larger numbers of requests than will fit in RAM,
and have performance degrade gracefully. Right now it is essentially
working, but has major performance problems, resulting in the in-RAM
queue of database jobs continually expanding when a big request is
running (even on my fairly heavy hardware, admittedly with heavy logging
enabled). There is room for significant optimisations, the immediate thing
being to not use pendingKeys (the big persistent map of which key is wanted
by which request) when we merely retry a block. Hopefully it will be 
possible to achieve reasonable performance, but some work will also need to
be done to ensure that on a slow system it still works even if slowly (i.e.
that it slows down rather than the job queue expanding indefinitely). There
is a nonzero chance that we will have to abandon the whole project. Simple
snapshot based resuming would be possible in that case (and IMHO essential
for 0.8), but would of course take time on startup and memory proportional
to the queue size.

2. Friend-of-a-friend routing. (Toad)

Published work shows that taking into account the locations of our
neighbour's neighbours when routing can dramatically improve performance.
Oskar estimates that we can reduce the path length by an average factor of
two, this may be even better in practice on a darknet. Apparently beyond
two hops we rapidly run into diminishing returns. The security impact is
minimal since this data is already exposed, albeit somewhat less reliably,
by swapping. In theory the performance gain should be around a factor of
log log N, assuming N nodes and log N peers per node, but no more than 5.

3. Eliminate all checkboxes in the installer, move to the wizard, and
create an advanced options button in the installer. (Toad)

This would streamline installation significantly: Paranoid users could
click a box to turn UPnP etc on/off, everyone else doesn't need to worry
about it.

4. Non-convergent splitfile encryption.

Encrypting splitfiles using a random key independant of the content of
the splitfile would dramatically reduce the rate at which an attacker
can approach the inserter in the case where he can guess the content.
Implementing this will be reasonably easy, it will however cost some
redundancy: if two people insert the same file, they won't share blocks.
In the future there are a couple of (relatively complex) schemes to
reduce this cost, which we may implement if they appear to be necessary.
This should probably be on by default but easy to turn off if the user
is sure the file content can't be predicted by the attacker, or just
doesn't care.

5. THROTTLE EVERYTHING !! (Toad)

Not all traffic is throttled at the moment. We should subject all
traffic, not only data transfer packets, to both congestion control and
bandwidth limiting. This is related to the low-level streams proposal,
but can be done without it, and should be easier. This is especially
problematic for online gaming. See bug 2312. Note that a lot of Freenet
runs on altruism, thus it is *essential* that it not severely disrupt
a user's internet connection!

6. Reinstate uninstall survey. (Toad)

Nextgens had an uninstall survey on emu for a while. Some useful data
came from it, and since 90%+ of our users don't stick around, it would be
a good idea to reinstate it, and try to optimise the questions.

7. Automatic bandwidth limit calibration. (Toad)

Several other p2p apps implement this, we should too. Bandwidth is *the*
scarce resource most of the time, we want to use as much of it as we can
without significantly slowing down the user's internet connection (see
above).


Vive's students' possible projects:

1. Facebook application.

A facebook app for (semi-?) automatic swapping of noderefs with people
already marked as your friends on Facebook. Obviously this has security
implications, it would be turned off by default but would be a useful
option. We could explain this in the post-install wizard.



TO IMPLEMENT IN 0.9:

1. Streams (low level optimisation)

At the moment, Freenet is quite inefficient in bandwidth usage: the
payload percentage, especially on nodes with low bandwidth limits but
also on nodes with high bandwidth limits, can be quite low, 70%, maybe
50%... this is unacceptable. A lot of this is due to message padding. If
we split block transfers (and large messages) into streams which can be
sliced up byte-wise however big we want them, rather than into 1K blocks,
then we will need much less padding and so acheive a higher payload
percentage. This also has significant side benefits for steganography and
for connections with low MTU, since we can (on a link with live streams)
construct useful packets of almost any size.

2. Tunnels.

There have been several proposals for creating encrypted tunnels to send
requests through. The oldest proposal is cellular premix routing, which
we have now abandoned, because it only offers an anonymity set of the
cell, and costs twice the cell's diameter in hops. The recent proposals
create tunnels by doing rendezvous of two or more "anchors" sent out
by a node: either we rendezvous at a location, with the first anchor sent
direct and the second random routed for a few hops, or we send two or more
random routed anchors. Either way when the anchors meet, we create a tunnel.
There can be just two anchors, or it can be an M of N secret sharing scheme.
Once a tunnel is created, it will be fairly long lived, and related requests
will be sent down the same tunnel(s). Related meaning firstly part of the
same splitfile, and secondly marked as belonging to the same pseudonymous
identity by FCP. (This is a GOOD thing to reduce the number of samples,
as long as separate pseudonymous identities can be kept separate).

Tunnels will help with each of the three main attacks on requests which
most concern me at present:
- Datastore seizure / remote probing.
- Local correlation attacks.
- Key-based search. (The splitfile changes above also help for this, but
  only for inserts; put both together and we have a radical improvement of
  security for insertors).

Because tunnels mean that requests are started many hops away, we won't have
to cache local requests in the local datastore. We will have a separate
client cache, which may be ephemeral (key in RAM only, not saved). This means
that we can publish Bloom filters for our datastore to our direct
neighbours. This will likely save us several hops on average, so the overall
cost of tunneling will not be huge. The only real problem is what to do if
the user turns off tunneling... turn off Bloom filters too?

3. Passive requests.

This feature has been planned for some time, I explained the current vision
to Oskar et al, and the main justification for it (FMS!). There seemed to be
general support in the group, in the medium term. Passive requests are a way
of subscribing to a key by doing a request for it, and the nodes visited
remembering the request and forwarding the data if it is found. As long as
routing works, this should work adequately. Currently we have a simple form
of this, ULPRs (Ultra-Lightweight Passive/Persistent Requests), but this is
inherently unreliable, and only lasts an hour. Full passive requests would
be reliable as long as routing works. This means when a connection is broken,
or there are significant changes to the network topology/locations, the node
involved would need to reroute the request. This is somewhat similar to the
publish/subscribe system I tried to get working early in the 0.7 cycle, but
IMHO we have a much better idea what we are doing now. Note that we may have
a large number of passive request subscriptions on any one node, big enough
that they will need to be stored in a database. Also note that if a lot of
nodes are subscribed to a key, they form a tree/web structure so that it
will be efficiently and near instantly propagated to all of them when it is
found (in a place where it shouldn't have been, or because it was inserted).
The objective here is primarily to reduce the cost of polling, particularly
for FMS, but also for requests in general, and to increase the speed at which
data can be found if people subscribe to it in advance. High-bandwidth /
medium-latency publish/subscribe (audio streams e.g.) can be built on top
of this; whether we want to have a separate system for low-bandwidth low
latency streams (IRC etc) is an open question.


DATA COLLECTION:

1. Push/pull tests.

We should do push/pull tests regularly on the real network. On darknet this
won't work well because of the difficulty of getting somebody you're not
near to to fetch your data, although this is IMHO possible with FMS etc
volunteers. On opennet we can simply select a different pair of peers every
time to avoid the two nodes converging on each other as they are reused.

OPEN QUESTIONS:

1. CPU profiling.

Is there any more we can do to improve the node's CPU usage? It can still be
high on some systems. What I found last time I tried was that mostly this
is due to low memory conditions and FEC encoding/decoding ... but it may be
worth re-investigating.

2. Oskar's NGR-like-routing paper.

Is there an opportunity for significant performance gain here? Oskar proved
in a paper that it is possible to take link costs into account in routing
and still achieve good routing... it's not clear how close this is to
something we could use.

3. Darknet link trust levels.

Should we have a trust level dropdown box for each darknet peer? Oskar had
originally assumed that we would solve the datastore problems by simply
not caching local requests, on the grounds that we trust our (darknet)
peers to not attempt to determine whether our requests are local from whether
we cached them. This logic doesn't work for most links, even most darknet
links; but for some connections, it may be useful, since it would
significantly reduce our vulnerability if our store is seized since only a
fraction of the file would be cached in our store, and likewise with
remote store probing. Also, we may not want to send Bloom filters that are
potentially incriminating to all our peers, especially if we turn off
tunneling for some requests???

4. Migration of data. 

After we swap locations, should we try to migrate data to match the new
location?

[freenet-dev] Summary of conclusions from Freenet mini-conference

Reply via email to