The following is 0.9 stuff. 0.7.5 is a stabilisation release which should be 
out within weeks. 0.8's main features will be Freetalk, MHKs, Bloom filter 
sharing and related changes, resulting in significant gains to usability 
(Freetalk), data retention and speed. Also, bursting is generally more of an 
issue with faster connections, as are slowly being rolled out across the 
world's major cities, and are already common in a few countries. However, IMHO 
these issues are worth considering.

The issue of bursts has come up a few times. Our current load management 
generally avoids sustained bursts because it is based on measuring the total 
load on the network and guesstimating a safe speed at which to send requests, 
using the average time taken to send a request and the probability of a request 
being rejected or timing out. However, the network is heterogenous, conditions 
in one place are not the same as in another, and users have differing views on 
security. So perhaps we could improve performance with a new load management 
scheme which adapts better to local conditions - most likely based on token 
passing and queueing (for bulk requests, real time flagged requests would be 
queued minimally or not at all). This would likely be more varied from place to 
place, *and* more bursty from time to time.

Recently there has been a consensus that Freenet cannot safely burst (with the 
exception of purely local traffic, CBR padded links, and invisible links such 
as LAN connections, private wifi or sneakernet), because a burst is visible on 
the network level - it's an explosion in traffic levels that fans out from the 
originator, becoming less severe on each hop. If an attacker can see the 
traffic levels, and also has a node close enough to identify the keys involved, 
he can tell who originated the burst and what they are fetching (or inserting). 
Thus we can never come close to Perfect Dark's speed, for example, because it 
relies heavily on bursting (as well as having severe lower limits for bandwidth 
and disk usage).

However, I am not convinced. I think we are trying to fly before we can walk 
here, and in any case there is a fundamental flaw in the argument.

THE FUNDAMENTAL FLAW:
A powerful passive attacker (which is required for the above attacks) can see 
the traffic flows. They cannot be easily disguised, even if we manage to 
obscure individual requests (which is obviously essential!). We can make 
downloads faster or we can make them take longer. If they are faster, then they 
show up more obviously on the graph. If they are really slow then it is 
conceivable that noise in the number of requests succeeding on the node may 
cover for it. But on the other hand, it means the burst is over a longer 
period, allows the attacker more time to try to move towards the originator (on 
opennet), the whole network may be smaller and overall usage lower because of 
behavioural effects, the attacker will see the same number of key-based 
samples, and there is a greater chance of downtime during the period (which can 
be unambiguously seen from traffic analysis). Intersection attacks are 
possible: compare the time when a specific node is up and is receiving data 
with the time when a specific request is on the network. If many nodes are 
continually receiving data (not just requesting stuff that has fallen out), 
then this attack is significantly harder, hence the traditional view that every 
node will have a huge queue and be constantly downloading. The more nodes which 
appear on traffic flows to be request sources, the more anonymity a requester 
has, and the more nodes *nearby* which appear to be traffic sources, the more 
anonymity he has against an attacker relatively nearby. On opennet, 
mobile-attacker adaptive search is so powerful that bursts probably make very 
little difference. On darknet, bursts may still be a genuine concern, if link 
traffic levels are observable. Tunnels obviously will help, but are likely to 
have a huge performance impact, and there will be big issues over securing them 
against traffic analysis.

OPENNET:
Assume we are trying to trace a request or an insert with predictable content; 
in the case of an insert with content not identifiable until after the data has 
been announced, bursting may be something of a concern.

If an attacker is close enough to identify the data, whether or not he has 
traffic data, he can get a rough bearing from the locations of the keys, and he 
can move towards that location, in the standard adaptive search attack. He can 
do this without being particularly powerful; this attack basically breaks 
opennet IMHO. The attacker can be quite a long way away. Data from traffic 
bursts does help, but the adaptive search attack is so much easier that it may 
not be relevant. A purely passive attacker will find nothing out in any case; 
nodes are needed on the network for all of these attacks. And limits on path 
folding are set in terms of time as well as in terms of requests, so there may 
be some security benefit to a transfer being faster and taking less time.

DARKNET:
Some darknet connections will be inherently invisible (private wifi links, LAN 
connections, sneakernet) or unobservable (constant bitrate padding, hard stego 
with rates determined purely by the transport e.g. faking a VoIP stream or 
gaming session). On these links we can happily send as much traffic, local and 
remote, as is physically possible, subject to concerns about nodes getting too 
great a proportion of our traffic (which are less of a worry on darknet 
anyway). For the rest, it is conceivable that severely limiting the incoming 
bandwidth used by outgoing requests, so that it is obscured by variations in 
request success, may help, but I am not convinced: the traffic levels will 
still "point to" the originator, overall, it will be a weaker signal but over a 
longer period. Also, we do not want darknet to be radically slower than 
opennet, or nobody will use it: darknet must be at least as fast as opennet by 
default (with enough connections), and we must allow users to add security at a 
cost in performance if they need it, through steganographic and CBR transports, 
and maybe tunnels.

TUNNELS:
I don't think there is much point turning tunnels on on opennet, for example 
(Sybil is just way too easy), but really paranoid users can set MAXIMUM 
security level and use tunnels (provided that it is possible to prevent tunnels 
being obvious on traffic analysis!); of course this assumes that we actually 
implement tunnels, which may be quite some time if it is a major performance 
cost.

IMPLEMENTATION AND DOCUMENTATION CHANGES:
1. We need a new load management scheme, most likely some variant on token 
passing: For bulk requests, we calculate our capacity for running requests, and 
when we are able to accept some, we ask some of our peers to send us requests. 
If they send too many we reject some. We queue requests waiting for an 
opportunity to forward them to a good node (hence optimising routing), with an 
eventual timeout if we don't manage to do so. Hence load propagates backwards, 
because if a request doesn't move forward, its slot is not made available for a 
new request. There are lots of parameters we can tune for example protection 
against ubernodes, how much of our traffic are we happy with any individual 
connection having, burstiness i.e. how much do we allow nodes which haven't 
sent many requests recently to send more now, and reciprocity, i.e. favouring 
nodes that have been useful to us.

2. In general, bursting is not a big problem for security, certainly not on 
opennet. Load management should not specifically try to limit it unless on 
darknet and the user has a high security level and/or has configured that they 
are prepared to sacrifice significant amounts of performance to improve 
security slightly.

3. Darknet users must (eventually) have the option for fully padded constant 
bitrate connections, and connections using steganography in such a way that the 
traffic flow levels are determined by the steganography (as in faking a VoIP 
call), and of course by network conditions, but not by the level of traffic 
that is actually available. Eventually we will need to devise means to use some 
of the surplus bandwidth for exchanging data pre-emptively etc.

4. It must be easy for darknet users to indicate to the node that a connection 
is unobservable, and to configure how much ubernode protection they need, 
probably via per-peer trust levels.

5. We should encourage users to have data constantly downloading, but we need 
to be realistic about this. More security comes mainly from more total 
downloaders at any given time, which IMHO results from a bigger and faster 
network.

6. Responses to offered keys, and data fetched from Bloom filters, when they 
are purely local, should generally be exempt from all kinds of limiting, 
because the only node able to see them is the one they are being fetched from. 
On opennet, this may make it possible for the peer to guess that the data is 
needed locally - but fetching the data over a longer period of time, or 
fetching it from the broader network and ignoring the node which already has 
it, will just increase our exposure - especially the latter means we are 
potentially vulnerable to far more nodes.

7. Persistent passive requests are a good thing, and can introduce some 
uncertainty into such calculations as #6.

8. Tunnels are a good thing, but given the likely performance cost not an 
immediate priority.

9. Inserts of data which is not predictable by an attacker should be 
encouraged, where this is reasonably possible. There are strong arguments that 
the need to heal splitfiles is critical to good performance. Hopefully this 
will be reduced when we have Bloom filter sharing. If an insert is 
indistinguishable, there is a good chance of its remaining reasonably 
anonymous, even on opennet; bursts are a threat, and can perhaps be covered 
relatively easily. In traffic terms, an insert is similar to an answered 
request, unless you can trace it across the network and show that it is longer 
than a request. So at normal or higher, unless the user overrides a specific 
setting, we should seriously consider severely limiting the bandwidth usage of 
inserts. In practice, we already do this, by treating our inserts just like any 
other requests from any other peer.

10. Favouritism: How much can we prefer our own requests to others, in terms of 
what is feasible for the network as a whole, and in terms of what is safe? At 
the moment, we don't favour our requests at all, except that we generally take 
into account the fact that they won't generate any output usage (which can be a 
big factor, since usually input bandwidth is way less than output bandwidth). 
IMHO this is an open question. If we favour our own requests too much, opennet 
peers will drop us, darknet peers will refuse to answer our requests by 
reciprocity for us not answering theirs (assuming we implement this, which imho 
is a good idea eventually), and peers we do have will know our requests are 
local. IMHO the above reasoning does allow for us to use excess capacity for 
our own requests, this is bursting; in other words, as a starting point, if 
there are requests to send from other peers and local requests to send we treat 
them equally (implying for example that we don't consider far more of our 
queued requests than theirs), but if there are no requests to send and there is 
the capacity to send some more, we can send our local requests, unless we are 
on darknet, doing a non-identifiable insert, and are very worried about our 
security.

DOCUMENTATION/ATTITUDE/GOALS CHANGES:

1. Opennet security sucks. Really, it does. It may be better than some of the 
alternatives, but it isn't vastly better. But IMHO we have the potential to 
achieve fairly interesting performance on opennet - not instant results, but 
fairly good throughput and reachability for rarer content.
2. Unpredictable inserts can be reasonably secure even on opennet, provided 
precautions are taken and the attacker isn't too powerful.
3. It is a good idea to have stuff downloading constantly, although downloading 
stuff you don't need will slow down the network at large. Bloom filter changes 
mean it won't be cached on your node, but it will be cached by reasonably 
nearby nodes.
4. Invisible connections and unobservable connections are a good thing.
5. If an attacker is nearby, there is very little you can do either on darknet 
or on opennet, unless we have tunnels. Freenet is really designed to protect 
against a distant attacker, it is unlikely to work well if an attacker has 
compromised a uniform 10% of the network - but that is rather difficult on 
darknet.
6. The more/bigger content you request, the harder it is to protect you.
7. If an attacker is not nearby, you are on darknet, and the data to be traced 
is identifiable (e.g. a large splitfile request or a reinsert), he will 
eventually be able to find you, but that will involve compromising a long chain 
of nodes either by electronic or social means.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20090605/b3380bf4/attachment.pgp>

Reply via email to