[Tech] Rethinking security, bursting, load management

Juiceman Fri, 5 Jun 2009 12:53:28 -0400

On Fri, Jun 5, 2009 at 10:20 AM, Matthew
Toseland<toad at amphibian.dyndns.org> wrote:
> The following is 0.9 stuff. 0.7.5 is a stabilisation release which should be 
> out within weeks. 0.8's main features will be Freetalk, MHKs, Bloom filter 
> sharing and related changes, resulting in significant gains to usability 
> (Freetalk), data retention and speed. Also, bursting is generally more of an 
> issue with faster connections, as are slowly being rolled out across the 
> world's major cities, and are already common in a few countries. However, 
> IMHO these issues are worth considering.
>
> The issue of bursts has come up a few times. Our current load management 
> generally avoids sustained bursts because it is based on measuring the total 
> load on the network and guesstimating a safe speed at which to send requests, 
> using the average time taken to send a request and the probability of a 
> request being rejected or timing out. However, the network is heterogenous, 
> conditions in one place are not the same as in another, and users have 
> differing views on security. So perhaps we could improve performance with a 
> new load management scheme which adapts better to local conditions - most 
> likely based on token passing and queueing (for bulk requests, real time 
> flagged requests would be queued minimally or not at all). This would likely 
> be more varied from place to place, *and* more bursty from time to time.
>
> Recently there has been a consensus that Freenet cannot safely burst (with 
> the exception of purely local traffic, CBR padded links, and invisible links 
> such as LAN connections, private wifi or sneakernet), because a burst is 
> visible on the network level - it's an explosion in traffic levels that fans 
> out from the originator, becoming less severe on each hop. If an attacker can 
> see the traffic levels, and also has a node close enough to identify the keys 
> involved, he can tell who originated the burst and what they are fetching (or 
> inserting). Thus we can never come close to Perfect Dark's speed, for 
> example, because it relies heavily on bursting (as well as having severe 
> lower limits for bandwidth and disk usage).
>
> However, I am not convinced. I think we are trying to fly before we can walk 
> here, and in any case there is a fundamental flaw in the argument.
>
> THE FUNDAMENTAL FLAW:
> A powerful passive attacker (which is required for the above attacks) can see 
> the traffic flows. They cannot be easily disguised, even if we manage to 
> obscure individual requests (which is obviously essential!). We can make 
> downloads faster or we can make them take longer. If they are faster, then 
> they show up more obviously on the graph. If they are really slow then it is 
> conceivable that noise in the number of requests succeeding on the node may 
> cover for it. But on the other hand, it means the burst is over a longer 
> period, allows the attacker more time to try to move towards the originator 
> (on opennet), the whole network may be smaller and overall usage lower 
> because of behavioural effects, the attacker will see the same number of 
> key-based samples, and there is a greater chance of downtime during the 
> period (which can be unambiguously seen from traffic analysis). Intersection 
> attacks are possible: compare the time when a specific node is up and is 
> receiving data with the time when a specific request is on the network. If 
> many nodes are continually receiving data (not just requesting stuff that has 
> fallen out), then this attack is significantly harder, hence the traditional 
> view that every node will have a huge queue and be constantly downloading. 
> The more nodes which appear on traffic flows to be request sources, the more 
> anonymity a requester has, and the more nodes *nearby* which appear to be 
> traffic sources, the more anonymity he has against an attacker relatively 
> nearby. On opennet, mobile-attacker adaptive search is so powerful that 
> bursts probably make very little difference. On darknet, bursts may still be 
> a genuine concern, if link traffic levels are observable. Tunnels obviously 
> will help, but are likely to have a huge performance impact, and there will 
> be big issues over securing them against traffic analysis.
>
> OPENNET:
> Assume we are trying to trace a request or an insert with predictable 
> content; in the case of an insert with content not identifiable until after 
> the data has been announced, bursting may be something of a concern.
>
> If an attacker is close enough to identify the data, whether or not he has 
> traffic data, he can get a rough bearing from the locations of the keys, and 
> he can move towards that location, in the standard adaptive search attack. He 
> can do this without being particularly powerful; this attack basically breaks 
> opennet IMHO. The attacker can be quite a long way away. Data from traffic 
> bursts does help, but the adaptive search attack is so much easier that it 
> may not be relevant. A purely passive attacker will find nothing out in any 
> case; nodes are needed on the network for all of these attacks. And limits on 
> path folding are set in terms of time as well as in terms of requests, so 
> there may be some security benefit to a transfer being faster and taking less 
> time.
>
> DARKNET:
> Some darknet connections will be inherently invisible (private wifi links, 
> LAN connections, sneakernet) or unobservable (constant bitrate padding, hard 
> stego with rates determined purely by the transport e.g. faking a VoIP stream 
> or gaming session). On these links we can happily send as much traffic, local 
> and remote, as is physically possible, subject to concerns about nodes 
> getting too great a proportion of our traffic (which are less of a worry on 
> darknet anyway). For the rest, it is conceivable that severely limiting the 
> incoming bandwidth used by outgoing requests, so that it is obscured by 
> variations in request success, may help, but I am not convinced: the traffic 
> levels will still "point to" the originator, overall, it will be a weaker 
> signal but over a longer period. Also, we do not want darknet to be radically 
> slower than opennet, or nobody will use it: darknet must be at least as fast 
> as opennet by default (with enough connections), and we must allow users to 
> add security at a cost in performance if they need it, through steganographic 
> and CBR transports, and maybe tunnels.
>
> TUNNELS:
> I don't think there is much point turning tunnels on on opennet, for example 
> (Sybil is just way too easy), but really paranoid users can set MAXIMUM 
> security level and use tunnels (provided that it is possible to prevent 
> tunnels being obvious on traffic analysis!); of course this assumes that we 
> actually implement tunnels, which may be quite some time if it is a major 
> performance cost.
>
> IMPLEMENTATION AND DOCUMENTATION CHANGES:
> 1. We need a new load management scheme, most likely some variant on token 
> passing: For bulk requests, we calculate our capacity for running requests, 
> and when we are able to accept some, we ask some of our peers to send us 
> requests. If they send too many we reject some. We queue requests waiting for 
> an opportunity to forward them to a good node (hence optimising routing), 
> with an eventual timeout if we don't manage to do so. Hence load propagates 
> backwards, because if a request doesn't move forward, its slot is not made 
> available for a new request. There are lots of parameters we can tune for 
> example protection against ubernodes, how much of our traffic are we happy 
> with any individual connection having, burstiness i.e. how much do we allow 
> nodes which haven't sent many requests recently to send more now, and 
> reciprocity, i.e. favouring nodes that have been useful to us.
>
> 2. In general, bursting is not a big problem for security, certainly not on 
> opennet. Load management should not specifically try to limit it unless on 
> darknet and the user has a high security level and/or has configured that 
> they are prepared to sacrifice significant amounts of performance to improve 
> security slightly.
>
> 3. Darknet users must (eventually) have the option for fully padded constant 
> bitrate connections, and connections using steganography in such a way that 
> the traffic flow levels are determined by the steganography (as in faking a 
> VoIP call), and of course by network conditions, but not by the level of 
> traffic that is actually available. Eventually we will need to devise means 
> to use some of the surplus bandwidth for exchanging data pre-emptively etc.
>
> 4. It must be easy for darknet users to indicate to the node that a 
> connection is unobservable, and to configure how much ubernode protection 
> they need, probably via per-peer trust levels.
>
> 5. We should encourage users to have data constantly downloading, but we need 
> to be realistic about this. More security comes mainly from more total 
> downloaders at any given time, which IMHO results from a bigger and faster 
> network.
>
> 6. Responses to offered keys, and data fetched from Bloom filters, when they 
> are purely local, should generally be exempt from all kinds of limiting, 
> because the only node able to see them is the one they are being fetched 
> from. On opennet, this may make it possible for the peer to guess that the 
> data is needed locally - but fetching the data over a longer period of time, 
> or fetching it from the broader network and ignoring the node which already 
> has it, will just increase our exposure - especially the latter means we are 
> potentially vulnerable to far more nodes.
>
> 7. Persistent passive requests are a good thing, and can introduce some 
> uncertainty into such calculations as #6.
>
> 8. Tunnels are a good thing, but given the likely performance cost not an 
> immediate priority.
>
> 9. Inserts of data which is not predictable by an attacker should be 
> encouraged, where this is reasonably possible. There are strong arguments 
> that the need to heal splitfiles is critical to good performance. Hopefully 
> this will be reduced when we have Bloom filter sharing. If an insert is 
> indistinguishable, there is a good chance of its remaining reasonably 
> anonymous, even on opennet; bursts are a threat, and can perhaps be covered 
> relatively easily. In traffic terms, an insert is similar to an answered 
> request, unless you can trace it across the network and show that it is 
> longer than a request. So at normal or higher, unless the user overrides a 
> specific setting, we should seriously consider severely limiting the 
> bandwidth usage of inserts. In practice, we already do this, by treating our 
> inserts just like any other requests from any other peer.


I think MHK's will greatly improve splitfile reliability.  Any chance
this can make it into 0.7.5?

>
> 10. Favouritism: How much can we prefer our own requests to others, in terms 
> of what is feasible for the network as a whole, and in terms of what is safe? 
> At the moment, we don't favour our requests at all, except that we generally 
> take into account the fact that they won't generate any output usage (which 
> can be a big factor, since usually input bandwidth is way less than output 
> bandwidth). IMHO this is an open question. If we favour our own requests too 
> much, opennet peers will drop us, darknet peers will refuse to answer our 
> requests by reciprocity for us not answering theirs (assuming we implement 
> this, which imho is a good idea eventually), and peers we do have will know 
> our requests are local. IMHO the above reasoning does allow for us to use 
> excess capacity for our own requests, this is bursting; in other words, as a 
> starting point, if there are requests to send from other peers and local 
> requests to send we treat them equally (implying for example that we don't 
> consider far more of our queued requests than theirs), but if there are no 
> requests to send and there is the capacity to send some more, we can send our 
> local requests, unless we are on darknet, doing a non-identifiable insert, 
> and are very worried about our security.
>
> DOCUMENTATION/ATTITUDE/GOALS CHANGES:
>
> 1. Opennet security sucks. Really, it does. It may be better than some of the 
> alternatives, but it isn't vastly better. But IMHO we have the potential to 
> achieve fairly interesting performance on opennet - not instant results, but 
> fairly good throughput and reachability for rarer content.
> 2. Unpredictable inserts can be reasonably secure even on opennet, provided 
> precautions are taken and the attacker isn't too powerful.
> 3. It is a good idea to have stuff downloading constantly, although 
> downloading stuff you don't need will slow down the network at large.

Would a list of keys we've transferred (either requested or forwarded
for a peer) in the past "x" hours and randomly re-requesting some
during periods of low downloads help to disguise traffic?

For that matter, perhaps a queue for splitfile healing that inserts
random healing CHKs, or do we already do this?

 Bloom filter changes mean it won't be cached on your node, but it
will be cached by reasonably nearby nodes.
> 4. Invisible connections and unobservable connections are a good thing.
> 5. If an attacker is nearby, there is very little you can do either on 
> darknet or on opennet, unless we have tunnels. Freenet is really designed to 
> protect against a distant attacker, it is unlikely to work well if an 
> attacker has compromised a uniform 10% of the network - but that is rather 
> difficult on darknet.
> 6. The more/bigger content you request, the harder it is to protect you.
> 7. If an attacker is not nearby, you are on darknet, and the data to be 
> traced is identifiable (e.g. a large splitfile request or a reinsert), he 
> will eventually be able to find you, but that will involve compromising a 
> long chain of nodes either by electronic or social means.
>
> _______________________________________________
> Tech mailing list
> Tech at freenetproject.org
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/tech
>

[Tech] Rethinking security, bursting, load management

Reply via email to