On Sunday, Oskar, myself, Ian, Theo, Michael, Vilhelm (vive), and a couple of newbies met up in a pub in London and had a long productive conversation about the current status of Freenet and the near future priorities. Vive has some notes; proper minutes will not be produced, but I include the important points and action items below, most of which inevitably fall to me! :) One important side benefit was to explain various current issues to people who haven't been involved very much for a while... hopefully they will contribute more now.
FINANCIAL STATUS: Freenet Project Inc has $16,000 in the bank and paypal combined. This is enough to keep me paid up until the end of the year. Google contributed $18K some months ago, also iirc we had a $1500 donation more recently, but we had expected much of that to be used by now; the difference may be private (paypal) donations. VERSIONING: Given the significant gains available from some of the below items in security and performance, we should IMHO skip 0.7.1 and release 0.8-alpha1 in approximately December. Hopefully we will be able to raise enough money on the back of this to continue into next year. Many features will need to be postponed to 0.9; hopefully 1.0 will be largely stabilisation work. ACTION ITEMS/TODO LIST ADDITIONS: Somewhat theoretical stuff: should work, but needs simulations to be absolutely certain. Mostly relatively easy to implement once simulations confirm that it works, but actual implementation should probably be postponed to 0.9 as mostly darknet specific and potentially destabilising. Responsible: Vive. (After he comes back from holiday, assisted by Oskar). 1. Selective randomization. It is necessary to randomize node locations from time to time in order to prevent the network from degenerating, with some parts of the keyspace becoming increasingly crowded and other parts of the keyspace becoming huge voids with no nodes in that area at all. This can be either caused by join-leave churn (causing an unhelpful form of "evolution", with locations in less dense areas being shifted to new nodes and then dropped), or by deliberate attack on the swapping algorithm, again with the goal of destroying locations and concentrating all nodes' locations over an increasingly small area of the keyspace. For more detail read the Pitch Black paper (is it linked from the website??). Currently we randomize on a regular basis. This is bad because it is not sensitive to the network size or whether an active attack is happening, hence it either does too much randomization or too little. Oskar has proposed a new way to decide when to randomize: We send out 5 probe requests (not the same as the current probe requests), to 5 random locations, and each reports how close the reached node is to its peers' locations. We then take the overall median, and if we are much closer (tuning factor) to our peers than the probed nodes on average are, we reset our location to that of one of the random locations we probed for. This requires simulation to: a) Determine a reasonable tuning factor. b) Prove that it works. c) Prove that it is less vulnerable to attack than the current mechanism. An attacker would have to compromize 3 of the 5 paths, but for each path he only needs to be on the path to fake it... the paths might be limited to 5 hops, but even so, that's a lot of opportunities. On the other hand, it is unlikely that regular randomization will beat location spamming attacks, unless there is so much randomization as to have a significant impact on performance when there is no attack. 2. Targeted swapping. Freenet initially assigns random locations to nodes on the darknet, and constantly attempts to swap locations between pairs of nodes to improve the routability of the network, using the Metropolis-Hastings algorithm. At the moment, these swaps are proposed completely randomly, from 10 hop random walks. Vive proposed in a paper last year that some proportion of swaps should be "directed" i.e. the node figures out where would be optimal for its peers, and then routes to that location and proposes a swap with the "ideal" node. Not all of the swaps can be directed, in order to avoid long-lived false minima. This would speed up location swapping, the network would stabilise faster, achieve more efficient routing, and hopefully less location churn. But mostly this is about making routing work well on darknet: see the next item. 3. Opennet nodes and swapping. Opennet nodes use path folding, which Oskar's work shows is a very fast, reliable algorithm for making a navigable network by changing the links rather than the locations, and therefore the do not need swapping. Swapping in fact may cause significant location churn for opennet nodes, leading to poor data storage. But simply turning it off for opennet nodes may have unexpected side-effects on a hybrid network. One proposal is that opennet nodes not initiate swaps, but relay them for darknet nodes; or even accept swaps, but not initiate them. The question is whether the differences in swap pressure will cause problems... We need some detailed simulations, there is the potential to improve performance (location stability, hence data reachability) significantly for opennet (i.e. the majority of the network at present), without jeopardising our long term strategy of moving everyone progressively to darknet in order to reduce our attackability (especially as ISPs may begin blocking p2p networks in many countries in the next few years). More straightforward todo items (mostly me), for 0.8-beta1 1. Finish the db4o branch. I have been working on the db4o branch for some time, around a month. This is an attempt to move the client layer (for persistent requests) into a lightweight object database. This should allow fast resuming of requests, and for users to queue larger numbers of requests than will fit in RAM, and have performance degrade gracefully. Right now it is essentially working, but has major performance problems, resulting in the in-RAM queue of database jobs continually expanding when a big request is running (even on my fairly heavy hardware, admittedly with heavy logging enabled). There is room for significant optimisations, the immediate thing being to not use pendingKeys (the big persistent map of which key is wanted by which request) when we merely retry a block. Hopefully it will be possible to achieve reasonable performance, but some work will also need to be done to ensure that on a slow system it still works even if slowly (i.e. that it slows down rather than the job queue expanding indefinitely). There is a nonzero chance that we will have to abandon the whole project. Simple snapshot based resuming would be possible in that case (and IMHO essential for 0.8), but would of course take time on startup and memory proportional to the queue size. 2. Friend-of-a-friend routing. (Toad) Published work shows that taking into account the locations of our neighbour's neighbours when routing can dramatically improve performance. Oskar estimates that we can reduce the path length by an average factor of two, this may be even better in practice on a darknet. Apparently beyond two hops we rapidly run into diminishing returns. The security impact is minimal since this data is already exposed, albeit somewhat less reliably, by swapping. In theory the performance gain should be around a factor of log log N, assuming N nodes and log N peers per node, but no more than 5. 3. Eliminate all checkboxes in the installer, move to the wizard, and create an advanced options button in the installer. (Toad) This would streamline installation significantly: Paranoid users could click a box to turn UPnP etc on/off, everyone else doesn't need to worry about it. 4. Non-convergent splitfile encryption. Encrypting splitfiles using a random key independant of the content of the splitfile would dramatically reduce the rate at which an attacker can approach the inserter in the case where he can guess the content. Implementing this will be reasonably easy, it will however cost some redundancy: if two people insert the same file, they won't share blocks. In the future there are a couple of (relatively complex) schemes to reduce this cost, which we may implement if they appear to be necessary. This should probably be on by default but easy to turn off if the user is sure the file content can't be predicted by the attacker, or just doesn't care. 5. THROTTLE EVERYTHING !! (Toad) Not all traffic is throttled at the moment. We should subject all traffic, not only data transfer packets, to both congestion control and bandwidth limiting. This is related to the low-level streams proposal, but can be done without it, and should be easier. This is especially problematic for online gaming. See bug 2312. Note that a lot of Freenet runs on altruism, thus it is *essential* that it not severely disrupt a user's internet connection! 6. Reinstate uninstall survey. (Toad) Nextgens had an uninstall survey on emu for a while. Some useful data came from it, and since 90%+ of our users don't stick around, it would be a good idea to reinstate it, and try to optimise the questions. 7. Automatic bandwidth limit calibration. (Toad) Several other p2p apps implement this, we should too. Bandwidth is *the* scarce resource most of the time, we want to use as much of it as we can without significantly slowing down the user's internet connection (see above). Vive's students' possible projects: 1. Facebook application. A facebook app for (semi-?) automatic swapping of noderefs with people already marked as your friends on Facebook. Obviously this has security implications, it would be turned off by default but would be a useful option. We could explain this in the post-install wizard. TO IMPLEMENT IN 0.9: 1. Streams (low level optimisation) At the moment, Freenet is quite inefficient in bandwidth usage: the payload percentage, especially on nodes with low bandwidth limits but also on nodes with high bandwidth limits, can be quite low, 70%, maybe 50%... this is unacceptable. A lot of this is due to message padding. If we split block transfers (and large messages) into streams which can be sliced up byte-wise however big we want them, rather than into 1K blocks, then we will need much less padding and so acheive a higher payload percentage. This also has significant side benefits for steganography and for connections with low MTU, since we can (on a link with live streams) construct useful packets of almost any size. 2. Tunnels. There have been several proposals for creating encrypted tunnels to send requests through. The oldest proposal is cellular premix routing, which we have now abandoned, because it only offers an anonymity set of the cell, and costs twice the cell's diameter in hops. The recent proposals create tunnels by doing rendezvous of two or more "anchors" sent out by a node: either we rendezvous at a location, with the first anchor sent direct and the second random routed for a few hops, or we send two or more random routed anchors. Either way when the anchors meet, we create a tunnel. There can be just two anchors, or it can be an M of N secret sharing scheme. Once a tunnel is created, it will be fairly long lived, and related requests will be sent down the same tunnel(s). Related meaning firstly part of the same splitfile, and secondly marked as belonging to the same pseudonymous identity by FCP. (This is a GOOD thing to reduce the number of samples, as long as separate pseudonymous identities can be kept separate). Tunnels will help with each of the three main attacks on requests which most concern me at present: - Datastore seizure / remote probing. - Local correlation attacks. - Key-based search. (The splitfile changes above also help for this, but only for inserts; put both together and we have a radical improvement of security for insertors). Because tunnels mean that requests are started many hops away, we won't have to cache local requests in the local datastore. We will have a separate client cache, which may be ephemeral (key in RAM only, not saved). This means that we can publish Bloom filters for our datastore to our direct neighbours. This will likely save us several hops on average, so the overall cost of tunneling will not be huge. The only real problem is what to do if the user turns off tunneling... turn off Bloom filters too? 3. Passive requests. This feature has been planned for some time, I explained the current vision to Oskar et al, and the main justification for it (FMS!). There seemed to be general support in the group, in the medium term. Passive requests are a way of subscribing to a key by doing a request for it, and the nodes visited remembering the request and forwarding the data if it is found. As long as routing works, this should work adequately. Currently we have a simple form of this, ULPRs (Ultra-Lightweight Passive/Persistent Requests), but this is inherently unreliable, and only lasts an hour. Full passive requests would be reliable as long as routing works. This means when a connection is broken, or there are significant changes to the network topology/locations, the node involved would need to reroute the request. This is somewhat similar to the publish/subscribe system I tried to get working early in the 0.7 cycle, but IMHO we have a much better idea what we are doing now. Note that we may have a large number of passive request subscriptions on any one node, big enough that they will need to be stored in a database. Also note that if a lot of nodes are subscribed to a key, they form a tree/web structure so that it will be efficiently and near instantly propagated to all of them when it is found (in a place where it shouldn't have been, or because it was inserted). The objective here is primarily to reduce the cost of polling, particularly for FMS, but also for requests in general, and to increase the speed at which data can be found if people subscribe to it in advance. High-bandwidth / medium-latency publish/subscribe (audio streams e.g.) can be built on top of this; whether we want to have a separate system for low-bandwidth low latency streams (IRC etc) is an open question. DATA COLLECTION: 1. Push/pull tests. We should do push/pull tests regularly on the real network. On darknet this won't work well because of the difficulty of getting somebody you're not near to to fetch your data, although this is IMHO possible with FMS etc volunteers. On opennet we can simply select a different pair of peers every time to avoid the two nodes converging on each other as they are reused. OPEN QUESTIONS: 1. CPU profiling. Is there any more we can do to improve the node's CPU usage? It can still be high on some systems. What I found last time I tried was that mostly this is due to low memory conditions and FEC encoding/decoding ... but it may be worth re-investigating. 2. Oskar's NGR-like-routing paper. Is there an opportunity for significant performance gain here? Oskar proved in a paper that it is possible to take link costs into account in routing and still achieve good routing... it's not clear how close this is to something we could use. 3. Darknet link trust levels. Should we have a trust level dropdown box for each darknet peer? Oskar had originally assumed that we would solve the datastore problems by simply not caching local requests, on the grounds that we trust our (darknet) peers to not attempt to determine whether our requests are local from whether we cached them. This logic doesn't work for most links, even most darknet links; but for some connections, it may be useful, since it would significantly reduce our vulnerability if our store is seized since only a fraction of the file would be cached in our store, and likewise with remote store probing. Also, we may not want to send Bloom filters that are potentially incriminating to all our peers, especially if we turn off tunneling for some requests??? 4. Migration of data. After we swap locations, should we try to migrate data to match the new location?
