On 31/03/14 17:54, Robert Hailey wrote: > I've only been skimming this thread, so excuse me if this is a bit off-base. > > (1) I have noticed a reference to "not routing inserts to new nodes" (wrt > MAST). I have recently required such a decision (asto if a node is "new"), so > I think a core idea of node veterancy is a good idea... and if it does not > effect the routing of "GET" requests (which lets them earn a veteran status), > I don't think that modifying the routing for INSERTs would have any negative > effect. It's basically just saying that INSERTs must go into the "known good" > network. Requests and inserts need to go down the same routes if data is to be findable quickly. Although conceivably we could do tricks like side-broadcasts: We can't share Bloom filters with nodes that aren't high uptime, but we can simply send them our requests, in an efficient (=slowish) way, at the same time as forwarding the request. If our peers have the data we return it both forwards and backwards. This would be reasonably safe if we have tunnels, or for low HTL requests; and requests could be represented fairly compactly if they're not time sensitive. TODO file various bugs...
The proposal is *not to route sensitive data to new nodes*. "Sensitive data" is: - High HTL requests (ideally, "high htl" meaning "prior to reaching the closest node to the target location", though we may only restrict the first few hops). Note that requests will complete before or very soon after this. But inserts will go many more hops. - Tunnels. (Need high uptime, good performance anyway or we'll be constantly creating new ones) "New nodes" means: - Nodes which are actually new -- Creation time too recent. -- Connected time (to a specific node) too short. - Low uptime nodes. -- Right now uptime is self-reported. -- We can measure it on a per peer level. -- We could verify it on a global level e.g. using shadow nodes assigned by the seednodes. - Nodes which don't have a proven track record of usefully serving requests (i.e. actual performance). -- E.g. if a node is still connected after some threshold time period, it's not "new". - Nodes with less than a certain amount of bandwidth -- This should be actually measured by a peer. -- There may be a minimum requirement per peer regardless of traffic; this makes it easy to measure, helps with traffic analysis, and sometimes we will be able to use the bandwidth for useful things so it's not as bad as it sounds. -- On a global level, we need to publish a list of peers periodically for tunneling. We could include signed bandwidth stats. However, this can be faked if the node is "on ice" and only connected to the attacker's own nodes. So we need the per-node limits too. Much of the above assumes new infrastructure: When creating an opennet node, the seednodes give you a certificate. Ideally this bootstrapping process would be costly; at the very least, it's limited by the capacity of the seednodes and limited per IP address per unit time. Also, the seednodes might allocate one or more "shadow nodes", as in ShadowWalker, to help keep peers lists honest, or possibly to verify claims about uptime. Hence for any mobile attack (e.g. MAST but also statistical attacks on random routing) where the attacker (Mallory) needs to be able to "move", i.e. create new locations (possibly abandoning old locations), he will need to wait until his newly activated identities obtain a sufficient reputation, and spend resources (bandwidth and probably answering requests). He can't just announce (or path fold) to the target nodes and use them immediately. This could slow down MAST enormously, but it may require that the network's evolution be relatively slow - so fast growth could be a problem, especially during a slashdotting (maybe we'd reduce the limits temporarily?). This helps even if creation of identities is cheap, or if Mallory has a stash he created earlier (we want the node to have been useful *recently*, not just in the distant past, before we trust it). Requiring that each core node has a unique IP address would also help, but might mean some centralisation, and probably doesn't work with IPv6. It is obviously going to be easier to bypass this sort of thing if creating identities is too cheap; we still need to think about that. If the main limiting factor is per-IP-per-unit-time then the attacker can simply change his IP address (e.g. by resetting a domestic DSL connection), and e.g. keep creating new identities until he gets the location he wants and/or the assigned shadow node is one of his etc etc. Limits on the IP block level are problematic and need manual supervision. On the other hand if they only apply to a higher tier of nodes perhaps they can be more aggressive. We could make opennet identities go away (or be ineligible to be core nodes) after some period of downtime, further limiting the "here's one I created earlier" thing, but this would mean old stores lose some of their value. However, any such mechanisms can likely be worked around if creating new identities is too cheap. For example, the attacker could keep on creating nodes until it gets one that has the desired location, and/or is verified by one of his own nodes. One possibly serious problem: How do we ensure that newbie nodes have some "old" peers? Especially during a slashdot, where we want the newbies to take as much of the load as possible. Hence my original proposal to have every node that isn't core opennet or darknet be transient... But we could rearrange things so we have e.g. separate limits for core and non-core peers... > (2) I'm no longer so quick to presume that "paying for opennet will make it > faster". Unless, that is, we have evidence to indicate the network *IS > CURRENTLY* under a Sybil attack. Common-sensically, wouldn't presenting a > barrier to entry make things slower? Performance requirements for opennet *might* make things faster; slow nodes can slow things down globally; the reason Freenet is unable to saturate a fast link isn't CPU usage! Ideally load management would be perfectly able to deal with slow nodes ... in practice I doubt this is true. Paying to join opennet would certainly not improve performance directly, unless we are being attacked in subtle ways intended to slow down development by keeping us chasing our tails after e.g. opennet bootstrapping problems caused by subtle announcement DoS'es. Which is possible, but I have no evidence for it. Some of these mechanisms might make such dirty tricks easier to detect and thus deter them. > Whereas I recall the consensus (or... uncontested statement from Matthew) > being that the primary cause for network slowness being the inability to find > data, it may be good to consider: Yes, individual requests for unpopular data tend to fail to make progress because of the data not being available... > (3) refusing (or heavily metering?) new connections from any FOAF. If done > well, this might help MAST, but I see this being primarily a network > acceleration... because I theorize that (apart from data being inserted into > transient nodes), the primary cause for request failure would be > "over-connectedness"... that the current pattern generates a huge number of > three-node-triangles (A-B-C-A), that kill requests as they approach their > destination (via HTL exhaustion). I disagree. That's exactly what we want for routing to work: Lots of short links, a smaller number of long links. We use the long links while we are far away, but when we are close to the target we need to be able to follow short links. Of course if we're getting loops we need to deal with that. We will still reject loops; the change we made after the traceback paper was that we don't remember requests after they have finished locally, so it is possible to traverse the same dead-end pocket twice. This should only happen on poorly connected darknets, so I doubt it's a serious problem on the current network. Perhaps you think we have too many short links? I won't take this seriously unless you can prove it with some sort of data. Nodes that do a lot of filesharing tend to have too many long links, hence the proposal not to path fold at high HTL. > (4) protecting (that is, not replacing) the "nearest veteran node" on either > side of our location. This is effectively what the simulators do, as a ring > topology is somewhat ideal. As I see it, this would assure that the network > has at least one valid and findable route to every part of the address space. This is the opposite of what you just said. :) We need the short links precisely so we have a ring IIRC. > I'd love to hear any thoughts on these.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Devl mailing list [email protected] https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
