Re: [freenet-dev] Pay-for-opennet proposal: latest iteration

Matthew Toseland Mon, 31 Mar 2014 11:51:21 -0700

On 31/03/14 17:54, Robert Hailey wrote:
> I've only been skimming this thread, so excuse me if this is a bit off-base.
>
> (1) I have noticed a reference to "not routing inserts to new nodes" (wrt 
> MAST). I have recently required such a decision (asto if a node is "new"), so 
> I think a core idea of node veterancy is a good idea... and if it does not 
> effect the routing of "GET" requests (which lets them earn a veteran status), 
> I don't think that modifying the routing for INSERTs would have any negative 
> effect. It's basically just saying that INSERTs must go into the "known good" 
> network.
Requests and inserts need to go down the same routes if data is to be
findable quickly. Although conceivably we could do tricks like
side-broadcasts: We can't share Bloom filters with nodes that aren't
high uptime, but we can simply send them our requests, in an efficient
(=slowish) way, at the same time as forwarding the request. If our peers
have the data we return it both forwards and backwards. This would be
reasonably safe if we have tunnels, or for low HTL requests; and
requests could be represented fairly compactly if they're not time
sensitive. TODO file various bugs...


The proposal is *not to route sensitive data to new nodes*.

"Sensitive data" is:
- High HTL requests (ideally, "high htl" meaning "prior to reaching the
closest node to the target location", though we may only restrict the
first few hops). Note that requests will complete before or very soon
after this. But inserts will go many more hops.
- Tunnels. (Need high uptime, good performance anyway or we'll be
constantly creating new ones)

"New nodes" means:
- Nodes which are actually new
-- Creation time too recent.
-- Connected time (to a specific node) too short.
- Low uptime nodes.
-- Right now uptime is self-reported.
-- We can measure it on a per peer level.
-- We could verify it on a global level e.g. using shadow nodes assigned
by the seednodes.
- Nodes which don't have a proven track record of usefully serving
requests (i.e. actual performance).
-- E.g. if a node is still connected after some threshold time period,
it's not "new".
- Nodes with less than a certain amount of bandwidth
-- This should be actually measured by a peer.
-- There may be a minimum requirement per peer regardless of traffic;
this makes it easy to measure, helps with traffic analysis, and
sometimes we will be able to use the bandwidth for useful things so it's
not as bad as it sounds.
-- On a global level, we need to publish a list of peers periodically
for tunneling. We could include signed bandwidth stats. However, this
can be faked if the node is "on ice" and only connected to the
attacker's own nodes. So we need the per-node limits too.

Much of the above assumes new infrastructure: When creating an opennet
node, the seednodes give you a certificate. Ideally this bootstrapping
process would be costly; at the very least, it's limited by the capacity
of the seednodes and limited per IP address per unit time. Also, the
seednodes might allocate one or more "shadow nodes", as in ShadowWalker,
to help keep peers lists honest, or possibly to verify claims about uptime.

Hence for any mobile attack (e.g. MAST but also statistical attacks on
random routing) where the attacker (Mallory) needs to be able to "move",
i.e. create new locations (possibly abandoning old locations), he will
need to wait until his newly activated identities obtain a sufficient
reputation, and spend resources (bandwidth and probably answering
requests). He can't just announce (or path fold) to the target nodes and
use them immediately. This could slow down MAST enormously, but it may
require that the network's evolution be relatively slow - so fast growth
could be a problem, especially during a slashdotting (maybe we'd reduce
the limits temporarily?). This helps even if creation of identities is
cheap, or if Mallory has a stash he created earlier (we want the node to
have been useful *recently*, not just in the distant past, before we
trust it).

Requiring that each core node has a unique IP address would also help,
but might mean some centralisation, and probably doesn't work with IPv6.

It is obviously going to be easier to bypass this sort of thing if
creating identities is too cheap; we still need to think about that. If
the main limiting factor is per-IP-per-unit-time then the attacker can
simply change his IP address (e.g. by resetting a domestic DSL
connection), and e.g. keep creating new identities until he gets the
location he wants and/or the assigned shadow node is one of his etc etc.
Limits on the IP block level are problematic and need manual
supervision. On the other hand if they only apply to a higher tier of
nodes perhaps they can be more aggressive.

We could make opennet identities go away (or be ineligible to be core
nodes) after some period of downtime, further limiting the "here's one I
created earlier" thing, but this would mean old stores lose some of
their value.

However, any such mechanisms can likely be worked around if creating new
identities is too cheap. For example, the attacker could keep on
creating nodes until it gets one that has the desired location, and/or
is verified by one of his own nodes.

One possibly serious problem: How do we ensure that newbie nodes have
some "old" peers? Especially during a slashdot, where we want the
newbies to take as much of the load as possible. Hence my original
proposal to have every node that isn't core opennet or darknet be
transient... But we could rearrange things so we have e.g. separate
limits for core and non-core peers...
> (2) I'm no longer so quick to presume that "paying for opennet will make it 
> faster". Unless, that is, we have evidence to indicate the network *IS 
> CURRENTLY* under a Sybil attack. Common-sensically, wouldn't presenting a 
> barrier to entry make things slower?
Performance requirements for opennet *might* make things faster; slow
nodes can slow things down globally; the reason Freenet is unable to
saturate a fast link isn't CPU usage! Ideally load management would be
perfectly able to deal with slow nodes ... in practice I doubt this is true.

Paying to join opennet would certainly not improve performance directly,
unless we are being attacked in subtle ways intended to slow down
development by keeping us chasing our tails after e.g. opennet
bootstrapping problems caused by subtle announcement DoS'es. Which is
possible, but I have no evidence for it. Some of these mechanisms might
make such dirty tricks easier to detect and thus deter them.
> Whereas I recall the consensus (or... uncontested statement from Matthew) 
> being that the primary cause for network slowness being the inability to find 
> data, it may be good to consider:
Yes, individual requests for unpopular data tend to fail to make
progress because of the data not being available...
> (3) refusing (or heavily metering?) new connections from any FOAF. If done 
> well, this might help MAST, but I see this being primarily a network 
> acceleration... because I theorize that (apart from data being inserted into 
> transient nodes), the primary cause for request failure would be 
> "over-connectedness"... that the current pattern generates a huge number of 
> three-node-triangles (A-B-C-A), that kill requests as they approach their 
> destination (via HTL exhaustion).
I disagree. That's exactly what we want for routing to work: Lots of
short links, a smaller number of long links. We use the long links while
we are far away, but when we are close to the target we need to be able
to follow short links. Of course if we're getting loops we need to deal
with that. We will still reject loops; the  change we made after the
traceback paper was that we don't remember requests after they have
finished locally, so it is possible to traverse the same dead-end pocket
twice. This should only happen on poorly connected darknets, so I doubt
it's a serious problem on the current network.

Perhaps you think we have too many short links? I won't take this
seriously unless you can prove it with some sort of data. Nodes that do
a lot of filesharing tend to have too many long links, hence the
proposal not to path fold at high HTL.
> (4) protecting (that is, not replacing) the "nearest veteran node" on either 
> side of our location. This is effectively what the simulators do, as a ring 
> topology is somewhat ideal. As I see it, this would assure that the network 
> has at least one valid and findable route to every part of the address space.
This is the opposite of what you just said. :) We need the short links
precisely so we have a ring IIRC.
> I'd love to hear any thoughts on these.

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Pay-for-opennet proposal: latest iteration

Reply via email to