Re: [freenet-dev] The current store size stats attack and Pitch Black

Matthew Toseland Thu, 28 Feb 2013 06:38:56 -0800

On Thursday 28 Feb 2013 14:34:58 Michael Grube wrote:
> I haven't had too much time to think about this. How would centralized
> reporting work? Seems like a malicious person could have a bunch of nodes
> join and simply report bad stats.


Right, sorry. What I meant was we might have the "canary" nodes - nodes run by 
people we trust - report aggregated stats, or just ask individual users. 
Obviously anything would need to be hard to spam.

There was a proposal on FMS to upload stats with a CAPTCHA...
> 
> Just my feedback. I'll try to have some kind of decent response in the next
> 24 hours.
> 
> On Thu, Feb 28, 2013 at 9:31 AM, Matthew Toseland <t...@amphibian.dyndns.org
> > wrote:
> 
> > On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote:
> > > On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote:
> > > > operhiem1's graphs of probed total datastore size have been attacked
> > recently by nodes returning bogus store sizes (in the multi-petabyte
> > range). This caused a sudden jump in store sizes on the total store size
> > graph. He excluded outliers, and the spike went away, but now it's come
> > back.
> > > >
> > > > The simplest explanation is that the person whose nodes are returning
> > the bogus stats has hacked their node to return bogus datastore stats even
> > when it is relaying a probe request. Given we use fairly high HTLs (30?)
> > for probes, this can affect enough traffic to have a big impact on stats.
> > > >
> > > > Total store size stats don't matter that much, but we need to use
> > probe stats for a couple of things that do:
> > > > 1. Pitch Black prevention will require probing for the typical
> > distance between a node and its peers. Granted on darknet it's harder for
> > an attacker to have a significant number of edges / nodes distributed
> > across the keyspace.
> > > > 2. I would like to be able to test empirically whether a given change
> > works. Overall performance fluctuates too wildly based on too many factors,
> > so probing random nodes for a single statistic (e.g. the proportion of
> > requests rejected) seems the best way to sanity check a network-level
> > change. If the stats can be perverted this easily then we can't rely on
> > them, so empiricism doesn't work.
> > > >
> > > > So how can we deal with this problem?
> > > >
> > > > We can safely get stats from a randomly chosen target location, by
> > routing several parts of a probe request randomly and then towards that
> > location. The main problems with this are:
> > > > - It gives too much control. Probes are supposed to be random.
> > > > - A random location may not be a random node, e.g. for Pitch Black
> > countermeasures when we are being attacked.
> > > >
> > > > For empiricism I guess we probably want to just have a relatively
> > small number of trusted nodes which insert their stats regularly - "canary"
> > nodes?
> > > >
> > > Preliminary conclusions, talking to digger3:
> > >
> > > There are 3 use cases.
> > >
> > > 1) Empirical confirmation when we do a build that changes something.
> > Measure something to see if it worked. *NOT* overall performance, low level
> > stuff that should show a big change.
> > > => We can use "canary" nodes for this, run by people we trust. Some will
> > need to run artificial configs, and they're probably not representative of
> > the network as a whole.
> > > => TODO: We should try to organise this explicitly, preferably before
> > trying the planned AIMD changes...
> > > 2) Pitch Black location distance detection.
> > > => Probably OK, because it's hard to get a lot of nodes in random places
> > on the keyspace on darknet.
> > > 3) General stats: Datastore, bandwidth, link length distributions, etc.
> > This stuff can and should affect development.
> > > => This is much harder. *Maybe* fetch from a random location, but even
> > there it's problematic?
> > > => We can however improve this significantly by discarding a larger
> > number of outliers.
> > > Given that probes have HTL 30, and assuming opennet so nodes are
> > randomly distributed:
> > > 10 nodes could corrupt 5% of probes
> > > 21 nodes could corrupt 10% of probes
> > > 44 nodes could corrupt 20% of probes.
> > >
> > > Also note that it depends on what the stat is - the probe request stats
> > are a percentage from 0 to 100, so much less vulnerable than datastore
> > size, which can be *big*.
> > >
> > One proposal: use low HTL probes from each node: (possibly combined with
> > central reporting, possibly not)
> >
> > https://bugs.freenetproject.org/view.php?id=5643
> >
>

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] The current store size stats attack and Pitch Black

Reply via email to