operhiem1's graphs of probed total datastore size have been attacked recently 
by nodes returning bogus store sizes (in the multi-petabyte range). This caused 
a sudden jump in store sizes on the total store size graph. He excluded 
outliers, and the spike went away, but now it's come back.

The simplest explanation is that the person whose nodes are returning the bogus 
stats has hacked their node to return bogus datastore stats even when it is 
relaying a probe request. Given we use fairly high HTLs (30?) for probes, this 
can affect enough traffic to have a big impact on stats.

Total store size stats don't matter that much, but we need to use probe stats 
for a couple of things that do:
1. Pitch Black prevention will require probing for the typical distance between 
a node and its peers. Granted on darknet it's harder for an attacker to have a 
significant number of edges / nodes distributed across the keyspace.
2. I would like to be able to test empirically whether a given change works. 
Overall performance fluctuates too wildly based on too many factors, so probing 
random nodes for a single statistic (e.g. the proportion of requests rejected) 
seems the best way to sanity check a network-level change. If the stats can be 
perverted this easily then we can't rely on them, so empiricism doesn't work.

So how can we deal with this problem?

We can safely get stats from a randomly chosen target location, by routing 
several parts of a probe request randomly and then towards that location. The 
main problems with this are:
- It gives too much control. Probes are supposed to be random.
- A random location may not be a random node, e.g. for Pitch Black 
countermeasures when we are being attacked.

For empiricism I guess we probably want to just have a relatively small number 
of trusted nodes which insert their stats regularly - "canary" nodes?

Attachment: signature.asc
Description: This is a digitally signed message part.

Devl mailing list

Reply via email to