Re: [freenet-dev] Google Summer of Code 2013
On 14/02/13 12:42, Matthew Toseland wrote: Organisations can apply from 18-29 of March. We have at least one very enthusiastic student (voxsim) who's been working with us with a view to participating. However he will need a mentor. I don't have time to be primary mentor to anyone this year. I guess I can do the organisation admin, provided that I don't have to mentor as well. Who wants to be involved in GSoC this year? I'm up for mentoring again. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- GPG: 4096R/5FBBDBCE https://github.com/infinity0 https://bitbucket.org/infinity0 https://launchpad.net/~infinity0 signature.asc Description: OpenPGP digital signature ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] The current store size stats attack and Pitch Black
On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote: On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote: operhiem1's graphs of probed total datastore size have been attacked recently by nodes returning bogus store sizes (in the multi-petabyte range). This caused a sudden jump in store sizes on the total store size graph. He excluded outliers, and the spike went away, but now it's come back. The simplest explanation is that the person whose nodes are returning the bogus stats has hacked their node to return bogus datastore stats even when it is relaying a probe request. Given we use fairly high HTLs (30?) for probes, this can affect enough traffic to have a big impact on stats. Total store size stats don't matter that much, but we need to use probe stats for a couple of things that do: 1. Pitch Black prevention will require probing for the typical distance between a node and its peers. Granted on darknet it's harder for an attacker to have a significant number of edges / nodes distributed across the keyspace. 2. I would like to be able to test empirically whether a given change works. Overall performance fluctuates too wildly based on too many factors, so probing random nodes for a single statistic (e.g. the proportion of requests rejected) seems the best way to sanity check a network-level change. If the stats can be perverted this easily then we can't rely on them, so empiricism doesn't work. So how can we deal with this problem? We can safely get stats from a randomly chosen target location, by routing several parts of a probe request randomly and then towards that location. The main problems with this are: - It gives too much control. Probes are supposed to be random. - A random location may not be a random node, e.g. for Pitch Black countermeasures when we are being attacked. For empiricism I guess we probably want to just have a relatively small number of trusted nodes which insert their stats regularly - canary nodes? Preliminary conclusions, talking to digger3: There are 3 use cases. 1) Empirical confirmation when we do a build that changes something. Measure something to see if it worked. *NOT* overall performance, low level stuff that should show a big change. = We can use canary nodes for this, run by people we trust. Some will need to run artificial configs, and they're probably not representative of the network as a whole. = TODO: We should try to organise this explicitly, preferably before trying the planned AIMD changes... 2) Pitch Black location distance detection. = Probably OK, because it's hard to get a lot of nodes in random places on the keyspace on darknet. 3) General stats: Datastore, bandwidth, link length distributions, etc. This stuff can and should affect development. = This is much harder. *Maybe* fetch from a random location, but even there it's problematic? = We can however improve this significantly by discarding a larger number of outliers. Given that probes have HTL 30, and assuming opennet so nodes are randomly distributed: 10 nodes could corrupt 5% of probes 21 nodes could corrupt 10% of probes 44 nodes could corrupt 20% of probes. Also note that it depends on what the stat is - the probe request stats are a percentage from 0 to 100, so much less vulnerable than datastore size, which can be *big*. One proposal: use low HTL probes from each node: (possibly combined with central reporting, possibly not) https://bugs.freenetproject.org/view.php?id=5643 signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] The current store size stats attack and Pitch Black
I haven't had too much time to think about this. How would centralized reporting work? Seems like a malicious person could have a bunch of nodes join and simply report bad stats. Just my feedback. I'll try to have some kind of decent response in the next 24 hours. On Thu, Feb 28, 2013 at 9:31 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote: On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote: operhiem1's graphs of probed total datastore size have been attacked recently by nodes returning bogus store sizes (in the multi-petabyte range). This caused a sudden jump in store sizes on the total store size graph. He excluded outliers, and the spike went away, but now it's come back. The simplest explanation is that the person whose nodes are returning the bogus stats has hacked their node to return bogus datastore stats even when it is relaying a probe request. Given we use fairly high HTLs (30?) for probes, this can affect enough traffic to have a big impact on stats. Total store size stats don't matter that much, but we need to use probe stats for a couple of things that do: 1. Pitch Black prevention will require probing for the typical distance between a node and its peers. Granted on darknet it's harder for an attacker to have a significant number of edges / nodes distributed across the keyspace. 2. I would like to be able to test empirically whether a given change works. Overall performance fluctuates too wildly based on too many factors, so probing random nodes for a single statistic (e.g. the proportion of requests rejected) seems the best way to sanity check a network-level change. If the stats can be perverted this easily then we can't rely on them, so empiricism doesn't work. So how can we deal with this problem? We can safely get stats from a randomly chosen target location, by routing several parts of a probe request randomly and then towards that location. The main problems with this are: - It gives too much control. Probes are supposed to be random. - A random location may not be a random node, e.g. for Pitch Black countermeasures when we are being attacked. For empiricism I guess we probably want to just have a relatively small number of trusted nodes which insert their stats regularly - canary nodes? Preliminary conclusions, talking to digger3: There are 3 use cases. 1) Empirical confirmation when we do a build that changes something. Measure something to see if it worked. *NOT* overall performance, low level stuff that should show a big change. = We can use canary nodes for this, run by people we trust. Some will need to run artificial configs, and they're probably not representative of the network as a whole. = TODO: We should try to organise this explicitly, preferably before trying the planned AIMD changes... 2) Pitch Black location distance detection. = Probably OK, because it's hard to get a lot of nodes in random places on the keyspace on darknet. 3) General stats: Datastore, bandwidth, link length distributions, etc. This stuff can and should affect development. = This is much harder. *Maybe* fetch from a random location, but even there it's problematic? = We can however improve this significantly by discarding a larger number of outliers. Given that probes have HTL 30, and assuming opennet so nodes are randomly distributed: 10 nodes could corrupt 5% of probes 21 nodes could corrupt 10% of probes 44 nodes could corrupt 20% of probes. Also note that it depends on what the stat is - the probe request stats are a percentage from 0 to 100, so much less vulnerable than datastore size, which can be *big*. One proposal: use low HTL probes from each node: (possibly combined with central reporting, possibly not) https://bugs.freenetproject.org/view.php?id=5643 ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] The current store size stats attack and Pitch Black
On Thursday 28 Feb 2013 14:34:58 Michael Grube wrote: I haven't had too much time to think about this. How would centralized reporting work? Seems like a malicious person could have a bunch of nodes join and simply report bad stats. Right, sorry. What I meant was we might have the canary nodes - nodes run by people we trust - report aggregated stats, or just ask individual users. Obviously anything would need to be hard to spam. There was a proposal on FMS to upload stats with a CAPTCHA... Just my feedback. I'll try to have some kind of decent response in the next 24 hours. On Thu, Feb 28, 2013 at 9:31 AM, Matthew Toseland t...@amphibian.dyndns.org wrote: On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote: On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote: operhiem1's graphs of probed total datastore size have been attacked recently by nodes returning bogus store sizes (in the multi-petabyte range). This caused a sudden jump in store sizes on the total store size graph. He excluded outliers, and the spike went away, but now it's come back. The simplest explanation is that the person whose nodes are returning the bogus stats has hacked their node to return bogus datastore stats even when it is relaying a probe request. Given we use fairly high HTLs (30?) for probes, this can affect enough traffic to have a big impact on stats. Total store size stats don't matter that much, but we need to use probe stats for a couple of things that do: 1. Pitch Black prevention will require probing for the typical distance between a node and its peers. Granted on darknet it's harder for an attacker to have a significant number of edges / nodes distributed across the keyspace. 2. I would like to be able to test empirically whether a given change works. Overall performance fluctuates too wildly based on too many factors, so probing random nodes for a single statistic (e.g. the proportion of requests rejected) seems the best way to sanity check a network-level change. If the stats can be perverted this easily then we can't rely on them, so empiricism doesn't work. So how can we deal with this problem? We can safely get stats from a randomly chosen target location, by routing several parts of a probe request randomly and then towards that location. The main problems with this are: - It gives too much control. Probes are supposed to be random. - A random location may not be a random node, e.g. for Pitch Black countermeasures when we are being attacked. For empiricism I guess we probably want to just have a relatively small number of trusted nodes which insert their stats regularly - canary nodes? Preliminary conclusions, talking to digger3: There are 3 use cases. 1) Empirical confirmation when we do a build that changes something. Measure something to see if it worked. *NOT* overall performance, low level stuff that should show a big change. = We can use canary nodes for this, run by people we trust. Some will need to run artificial configs, and they're probably not representative of the network as a whole. = TODO: We should try to organise this explicitly, preferably before trying the planned AIMD changes... 2) Pitch Black location distance detection. = Probably OK, because it's hard to get a lot of nodes in random places on the keyspace on darknet. 3) General stats: Datastore, bandwidth, link length distributions, etc. This stuff can and should affect development. = This is much harder. *Maybe* fetch from a random location, but even there it's problematic? = We can however improve this significantly by discarding a larger number of outliers. Given that probes have HTL 30, and assuming opennet so nodes are randomly distributed: 10 nodes could corrupt 5% of probes 21 nodes could corrupt 10% of probes 44 nodes could corrupt 20% of probes. Also note that it depends on what the stat is - the probe request stats are a percentage from 0 to 100, so much less vulnerable than datastore size, which can be *big*. One proposal: use low HTL probes from each node: (possibly combined with central reporting, possibly not) https://bugs.freenetproject.org/view.php?id=5643 signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Maven revisited
I meant that more as a question than a statement. But if the horse is starting to die I'll avoid beating it. On Sat, 02 Feb 2013 07:23:42 + Ximin Luo infini...@gmx.com wrote: Agreed; maven is better than our current build scripts in every other respect, but we MUST NOT use it until secure downloads are implemented. On 02/02/13 05:12, Travis Wellman wrote: How is maven different than ruby gems? http://venturebeat.com/2013/01/30/rubygems-org-hacked-interrupting-heroku-services-and-putting-millions-of-sites-using-rails-at-risk/ http://traviswellman.com/ ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl