Re: [freenet-dev] Statistics Project Update #1
On Thursday 10 May 2012 01:08:45 Evan Daniel wrote: On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky zlat...@gmail.com wrote: On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. There was a study that higher uptime correlated with the probability of further uptime so if you shift bias towards low-uptime nodes you could end will lower overall reliability. It was done on a different network with different usage patterns but imho you should definitely treat node uptime as a parameter in any simulations. MH should produce a good simple random sample from all nodes currently online, provided that the walk is of sufficient length, regardless of clustering effects. If there are partitioning effects, those will make the required walk length to get good dispersion longer, in a way that might be somewhat difficult to measure, but as long as the network is not completely partitioned, a sufficient walk length will produce a good sample. The fact that a large sample must be taken over an extended period means that low-uptime nodes will have a somewhat disproportionately lower chance of being in the sample (I think... need to do math here), but isn't a huge problem. Can we tell whether the network is partitioned? I guess that's one of the key outcomes? I.e. are there barriers between different parts of the network, are there large areas with similar locations but few connections, etc? signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
On Tuesday 01 May 2012 00:27:26 Michael Grube wrote: On Mon, Apr 30, 2012 at 7:06 PM, Steve Dougherty st...@asksteved.comwrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. It occurs to me that the probe requests I hope to depreciate allow reconstructing the actual network topology - perhaps we could run simulations on top of it? The new probe requests are currently planned to not report degrees or link length distributions, [1] which as far as I can tell would mean no way to reconstruct the network as measured in simulation. Does it seem reasonable to omit the ability to gather such information? I don't know, in the short run. In the long run we'd prefer not to expose the full topology for security reasons. On 04/28/2012 03:51 PM, Michael Grube wrote: Are you assuming opennet, darknet, a mix? The simulation generates the graph in a way which effectively sort of assumes darknet: it assigns locations, then iterates over the network and connects nodes based on link length. [2] [3] I'm working under the assumption that when using the same degree distribution as the network is measured to have this is an accurate enough approximation. I will include these and similar plots in this week's progress report. Ah, ok. You're making a Kleinberg Small World graph. Nice. My understanding is that a more thorough simulation of darknet would randomly assign locations and reassign them with the location-swapping algorithm used in Fred. Right. Funnily enough, the swapping algorithm is most concisely described in the Pitch Black paper. I'd suggest that as a reference for quick implementation. I think simulating opennet would be more work both to implement and to run, as it would mean implementing path-folding and randomly assigning locations and connections. No doubt about that. Just curious. I did some work involving some very simple probes, but it was more of a theoretical simulation. We'll see how close theory matches reality I guess ;) Obviously, opennet will have uptime issues (but hopefully they will settle after a reasonable period), and darknet presumably won't... signature.asc Description: This is a digitally signed message part. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky zlat...@gmail.com wrote: On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. There was a study that higher uptime correlated with the probability of further uptime so if you shift bias towards low-uptime nodes you could end will lower overall reliability. It was done on a different network with different usage patterns but imho you should definitely treat node uptime as a parameter in any simulations. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl MH should produce a good simple random sample from all nodes currently online, provided that the walk is of sufficient length, regardless of clustering effects. If there are partitioning effects, those will make the required walk length to get good dispersion longer, in a way that might be somewhat difficult to measure, but as long as the network is not completely partitioned, a sufficient walk length will produce a good sample. The fact that a large sample must be taken over an extended period means that low-uptime nodes will have a somewhat disproportionately lower chance of being in the sample (I think... need to do math here), but isn't a huge problem. Evan Daniel ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/30/2012 07:27 PM, Michael Grube wrote: Right. Funnily enough, the swapping algorithm is most concisely described in the Pitch Black paper. I'd suggest that as a reference for quick implementation. Thanks for the suggestion; will do. On 05/01/2012 07:48 AM, Zlatin Balevsky wrote: There was a study that higher uptime correlated with the probability of further uptime so if you shift bias towards low-uptime nodes you could end will lower overall reliability. It was done on a different network with different usage patterns but imho you should definitely treat node uptime as a parameter in any simulations. So simulating this would mean an initial connection, then disconnection and reconnection cycles as low-uptime nodes be low-uptime? As far as bias towards low-uptime nodes affecting reliability, it's worth noting that all this talk of Metropolis-Hastings is only relevant to the new probe routing and doesn't affect the existing routing behaviors. If this routing includes low-uptime nodes that's good! It would mean a more uniform selection of endpoints which allows more accurate and comprehensive measurement of the network. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJPpMAZAAoJECLJP19KqmFuooQQAMIH3I4C5ARaZGlItAnGNU0f kbm835bikBcN+NW2ICAzojS/tvirnDX4FPSGs+jyolUYse915MCWQRXLLi14G1VN 5pFfpm9cNHxqZB1fl4Cr7ndcBy6KFV0qzNK+K/RA4sCPGhiU63GaInPGEeoXwpFL I310RDaHGgGwbxcRokgnwg+6pPGqwSwOy/v7C+gu8ayfoD7ebP+Ow2fcjEmDDV/g OJwfbnveiGkdDaHb6GA5jpFgrkqSbod7TDKXp0+vXp9XL4wpcI7f486kDCFEplTp E5K0UoWoNVYgSHjrffELhdQy3ehrQHaSayBUaOm4K3bI2VRni1UIOlm4ixkqa4m1 09NAy39HYc2xlnyyBh8/zx1L/6l6SG0Thkms2uhhm4gnCYySYHzTxn3fXpuclUIU 15THxZocCd8rQxZzwooOSSseGdxG3S5cBYsttOX1VJiyUZjdBS1cg9uNF38bm2d/ /qZwt87a1hVyQ47U6O36BX7E8hT5aYbskPLGdF74xI1QiSIn8ZQO46Q9thBTytHs AxbTr568pvSYqbhC/N4PrxESiMxNoj5agP9Jo4MV5pd8QD4rVahPZoSyLM7qyyp3 uRR+vEo8zjSxCzZj/TQUfuvb8QsUvkHuuR0+6jod4s9y0DnkszndJ05dqydSLQu/ pCZT1uUEoXqJ4us1IC3e =gMU0 -END PGP SIGNATURE- ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. There was a study that higher uptime correlated with the probability of further uptime so if you shift bias towards low-uptime nodes you could end will lower overall reliability. It was done on a different network with different usage patterns but imho you should definitely treat node uptime as a parameter in any simulations. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. It occurs to me that the probe requests I hope to depreciate allow reconstructing the actual network topology - perhaps we could run simulations on top of it? The new probe requests are currently planned to not report degrees or link length distributions, [1] which as far as I can tell would mean no way to reconstruct the network as measured in simulation. Does it seem reasonable to omit the ability to gather such information? On 04/28/2012 03:51 PM, Michael Grube wrote: Are you assuming opennet, darknet, a mix? The simulation generates the graph in a way which effectively sort of assumes darknet: it assigns locations, then iterates over the network and connects nodes based on link length. [2] [3] I'm working under the assumption that when using the same degree distribution as the network is measured to have this is an accurate enough approximation. I will include these and similar plots in this week's progress report. My understanding is that a more thorough simulation of darknet would randomly assign locations and reassign them with the location-swapping algorithm used in Fred. I think simulating opennet would be more work both to implement and to run, as it would mean implementing path-folding and randomly assigning locations and connections. [1] https://bugs.freenetproject.org/view.php?id=3568 [2] https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L139 [3] https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L52 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJPnxsCAAoJECLJP19KqmFu+dsP/RsSyAcQEqqFM3UDjdlTtMBd EtBWQ/7mfsLnZ4D3aHGZ1yV4F2jpU5T6ZVEcP520gwZ1fz7k+Yl4AVZYMyE0Ixbu HiSGeoAQeyAJs+znhg+svKvIrcUzr6i2N6/1D3iZPZl8s//dXNtOfr5zfsoh3wrB k7T5BmCBEGEIru4Z33EsMkVDTzJ8fy17fZew1MKfOs1HlBUr3hIrz4b/IlbxUprb 0SMosb/cc5W+pRM0d6nVlYn3vMgW/IHFTF2wFQGcd2oC9eE1RJ7J6CvsaVnjG1k+ MRv0nCQoVxztQ3CUS8Apkzb/SFpRcJFBist/hyCkL8eCHE8fjblisthRGZaoBN8d aqcQw57xeRPJwdmZGWw6e/gMOTVVa44XlhuKiOap0iQSctWsG1rcpXA824VshasU JDfMAiz67L/7QQ1q/zLAC22PPpxfliS+k4A+OF/4QQLeZ+3dfkpgg9LEzphVVOwH nXw0/d8mg+i+dUir4irq15nharbiNtDbLAmVJYW+KnfVQIvVOl+4ZcIL9j4EanzK egEMlm8y4y+tQYR2TsPhqrnrNax8lspqz6fp1H7dMm0aLxF/edBHE/m2C1xZ2tX0 7NpD4Hx7jV7qCFD86kxc1dbYZpgYh5i0BOWBjc+bLGY4gyO1cGJOLU8jknCmZ1vV QDdK7+1G24/xn9Ju/Y9h =GBrm -END PGP SIGNATURE- ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
On Mon, Apr 30, 2012 at 7:06 PM, Steve Dougherty st...@asksteved.comwrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote: In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. Good to know - I'll look for that. Are there any particular effects you had in mind? The Metropolis-Hastings correction in the new probes should produce a fairly uniform distribution of endpoints despite clustering and well-connected nodes, but explicitly simulating the effects of high uptime could be helpful. It occurs to me that the probe requests I hope to depreciate allow reconstructing the actual network topology - perhaps we could run simulations on top of it? The new probe requests are currently planned to not report degrees or link length distributions, [1] which as far as I can tell would mean no way to reconstruct the network as measured in simulation. Does it seem reasonable to omit the ability to gather such information? On 04/28/2012 03:51 PM, Michael Grube wrote: Are you assuming opennet, darknet, a mix? The simulation generates the graph in a way which effectively sort of assumes darknet: it assigns locations, then iterates over the network and connects nodes based on link length. [2] [3] I'm working under the assumption that when using the same degree distribution as the network is measured to have this is an accurate enough approximation. I will include these and similar plots in this week's progress report. Ah, ok. You're making a Kleinberg Small World graph. Nice. My understanding is that a more thorough simulation of darknet would randomly assign locations and reassign them with the location-swapping algorithm used in Fred. Right. Funnily enough, the swapping algorithm is most concisely described in the Pitch Black paper. I'd suggest that as a reference for quick implementation. I think simulating opennet would be more work both to implement and to run, as it would mean implementing path-folding and randomly assigning locations and connections. No doubt about that. Just curious. I did some work involving some very simple probes, but it was more of a theoretical simulation. We'll see how close theory matches reality I guess ;) Thanks for answering. [1] https://bugs.freenetproject.org/view.php?id=3568 [2] https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L139 [3] https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L52 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJPnxsCAAoJECLJP19KqmFu+dsP/RsSyAcQEqqFM3UDjdlTtMBd EtBWQ/7mfsLnZ4D3aHGZ1yV4F2jpU5T6ZVEcP520gwZ1fz7k+Yl4AVZYMyE0Ixbu HiSGeoAQeyAJs+znhg+svKvIrcUzr6i2N6/1D3iZPZl8s//dXNtOfr5zfsoh3wrB k7T5BmCBEGEIru4Z33EsMkVDTzJ8fy17fZew1MKfOs1HlBUr3hIrz4b/IlbxUprb 0SMosb/cc5W+pRM0d6nVlYn3vMgW/IHFTF2wFQGcd2oC9eE1RJ7J6CvsaVnjG1k+ MRv0nCQoVxztQ3CUS8Apkzb/SFpRcJFBist/hyCkL8eCHE8fjblisthRGZaoBN8d aqcQw57xeRPJwdmZGWw6e/gMOTVVa44XlhuKiOap0iQSctWsG1rcpXA824VshasU JDfMAiz67L/7QQ1q/zLAC22PPpxfliS+k4A+OF/4QQLeZ+3dfkpgg9LEzphVVOwH nXw0/d8mg+i+dUir4irq15nharbiNtDbLAmVJYW+KnfVQIvVOl+4ZcIL9j4EanzK egEMlm8y4y+tQYR2TsPhqrnrNax8lspqz6fp1H7dMm0aLxF/edBHE/m2C1xZ2tX0 7NpD4Hx7jV7qCFD86kxc1dbYZpgYh5i0BOWBjc+bLGY4gyO1cGJOLU8jknCmZ1vV QDdK7+1G24/xn9Ju/Y9h =GBrm -END PGP SIGNATURE- ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
Are you assuming opennet, darknet, a mix? On Fri, Apr 27, 2012 at 10:34 PM, Steve Dougherty st...@asksteved.comwrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Here's an update on my progress on the statistics project for the first week: The current probes are biased towards better-connected nodes: at each hop they choose a random peer to pass the request to - this is a random walk. However, because better-connected nodes by definition have more connections, the requests will be passed to them more often and they will be over-represented in results. To address this, the new probes I will implement will use Metropolis-Hastings correction: unlike the uniform random walk which always uses the random peer it picks, it is less likely to pick a well-connected node, and more likely to pick a poorly-connected node. As there are more chances to pick a well-connected node than a poorly-connected one, this balances out to a uniform probability to pick any given node. I'm starting from evanbd's network simulator,[1] which is able to generate networks based off theoretical models and perform some routing simulations. It can now also reproduce a given degree (number of peers) distribution which allows simulating the network as it is measured to be in addition to purely theoretical models. Currently I'm working on plotting the distribution of this routing strategy with different limits on hops before returning information - Hops To Live - HTL. This is to get a good number to start with in the implementation of these probes. I will also refactor the simulator and make it easier to configure: currently values are hard-coded and changing the simulation parameters means recompiling. My goal is to have this area of simulation done and have begun planning if not implementing the new probe requests by the end of next week. Thanks, operhiem1 [1] USK@gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac ,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200911.xhtml -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJPm1czAAoJECLJP19KqmFuZ2MP/1VRYipBzRKOQDjkKSIl0dq6 +FmJg/lPwBltn/gDXMRB0+vY1Msdbd/ydXx4JEqtJzeHUC/bqreLxMov2EqzYZAl 0kLmuNUp7Cn9h7PlhAVpQ5t7BMLbJF5UikDLHz3vUfAE4bKgDuNwiy9Z0Uf+zqr/ esRD0qWn24dACBOA4rRAkb0b+14UgIKgMj3ohMRkpK2NgHuB4OUmGMOQIf/9h/Vb /FwruM6RdiUUd0g+ldKhzpfflqahKt30xjHCQeNvMZRx7N0OMJfnBUTbCP+ogaD5 aM/BoXoKo1WUCjMKUN8vFvby1BF+4zolywyIhxUqrQv76yjoGvpI/K3qdsH5YWUG AZJeVbbQdV959S1waAU4gji2iEKVzhretZNBMQApY421WG1c0C8MUtNOY37zZ3iO q/Nxlck9vVDTNgXAs3vzm7VbuKeeyEfqHe+imIYhiqYjfqUQteSgO70T2gpMrw6f ojbs2ohtM7sXLPp8P6Lf57VcHEsmSUJkWTua7ycdPXGHmdo/MLqFLW3UVYsWzy5P c67y6yjvxVVqJfbu38FQ/mTqgOGTduU1568BYApBO9bp6/b+2jkZcfcIsL8apGM2 vvjtxxxCfoESHobwH59NoSezslGxHBddEpWcDl2ggY6NgkzH72iAtjF4a4GRkGJM ju5xKSZ40OCpLNUCA/M0 =cnM8 -END PGP SIGNATURE- ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
Re: [freenet-dev] Statistics Project Update #1
On Fri, Apr 27, 2012 at 10:34 PM, Steve Dougherty st...@asksteved.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Here's an update on my progress on the statistics project for the first week: The current probes are biased towards better-connected nodes: at each hop they choose a random peer to pass the request to - this is a random walk. However, because better-connected nodes by definition have more connections, the requests will be passed to them more often and they will be over-represented in results. To address this, the new probes I will implement will use Metropolis-Hastings correction: unlike the uniform random walk which always uses the random peer it picks, it is less likely to pick a well-connected node, and more likely to pick a poorly-connected node. As there are more chances to pick a well-connected node than a poorly-connected one, this balances out to a uniform probability to pick any given node. In Gnutella we observed that long-lived nodes tend to be better connected and that they also cluster with other high-uptime nodes. If the same is true for Freenet it's a good idea to keep an eye for side effects as you tweak the behavior. ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl