Re: [freenet-dev] Statistics Project Update #1

2012-06-20 Thread Matthew Toseland
On Thursday 10 May 2012 01:08:45 Evan Daniel wrote:
 On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky zlat...@gmail.com wrote:
  On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
  In Gnutella we observed that long-lived nodes tend to be better
  connected and that they also cluster with other high-uptime nodes.
  If the same is true for Freenet it's a good idea to keep an eye for
  side effects as you tweak the behavior.
 
  Good to know - I'll look for that. Are there any particular effects
  you had in mind? The Metropolis-Hastings correction in the new probes
  should produce a fairly uniform distribution of endpoints despite
  clustering and well-connected nodes, but explicitly simulating the
  effects of high uptime could be helpful.
 
  There was a study that higher uptime correlated with the probability
  of further uptime so if you shift bias towards low-uptime nodes you
  could end will lower overall reliability.  It was done on a different
  network with different usage patterns but imho you should definitely
  treat node uptime as a parameter in any simulations.
 
 MH should produce a good simple random sample from all nodes currently
 online, provided that the walk is of sufficient length, regardless of
 clustering effects. If there are partitioning effects, those will make
 the required walk length to get good dispersion longer, in a way that
 might be somewhat difficult to measure, but as long as the network is
 not completely partitioned, a sufficient walk length will produce a
 good sample. The fact that a large sample must be taken over an
 extended period means that low-uptime nodes will have a somewhat
 disproportionately lower chance of being in the sample (I think...
 need to do math here), but isn't a huge problem.

Can we tell whether the network is partitioned? I guess that's one of the key 
outcomes?

I.e. are there barriers between different parts of the network, are there large 
areas with similar locations but few connections, etc?


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Statistics Project Update #1

2012-06-20 Thread Matthew Toseland
On Tuesday 01 May 2012 00:27:26 Michael Grube wrote:
 On Mon, Apr 30, 2012 at 7:06 PM, Steve Dougherty st...@asksteved.comwrote:
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
   In Gnutella we observed that long-lived nodes tend to be better
   connected and that they also cluster with other high-uptime nodes.
   If the same is true for Freenet it's a good idea to keep an eye for
   side effects as you tweak the behavior.
 
  Good to know - I'll look for that. Are there any particular effects
  you had in mind? The Metropolis-Hastings correction in the new probes
  should produce a fairly uniform distribution of endpoints despite
  clustering and well-connected nodes, but explicitly simulating the
  effects of high uptime could be helpful.
 
  It occurs to me that the probe requests I hope to depreciate allow
  reconstructing the actual network topology - perhaps we could run
  simulations on top of it? The new probe requests are currently planned
  to not report degrees or link length distributions, [1] which as far
  as I can tell would mean no way to reconstruct the network as measured
  in simulation. Does it seem reasonable to omit the ability to gather
  such information?

I don't know, in the short run. In the long run we'd prefer not to expose the 
full topology for security reasons.
 
 
  On 04/28/2012 03:51 PM, Michael Grube wrote:
   Are you assuming opennet, darknet, a mix?
 
  The simulation generates the graph in a way which effectively sort of
  assumes darknet: it assigns locations, then iterates over the network
  and connects nodes based on link length. [2] [3] I'm working under the
  assumption that when using the same degree distribution as the network
  is measured to have this is an accurate enough approximation. I will
  include these and similar plots in this week's progress report.
 
 
 Ah, ok. You're making a Kleinberg Small World graph. Nice.
 
 
 
  My understanding is that a more thorough simulation of darknet would
  randomly assign locations and reassign them with the location-swapping
  algorithm used in Fred.
 
 
 Right. Funnily enough, the swapping algorithm is most concisely described
 in the Pitch Black paper. I'd suggest that as a reference for quick
 implementation.
 
 
  I think simulating opennet would be more work
  both to implement and to run, as it would mean implementing
  path-folding and randomly assigning locations and connections.
 
 
 No doubt about that. Just curious. I did some work involving some very
 simple probes, but it was more of a theoretical simulation. We'll see how
 close theory matches reality I guess ;)

Obviously, opennet will have uptime issues (but hopefully they will settle 
after a reasonable period), and darknet presumably won't...


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Statistics Project Update #1

2012-05-09 Thread Evan Daniel
On Tue, May 1, 2012 at 7:48 AM, Zlatin Balevsky zlat...@gmail.com wrote:
 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
 In Gnutella we observed that long-lived nodes tend to be better
 connected and that they also cluster with other high-uptime nodes.
 If the same is true for Freenet it's a good idea to keep an eye for
 side effects as you tweak the behavior.

 Good to know - I'll look for that. Are there any particular effects
 you had in mind? The Metropolis-Hastings correction in the new probes
 should produce a fairly uniform distribution of endpoints despite
 clustering and well-connected nodes, but explicitly simulating the
 effects of high uptime could be helpful.

 There was a study that higher uptime correlated with the probability
 of further uptime so if you shift bias towards low-uptime nodes you
 could end will lower overall reliability.  It was done on a different
 network with different usage patterns but imho you should definitely
 treat node uptime as a parameter in any simulations.
 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

MH should produce a good simple random sample from all nodes currently
online, provided that the walk is of sufficient length, regardless of
clustering effects. If there are partitioning effects, those will make
the required walk length to get good dispersion longer, in a way that
might be somewhat difficult to measure, but as long as the network is
not completely partitioned, a sufficient walk length will produce a
good sample. The fact that a large sample must be taken over an
extended period means that low-uptime nodes will have a somewhat
disproportionately lower chance of being in the sample (I think...
need to do math here), but isn't a huge problem.

Evan Daniel
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Statistics Project Update #1

2012-05-04 Thread Steve Dougherty
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/30/2012 07:27 PM, Michael Grube wrote:
 Right. Funnily enough, the swapping algorithm is most concisely 
 described in the Pitch Black paper. I'd suggest that as a 
 reference for quick implementation.

Thanks for the suggestion; will do.

On 05/01/2012 07:48 AM, Zlatin Balevsky wrote:
 There was a study that higher uptime correlated with the 
 probability of further uptime so if you shift bias towards 
 low-uptime nodes you could end will lower overall reliability.  It 
 was done on a different network with different usage patterns but 
 imho you should definitely treat node uptime as a parameter in any 
 simulations.

So simulating this would mean an initial connection, then
disconnection and reconnection cycles as low-uptime nodes be
low-uptime? As far as bias towards low-uptime nodes affecting
reliability, it's worth noting that all this talk of
Metropolis-Hastings is only relevant to the new probe routing and
doesn't affect the existing routing behaviors. If this routing
includes low-uptime nodes that's good! It would mean a more uniform
selection of endpoints which allows more accurate and comprehensive
measurement of the network.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBAgAGBQJPpMAZAAoJECLJP19KqmFuooQQAMIH3I4C5ARaZGlItAnGNU0f
kbm835bikBcN+NW2ICAzojS/tvirnDX4FPSGs+jyolUYse915MCWQRXLLi14G1VN
5pFfpm9cNHxqZB1fl4Cr7ndcBy6KFV0qzNK+K/RA4sCPGhiU63GaInPGEeoXwpFL
I310RDaHGgGwbxcRokgnwg+6pPGqwSwOy/v7C+gu8ayfoD7ebP+Ow2fcjEmDDV/g
OJwfbnveiGkdDaHb6GA5jpFgrkqSbod7TDKXp0+vXp9XL4wpcI7f486kDCFEplTp
E5K0UoWoNVYgSHjrffELhdQy3ehrQHaSayBUaOm4K3bI2VRni1UIOlm4ixkqa4m1
09NAy39HYc2xlnyyBh8/zx1L/6l6SG0Thkms2uhhm4gnCYySYHzTxn3fXpuclUIU
15THxZocCd8rQxZzwooOSSseGdxG3S5cBYsttOX1VJiyUZjdBS1cg9uNF38bm2d/
/qZwt87a1hVyQ47U6O36BX7E8hT5aYbskPLGdF74xI1QiSIn8ZQO46Q9thBTytHs
AxbTr568pvSYqbhC/N4PrxESiMxNoj5agP9Jo4MV5pd8QD4rVahPZoSyLM7qyyp3
uRR+vEo8zjSxCzZj/TQUfuvb8QsUvkHuuR0+6jod4s9y0DnkszndJ05dqydSLQu/
pCZT1uUEoXqJ4us1IC3e
=gMU0
-END PGP SIGNATURE-
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Statistics Project Update #1

2012-05-01 Thread Zlatin Balevsky
 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
 In Gnutella we observed that long-lived nodes tend to be better
 connected and that they also cluster with other high-uptime nodes.
 If the same is true for Freenet it's a good idea to keep an eye for
 side effects as you tweak the behavior.

 Good to know - I'll look for that. Are there any particular effects
 you had in mind? The Metropolis-Hastings correction in the new probes
 should produce a fairly uniform distribution of endpoints despite
 clustering and well-connected nodes, but explicitly simulating the
 effects of high uptime could be helpful.

There was a study that higher uptime correlated with the probability
of further uptime so if you shift bias towards low-uptime nodes you
could end will lower overall reliability.  It was done on a different
network with different usage patterns but imho you should definitely
treat node uptime as a parameter in any simulations.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Statistics Project Update #1

2012-04-30 Thread Steve Dougherty
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
 In Gnutella we observed that long-lived nodes tend to be better 
 connected and that they also cluster with other high-uptime nodes.
 If the same is true for Freenet it's a good idea to keep an eye for
 side effects as you tweak the behavior.

Good to know - I'll look for that. Are there any particular effects
you had in mind? The Metropolis-Hastings correction in the new probes
should produce a fairly uniform distribution of endpoints despite
clustering and well-connected nodes, but explicitly simulating the
effects of high uptime could be helpful.

It occurs to me that the probe requests I hope to depreciate allow
reconstructing the actual network topology - perhaps we could run
simulations on top of it? The new probe requests are currently planned
to not report degrees or link length distributions, [1] which as far
as I can tell would mean no way to reconstruct the network as measured
in simulation. Does it seem reasonable to omit the ability to gather
such information?


On 04/28/2012 03:51 PM, Michael Grube wrote:
 Are you assuming opennet, darknet, a mix?

The simulation generates the graph in a way which effectively sort of
assumes darknet: it assigns locations, then iterates over the network
and connects nodes based on link length. [2] [3] I'm working under the
assumption that when using the same degree distribution as the network
is measured to have this is an accurate enough approximation. I will
include these and similar plots in this week's progress report.

My understanding is that a more thorough simulation of darknet would
randomly assign locations and reassign them with the location-swapping
algorithm used in Fred. I think simulating opennet would be more work
both to implement and to run, as it would mean implementing
path-folding and randomly assigning locations and connections.

[1] https://bugs.freenetproject.org/view.php?id=3568
[2]
https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L139
[3]
https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L52
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBAgAGBQJPnxsCAAoJECLJP19KqmFu+dsP/RsSyAcQEqqFM3UDjdlTtMBd
EtBWQ/7mfsLnZ4D3aHGZ1yV4F2jpU5T6ZVEcP520gwZ1fz7k+Yl4AVZYMyE0Ixbu
HiSGeoAQeyAJs+znhg+svKvIrcUzr6i2N6/1D3iZPZl8s//dXNtOfr5zfsoh3wrB
k7T5BmCBEGEIru4Z33EsMkVDTzJ8fy17fZew1MKfOs1HlBUr3hIrz4b/IlbxUprb
0SMosb/cc5W+pRM0d6nVlYn3vMgW/IHFTF2wFQGcd2oC9eE1RJ7J6CvsaVnjG1k+
MRv0nCQoVxztQ3CUS8Apkzb/SFpRcJFBist/hyCkL8eCHE8fjblisthRGZaoBN8d
aqcQw57xeRPJwdmZGWw6e/gMOTVVa44XlhuKiOap0iQSctWsG1rcpXA824VshasU
JDfMAiz67L/7QQ1q/zLAC22PPpxfliS+k4A+OF/4QQLeZ+3dfkpgg9LEzphVVOwH
nXw0/d8mg+i+dUir4irq15nharbiNtDbLAmVJYW+KnfVQIvVOl+4ZcIL9j4EanzK
egEMlm8y4y+tQYR2TsPhqrnrNax8lspqz6fp1H7dMm0aLxF/edBHE/m2C1xZ2tX0
7NpD4Hx7jV7qCFD86kxc1dbYZpgYh5i0BOWBjc+bLGY4gyO1cGJOLU8jknCmZ1vV
QDdK7+1G24/xn9Ju/Y9h
=GBrm
-END PGP SIGNATURE-
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Statistics Project Update #1

2012-04-30 Thread Michael Grube
On Mon, Apr 30, 2012 at 7:06 PM, Steve Dougherty st...@asksteved.comwrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 04/28/2012 06:56 PM, Zlatin Balevsky wrote:
  In Gnutella we observed that long-lived nodes tend to be better
  connected and that they also cluster with other high-uptime nodes.
  If the same is true for Freenet it's a good idea to keep an eye for
  side effects as you tweak the behavior.

 Good to know - I'll look for that. Are there any particular effects
 you had in mind? The Metropolis-Hastings correction in the new probes
 should produce a fairly uniform distribution of endpoints despite
 clustering and well-connected nodes, but explicitly simulating the
 effects of high uptime could be helpful.

 It occurs to me that the probe requests I hope to depreciate allow
 reconstructing the actual network topology - perhaps we could run
 simulations on top of it? The new probe requests are currently planned
 to not report degrees or link length distributions, [1] which as far
 as I can tell would mean no way to reconstruct the network as measured
 in simulation. Does it seem reasonable to omit the ability to gather
 such information?


 On 04/28/2012 03:51 PM, Michael Grube wrote:
  Are you assuming opennet, darknet, a mix?

 The simulation generates the graph in a way which effectively sort of
 assumes darknet: it assigns locations, then iterates over the network
 and connects nodes based on link length. [2] [3] I'm working under the
 assumption that when using the same degree distribution as the network
 is measured to have this is an accurate enough approximation. I will
 include these and similar plots in this week's progress report.


Ah, ok. You're making a Kleinberg Small World graph. Nice.



 My understanding is that a more thorough simulation of darknet would
 randomly assign locations and reassign them with the location-swapping
 algorithm used in Fred.


Right. Funnily enough, the swapping algorithm is most concisely described
in the Pitch Black paper. I'd suggest that as a reference for quick
implementation.


 I think simulating opennet would be more work
 both to implement and to run, as it would mean implementing
 path-folding and randomly assigning locations and connections.


No doubt about that. Just curious. I did some work involving some very
simple probes, but it was more of a theoretical simulation. We'll see how
close theory matches reality I guess ;)

Thanks for answering.



 [1] https://bugs.freenetproject.org/view.php?id=3568
 [2]

 https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L139
 [3]

 https://github.com/Thynix/routing-simulator/blob/73dfd6c94156cef35815ac7de2fcfa934385ccae/Graph.java#L52
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)

 iQIcBAEBAgAGBQJPnxsCAAoJECLJP19KqmFu+dsP/RsSyAcQEqqFM3UDjdlTtMBd
 EtBWQ/7mfsLnZ4D3aHGZ1yV4F2jpU5T6ZVEcP520gwZ1fz7k+Yl4AVZYMyE0Ixbu
 HiSGeoAQeyAJs+znhg+svKvIrcUzr6i2N6/1D3iZPZl8s//dXNtOfr5zfsoh3wrB
 k7T5BmCBEGEIru4Z33EsMkVDTzJ8fy17fZew1MKfOs1HlBUr3hIrz4b/IlbxUprb
 0SMosb/cc5W+pRM0d6nVlYn3vMgW/IHFTF2wFQGcd2oC9eE1RJ7J6CvsaVnjG1k+
 MRv0nCQoVxztQ3CUS8Apkzb/SFpRcJFBist/hyCkL8eCHE8fjblisthRGZaoBN8d
 aqcQw57xeRPJwdmZGWw6e/gMOTVVa44XlhuKiOap0iQSctWsG1rcpXA824VshasU
 JDfMAiz67L/7QQ1q/zLAC22PPpxfliS+k4A+OF/4QQLeZ+3dfkpgg9LEzphVVOwH
 nXw0/d8mg+i+dUir4irq15nharbiNtDbLAmVJYW+KnfVQIvVOl+4ZcIL9j4EanzK
 egEMlm8y4y+tQYR2TsPhqrnrNax8lspqz6fp1H7dMm0aLxF/edBHE/m2C1xZ2tX0
 7NpD4Hx7jV7qCFD86kxc1dbYZpgYh5i0BOWBjc+bLGY4gyO1cGJOLU8jknCmZ1vV
 QDdK7+1G24/xn9Ju/Y9h
 =GBrm
 -END PGP SIGNATURE-
 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Statistics Project Update #1

2012-04-28 Thread Michael Grube
Are you assuming opennet, darknet, a mix?

On Fri, Apr 27, 2012 at 10:34 PM, Steve Dougherty st...@asksteved.comwrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Here's an update on my progress on the statistics project for the
 first week:

 The current probes are biased towards better-connected nodes: at each
 hop they choose a random peer to pass the request to - this is a random
 walk. However, because better-connected nodes by definition have more
 connections, the requests will be passed to them more often and they
 will be over-represented in results. To address this, the new probes I
 will implement will use Metropolis-Hastings correction: unlike the
 uniform random walk which always uses the random peer it picks, it is
 less likely to pick a well-connected node, and more likely to pick a
 poorly-connected node. As there are more chances to pick a
 well-connected node than a poorly-connected one, this balances out to
 a uniform probability to pick any given node.

 I'm starting from evanbd's network simulator,[1] which is able to
 generate networks based off theoretical models and perform some routing
 simulations. It can now also reproduce a given degree (number of peers)
 distribution which allows simulating the network as it is measured to be
 in addition to purely theoretical models.

 Currently I'm working on plotting the distribution of this routing
 strategy with different limits on hops before returning information -
 Hops To Live - HTL. This is to get a good number to start with in the
 implementation of these probes. I will also refactor the simulator and
 make it easier to configure: currently values are hard-coded and
 changing the simulation parameters means recompiling.




 My goal is to have this area of simulation done and have begun planning
 if not implementing the new probe requests by the end of next week.

 Thanks,
 operhiem1

 [1]
 USK@gjw6StjZOZ4OAG-pqOxIp5Nk11udQZOrozD4jld42Ac
 ,BYyqgAtc9p0JGbJ~18XU6mtO9ChnBZdf~ttCn48FV7s,AQACAAE/flog/29/200911.xhtml
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.11 (GNU/Linux)

 iQIcBAEBAgAGBQJPm1czAAoJECLJP19KqmFuZ2MP/1VRYipBzRKOQDjkKSIl0dq6
 +FmJg/lPwBltn/gDXMRB0+vY1Msdbd/ydXx4JEqtJzeHUC/bqreLxMov2EqzYZAl
 0kLmuNUp7Cn9h7PlhAVpQ5t7BMLbJF5UikDLHz3vUfAE4bKgDuNwiy9Z0Uf+zqr/
 esRD0qWn24dACBOA4rRAkb0b+14UgIKgMj3ohMRkpK2NgHuB4OUmGMOQIf/9h/Vb
 /FwruM6RdiUUd0g+ldKhzpfflqahKt30xjHCQeNvMZRx7N0OMJfnBUTbCP+ogaD5
 aM/BoXoKo1WUCjMKUN8vFvby1BF+4zolywyIhxUqrQv76yjoGvpI/K3qdsH5YWUG
 AZJeVbbQdV959S1waAU4gji2iEKVzhretZNBMQApY421WG1c0C8MUtNOY37zZ3iO
 q/Nxlck9vVDTNgXAs3vzm7VbuKeeyEfqHe+imIYhiqYjfqUQteSgO70T2gpMrw6f
 ojbs2ohtM7sXLPp8P6Lf57VcHEsmSUJkWTua7ycdPXGHmdo/MLqFLW3UVYsWzy5P
 c67y6yjvxVVqJfbu38FQ/mTqgOGTduU1568BYApBO9bp6/b+2jkZcfcIsL8apGM2
 vvjtxxxCfoESHobwH59NoSezslGxHBddEpWcDl2ggY6NgkzH72iAtjF4a4GRkGJM
 ju5xKSZ40OCpLNUCA/M0
 =cnM8
 -END PGP SIGNATURE-
 ___
 Devl mailing list
 Devl@freenetproject.org
 https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Statistics Project Update #1

2012-04-28 Thread Zlatin Balevsky
On Fri, Apr 27, 2012 at 10:34 PM, Steve Dougherty st...@asksteved.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Here's an update on my progress on the statistics project for the
 first week:

 The current probes are biased towards better-connected nodes: at each
 hop they choose a random peer to pass the request to - this is a random
 walk. However, because better-connected nodes by definition have more
 connections, the requests will be passed to them more often and they
 will be over-represented in results. To address this, the new probes I
 will implement will use Metropolis-Hastings correction: unlike the
 uniform random walk which always uses the random peer it picks, it is
 less likely to pick a well-connected node, and more likely to pick a
 poorly-connected node. As there are more chances to pick a
 well-connected node than a poorly-connected one, this balances out to
 a uniform probability to pick any given node.

In Gnutella we observed that long-lived nodes tend to be better
connected and that they also cluster with other high-uptime nodes.  If
the same is true for Freenet it's a good idea to keep an eye for side
effects as you tweak the behavior.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl