[freenet-dev] Statistics Project Update #2

2012-05-04 Thread Steve Dougherty
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I've completed an initial run of simulation work on probes. The code is
available, [1] as well as the simulation results from which the plots
were generated. [2] The point of immediate interest though is the plots
themselves, [3] which show the predicted network coverage of different
probe routing techniques on networks with ideal degree distribution
(more on this later) and that following the degree distribution of
Freenet as measured. [4] Link lengths and locations do not factor into
this simulation because probes take only degree into account and are not
seeking any given destination; their goal is only to average out to
distribute endpoints uniformly throughout the network.

The ideal network distribution has each node add a fixed number of
remote connections without regard for the number of connections it or
the nodes it's connecting to have. I don't know whether this or having
each node have the same number of total connections is the ideal. The
results of the simulation did not appear to greatly change with
network size, as shown by the consistent behavior between the 12,000
and 45,000 node versions of the MH-corrected degree-conforming
simulation. [5]

As expected, the plots suggest that using Metropolis-Hastings correction
will be an immense improvement in endpoint uniformity over the current
uniform random routing, but specifically suggest that an HTL of around
20 hops is close enough to a baseline uniform endpoint probability to be
a good starting point. I've noticed that these CDFs aren't a very good
format for demonstrating closeness of distributions, given overlapping
lines, but I don't understand the Kolmogorov–Smirnov test yet, so I'm
planning to just use these results as a guideline and begin implementing
the new probes next week.

The gnuplot scripts to generate the degree and link length distribution
plots are part of pyProbe, [6] and GNU parallel [7] is used in test.sh
to run simulations in parallel. In the simulator source there are
scripts from an earlier effort to plot coverage as percentages, but that
was even less clear than the CDFs.

Comments and suggestions are very welcome!

Thanks,
operhiem1

[1] https://github.com/Thynix/routing-simulator/tree/dev
[2] http://asksteved.com/plot-source.tar.xz
[3] http://imgur.com/a/Z8SBS#2
[4] http://i.imgur.com/ehfBP.png
[5] http://i.imgur.com/rtRIB.png
[6] https://github.com/Thynix/pyProbe
[7] http://www.gnu.org/software/parallel/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBAgAGBQJPpL3dAAoJECLJP19KqmFuNn0QAMsA4nzk6AfPf8pIqrmoEW8U
2jcc7L3KnUkCIgvh9FyhJkZ9Fm42zCoqgxXmyavM9T18ZO52eYaNMaSfkA5FWltk
iBElymF7ZCGd3ERX9XPirbXGDeMbpNsFbVHFoJbqKzb94MrnSUivLsVQz0Nl1KOJ
g1yfYdA4RK3ywYIvwS7nWkIIrxhuik/Jzjaq5cuqY2L6i3DgiM9gjYweyJLpzt6r
k/mRNOuKTI0MSdqMWclBFXOEOzTg/vZKZSvvslpZRwt0Opp+nK9VKBMVzvqiqUpr
G9EEke4vPqU8OdWffxqu3nF5ZXlr4aB3mWw6B7zimE+7C3Wvk3oQHxxv/p/PqD96
GQ/sUbkFERSv/SnMDCuz8BVoPNihTyohvRJmeW92P2KpFCJ7Ynsx1uC6XLKDQVIO
Qxds7EUKkdEQaEbNYRKMkzx9qzOszRZlcvLElX2Fgw15KvTMKmMDb/7t1DpbBysY
tl7JnkYW6crq3nvBpWu3JFmSOYERhEzzKxkRsE76DVzkBz35AYOb1ZTLx06mEgP4
F8HFs31Ra8LNlVCoN5jEHW3WhUIVkVtx8zauXGOtjJuY4ePhEXS9TvXOKAbvxMiA
d/Nu78MORKBdq1repSMIcCLUl1Ya0AT0BEugvJ4KyKPScl0JL0GPOiFBG8Dr01GZ
pQtYR4VDpcLlPzPkq1xj
=apcO
-END PGP SIGNATURE-
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl


Re: [freenet-dev] Statistics Project Update #1

2012-05-04 Thread Steve Dougherty
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/30/2012 07:27 PM, Michael Grube wrote:
 Right. Funnily enough, the swapping algorithm is most concisely 
 described in the Pitch Black paper. I'd suggest that as a 
 reference for quick implementation.

Thanks for the suggestion; will do.

On 05/01/2012 07:48 AM, Zlatin Balevsky wrote:
 There was a study that higher uptime correlated with the 
 probability of further uptime so if you shift bias towards 
 low-uptime nodes you could end will lower overall reliability.  It 
 was done on a different network with different usage patterns but 
 imho you should definitely treat node uptime as a parameter in any 
 simulations.

So simulating this would mean an initial connection, then
disconnection and reconnection cycles as low-uptime nodes be
low-uptime? As far as bias towards low-uptime nodes affecting
reliability, it's worth noting that all this talk of
Metropolis-Hastings is only relevant to the new probe routing and
doesn't affect the existing routing behaviors. If this routing
includes low-uptime nodes that's good! It would mean a more uniform
selection of endpoints which allows more accurate and comprehensive
measurement of the network.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQIcBAEBAgAGBQJPpMAZAAoJECLJP19KqmFuooQQAMIH3I4C5ARaZGlItAnGNU0f
kbm835bikBcN+NW2ICAzojS/tvirnDX4FPSGs+jyolUYse915MCWQRXLLi14G1VN
5pFfpm9cNHxqZB1fl4Cr7ndcBy6KFV0qzNK+K/RA4sCPGhiU63GaInPGEeoXwpFL
I310RDaHGgGwbxcRokgnwg+6pPGqwSwOy/v7C+gu8ayfoD7ebP+Ow2fcjEmDDV/g
OJwfbnveiGkdDaHb6GA5jpFgrkqSbod7TDKXp0+vXp9XL4wpcI7f486kDCFEplTp
E5K0UoWoNVYgSHjrffELhdQy3ehrQHaSayBUaOm4K3bI2VRni1UIOlm4ixkqa4m1
09NAy39HYc2xlnyyBh8/zx1L/6l6SG0Thkms2uhhm4gnCYySYHzTxn3fXpuclUIU
15THxZocCd8rQxZzwooOSSseGdxG3S5cBYsttOX1VJiyUZjdBS1cg9uNF38bm2d/
/qZwt87a1hVyQ47U6O36BX7E8hT5aYbskPLGdF74xI1QiSIn8ZQO46Q9thBTytHs
AxbTr568pvSYqbhC/N4PrxESiMxNoj5agP9Jo4MV5pd8QD4rVahPZoSyLM7qyyp3
uRR+vEo8zjSxCzZj/TQUfuvb8QsUvkHuuR0+6jod4s9y0DnkszndJ05dqydSLQu/
pCZT1uUEoXqJ4us1IC3e
=gMU0
-END PGP SIGNATURE-
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl