Hi, 

This is the second part of the proposed WoT algorithm improvement:


## Improving the rank2+ update detection delay to
   (less than) O(N), with N the *active* IDs


It is also available in the bugtracker at
https://bugs.freenetproject.org/view.php?id=3816#c12219

------ ------ ------ ------ ------ ------ 

The process to check IDs with rank >= 2 can be improved from essentially
checking them at random (with the real risk of missing IDs — there is no
guarantee to ever check them all, not even networkwide), to having each
active ID check all IDs in O(N) (with N the number of of IDs).

Process: When removing a random subscription to an ID with rank2 or
higher, with 50% probability add the ID+current_version to a blocklist
which avoids processing this same ID with this or a lower version again
and prune it from the WoT.

When receiving a version hint from another ID with a higher version than
the one which is blocked, the ID is removed from the blocklist.

The total cost in memory is on the order of the number of old IDs
already checked, bounded to O(N), the number of Identities.


Expected effect: Assume that 9k of the 10k IDs in WoT are stale (a
reasonable assumption, because only about 300 IDs are inserted from an
up to date version of WoT right now).

When replacing one rank2 and one rank3+ subscription per hour, that
means about 16k subscription replacements per year, or, to simplify,
about two replacements per ID in the WoT.


## Looking at only a single ID:

For the first replacement there is a 90% probability that the ID in
question is stale, and a 50% probability that it will be put on the
blocklist if it is stale, combined a 45% probability that the number of
stale IDs decreases by one. In other words, it takes on average 2.2
steps to remove the first stale ID from the IDs to check.

As a rough estimate, for 10 IDs it would take 15 steps to prune out 5 of
the 9 stale IDs. Scaling this up should give an estimation of the time
required for 9k IDs. So after about 15k steps (one year) half the stale
IDs should be on the blocklist.


## Looking at the whole network

For a given stale ID, after one year there is roughly a 50% chance that
it is on the blocklist of a given active ID. But the probability that it
is on the blocklist of every active ID is just about 0.5^k, with k the
number of active IDs. So when there is an update to this previously
stale ID, it is almost certain that some ID will see it and remove it
From the blocklists of most other IDs within O(N) steps (this will
accelerate as more stale IDs are blocked).


## This is an approximation

I am sure that there is a beautiful formula to calculate exactly the
proportion of subscriptions to stale IDs we’ll have with this algorithm,
and the average discovery time for a previously stale ID to be seen
networkwide again.

I only have this approximation right now, though.

------ ------ ------ ------ ------ ------ 


It would be great if you could check whether the logic holds!

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to