Our MHK tests have reported back in. We insert one key 3 times, and then insert 
3 keys once. We then fetch all of them after a week. If the one key succeeds, 
that is a success for repeatedly inserting a key. If any of the 3 keys 
succeeds, that is a success for MHKs (MHKs are a proposed feature involving 
encrypting the top block in 3 different ways and inserting all 3 blocks, such 
that if we get any of them we can regenerate the others and deliver the data). 
We can also calculate the single block success ratio, and compare that to our 
other data on weekly success stats.

Our long-term push/pull stats. These involve pushing a 64KB splitfile headed by 
a KSK, and fetching it after 2^N+1 days. We insert N+1 blocks on a day, so each 
test insert is only fetched once, after the given period.

Success rate for 0 days: 1.0 (131 samples)
Success rate for 1 days: 0.8837209302325582 (129 samples)
Success rate for 3 days: 0.8203125 (128 samples)
Success rate for 7 days: 0.784 (125 samples)
Success rate for 15 days: 0.6752136752136753 (117 samples)
Success rate for 31 days: 0.4752475247524752 (101 samples)
Success rate for 63 days: 0.10126582278481013 (79 samples)
Success rate for 127 days: 0.0 (24 samples)

We also have more accurate stats based on single blocks with no retrying 
(rather than the usual 3), which are rerequested after 0, 1, 3, 7, 15, ... days:

1 days: Total fetches: 1920 total successes 1471 = 76.61458333333333% in 
32896.885791978246ms
3 days: Total fetches: 1856 total successes 1238 = 66.70258620689656% in 
32386.7310177706ms
7 days: Total fetches: 1728 total successes 1040 = 60.18518518518518% in 
36267.77115384615ms
15 days: Total fetches: 1504 total successes 708 = 47.07446808510638% in 
40448.608757062146ms
31 days: Total fetches: 992 total successes 365 = 36.79435483870968% in 
44799.78082191781ms

So generally, we have a serious problem with data retention. But now lets look 
at the MHK data. This also involves 64KB random splitfiles, but they are headed 
by CHKs. Everything I know suggests that KSKs will survive better than CHKs, 
but we can test this theory if it is vital.

Total attempts where insert succeeded and fetch executed: 54
Single keys succeeded: 52
MHKs succeeded: 50
Single key individual fetches: 162
Single key individual fetches succeeded: 124
Success rate for individual keys (from MHK inserts): 0.7654320987654321
Success rate for the single key triple inserted: 0.9629629629629629
Success rate for the MHK (success = any of the 3 different keys worked): 
0.9259259259259259

So the 1 week success rate for a single block is 76.5%, which is close to the 
78% reported by the long-term test with more blocks. The success rate for an 
MHK is 92.6%, which is plausible given the single block success rate. But look 
at the success rate for a single key triple inserted, 96.2%. Statistically 
there is probably no difference between this and the MHK rating, but the 
interesting thing is it is so high!

***Why does a single insert repeated 3 times persist so much better than a 
single insert?***

Theories:

1. It gets routed differently because of per-node failure tables. This is 
WRONG: Inserts don't go into per-node failure tables if they succeed, in fact 
inserts don't update the failure table at all, although it is taken into 
account when routing.

2. The problem is 3637/3639: We don't store if the HTL is still too high to 
cache (a security measure introduced some months back), and sometimes, because 
of random HTL decrementing, this happens to still be the case when we reach the 
nodes which would be sinks for the data, and would store it in the store (not 
the cache). This is also WRONG, at least in its unmodified form, because if we 
follow the same route we will make the same caching decisions. However, if for 
some reason we follow a different route, it may well be important.

3. It didn't go the same route because of temporary backoff or preemptive 
rejection, so the 3 inserts have a much better chance of hitting the sink 
nodes, because at least one path to the sink nodes will have low enough HTL to 
store it (3637/3639).

4. It didn't go the same route because of temporary backoff or preemptive 
rejection, so the 3 inserts have a much better chance of hitting the sink 
nodes, because we have severe misrouting and frequently miss the sink nodes 
altogether.

5. It didn't go the same route because of temporary backoff or preemptive 
rejection, and the 3 routes are sufficiently different as to cover a lot more 
nodes, and we cache on more nodes without storing on many more, and we fetch it 
from the cache in future because of future misrouting; or we store on more 
nodes which somehow are misplaced; either way the network is rather chaotic.

So we need to determine which theory is correct. Clearly not the first two. If 
routing is working well and misrouting is not particularly important, then #3. 
If misrouting is a common problem that prevents us from reaching the sink 
nodes, then #4 or #5.

Either way we need to fix 3637/3639 as a matter of urgency, and see how this 
affects the test results. If there remains a dramatic gap between performance 
of an insert done once and one done 3 times, then we should consider some more 
radical options:
- Increasing the max HTL on inserts but not on requests.
- Queueing inserts until they can get a reasonably optimal next hop.

If the decay is consistent - which it may not be but it should be close to - 
and if the actual rate is 90%, which IMHO is plausible, although we don't have 
many samples here, then this makes a huge difference to fetchability:

Block reachability after 1 week Half-life
0.765                                           2-3 weeks
0.9                                                     6-7 weeks
0.92                                                    8-9 weeks
0.96                                                    17 weeks

We will need to increase our rate of gathering data in order to be sure 
reasonably quickly, by e.g. making the MHK tester insert many more blocks per 
day in the same pattern. We may also need to dump some of the other 
routing/storage related changes that are already in trunk so that we are not 
measuring too many changes at once.

In the longer term, inserting top or lone blocks multiple times to improve 
persistence is definitely a good idea. But IMHO if we investigate this properly 
we may be able to improve persistence dramatically for *all* blocks, and 
therefore for content in general.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20100121/d9080fc3/attachment.pgp>

Reply via email to