Re: [freenet-dev] Path to 0.6

Toad Mon, 03 Nov 2003 18:23:21 -0800

On Mon, Nov 03, 2003 at 04:03:34PM -0600, Tom Kaitchuck wrote:
> On Monday 03 November 2003 02:10 am, Tracy R Reed wrote:
> > You are implying that frost is the cause of this? If that's the case I
> > think the frost project has to die because it is killing the rest of the
> > network. But I'm not sure that it is. It would be nice if there were a
> > psuccess measurement which did not include KSK's. With an 8% psuccess rate
> > I would have to insert a splitfile with around 1000% redundancy to be able
> > to get it on the first try. Is that what you propose that we do? Or do we
> > just click retry a whole lot of times and cross our fingers? I haven't
> > been able to receive TFE for a couple days and my node has 20G of data in
> > the store and a DS-3 and has been up for months. Others can retrieve it so
> > it was definitely inserted today. I recall someone on the channel once
> > reported a 25% psuccess and you were impressed. Doesn't it seem odd to you
> > that you would buy that as a possible realistic number and now you ask us
> > to consider that perhaps 92% of requests are for data that isn't in the
> > network?
> .......
> > I have a challenge for you: Name one other computer program that works yet
> > nobody can explain how. And you intend to put Freenet on this list? I just
> > don't buy the "nonobvious" routing theory.
> 
> I for one DO buy that theory, because I have a very good idea of why the 
> network is not working. As I stated in my post on Tuesday in the thread 
> "Looking good, but still a big problem..." the reason the network is not 
> working at this point is likely not due to bugs in the NGrouting code itself.
> 
> I suspect that it is not doing what we want because it is interacting with 
> probabilistic caching. Meaning that as one gets closer and closer to the 
> destination of the data, the more likely the data is to be found on any 
> random node as compaired to the probability that it is on the closest match. 
> Because we cache probabilistically, closer the request gets to it's 
> destination the better fast non-specialized nodes look. This behavior is also 
> self-reinforcing, as the fast nodes continue to look good and windup storing 
> all the popular content. This is somewhat unavoidable, because it is 
> impossible to make NGrouting sensitive enough to pick up on the 
> popularity/unpopularity of a particular key.
> 
> A quick way to test this is to see if your successful incoming request 
> histogram looks more specialized than your data store. For mine it does.
> 
> So, a request may not succeed because it gets routed to node that was likely 
> to fail, but NGrouting does not consider this bad, because it was fast, and 
> would have been best had it been best overall had it been retried FROM THAT 
> POINT, however it does not get retried from that point, and instead it goes 
> back to the original requester and retries along a similar or perhaps the 
> same path. The network does not learn to route a specific key differently, 
> rather it looks at the overall pattern, and there the faster nodes look 
> better.


This does not make sense, because NGRouting always considers DNF to take
the time to DNF, plus the global time for success - which is generally
pretty huge, and does not depend on the HTL, only on the key.
> 
> I don't mean to paint a bleak picture here, you just have to count on the fact 
> that NGrouting optimizes for the retrieval time, not the probability of 
> success.
> 
> So needs to be done:
> First prevent flooding. That is what is exasperating the current problems. To 
> do that we should try to smooth out QRs rather than just rejecting everything 
> for a set time. Second it might be better if we did not do query rejects at 

Perhaps. We tried before.. I'm not sure how well it worked.

> all, but rather have a query accept message. That way under load we would not 
> waste any more bandwidth to reject incoming requests. It would also make 
> predicting the time of a QR simple as it is always a fixed timeout and they 
> would look comparatively bad because you have to wait for the timeout.This 
> would eliminate the need for query rejects "with a punch".

We have an Accepted message, but after the query is accepted it can be
rejected at a later time for various reasons.
> 
> Then implement TUKs, everyone agrees that KSKs and SSKs in Frost waste a lot 
> of bandwidth and are at least partly responsible for the all around low 
> success rates. 

I'm not sure that world writable TUKs make sense. But I suppose if you
can write to the board, you can flood it with messages - but Frost
TUKs would have to have a payload of an index number for the most recent
message, and Frost would then try to fetch messages up to that point, so
it would be far easier to just insert a TUK with an absurd index number
than to insert hundreds of spam messages.
> 
> Third (and this is the big one) eliminate probabilistic caching. I'm not 
> proposing going to random caching, rather inventing NGcaching. That is each 
> passing successful request should be added to the list of requests we have 
> seen and referenced to the node that provided it, as it does now. But rather 
> than storing it in the cache if in is within so close to the origin, evaluate 
> it biased on the probability that it will be requested in the future. 
> 
> Because it would be fairly difficult to come up with a very good algorithm for 
> accurately estimating this, taking into account, the distance from the 
> target, the request htl as it passed, the number of times we have seen this 
> data before, the key value, (and how long it took to get it?), this would 
> probably be best left to an SVM. I must admit I know nothing about SVMs or 
> how they work. But I recall someone that did wrote one that was actually able 
> to predict Routing times better than our current algorithm, but it was 
> decided against because it required too much CPU. However for the data store, 
> this would not really be a problem because we would only need to run one, and 
> not hundreds. 

Why not use an estimator? If they aren't working, NGRouting is stuffed.
> 
> Forth (and this is long term) if we have reasonable assurances that the data 
> is actually in the network, it would be more efficient to simply retry the 
> request from the last node that failed. However this opens the door to 
> flooding and anti-specialization attacks. So for this to be possible we 
> should have both Hash-Cash and a positive trust biased routing system. 
> 
> And of course it is always good to have estimators of the network health to 
> help debug things. So in case anyone had any delusions about how much work is 
> left to be done, there you are. The good news is that if all that was done, 
> along with a few other things, like pre-mixrouting, we could be near 1.0. So 
> everyone: donate money and write code. My check will be in the mail soon.

-- 
Matthew J Toseland - [EMAIL PROTECTED]
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.

signature.asc
Description: Digital signature

_______________________________________________
Devl mailing list
[EMAIL PROTECTED]
http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Path to 0.6

Reply via email to