On Thursday 18 October 2001 15:24, you wrote:
> On Thu, Oct 18, 2001 at 03:20:42PM -0400, Gianni Johansson wrote:
> > On Thursday 18 October 2001 08:04, you wrote:
>
> < >
>
> > > As I said, I don't see the point in this. The CP value already gives a
> > > "rest", it just does to ramdonly rather than absolutely, but it
> > > averages out.
> >
> > Averaging out isn't good enough.  Transient routing failures -- caused by
> > target nodes hitting their thread/connection limit   -- are very common.
> > You need to recover from them as soon as possible.  If you have one node
> > ref that just stopped working and the rest haven't worked for days, it
> > makes sense to retry the one that was working as soon as reasonably
> > possible (i.e after the rest interval expires). Otherwise the network
> > will collapse under heavy stress a la 0.3.
>
> But that is what the CP does. 

> The one that just failed recently will
> have a greater CP than those that have been broken for a long time -
> therefore it will get checked sooner. 
This assertion is false, see below.

> Perhaps the decay needs to be
> tuned, but the math here is already correct.
>
> > Just to clarify, my change increases the timeout interval with each
> > successive failure, so if the refs have been retried and failed and
> > none's timeout interval has expired, the request will RNF.
>
> Which is what decaying the CP already does.
>
> <>
>
> > The perfect is not the enemy of the good.
> >
> > Rember this code should almost never run.  It's purpose is to allow the
> > node to bootstrap into the network with a small set of potentially crappy
> > noderefs and/or to recover after it has been overloaded --i.e. routed so
> > many requests that all of it's node refs are no longer responding.
> >
> > As you have pointed out the previous implementation was just wrong.  I
> > think my modifications are an improvement.
>
> The correct thing to do is keep it simple and consistent. Use the CP and
> only the CP. If you run out of refs always go to RouteNotFound.
> Unnecessary complication IS the enemy of the good.
>

The fundamental problem with the current CP approach is that it doesn't 
take time into account in the way it models contact reliability.  No amount
of tuning will fix this.

A noderef that was responding to 100% of requests until 20 minutes ago but has
failed to respond to the last 10 requests is qualitatively different from a 
noderef which which has failed all 10 requests that were made to it since
the node was started a week ago.  The former is much more likely to respond 
than the latter. 


> > > I agree, that seems wrong. I think that it should rather be
> > >
> > > tm = new ThreadPool(tg, node.maximumThreads >> 1, node.maximumThreads,
> > > 5);
> > >
> > > though "5" seems awefully arbitrary. I'm quite sure the intention was
> > > to have maximumThreads maximum, and half as many in the pool by
> > > default, so this is probably simply a mistake.
> >
> > I'm not sure I am following you.
>
> Well, I'm not entirely sure how the ThreadPool works, but I thought that
> the pool number was the number of threads that were always kept alive
> (except I guess "minPool" would be a better name for that I guess), the
> maxThreads number was the maximum that could ever be alive, and I don't
> see the need to enqueue any jobs at all (there is no sense in
> leaving jobs hanging we don't have threads for, all new Threads except
> connections come from the Ticker).
>
> It seems logical to me that we keep an active pool of about half the
> allowable threads - why would we only keep 5?
Back in the .3 days people expressed dismay that all of those "unused"
threads were being kept around.

I think we are finally on the same page.  No queued jobs.

--gj

-- 
Freesites
(0.3) freenet:MSK at SSK@enI8YFo3gj8UVh-Au0HpKMftf6QQAgE/homepage//
(0.4) freenet:SSK at npfV5XQijFkF6sXZvuO0o~kG4wEPAgM/homepage//

_______________________________________________
Devl mailing list
Devl at freenetproject.org
http://lists.freenetproject.org/mailman/listinfo/devl

Reply via email to