On Tuesday 08 January 2008 22:57, Robert Hailey wrote:
>
> On Jan 8, 2008, at 3:33 PM, Matthew Toseland wrote:
>
> > On Saturday 05 January 2008 00:50, Robert Hailey wrote:
> >>
> >> Interestingly (now that I have got the simulator running), this
> >> 'general timeout' appears even in simulations between nodes on the
> >> same machine. Unless I coded something wrong, perhaps there is an
> >> added delay or missing response somewhere which is not obvious?
> >
> > Entirely possible. Fixing it would be better than an arbitrary
> > cutoff when we
> > are still able to potentially find the data, and still have enough
> > HTL to do
> > so.
>
> On Jan 8, 2008, at 2:27 PM, Matthew Toseland wrote:
> > On Friday 04 January 2008 18:32, Robert Hailey wrote:
> >>
> >> Apparently until this revision 16886, (so long as any one node does
> >> not timeout) a node will take as long as necessary to exhaust
> >> routable
> >> peers. Even long after the original requestor has given up on that
> >> node.
> >
> > Is there any evidence that this happens in practice? Surely the HTL
> > should
> > prevent excessive searching in most cases?
>
> There is, in fact. The timeout itself (which I have been running on my
> node for a while) is evidence of the behavior (which to be seems
> incorrect).
>
> Jan 08, 2008 20:03:41:146 (freenet.node.RequestSender, RequestSender
> for UID -3998139406700477577, ERROR): discontinuing non-local request
> search, general timeout (6 attempts, 3 overloads)
> ...
> Jan 08, 2008 20:12:21:226 (freenet.node.RequestSender, RequestSender
> for UID 60170596711015291, ERROR): discontinuing non-local request
> search, general timeout (1 attempts, 3 overloads)
Ouch. How common is this?
>
> You see... in the first log statement the node tried six peers before
> running out of time. In the second case (which occurs quite
> frequently), the node used the entire 2 minutes waiting on a response
> from one node (FETCH_TIMEOUT); if it were allowed to continue to the
> next node it could (65%) spend another 2 minutes on just-that-node.
>
> >> To the best of my knowledge, all of the upstream nodes will not
> >> respond with the LOOP rejection before then. And even well before the
> >> worst case, this effect can accrue across many nodes in the path.
> >
> > If the same request is routed to a node which is already running it,
> > it
> > rejects it with RejectedLoop. If it's routed to a node which has
> > recently ran
> > it, it again rejects it. If it is a different request for the same
> > key, it
> > may be coalesced.
>
> If you mean that the RECENTLY_FAILED mechanism would keep this in
> check... I see this idea in many places, but I cannot see where this
> is actually implemented. The only place I see that makes a
> FNPRecentlyFailed message is in RequestHandler (upon it's
> RequestSender having received one).
It's part of the unfinished ULPRs system. It will be implemented after I have
opennet fully sorted out.
>
> Presently the node will "RejectLoop" if it is one of the last 10000
> completed requests. My node runs through that many requests in about
> 16 minutes. With this logging statement it is already shown that a
> request can last longer than 2 minutes for one peer (and most nodes
> have 20). If you assume that a request takes 4 minutes (two peers,
> VERY optimistic), then it would then only take 4 nodes ('near' each
> other) to generate a request-live-lock (absent the HTL; the request
> would never drop from the network); each trying two of it's other
> peers, and then the next in the 4-chain.
Okay, this is a problem.
>
> I do not think that this timeout I added is arbitrary. As I understand
> Ian's original networking theory, a request is not valid after the
> originator has timed out. In much the same way that a single node
> fatally timing out collapses the request chain, so too should a node
> 'taking too long' (as that node *IS* the one fatally timing out the
> chain).
Well, we can't easily inform downstream nodes of a request timing out. And we
can't include the updated timeout on each hop either (for security reasons).
>
> But on the other hand, I do understand your point about the HTL, and
> that it would keep the request from continuing indefinitely; it seems
> like it could be quite a waste of network resources. Certainly beyond
> that point in time (where the requester has fatally timed out) no
> response should be sent back to the source (that could be many of the
> unclaimed fifo packets); or maybe just if the data is finally found
> (you mentioned ULPRs).
Maybe there is another reason for this behaviour.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL:
<https://emu.freenetproject.org/pipermail/devl/attachments/20080111/42e5eee8/attachment.pgp>