node

Robert Hailey Tue, 8 Jan 2008 16:57:21 -0600

On Jan 8, 2008, at 3:33 PM, Matthew Toseland wrote:

> On Saturday 05 January 2008 00:50, Robert Hailey wrote:
>>
>> Interestingly (now that I have got the simulator running), this
>> 'general timeout' appears even in simulations between nodes on the
>> same machine. Unless I coded something wrong, perhaps there is an
>> added delay or missing response somewhere which is not obvious?
>
> Entirely possible. Fixing it would be better than an arbitrary  
> cutoff when we
> are still able to potentially find the data, and still have enough  
> HTL to do
> so.

On Jan 8, 2008, at 2:27 PM, Matthew Toseland wrote:
> On Friday 04 January 2008 18:32, Robert Hailey wrote:
>>
>> Apparently until this revision 16886, (so long as any one node does
>> not timeout) a node will take as long as necessary to exhaust  
>> routable
>> peers. Even long after the original requestor has given up on that  
>> node.
>
> Is there any evidence that this happens in practice? Surely the HTL  
> should
> prevent excessive searching in most cases?

There is, in fact. The timeout itself (which I have been running on my  
node for a while) is evidence of the behavior (which to be seems  
incorrect).

Jan 08, 2008 20:03:41:146 (freenet.node.RequestSender, RequestSender  
for UID -3998139406700477577, ERROR): discontinuing non-local request  
search, general timeout (6 attempts, 3 overloads)
...
Jan 08, 2008 20:12:21:226 (freenet.node.RequestSender, RequestSender  
for UID 60170596711015291, ERROR): discontinuing non-local request  
search, general timeout (1 attempts, 3 overloads)

You see... in the first log statement the node tried six peers before  
running out of time. In the second case (which occurs quite  
frequently), the node used the entire 2 minutes waiting on a response  
from one node (FETCH_TIMEOUT); if it were allowed to continue to the  
next node it could (65%) spend another 2 minutes on just-that-node.

>> To the best of my knowledge, all of the upstream nodes will not
>> respond with the LOOP rejection before then. And even well before the
>> worst case, this effect can accrue across many nodes in the path.
>
> If the same request is routed to a node which is already running it,  
> it
> rejects it with RejectedLoop. If it's routed to a node which has  
> recently ran
> it, it again rejects it. If it is a different request for the same  
> key, it
> may be coalesced.

If you mean that the RECENTLY_FAILED mechanism would keep this in  
check... I see this idea in many places, but I cannot see where this  
is actually implemented. The only place I see that makes a  
FNPRecentlyFailed message is in RequestHandler (upon it's  
RequestSender having received one).

Presently the node will "RejectLoop" if it is one of the last 10000  
completed requests. My node runs through that many requests in about  
16 minutes. With this logging statement it is already shown that a  
request can last longer than 2 minutes for one peer (and most nodes  
have 20). If you assume that a request takes 4 minutes (two peers,  
VERY optimistic), then it would then only take 4 nodes ('near' each  
other) to generate a request-live-lock (absent the HTL; the request  
would never drop from the network); each trying two of it's other  
peers, and then the next in the 4-chain.

I do not think that this timeout I added is arbitrary. As I understand  
Ian's original networking theory, a request is not valid after the  
originator has timed out. In much the same way that a single node  
fatally timing out collapses the request chain, so too should a node  
'taking too long' (as that node *IS* the one fatally timing out the  
chain).

But on the other hand, I do understand your point about the HTL, and  
that it would keep the request from continuing indefinitely; it seems  
like it could be quite a waste of network resources. Certainly beyond  
that point in time (where the requester has fatally timed out) no  
response should be sent back to the source (that could be many of the  
unclaimed fifo packets); or maybe just if the data is finally found  
(you mentioned ULPRs).

--
Robert Hailey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080108/8ea682c3/attachment.html>

[freenet-dev] r16886 - trunk/freenet/src/freenet/node

Reply via email to