On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote:

> On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote:
>> On Wednesday 09 January 2008 17:14, Robert Hailey wrote:
>>>
>>> I have reverted r16886, as it appears to be based on a misunderstand
>>> of how requests work against the topology of the network (r16980).
>>
>> I was rather hoping to be talked into accepting 16886 ... we should
>> at least
>> have some logging in such cases IMHO.
>>
>> Requests really shouldn't be taking that long - maybe it's related
>> to the HTL
>> problem, maybe we have such a perverse network topology that we are
>> resetting
>> HTL time after time after time, I dunno, simulations would be
>> interesting.
>
> I'm thinking that message queue priorities will obsolete this problem,
> as the path for responses to requests will solidify nearly
> immediately. Which is to say, we will then see mostly fetch-timeouts,
> not accepted/fatal-timeouts.

Well, I do think that this problem *generally* has gone away. A large  
part of the timeouts may have been request coalescing deadlocks. In my  
logs, I no longer see that "requestsender took to long to respond to  
requestor (+2m)", but when I do see that log statement fire, it is huge!

Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler, RequestSender  
for UID 5637402349040790252, ERROR):
requestsender took too long to respond to requestor (16m10s/3)
Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler, RequestSender  
for UID 98827504771122964, ERROR):
requestsender took too long to respond to requestor (16m8s/3)
Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler, RequestSender  
for UID 774454676209630, ERROR):
requestsender took too long to respond to requestor (16m8s/3)
Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler, RequestSender  
for UID 7341907878853950087, ERROR):
requestsender took too long to respond to requestor (34m33s/4)

Half an hour for one request? Good night!

Relatedly, I have a patch which scans the running uids every 20  
minutes, logging and removing those found to still be running (longer  
than period between checks; 20 minutes). I intended this to make sure  
that no UIDs were leaked from the recent thread optimization (r17190).  
I'm hesitant to commit it as it is a bit of a kludge rather than a fix  
(not leaking the uids), and if requests are taking that long it would  
interfere with loop detection.

Interestingly, of the four uid trackers {CHK/SSK}{request/insert},  
only SSK-inserts have NOT come up in the stale-uid scan logging; so  
this is not confined to requestSender/r16886.

--
Robert Hailey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080124/a4fdebd5/attachment.html>

Reply via email to