On Jan 24, 2008, at 11:45 AM, Robert Hailey wrote:

>
> On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote:
>
>> On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote:
>>> On Wednesday 09 January 2008 17:14, Robert Hailey wrote:
>>>>
>>>> I have reverted r16886, as it appears to be based on a  
>>>> misunderstand
>>>> of how requests work against the topology of the network (r16980).
>>>
>>> I was rather hoping to be talked into accepting 16886 ... we should
>>> at least
>>> have some logging in such cases IMHO.
>>>
>>> Requests really shouldn't be taking that long - maybe it's related
>>> to the HTL
>>> problem, maybe we have such a perverse network topology that we are
>>> resetting
>>> HTL time after time after time, I dunno, simulations would be
>>> interesting.
>>
>> I'm thinking that message queue priorities will obsolete this  
>> problem,
>> as the path for responses to requests will solidify nearly
>> immediately. Which is to say, we will then see mostly fetch-timeouts,
>> not accepted/fatal-timeouts.
>
> Well, I do think that this problem *generally* has gone away. A  
> large part of the timeouts may have been request coalescing  
> deadlocks. In my logs, I no longer see that "requestsender took to  
> long to respond to requestor (+2m)", but when I do see that log  
> statement fire, it is huge!
>
> Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler,  
> RequestSender for UID 5637402349040790252, ERROR):
> requestsender took too long to respond to requestor (16m10s/3)
> Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler,  
> RequestSender for UID 98827504771122964, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler,  
> RequestSender for UID 774454676209630, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler,  
> RequestSender for UID 7341907878853950087, ERROR):
> requestsender took too long to respond to requestor (34m33s/4)
>
> Half an hour for one request? Good night!

This may be an effect of having an incredibly slow peer (although even  
sendSync() should not block nearly that long... so I can't imagine  
why), I'll keep watching it.

--
Robert Hailey

>
> Relatedly, I have a patch which scans the running uids every 20  
> minutes, logging and removing those found to still be running  
> (longer than period between checks; 20 minutes). I intended this to  
> make sure that no UIDs were leaked from the recent thread  
> optimization (r17190). I'm hesitant to commit it as it is a bit of a  
> kludge rather than a fix (not leaking the uids), and if requests are  
> taking that long it would interfere with loop detection.
>
> Interestingly, of the four uid trackers {CHK/SSK}{request/insert},  
> only SSK-inserts have NOT come up in the stale-uid scan logging; so  
> this is not confined to requestSender/r16886.
>
> --
> Robert Hailey
>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080124/db0519aa/attachment.html>

Reply via email to