On Jan 24, 2008, at 11:45 AM, Robert Hailey wrote:
>
> On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote:
>
>> On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote:
>>> On Wednesday 09 January 2008 17:14, Robert Hailey wrote:
>>>>
>>>> I have reverted r16886, as it appears to be based on a
>>>> misunderstand
>>>> of how requests work against the topology of the network (r16980).
>>>
>>> I was rather hoping to be talked into accepting 16886 ... we should
>>> at least
>>> have some logging in such cases IMHO.
>>>
>>> Requests really shouldn't be taking that long - maybe it's related
>>> to the HTL
>>> problem, maybe we have such a perverse network topology that we are
>>> resetting
>>> HTL time after time after time, I dunno, simulations would be
>>> interesting.
>>
>> I'm thinking that message queue priorities will obsolete this
>> problem,
>> as the path for responses to requests will solidify nearly
>> immediately. Which is to say, we will then see mostly fetch-timeouts,
>> not accepted/fatal-timeouts.
>
> Well, I do think that this problem *generally* has gone away. A
> large part of the timeouts may have been request coalescing
> deadlocks. In my logs, I no longer see that "requestsender took to
> long to respond to requestor (+2m)", but when I do see that log
> statement fire, it is huge!
>
> Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler,
> RequestSender for UID 5637402349040790252, ERROR):
> requestsender took too long to respond to requestor (16m10s/3)
> Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler,
> RequestSender for UID 98827504771122964, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler,
> RequestSender for UID 774454676209630, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler,
> RequestSender for UID 7341907878853950087, ERROR):
> requestsender took too long to respond to requestor (34m33s/4)
>
> Half an hour for one request? Good night!
This may be an effect of having an incredibly slow peer (although even
sendSync() should not block nearly that long... so I can't imagine
why), I'll keep watching it.
--
Robert Hailey
>
> Relatedly, I have a patch which scans the running uids every 20
> minutes, logging and removing those found to still be running
> (longer than period between checks; 20 minutes). I intended this to
> make sure that no UIDs were leaked from the recent thread
> optimization (r17190). I'm hesitant to commit it as it is a bit of a
> kludge rather than a fix (not leaking the uids), and if requests are
> taking that long it would interfere with loop detection.
>
> Interestingly, of the four uid trackers {CHK/SSK}{request/insert},
> only SSK-inserts have NOT come up in the stale-uid scan logging; so
> this is not confined to requestSender/r16886.
>
> --
> Robert Hailey
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://emu.freenetproject.org/pipermail/devl/attachments/20080124/db0519aa/attachment.html>