On Thursday 24 January 2008 17:45, Robert Hailey wrote:
> 
> On Jan 11, 2008, at 1:53 PM, Robert Hailey wrote:
> 
> > On Jan 11, 2008, at 12:42 PM, Matthew Toseland wrote:
> >> On Wednesday 09 January 2008 17:14, Robert Hailey wrote:
> >>>
> >>> I have reverted r16886, as it appears to be based on a misunderstand
> >>> of how requests work against the topology of the network (r16980).
> >>
> >> I was rather hoping to be talked into accepting 16886 ... we should
> >> at least
> >> have some logging in such cases IMHO.
> >>
> >> Requests really shouldn't be taking that long - maybe it's related
> >> to the HTL
> >> problem, maybe we have such a perverse network topology that we are
> >> resetting
> >> HTL time after time after time, I dunno, simulations would be
> >> interesting.
> >
> > I'm thinking that message queue priorities will obsolete this problem,
> > as the path for responses to requests will solidify nearly
> > immediately. Which is to say, we will then see mostly fetch-timeouts,
> > not accepted/fatal-timeouts.
> 
> Well, I do think that this problem *generally* has gone away. A large  
> part of the timeouts may have been request coalescing deadlocks. In my  
> logs, I no longer see that "requestsender took to long to respond to  
> requestor (+2m)", but when I do see that log statement fire, it is huge!
> 
> Jan 24, 2008 17:05:11:767 (freenet.node.RequestHandler, RequestSender  
> for UID 5637402349040790252, ERROR):
> requestsender took too long to respond to requestor (16m10s/3)
> Jan 24, 2008 17:05:14:446 (freenet.node.RequestHandler, RequestSender  
> for UID 98827504771122964, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:05:14:447 (freenet.node.RequestHandler, RequestSender  
> for UID 774454676209630, ERROR):
> requestsender took too long to respond to requestor (16m8s/3)
> Jan 24, 2008 17:23:00:203 (freenet.node.RequestHandler, RequestSender  
> for UID 7341907878853950087, ERROR):
> requestsender took too long to respond to requestor (34m33s/4)
> 
> Half an hour for one request? Good night!

This is suspicious, they are all roughly the same period except the last. I 
suggest you set log level minor and investigate what happened by searching 
for the UID.
> 
> Relatedly, I have a patch which scans the running uids every 20  
> minutes, logging and removing those found to still be running (longer  
> than period between checks; 20 minutes). I intended this to make sure  
> that no UIDs were leaked from the recent thread optimization (r17190).  
> I'm hesitant to commit it as it is a bit of a kludge rather than a fix  
> (not leaking the uids), and if requests are taking that long it would  
> interfere with loop detection.
> 
> Interestingly, of the four uid trackers {CHK/SSK}{request/insert},  
> only SSK-inserts have NOT come up in the stale-uid scan logging; so  
> this is not confined to requestSender/r16886.

The other question here is not decrementing the HTL on retries if we have 
decrementAtMax = false. I've just committed a patch for that.

I'll probably release alpha 2 from 1103...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20080124/f5009f4f/attachment.pgp>

Reply via email to