On 12/01/2013 1:39 p.m., Alex Rousskov wrote:
Hello,

     I think there is an off-by-one error in the external_acl.cc code
that kills Squid:

2013/01/11 15:13:18.037| external_acl.cc(843) aclMatchExternal: 
"http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/": 
queueing a call.
2013/01/11 15:13:18.037| external_acl.cc(845) aclMatchExternal: 
"http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/": return 
-1.
2013/01/11 15:13:18.037| Acl.cc(321) checklistMatches: ACL::ChecklistMatches: 
result for 'dummyfilter' is -1
2013/01/11 15:13:18.037| Acl.cc(339) matches: ACLList::matches: result is false
2013/01/11 15:13:18.037| Checklist.cc(275) matchNode: 0x99fe8a8 matched=0 
async=1 finished=0
2013/01/11 15:13:18.037| Checklist.cc(312) matchNode: 0x99fe8a8 going async
2013/01/11 15:13:18.037| Acl.cc(61) FindByName: ACL::FindByName 'dummyfilter'
2013/01/11 15:13:18.037| Checklist.cc(131) asyncInProgress: 
ACLChecklist::asyncInProgress: 0x99fe8a8 async set to 1
2013/01/11 15:13:18.037| external_acl.cc(1407) Start: fg lookup in 
'dummyfilter' for 
'http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/'
2013/01/11 15:13:18.037| external_acl.cc(1450) Start: 'dummyfilter' queue is 
too long
2013/01/11 15:13:18.037| assertion failed: external_acl.cc:1451: "inBackground"

The enqueue check for external ACL lookups is inconsistent with the
final queue length check in ExternalACLLookup::Start(). The former
allows adding to the already full (but not yet overflowing) queue while
the latter rightfully(?) asserts that the queue should not overflow.

Should the queue_size check be relaxed to use 2*n_running logic, used by
some other seemingly similar checks?

And should we use n_running or n_active for these checks?

I think we should get away from the 2*N logic entirely and do something a bit safer. As it stands n_active AFAIK is the intended measure. n_running includes helpers being shutdown, so when they disappear the queue magically overflows. However if any one of the n_active helpers has a problem they get removed from n_active and the queue again magically overflows.

Perhapse something like 2*n_max would be better, at least that is a fixed lengh of queue independent of what is happening with the helpers.

TODO: There are approximately ten places where we check queue sizes
against n_running or n_active. Many of those checks are slightly
different. This inconsistency should be removed, probably by adding a
few standard methods for a few types of checks that are actually needed.
Patches welcome!

+1 on that idea.

Amos

Reply via email to