Hello,
I think there is an off-by-one error in the external_acl.cc code
that kills Squid:
> 2013/01/11 15:13:18.037| external_acl.cc(843) aclMatchExternal:
> "http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/":
> queueing a call.
> 2013/01/11 15:13:18.037| external_acl.cc(845) aclMatchExternal:
> "http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/":
> return -1.
> 2013/01/11 15:13:18.037| Acl.cc(321) checklistMatches: ACL::ChecklistMatches:
> result for 'dummyfilter' is -1
> 2013/01/11 15:13:18.037| Acl.cc(339) matches: ACLList::matches: result is
> false
> 2013/01/11 15:13:18.037| Checklist.cc(275) matchNode: 0x99fe8a8 matched=0
> async=1 finished=0
> 2013/01/11 15:13:18.037| Checklist.cc(312) matchNode: 0x99fe8a8 going async
> 2013/01/11 15:13:18.037| Acl.cc(61) FindByName: ACL::FindByName 'dummyfilter'
> 2013/01/11 15:13:18.037| Checklist.cc(131) asyncInProgress:
> ACLChecklist::asyncInProgress: 0x99fe8a8 async set to 1
> 2013/01/11 15:13:18.037| external_acl.cc(1407) Start: fg lookup in
> 'dummyfilter' for
> 'http://example-3.com/req=debug,uniq4,abort_at=AfterReqHs/resp=debug/'
> 2013/01/11 15:13:18.037| external_acl.cc(1450) Start: 'dummyfilter' queue is
> too long
> 2013/01/11 15:13:18.037| assertion failed: external_acl.cc:1451:
> "inBackground"
The enqueue check for external ACL lookups is inconsistent with the
final queue length check in ExternalACLLookup::Start(). The former
allows adding to the already full (but not yet overflowing) queue while
the latter rightfully(?) asserts that the queue should not overflow.
Should the queue_size check be relaxed to use 2*n_running logic, used by
some other seemingly similar checks?
And should we use n_running or n_active for these checks?
TODO: There are approximately ten places where we check queue sizes
against n_running or n_active. Many of those checks are slightly
different. This inconsistency should be removed, probably by adding a
few standard methods for a few types of checks that are actually needed.
Patches welcome!
Thank you,
Alex.
Prevent external_acl.cc "inBackground" assertion on queue overloads.
The enqueue check for external ACL lookups was inconsistent with the final
queue length check in ExternalACLLookup::Start(). The former allowed adding to
the already full (but not yet overflowing) queue while the latter
rightfully(?) asserted that the queue should not overflow.
TODO: Should the fatal queue_size check be relaxed to use 2*n_running logic,
used by some other seemingly similar checks?
TODO: There are approximately ten places where we check queue sizes against
n_running or n_active. Many of those checks are slightly different. This
inconsistency should be removed, probably by adding a few standard methods
for a few types of checks that are actually needed.
=== modified file 'src/external_acl.cc'
--- src/external_acl.cc 2012-09-04 09:10:20 +0000
+++ src/external_acl.cc 2013-01-11 23:19:07 +0000
@@ -822,41 +822,41 @@
}
entry = static_cast<external_acl_entry *>(hash_lookup(acl->def->cache, key));
external_acl_entry *staleEntry = entry;
if (entry && external_acl_entry_expired(acl->def, entry))
entry = NULL;
if (entry && external_acl_grace_expired(acl->def, entry)) {
// refresh in the background
ExternalACLLookup::Start(ch, acl, true);
debugs(82, 4, HERE << "no need to wait for the refresh of '" <<
key << "' in '" << acl->def->name << "' (ch=" << ch << ").");
}
if (!entry) {
debugs(82, 2, HERE << acl->def->name << "(\"" << key << "\") = lookup needed");
debugs(82, 2, HERE << "\"" << key << "\": entry=@" <<
entry << ", age=" << (entry ? (long int) squid_curtime - entry->date : 0));
- if (acl->def->theHelper->stats.queue_size <= (int)acl->def->theHelper->childs.n_active) {
+ if (acl->def->theHelper->stats.queue_size < (int)acl->def->theHelper->childs.n_active) {
debugs(82, 2, HERE << "\"" << key << "\": queueing a call.");
ch->changeState(ExternalACLLookup::Instance());
debugs(82, 2, HERE << "\"" << key << "\": return -1.");
return ACCESS_DUNNO; // expired cached or simply absent entry
} else {
if (!staleEntry) {
debugs(82, DBG_IMPORTANT, "WARNING: external ACL '" << acl->def->name <<
"' queue overload. Request rejected '" << key << "'.");
external_acl_message = "SYSTEM TOO BUSY, TRY AGAIN LATER";
return ACCESS_DUNNO;
} else {
debugs(82, DBG_IMPORTANT, "WARNING: external ACL '" << acl->def->name <<
"' queue overload. Using stale result. '" << key << "'.");
entry = staleEntry;
/* Fall thru to processing below */
}
}
}
}