Attached is a patch for optimisation of REs.
This is the second submission of the patch and the comments from
Amos' review are addressed.

This patch is inspired by the work that I did for ufdbGuard and a few emails 
with Amos.

The new code optimises lists of regular expressions.
The optimisations are:
* initial .* is stripped
* RE-1 RE-2 ... RE-n are joined into one large RE: (RE-1)|(RE-2)|...|(RE-n)
* -i ... -i options are optimised: the second one is ignored, same for +i

The only modified file is src/acl/RegexData.cc

attached are the patch (RegexData.cc.patch) and files for a unit test:
squidtest.conf
re.4lines    - used in squidtest.conf; contains REs
re.200lines     - used in squidtest.conf; contains REs
unittest_re_optim_wget - script with wget commands to trigger squid to evaluate 
REs

unittest_re_optim_wget contains instructions on how to setup and perform a unit 
test

I tried to get a member of the squid-dev mailing list but are not yet
so comments should also go to my email address directly.

Marcus Kool



Marcus Kool wrote:

Amos Jeffries wrote:
 > Amos Jeffries wrote:
 >> Hi Marcus,
 >> Did my audit feedback on this make it to you? I've just noticed my
 >> mailer has not marked the thread as responded.
 >>

On 01/07/11 00:52, Marcus Kool wrote:
No, it did not.

Okay. My mailer seems to have screwed up badly. There were a few little minor bits.

* the patch being reversed. Just order the files the other way around on next patch.

compileOptimisedREs/compileUnoptimisedREs have duplicate code checking for (RElen > BUFSIZ+1) case on the wordlist key. They are already checked for that criteria by aclParseRegexList before adding.

debugs() WARNING to the user should be DBG_IMPORTANT in the second parameter.

The major problem debugs() need DBG_CRITICAL in parameter #2 and "ERROR:" instead of the function name.

The >100 messages only need to be shown when checking the config for problems. ie.
  debugs(28, (opt_parse_cfg_only?DBG_IMPORTANT:2), ....

Thanks for the feedback,  I will make a new patch.  I was not able to
do it to be included in the next releases but it will be soon.


None else has mentioned anything, so with these style tweaks it can go in. The next releases are planned to happen tomorrow. If you want to submit a new patch in the next 12hrs I'll use that.


I tried to subscribe to the squid-dev mailing list the other day
but got no reply yet. But in the list archives I did not see any
response/feedback either.

I saw that arrive. So whoever was moderating this week appears to have has okayed you for posting. If you went through the regular ezmail subscription process (mail to squid-dev-subscr...@squid-cache.org) you should have been receiving list mail for a few days?

I have not yet received emails from squid-dev.  Should I resend
the application ?

Amos

Marcus

Attachment: patch-RE-optimisation-squid-3-1-14.tar.gz
Description: GNU Zip compressed data

Reply via email to