On Mon, Nov 14, 2016 at 5:59 AM, Amos Jeffries <squ...@treenet.co.nz> wrote: > On 14/11/2016 6:30 p.m., Alex Rousskov wrote: >> On 11/13/2016 05:11 PM, Kinkie wrote: >> >>> the attached patch moves away from hand-rolling a c-string onto >>> joining a SBufList for optimizing regexes in RegexData.cc. >> >>> You can find attached as a test case the output of squidclient >>> mgr:config taken on trunk and on the submitted code. It is slightly >>> different in that the new code allows for a longer optimized regex. >> >> I see the following before/after difference: >> >> -acl bigacl dstdom_regex ... (pattern84) (pattern86) ... >> +acl bigacl dstdom_regex ... (pattern84)|(pattern85) ... >> >> Does changing space into "|" result in one internal regex instead of >> two? > > It does / should. > >> And that is the optimization you have implemented? > > No, that was implemented long ago by Marcus. We got a ~20% speed > increase from that. More complexity in the individual regex test seemed > to counteract the fewer tests done. > > >> How is that >> related to SBufs (i.e., why could not old c-strings accommodate longer >> regexes)? > > FYI (to kinkie): The 100 patterns per aggregation was an arbitrary > limit, but the total string length is limited by the ConfigParser buffers. > > The plain-text length of the aggregated regex string needs to fit within > one squid.conf line length. With space for the directive name, label, > type and flags being included in the limit.
that's BUFSIZ, or 4k. We currently have 1K + separator, name and flags. If anything, we could be more aggressive in having longer regexes, modulo OOM in regcomp. >> And why is there no pattern85 in the "before" test?! >> >> >>> The code is expected to cause a small performance regression on >>> parse and reconfigure due to extra data copies. The regression is >>> negligible: the attached test cases perform "squid -k parse" in 60 >>> msec in trunk and 62 on the branch on a warm cache on my i5 macbook >>> air. >> >> A 3% performance regression is not necessarily negligible. Have you >> tested with more ACL lines (not larger individual ACL lines) that take a >> few seconds to load (as opposed to 60ms) ? If not, please do unless that >> test would be irrelevant somehow. >> >> And what is the expected performance improvement from having fewer >> longer regexes? > > I assume you mean performance of normal request processing through the > regex ACL. Agreed, that need to be checked/compared to current speeds. Time needed for squid -k parse on my Mac 1.3GHz i5 macbook air, for a 1M-entry random ACL, 13Mb-big: new: 20.8s trunk: 18.3s Hopefully in time we'll move the config parser to SBuf.. -- Francesco _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev