https://bugs.exim.org/show_bug.cgi?id=1841
Philip Hazel <p...@hermes.cam.ac.uk> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|NEW |RESOLVED --- Comment #1 from Philip Hazel <p...@hermes.cam.ac.uk> --- You appear to be using a regex to match a list of fixed strings. This is not the best way of doing that because there are fast algorithms for doing literal string searches, for example: https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm Regex searches come into their own when the search patterns are not all literals. However, if you do use one for this kind of search, there are ways to speed things up. For each character in your log file PCRE is going to laboriously check each initial character of your substrings in turn. Assuming that your log files contain characters other than letters, one way of speeding this up would be to check for a letter before testing all the individual ones. A lookahead such as (?=[a-z]) at the start might speed things up. And/or you could group your strings by initial letter and use lookaheads so that you search only those that begin with a given letter. The pcretest program has facilities for timing matches, and can therefore be used to compare the performance of different regex. Jeffrey Friedl's book "Mastering Regular Expressions" has useful information about optimizing patterns, though I can't remember if it says much about literal strings. Finally, are you using JIT? That can speed up PCRE matches by quite a lot. Oh, and as you are using PCRE1, are you calling pcre_study()? I am going to close this item, because I do not think it is a bug. -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev