https://bugs.exim.org/show_bug.cgi?id=2283
--- Comment #12 from Philip Hazel <[email protected]> --- OK, I've done some further testing and studied the code - for the first time - I haven't ever looked at this C++ code before. The original code always compiles the pattern twice - once asis and then "anchored" at each end. This seems inefficient, because you don't know which of the two versions are actually needed, but for small patterns I guess it's cheap. But see below... I see there is no provision for using the JIT accelerator from the C++ wrapper. The addition of JIT happened after the C++ wrapper was contributed; the (then) maintainer either didn't see the possibility or didn't need the feature. Anyway, I'm now sure your patch is wrong, because it does not confine its search for (*UTF8) etc to the very start of the pattern. As it happens, you can get away with this because the first, unanchored, compile will throw the error, before this wrapping code is obeyed. However, it could be confused by putting (*UTF8) in a comment, for example. Also, for a very long pattern, it's a waste of time searching right along it. (Some people's patterns are thousands of bytes long.) Other comments: My code exploited the fact that the list of (*UTF8) etc is in reverse alphabetic order (I reckoned (*UTF8) was likely to be most common). LONG TERM: I would suggest that you think about moving to PCRE2 in the long term, for several reasons. The 10.xx releases do not have the C++ wrapper, but you could copy the PCRE1 version and update it; personally I'd be tempted to re-implement it. If performance is an issue for you, a way of using JIT would help. The messy code we are currently discussing could be dispensed with, because PCRE2 suports an ENDANCHORED option. Both ANCHORED and ENDANCHORED can be specified at match time, so in theory there is no need for the double compilation that is currently used, though there is a caveat: if specified at match time, JIT cannot be used, so there's a compile vs match performance trade-off. -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
