https://bugs.exim.org/show_bug.cgi?id=1786

Philip Hazel <p...@hermes.cam.ac.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Philip Hazel <p...@hermes.cam.ac.uk> ---
Your pattern uses \C in UTF-8 mode. This is documented as being not a good
idea. This is what the documentation (man pcre2pattern) says:

Because \C breaks up characters into individual code units, matching one unit
with \C in UTF-8 or UTF-16 mode means that the rest of the string may start
with a malformed UTF character. This has undefined results, because PCRE2 
assumes that it is matching character by character in a valid UTF string (by
default it checks the subject string's validity at the start of processing
unless the PCRE2_NO_UTF_CHECK option is used).

An application can lock out the use of \C by setting the
PCRE2_NEVER_BACKSLASH_C option when compiling a pattern. It is also possible to
build PCRE2 with the use of \C permanently disabled.

I suggest that you use PCRE2_NEVER_BACKSLASH_C when generating random patterns
using a fuzzer.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to