https://bugs.exim.org/show_bug.cgi?id=1697

            Bug ID: 1697
           Summary: Incorrect compilation of classes containing ucase
                    mnemonics and properties
           Product: PCRE
           Version: 8.37
          Hardware: All
                OS: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: p...@hermes.cam.ac.uk
          Reporter: justin.vii...@intel.com
                CC: pcre-dev@exim.org

Some fuzzer testing of PCRE and Intel's Hyperscan pattern matching library
produced this pattern:

  /[\W\p{Any}]/

.. compiled without the PCRE_UCP flag, which we would expect to match against
any character. Instead, we found that it behaves the same way as just the class
[\W]. Running 'pcretest -d' shows:

----
$ bin/pcretest -d
PCRE version 8.37 2015-04-28

  re> /[\W\p{Any}]/
------------------------------------------------------------------
  0  36 Bra
  3     [\x00-/:-@[-^`{-\xff] (neg)
 36  36 Ket
 39     End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
data> -
 0: -
data> a
No match
----

My suspicion is that this is something to do with the interaction of a negated
class mnemonic (like \W, \D, \S) and the property xclass -- perhaps the
handling of the should_flip_negation bool in pcre_compile.c? The pattern above
is interpreted as /[\P{Xwd\p{Any}]/ if the PCRE_UCP flag is set, which looks
right, so it's only an issue without the flag.

I checked against PCRE2 10.20 as well, and it exhibits the same behaviour.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to