------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1472 Summary: Regexp "(|ab)" not handled in accordance with documentation Product: PCRE Version: N/A Platform: Other OS/Version: All Status: NEW Severity: bug Priority: low Component: Code AssignedTo: [email protected] ReportedBy: [email protected] CC: [email protected] I suspect that this is a documentation issue rather than an implementation or algorithm issue, but it was surprising and it seems worth clarifying. Given that '|' in PCRE specifies ordered choice, one would expect that the regular expression "(|ab)" would match anything at all, because there is a zero-length path to a match. It does not. Running echo "ab" | pcregrep --color '(|ab)' indicates that the matched string is "ab", which is an unexpected outcome. Conversely: echo "xab" | pcregrep --color 'x(|ab)' indicates that the matched string is 'x' (as expected). I suspect (without strong confidence) that pcregrep is rejecting matches of zero length. I note that echo "ab" | pcregrep --color '(^|ab)' also purports to match "ab", which seems to support the notion that zero-length matches are rejected. While this behavior is arguably sensible from the human perspective, it does not conform to the specification of PCRE regular expressions. Either the specification or the implementation should be corrected to bring them into agreement. Please note that I have not dug deeper into this. In particular, it seems vaguely possible that the PCRE library is returning a result consistent with the specification, which is in turn being filtered by pcregrep on the basis of length. -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
