Allow complemented character class escapes within regex brackets. The complement-class escapes \D, \S, \W are now allowed within bracket expressions. There is no semantic difficulty with doing that, but the rather hokey macro-expansion-based implementation previously used here couldn't cope.
Also, invent "word" as an allowed character class name, thus "\w" is now equivalent to "[[:word:]]" outside brackets, or "[:word:]" within brackets. POSIX allows such implementation-specific extensions, and the same name is used in e.g. bash. One surprising compatibility issue this raises is that constructs such as "[\w-_]" are now disallowed, as our documentation has always said they should be: character classes can't be endpoints of a range. Previously, because \w was just a macro for "[:alnum:]_", such a construct was read as "[[:alnum:]_-_]", so it was accepted so long as the character after "-" was numerically greater than or equal to "_". Some implementation cleanup along the way: * Remove the lexnest() hack, and in consequence clean up wordchrs() to not interact with the lexer. * Fix colorcomplement() to not be O(N^2) in the number of colors involved. * Get rid of useless-as-far-as-I-can-see calls of element() on single-character character element names in brackpart(). element() always maps these to the character itself, and things would be quite broken if it didn't --- should "[a]" match something different than "a" does? Besides, the shortcut path in brackpart() wasn't doing this anyway, making it even more inconsistent. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/2a0af7fe460eb46f9af996075972bf7c2e3f211d Modified Files -------------- doc/src/sgml/func.sgml | 25 +- src/backend/regex/re_syntax.n | 13 +- src/backend/regex/regc_color.c | 34 ++- src/backend/regex/regc_lex.c | 166 ++---------- src/backend/regex/regc_locale.c | 97 +++---- src/backend/regex/regc_pg_locale.c | 9 + src/backend/regex/regcomp.c | 285 +++++++++++++++++---- src/include/regex/regguts.h | 20 +- .../modules/test_regex/expected/test_regex.out | 250 ++++++++++++++++++ src/test/modules/test_regex/sql/test_regex.sql | 44 ++++ 10 files changed, 672 insertions(+), 271 deletions(-)
