2018-05-03 15:54:58 +0000, Austin Group Bug Tracker:
[...]
> On page 2492 line 80144 section awk, add two new rows to the table:<pre>
> \., \[, \(, \*, \+, | A <backslash> character followed by a character | In
> the lexical token <b>ERE</b>, the sequence 
> \?, \{, \|, \^, \$  | that has a special meaning in EREs (see         |
> shall represent itself.  Otherwise undefined.
>                     | [xref to XBD 9.4.3]), other than <backslash>.   |
> --------------------+-------------------------------------------------+-----------------------------------------------------
> \\                  | Two <backslash> characters.                     | In
> the lexical token <b>ERE</b>, the sequence shall
>                     |                                                 |
> represent itself. In the lexical token <b>STRING</b>,
>                     |                                                 | it
> shall represent a single <backslash>.
> </pre>
[...]

Thanks.

Does that mean that:

awk '/[\]]/'

is to match on "\]" and not on "]" (like for grep -E '[\]]')?

In practice not many implementation do. The only ones I found
that did were busybox awk and Solaris /usr/xpg4/bin/awk.

mawk, gawk, bwk's awk, Solaris nawk, FreeBSD awk all output:

$ printf '%s\n' '\]' ']' | awk '/[\]]/'
\]
]

In practice, one does need to double the \ inside bracket
expressions for  it not to be treated specially in many regexp
and wildcard engines. That's the case of several shells
(including certified ones) for
var='[\\]foo]'; case x in $var); esac
and of many REs including the modern de-facto standard that PCRE
has become.

I think POSIX should acknowledge that generally in the regular
expression syntax.

-- 
Stephane

Reply via email to