2011-10-02, 21:51(-04), Chet Ramey: > On 10/2/11 3:43 PM, Stephane CHAZELAS wrote: > >> [*] actually, bash does some (undocumented) preprocessing on the >> regexps, so even the regex(3) reference is misleading here. > > Not really. The words are documented to undergo quote removal, so > they undergo quote removal. That turns \1 into 1, for instance. [...]
The problem and confusion here comes from the fact that "\" is overloaded and used by two different pieces of software (bash and the system regex). It is used: - by bash for quoting - by regex(3) to escape regexp characters in some circumstances (for instance when not inside [...], but it may vary per implementations (think of the (?{...} type extensions)) - by some regex(3) implementations to introduce new regexp operators (\w, \b, \<...) BTW, another bug: $ bash -c '[[ "\\" =~ ["."] ]]' && echo yes yes And what one could consider a bug: ~$ bash -c 'chars="a]"; [[ "a" =~ ["$chars"] ]]' && echo yes ~$ bash -c 'chars="a]"; [[ "a]" =~ ["$chars"] ]]' && echo yes yes I was wrong in saying that bash documentation should refer to POSIX regexps as it disables extensions. It only disables extensions introduced by "\", not the ones introduced by sequences that would otherwise be invalid in POSIX EREs like "(?", {{, **... It should still refer to POSIX regexps as it's the only ones guaranteed to work. Any extension provided by the system's regex(3) API may not work with bash. -- Stephane