Ezio Melotti <ezio.melo...@gmail.com> added the comment: [\w] should definitely work, but [\B] doesn't seem to match anything useful, and it just fails silently because it's neither equivalent to \B nor to [B]: >>> re.match(r'foo\B', 'foobar') # on a non-word-boundary -- matches fine <_sre.SRE_Match object at 0xb76dd3a0> >>> re.match(r'foo[B]', 'fooBar') # same as r'fooB' <_sre.SRE_Match object at 0xb76dd1e0> >>> re.match(r'foo[\B]', 'foobar') # not equivalent to \B >>> re.match(r'foo[\B]', 'fooBar') # not equivalent to [B]
The same is true for \Z and \A: >>> re.match(r'foo\Z', 'foo') # end of the string -- matches fine <_sre.SRE_Match object at 0xb76dd3a0> >>> re.match(r'foo[Z]', 'fooZ') # same as r'fooZ' <_sre.SRE_Match object at 0xb76dd1e0> >>> re.match(r'foo[\Z]', 'foo') # not equivalent to \Z >>> re.match(r'foo[\Z]', 'fooZ') # not equivalent to [Z] >>> >>> re.match(r'\Afoo', 'foo') # beginning of the string -- matches fine <_sre.SRE_Match object at 0xb76dd1e0> >>> re.match(r'[A]foo', 'Afoo') # same as r'Afoo' <_sre.SRE_Match object at 0xb76dd3a0> >>> re.match(r'[\A]foo', 'foo') # not equivalent to \A >>> re.match(r'[\A]foo', 'Afoo') # not equivalent to [A] Inside [], \b switches from word boundary to backspace: >>> re.match(r'foo\b', 'foobar') # not on a word boundary -- no matches >>> re.match(r'foo\b', 'foo bar') # on a word boundary -- matches fine <_sre.SRE_Match object at 0xb74a4ec8> >>> re.match(r'foo[\b]', 'foo bar') # not equivalent to \b >>> re.match(r'foo[\b]', 'foo\bbar') # matches backspace <_sre.SRE_Match object at 0xb76dd3d8> >>> re.match(r'foo([\b])', 'foo\bbar').group(1) '\x08' Given that \b doesn't keep its word boundary meaning inside the [], \B (and \A and \Z) shouldn't keep it either (also because I can't see how having these inside [] would be of any use). On the other hand I'm not sure they should be equivalent to B, A, Z either. There are several escape sequences in the form \X (where X is an upper- or lower-case letter) that are not equivalent to X (\a\b\d\f\s\x\w\D\S\W...). Raising an error that says something like "I don't think [\A] does what you think it does, use [A] instead." might be a better option (and in case anyone is wondering about re.escape, I just checked and it doesn't escape letters). Even if this is technically backward incompatible, any string that has \A, \B, \Z inside [] can be considered buggy IMHO (unless someone can come up with a valid use case where they do something useful). ---------- assignee: docs@python -> _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13899> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com