New submission from Robert Meerman <robert.meer...@gmail.com>: Regular expressions which are written match literal underscores ("_", ASCII ordinal 95) and specify `re.IGNORECASE` during compilation do not consistently match underscores: it seems some occurrences are matched, but others are not.
The following session log shows the problem: Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> subject = "[Conclave-Mendoi]_ef_-_a_tale_of_memories_00-12_H264" >>> print subject.encode("base64") # Incase my environment encoding is to blame W0NvbmNsYXZlLU1lbmRvaV1fZWZfLV9hX3RhbGVfb2ZfbWVtb3JpZXNfMDAtMTJfSDI2NA== >>> re.sub("_", "X", subject) # No flags, does what I expect '[Conclave-Mendoi]XefX-XaXtaleXofXmemoriesX00-12XH264' >>> >>> re.sub("_", "X", subject, re.IGNORECASE) # Misses some matches '[Conclave-Mendoi]XefX-_a_tale_of_memories_00-12_H264' >>> >>> re.sub("_", "X", subject, re.IGNORECASE | re.LOCALE) # Misses fewer matches '[Conclave-Mendoi]XefX-XaXtaleXofXmemories_00-12_H264' >>> >>> re.sub("_", "X", subject, re.IGNORECASE | re.LOCALE | re.UNICODE) # Works OK '[Conclave-Mendoi]XefX-XaXtaleXofXmemoriesX00-12XH264' >>> >>> re.sub("_", "X", subject, re.IGNORECASE | re.UNICODE) # Works OK '[Conclave-Mendoi]XefX-XaXtaleXofXmemoriesX00-12XH264' >>> >>> type(subject) # Don't think this is a unicode string <type 'str'> >>> Since my `subject` variable is of type `str` and only contains ASCII characters I do not believe that the `re.UNICODE` flag should be required. ---------- components: Regular Expressions messages: 134700 nosy: RobM, effbot, ezio.melotti, pitrou priority: normal severity: normal status: open title: re.IGNORECASE does not match literal "_" (underscore) type: behavior versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11947> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com