On 2015-08-18 22:42, Laurent Pointal wrote:
Hello,I want to make a replacement in a string, to ensure that ellipsis are surrounded by spaces (this is not a typographycal problem, but a preparation for late text chunking). I tried with regular expressions and the SRE_Pattern.sub() method, but I have an unexpected duplication of the replacement pattern: The code: ellipfind_re = re.compile(r"((?=\.\.\.)|…)", re.IGNORECASE|re.VERBOSE) ellipfind_re.sub(' ... ', "C'est un essai... avec différents caractères… pour voir.") And I retrieve: "C'est un essai ... ... avec différents caractères ... pour voir." ^^^ I tested with/without group capture, same result. My Python version: Python 3.4.3 (default, Mar 26 2015, 22:03:40) [GCC 4.9.2] on linux Any idea ?
(?=...) is a lookahead; a non-capture group is (?:...). The regex should be r"((?:\.\.\.)|…)", which can be simplified to just r"\.\.\.|…" for your use-case. (You don't need the re.IGNORECASE|re.VERBOSE either!) -- https://mail.python.org/mailman/listinfo/python-list
