[issue29977] re.sub stalls forever on an unmatched non-greedy case

2017-04-04 Thread Matthew Barnett
Matthew Barnett added the comment: A slightly shorter form: /\*(?:(?!\*/).)*\*/ Basically it's: match start while not match end: consume character match end If the "match end" is a single character, you can use a negated character set, for example: [^\n]* othe

[issue29977] re.sub stalls forever on an unmatched non-greedy case

2017-04-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This is a well known issue called catastrophic backtracking. It can't be solved with the current implementation of the regular expression engine. The best you can rewrite your regular expression. Even replacing "(.|\s)" with just "." can help. -- no

[issue29977] re.sub stalls forever on an unmatched non-greedy case

2017-04-04 Thread Gareth Rees
Gareth Rees added the comment: See also issue28690, issue212521, issue753711, issue1515829, etc. -- ___ Python tracker ___ ___ Python-

[issue29977] re.sub stalls forever on an unmatched non-greedy case

2017-04-04 Thread Gareth Rees
Gareth Rees added the comment: The problem here is that both "." and "\s" match a whitespace character, and because you have the re.DOTALL flag turned on this includes "\n", and so the number of different ways in which (.|\s)* can be matched against a string is exponential in the number of whi

[issue29977] re.sub stalls forever on an unmatched non-greedy case

2017-04-04 Thread Robert Lujo
New submission from Robert Lujo: Hello, I assume I have hit some bug/misbehaviour in re module. I will provide you "working" example: import re RE_C_COMMENTS= re.compile(r"/\*(.|\s)*?\*/", re.MULTILINE|re.DOTALL|re.UNICODE) text = "Special section /* valves:\n\n\nsilicone\n\n\n\n\n\n\nH