[issue44699] Simple regex appears to take exponential time in length of input

2021-07-21 Thread János Brezniczky
János Brezniczky added the comment: Hello! Thanks for the reply - you are probably right, and I accept that it's normal. However, there may be numerous frameworks getting built on the concept of collaboratively contributed regexes. (Python likes prototyping, prototyping likes portable

[issue44699] Simple regex appears to take exponential time in length of input

2021-07-21 Thread Matthew Barnett
Matthew Barnett added the comment: It's called "catastrophic backtracking". Think of the number of ways it could match, say, 4 characters: 4, 3+1, 2+2, 2+1+1, 1+3, 1+2+1, 1+1+2, 1+1+1+1. Now try 5 characters... -- ___ Python tracker

[issue44699] Simple regex appears to take exponential time in length of input

2021-07-21 Thread János Brezniczky
János Brezniczky added the comment: I'd also raise for consideration the introduction a (default?) timeout on regexes, similarly to how such a feature seems available in .NET. Given the DOS vector vs. occasionally non-trivially complex expressions, this could draw developer attention to

[issue44699] Simple regex appears to take exponential time in length of input

2021-07-21 Thread János Brezniczky
János Brezniczky added the comment: Might be related to https://bugs.python.org/issue43075, as in would trigger ReDOSability had my syntax highlighting-related changes (which eventually broke golden master tests, fortunately :) ) hit the public. (Meaning my reduced - possibly minimal -

[issue44699] Simple regex appears to take exponential time in length of input

2021-07-21 Thread János Brezniczky
New submission from János Brezniczky : I don't know if it's normal, but it's impractical. I seem to have come by an expression consuming o(c^n) CPU cycles with c around 2. Regex: \*([^*]+)*\* Resulted in times (in seconds) of 0.17 (* is there anybody out?) 0.34 1 0.69 12 1.36 123 2.73