János Brezniczky added the comment:
Hello! Thanks for the reply - you are probably right, and I accept that it's
normal. However, there may be numerous frameworks getting built on the concept
of collaboratively contributed regexes.
(Python likes prototyping, prototyping likes portable
Matthew Barnett added the comment:
It's called "catastrophic backtracking". Think of the number of ways it could
match, say, 4 characters: 4, 3+1, 2+2, 2+1+1, 1+3, 1+2+1, 1+1+2, 1+1+1+1. Now
try 5 characters...
--
___
Python tracker
János Brezniczky added the comment:
I'd also raise for consideration the introduction a (default?) timeout on
regexes, similarly to how such a feature seems available in .NET.
Given the DOS vector vs. occasionally non-trivially complex expressions, this
could draw developer attention to
János Brezniczky added the comment:
Might be related to https://bugs.python.org/issue43075, as in would trigger
ReDOSability had my syntax highlighting-related changes (which eventually broke
golden master tests, fortunately :) ) hit the public.
(Meaning my reduced - possibly minimal -
New submission from János Brezniczky :
I don't know if it's normal, but it's impractical.
I seem to have come by an expression consuming o(c^n) CPU cycles with c around
2.
Regex:
\*([^*]+)*\*
Resulted in times (in seconds) of
0.17 (* is there anybody out?)
0.34 1
0.69 12
1.36 123
2.73