On 2020-12-05 23:42:11 +0100, [email protected] wrote: > Timeout: no idea. But check out re.compile and re.iterfind as they might > speed things up.
I doubt that compiling regular expressions helps the OP much. Compiled
regular expressions are cached, but more importantly, if a match takes
long enough that specifying a timeout is useful, the time is almost
certainly not spent compiling, but matching - most likely backtracking
from lots of promising but ultimately unsuccessful partial matches.
> regex = r'data-stid="section-room-list"[\s\S]*?>\s*([\s\S]*?)\s*' \
>
>
> r'(?:class\s*=\s*"\s*sticky-book-now\s*"|</ul>\s*</section>|id\s*=\s*"Location")'
> rooms_blocks_to_be_replace = re.findall(regex, html_template)
This part:
\s*([\s\S]*?)\s*'
looks dangerous from a performance point of view. If that can be
rewritten with less potential for backtracking, it might help.
Generally, it should be possible to implement a timeout for any
operation by either scheduling an alarm with signal.alarm or by
executing the operation in a separate thread and killing the thread if
it takes too long.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | [email protected] | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list
