New submission from Sergio Rael <sr...@onestic.com>:

I have found a deadlock using Python 3.6.10 that seems to have been solved on 
3.7.x. probably related to capture groups. To reproduce the deadlock just do 
something like this:

re.findall(
    '\[et_pb_image(?:\w|=|"|\d|\.| 
|_|\/)*src="(https?:\/\/(?:www\.)?\w*\.\w*(?:\/|\w|\d|\.|-)*\.(?:png|jpg|jpeg|gif))"(?:\w|=|"|\d|\.|
 |_|\/|%|\|)*(?:\/?\])(?:\[\/et_pb_image\])?',
    '[et_pb_image _builder_version="3.27.2" 
src="https://www.somewhere.com/wp-content/uploads/2019/08/stabilizers.jpg"; 
box_shadow_horizontal_tablet="0px" box_shadow_vertical_tablet="0px" 
box_shadow_blur_tablet="40px" box_shadow_spread_tablet="0px" 
z_index_tablet="500" url="https://youtu.be/fTrC5gkyYBM"; url_new_window="on" /]',
)

I noticed that the problem is related to having two image urls on the content. 
The regex says to look only for the one starting with "src=" so the one 
starting with "url=" should be ignored. If "url=\"XXX\"" is removed from the 
tag it works fine.

----------
components: Regular Expressions
messages: 368026
nosy: ezio.melotti, mrabarnett, srael
priority: normal
severity: normal
status: open
title: re.findall() deadlock on Python 3.6.10
type: behavior
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40496>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to