[issue28690] Loop in re (regular expression) processing

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: It's not really a bug, but more a trap of regular expressions. It seems like you fixed your issue, so I close it. -- nosy: +haypo resolution: -> not a bug status: open -> closed ___ Python tracker

[issue28690] Loop in re (regular expression) processing

2016-11-14 Thread Walter Farrell
Walter Farrell added the comment: Thanks, Gareth. That does work. Interesting that regex does still seem to work linearly with the original version, but your version seems cleaner. On Mon, Nov 14, 2016 at 3:55 PM, Gareth Rees wrote: > > Gareth Rees added the comment: > > This is a well-known

[issue28690] Loop in re (regular expression) processing

2016-11-14 Thread Gareth Rees
Gareth Rees added the comment: This is a well-known gotcha with backtracking regexp implementations. The problem is that in the alternation "( +|'[^']*'|\"[^\"]*\"|[^>]+)" there are some characters (space, apostrophe, double quotes) that match multiple alternatives (for example a space matches

[issue28690] Loop in re (regular expression) processing

2016-11-14 Thread Walter Farrell
New submission from Walter Farrell: Given: pattern = r"(^|[^\\])<(pm [^ ]+( +|'[^']*'|\"[^\"]*\"|[^>]+)+)>" s = "Bain, F. W. " added to the end of s, it returns quickly with a match. Without the ">" it should fail, but instead seems to loop. (If I use the regex module instead of re, it fail