New submission from Almer Tigelaar:

>From the documentation ^ should restrict the matching of re.search to the 
>beginning of the string, as mentioned here: 
>https://docs.python.org/3.4/library/re.html#search-vs-match

However, this doesn't always seem to work as the following example shows:

re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|([0-9]{4})$",
 "2015-AE-02T10:16:08.450904")

This should not match since the expression uses or-ed patterns between anchors 
^ and $. Based on the "AE" this should not return a match, yet it returns one 
from positions 22 to 26, based on the last pattern in the or-red sequence of 
patterns: ([0-9]{4})

This can be worked around by explicitly including the anchor markers in the 
last pattern as follows:

re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|(^[0-9]{4}$)$",
 "2015-AE-02T10:16:08.450904")

Notice: the last pattern now explicitly includes the anchors: (^[0-9]{4}$), 
which is factually duplicate with the anchors that already exist at the 
beginning and end of the entire regular expression!

This work around correctly produces no match (which is the behaviour I expected 
from the first pattern).

----------
components: Regular Expressions
messages: 246756
nosy: Almer Tigelaar, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.search not respecting anchor markers in or-ed construction
type: behavior
versions: Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to