New submission from Almer Tigelaar:
>From the documentation ^ should restrict the matching of re.search to the
>beginning of the string, as mentioned here:
>https://docs.python.org/3.4/library/re.html#search-vs-match
However, this doesn't always seem to work as the following example shows:
re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|([0-9]{4})$",
"2015-AE-02T10:16:08.450904")
This should not match since the expression uses or-ed patterns between anchors
^ and $. Based on the "AE" this should not return a match, yet it returns one
from positions 22 to 26, based on the last pattern in the or-red sequence of
patterns: ([0-9]{4})
This can be worked around by explicitly including the anchor markers in the
last pattern as follows:
re.search("^([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9]\\.[0-9]+)|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9]:[0-5][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9]T[0-2][0-9])|([0-9]{4}-[01][0-9]-[0-3][0-9])|([0-9]{4}-[01][0-9])|(^[0-9]{4}$)$",
"2015-AE-02T10:16:08.450904")
Notice: the last pattern now explicitly includes the anchors: (^[0-9]{4}$),
which is factually duplicate with the anchors that already exist at the
beginning and end of the entire regular expression!
This work around correctly produces no match (which is the behaviour I expected
from the first pattern).
----------
components: Regular Expressions
messages: 246756
nosy: Almer Tigelaar, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.search not respecting anchor markers in or-ed construction
type: behavior
versions: Python 3.4
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue24636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com