brent s. <brent.sa...@gmail.com> added the comment:

"'\.' is an invalid escape sequence. Could you try it with a raw string?"

Well, a valid regex escape, but right. Point taken. I am under the impression, 
however, that given the value in ptrn (in example.py) is already a string, it 
should be interpreted as a raw string in the re.compile(), no? Because 
otherwise it'd be a dickens of a time getting a regex pattern that's 
dynamic/programmatically assigned to a name, since there's no raw(), str.raw(), 
or str.encode('raw').

They both evaluate to the same, for what it's worth:

>>> repr('\.+$')
"'\\\\.+$'"
>>> repr(r'\.+$')
"'\\\\.+$'"
>>> ptrn = '\.+$'
>>> repr(ptrn)
"'\\\\.+$'"

So.

"Also, it's not really clear to me what you're seeing, vs. what you expect to 
see. For one example that you think is incorrect, could you show what you get 
vs. what you expect to get? And, if that's different on different python 
versions, could you show what each version does?"

The comment from Serhiy clarifies that this was indeed something that was 
changed. You can see the difference pretty easily by just calling the 
example.py between python2 and python3.

--

"This change was intentional and documented. It fixed old bug in the Python 
implementation of RE and removed the discrepancy with other RE engines."

Okay, so I'm not going insane. That's good. Do you have the bug ID it fixes and 
where it's documented? Do you know which other RE engines were doing this? 
Because GNU sed, for instance, does not behave like this - it behaves as the 
"pre-bugfix" behaviour did:

$ echo 'a.b.' | sed -e 's/\.*$/./g'
a.b.
$ echo 'a.b...' | sed -e 's/\.*$/./g'
a.b.
$ echo 'a.b' | sed -e 's/\.*$/./g'
a.b.

"The pattern r'\.*$' matches not only a sequence of dots at the of the line, 
but also an empty string at the end of line. If this is not what you want, use 
r'\.+$'."

Right; it's to guarantee there is one and only one period at the end of a line, 
whether there is no period, one period, or many periods in the original string 
(think e.g. enforcing RFC1025-compatible FQDNs, for instance).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37594>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to