[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2022-02-05 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I am not convinced. What are examples of using re.match() instead of 
re.search()? How common is this type of errors?

There are perhaps many millions of scripts which use re.match(), deprecating 
re.match() at any time in future would be very destructive, and keeping an 
alias indefinitely would only add more confusion.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2022-02-04 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

What do other APIs in widely used languages do with regex terminology?  We 
appear to be the only popular language who anchors to the start of a string 
with an API named "match".

libpcre C: uses "match" to mean what we call "search" - 
https://www.pcre.org/current/doc/html/pcre2_match.html

Go: Uses "Match" to mean what we call "search" - https://pkg.go.dev/regexp#Match

JavaScript: Uses "match" to mean what we call "search" - 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match

Java: Uses "matches" (I think meaning what we call fullmatch?) - 
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

C++ RE2: explicit "FullMatch" and "PartialMatch" APIs - 
https://github.com/google/re2 

Jave re2j: uses "matches" like Java regex.Pattern - 
https://github.com/google/re2j 

Ruby: Uses "match" as we do "search" - 
https://ruby-doc.org/core-2.4.0/Regexp.html

Rust: Uses match as we do "search" - https://docs.rs/regex/latest/regex/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2022-02-04 Thread Gregory P. Smith


Change by Gregory P. Smith :


--
keywords: +patch
pull_requests: +29314
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/31137

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2022-02-04 Thread Gregory P. Smith


Change by Gregory P. Smith :


--
assignee:  -> gregory.p.smith
versions: +Python 3.11 -Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2020-11-14 Thread Matthew Suozzo


Matthew Suozzo  added the comment:

> It just won't work unless you add explicit ".*" or ".*?" at the start of the 
> pattern

But think of when regexes are used for validating input. Getting it to "just 
work" may be over-permissive validation that only actually checks the beginning 
of the input. They're one missed test case away from a crash or, worse, a 
security issue.

This proposed name change would help make the function behavior obvious at the 
callsite. In the validator example, calling "prefixmatch" would stand out as 
wrong to even the most oblivious, documentation-averse user.

> My point is that re.match is a common bug when people really want re.search.

While I think better distinguishing the interfaces is a nice-to-have for 
usability, I think it has more absolute benefit to correctness. Again, 
confusion around the semantics of "match" were the motivation for adding 
"fullmatch" in the first place but that change only went so far to address the 
problem: It's still too easy to misuse the existing "match" interface and it's 
not realistic to remove it from the language. A new name would eliminate this 
class of error at a very low cost.

--
nosy: +matthew.suozzo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2020-11-14 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

My point is that re.match is a common bug when people really want re.search.

re.prefixmatch makes it explicit and non-confusing and thus unlikely to be used 
wrong or misunderstood when read or reviewed.

The term "match" when talking about regular expressions is not normally meant 
to imply any anchoring as anchors can be expressed within the regex.  Python is 
relatively unique in bothering to have different methods for a prefix match and 
an anywhere match.  (We'd have been better off without a match method entirely, 
only having search - too late now)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2020-11-14 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I seen a code which uses re.search() with anchor ^ instead of re.match(), but I 
never seen a code which uses re.match() instead of re.search(). It just won't 
work unless you add explicit ".*" or ".*?" at the start of the pattern, and it 
is a clear indication that re.match() matches the start of the string.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2020-11-14 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
components: +Regular Expressions
nosy: +ezio.melotti, mrabarnett

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42353] Proposal: re.prefixmatch method (alias for re.match)

2020-11-13 Thread Gregory P. Smith


New submission from Gregory P. Smith :

A well known anti-pattern in Python is use of re.match when you meant to use 
re.search.

re.fullmatch was added in 3.4 via https://bugs.python.org/issue16203 for 
similar reasons.

re.prefixmatch would be similar: we want the re.match behavior, but want the 
code to be obvious about its intent.  This documents the implicit ^ in the name.

The goal would be to allow linters to ultimately flag re.match as the 
anti-pattern when in 3.10+ mode.  Asking people to use re.prefixmatch or 
re.search instead.

This would help avoid bugs where people mean re.search but write re.match.

The implementation is trivial.

This is **not** a decision to deprecate the widely used in 25 years worth of 
code's re.match name.  That'd be painful and is unlikely to be worth doing 
without spreading it over a 8+ year timeline.

--
components: Library (Lib)
messages: 380928
nosy: gregory.p.smith
priority: normal
severity: normal
stage: needs patch
status: open
title: Proposal: re.prefixmatch method (alias for re.match)
type: enhancement
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com