[Python-ideas] Re: Fwd: Re: Fwd: re.findfirst()

Serhiy Storchaka Fri, 06 Dec 2019 23:57:25 -0800

06.12.19 23:20, Kyle Stanley пише:

Serhiy Storchaka wrote:
 > It seems that in most cases the author just do not know about
 > re.search(). Adding re.findfirst() will not fix this.
That's definitely possible, but it might be just as likely that they sawre.findall() as being more simple to use compared to re.search().Although it has worse performance by a substantial amount when parsingdecent amounts of text (assuming the first match isn't at the end),``re.findall()[0]`` /consistently/ returns the first string that wasmatched, as long as no subgroups were used. This allows them tocircumvent the usage of match objects entirely, which makes it a biteasier to learn. Especially for those who are less familiar with OOP, orare already familiar with other popular flavors of regex (such as JS).
I'll admit this is mostly speculation, but I think there's an especiallylarge number of re users (compared to other modules) that aren'tnecessarily developers, and might just be someone who wants to write ascript to quickly parse some documents. These types of users are theones who would likely benefit the most from the proposed re.findfirst(),particularly if it directly returns a string as Guido is suggesting.
I think at the end of the day, the critical question to answer is this:
*Do we want to add a new helper function that's easy to use, consistent,and provides good performance for finding the first match, even if thefunctionality already exists within the module?*

My concern is that this will add complexity to the module documentationwhich is already too complex. re.findfirst() has more complex semantic(if no capture groups returns this, if one capture group return that,and in other cases return even something of different type) thanre.search() which just returns a match object or None. This willincrease chance that the user miss the appropriate function and usesuboptimal functions like findall()[0].

re.finditer() is more modern and powerful function than re.findall().The latter may be even deprecated in future.

In future we may add yet few functions/methods: re.rmatch() (likere.match(), but matches at the end of the string instead of the start),re.rsearch() (searches from the end), re.rfinditer() (iterates in thereversed order). Unlike to findfirst() they will implement features thatcannot be easily expressed using existing functions.

Another option to consider might be adding a boolean parameter tore.search() that changes the behavior to directly return a stringinstead of a match object, similar to re.findall() when there are notmultiple subgroups.


Oh, no, this is the worst idea!
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/C4VUEDFVLRJ5G7KTDI5G5RNC3MMP7X6V/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Fwd: Re: Fwd: re.findfirst()

Reply via email to