06.12.19 23:20, Kyle Stanley пише:
Serhiy Storchaka wrote:
 > It seems that in most cases the author just do not know about
 > re.search(). Adding re.findfirst() will not fix this.

That's definitely possible, but it might be just as likely that they saw re.findall() as being more simple to use compared to re.search(). Although it has worse performance by a substantial amount when parsing decent amounts of text (assuming the first match isn't at the end), ``re.findall()[0]`` /consistently/ returns the first string that was matched, as long as no subgroups were used. This allows them to circumvent the usage of match objects entirely, which makes it a bit easier to learn. Especially for those who are less familiar with OOP, or are already familiar with other popular flavors of regex (such as JS).

I'll admit this is mostly speculation, but I think there's an especially large number of re users (compared to other modules) that aren't necessarily developers, and might just be someone who wants to write a script to quickly parse some documents. These types of users are the ones who would likely benefit the most from the proposed re.findfirst(), particularly if it directly returns a string as Guido is suggesting.

I think at the end of the day, the critical question to answer is this:

*Do we want to add a new helper function that's easy to use, consistent, and provides good performance for finding the first match, even if the functionality already exists within the module?*

My concern is that this will add complexity to the module documentation which is already too complex. re.findfirst() has more complex semantic (if no capture groups returns this, if one capture group return that, and in other cases return even something of different type) than re.search() which just returns a match object or None. This will increase chance that the user miss the appropriate function and use suboptimal functions like findall()[0].

re.finditer() is more modern and powerful function than re.findall(). The latter may be even deprecated in future.

In future we may add yet few functions/methods: re.rmatch() (like re.match(), but matches at the end of the string instead of the start), re.rsearch() (searches from the end), re.rfinditer() (iterates in the reversed order). Unlike to findfirst() they will implement features that cannot be easily expressed using existing functions.

Another option to consider might be adding a boolean parameter to re.search() that changes the behavior to directly return a string instead of a match object, similar to re.findall() when there are not multiple subgroups.

Oh, no, this is the worst idea!
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C4VUEDFVLRJ5G7KTDI5G5RNC3MMP7X6V/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to