Re: [Python-Dev] Remove str.find in 3.0?
I wrote: [Andrew Durdin:] > > IOW, I expected "www.python.org".partition("python") to return exactly > > the same as "www.python.org".rpartition("python") > > Yow. Me too, and indeed I've been skimming this thread without > it ever occurring to me that it would be otherwise. And, on re-skimming the thread, I think that was always the plan. So that's OK, then. :-) -- g ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
> Just to put my spoke in the wheel, I find the difference in the > ordering of return values for partition() and rpartition() confusing: > > head, sep, remainder = partition(s) > remainder, sep, head = rpartition(s) > > My first expectation for rpartition() was that it would return exactly > the same values as partition(), but just work from the end of the > string. > > IOW, I expected "www.python.org".partition("python") to return exactly > the same as "www.python.org".rpartition("python") Yow. Me too, and indeed I've been skimming this thread without it ever occurring to me that it would be otherwise. > Anyway, I'm definitely +1 on partition(), but -1 on rpartition() > returning in "reverse order". +1. -- g ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Steve Holden <[EMAIL PROTECTED]> wrote: > > Guido van Rossum wrote: > > On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote: > [confusion] > > > > > > Hm. The example is poorly chosen because it's an end case. The > > invariant for both is (I'd hope!) > > > > "".join(s.partition()) == s == "".join(s.rpartition()) > > > > Thus, > > > > "a/b/c".partition("/") returns ("a", "/", "b/c") > > > > "a/b/c".rpartition("/") returns ("a/b", "/", "c") > > > > That can't be confusing can it? > > > > (Just think of it as rpartition() stopping at the last occurrence, > > rather than searching from the right. :-) > > > So we can check that a substring x appears precisely once in the string > s using > > s.partition(x) == s.rpartition(x) > > Oops, it fails if s == "". I can usually find some way to go wrong ... There was an example in the standard library that used "s.find(y) == s.rfind(y)" as a test for zero or 1 instances of the searched for item. Generally though, s.count(x)==1 is a better test. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Guido van Rossum wrote: > On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote: [confusion] > > > Hm. The example is poorly chosen because it's an end case. The > invariant for both is (I'd hope!) > > "".join(s.partition()) == s == "".join(s.rpartition()) > > Thus, > > "a/b/c".partition("/") returns ("a", "/", "b/c") > > "a/b/c".rpartition("/") returns ("a/b", "/", "c") > > That can't be confusing can it? > > (Just think of it as rpartition() stopping at the last occurrence, > rather than searching from the right. :-) > So we can check that a substring x appears precisely once in the string s using s.partition(x) == s.rpartition(x) Oops, it fails if s == "". I can usually find some way to go wrong ... tongue-in-cheek-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Delaney, Timothy (Tim)" <[EMAIL PROTECTED]> wrote in message > before, sep, after = s.partition('?') > ('http://www.python.org', '', '') > > before, sep, after = s.rpartition('?') > ('', '', 'http://www.python.org') I can also see this as left, sep, right, with the sep not found case putting all in left or right depending on whether one scanned to the right or left. In other words, when the scanner runs out of chars to scan, everything is 'behind' the scan, where 'behind' depends on the direction of scanning. That seems nicely symmetric. Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
""Martin v. Löwis"" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Terry Reedy wrote: >> One (1a) is to give an inband signal that is like a normal >> response except that it is not (str.find returing -1). >> >> Python as distributed usually chooses 1b or 2. >> I believe str.find and >> .rfind are unique in the choice of 1a. > > That is not true. str.find's choice is not 1a, It it the paradigm example of 1a as I meant my definition. > -1 does *not* look like a normal response, > since a normal response is non-negative. Actually, the current doc does not clearly specify to some people that the response is a count. That is what lead to the 'str.find is buggy' thread on c.l.p, and something I will clarify when I propose a doc patch. In any case, Python does not have a count type, though I sometime wish it did. The return type is int and -1 is int, though it is not meant to be used as an int and it is a bug to do so. >It is *not* the only method with choice 1a): > dict.get returns None if the key is not found, None is only the default default, and whatever the default is, it is not necessarily an error return. A dict accessed via .get can be regarded as an infinite association matching all but a finite set of keys with the default. Example: a doubly infinite array of numbers with only a finite number of non-zero entries, implemented as a dict. This is the view actually used if one does normal calculations with that default return. There is no need, at least for that access method, for any key to be explicitly associated with the default. If the default *is* regarded as an error indicator, and is only used to guard normal processing of the value returned, then that default must not be associated any key. There is the problem that the domain of dict values is normally considered to be any Python object and functions can only return Python objects and not any non-Python-object error return. So the effective value domain for the particular dict must be the set 'Python objects' minus the error indicator. With discipline, None often works. Or, to guarantee 1b-ness, one can create a new type that cannot be in the dict. > For another example, file.read() returns an empty string at EOF. If the request is 'give me the rest of the file as a string', then '' is the answer, not a 'cannot answer' indicator. Similarly, if the request is 'how many bytes are left to read', then zero is a numerical answer, not a non-numerical 'cannot answer' indicator. Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/31/05, Guido van Rossum <[EMAIL PROTECTED]> wrote: > > Hm. The example is poorly chosen because it's an end case. The > invariant for both is (I'd hope!) > > "".join(s.partition()) == s == "".join(s.rpartition()) > (Just think of it as rpartition() stopping at the last occurrence, > rather than searching from the right. :-) Ah, that makes a difference. I could see that there was a different way of looking at the function, I just couldn't see what it was... Now I understand the way it's been done. Cheers, Andrew. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote: > On 8/31/05, Delaney, Timothy (Tim) <[EMAIL PROTECTED]> wrote: > > Andrew Durdin wrote: > > > > > Just to put my spoke in the wheel, I find the difference in the > > > ordering of return values for partition() and rpartition() confusing: > > > > > > head, sep, remainder = partition(s) > > > remainder, sep, head = rpartition(s) > > > > This is the confusion - you've got the terminology wrong. > > > > before, sep, after = s.partition('?') > > ('http://www.python.org', '', '') > > > > before, sep, after = s.rpartition('?') > > ('', '', 'http://www.python.org') > > That's still confusing (to me), though -- when the string is being > processed, what comes before the separator is the stuff at the end of > the string, and what comes after is the bit at the beginning of the > string. It's not the terminology that's confusing me, though I find > it hard to describe exactly what is. Maybe it's just me -- does anyone > else have the same confusion? Hm. The example is poorly chosen because it's an end case. The invariant for both is (I'd hope!) "".join(s.partition()) == s == "".join(s.rpartition()) Thus, "a/b/c".partition("/") returns ("a", "/", "b/c") "a/b/c".rpartition("/") returns ("a/b", "/", "c") That can't be confusing can it? (Just think of it as rpartition() stopping at the last occurrence, rather than searching from the right. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/31/05, Delaney, Timothy (Tim) <[EMAIL PROTECTED]> wrote: > Andrew Durdin wrote: > > > Just to put my spoke in the wheel, I find the difference in the > > ordering of return values for partition() and rpartition() confusing: > > > > head, sep, remainder = partition(s) > > remainder, sep, head = rpartition(s) > > This is the confusion - you've got the terminology wrong. > > before, sep, after = s.partition('?') > ('http://www.python.org', '', '') > > before, sep, after = s.rpartition('?') > ('', '', 'http://www.python.org') That's still confusing (to me), though -- when the string is being processed, what comes before the separator is the stuff at the end of the string, and what comes after is the bit at the beginning of the string. It's not the terminology that's confusing me, though I find it hard to describe exactly what is. Maybe it's just me -- does anyone else have the same confusion? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Andrew Durdin wrote: > Just to put my spoke in the wheel, I find the difference in the > ordering of return values for partition() and rpartition() confusing: > > head, sep, remainder = partition(s) > remainder, sep, head = rpartition(s) This is the confusion - you've got the terminology wrong. before, sep, after = s.partition('?') ('http://www.python.org', '', '') before, sep, after = s.rpartition('?') ('', '', 'http://www.python.org') Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/31/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > [Hye-Shik Chang] > > What would be a result for rpartition(s, '?') ? > > ('', '', 'http://www.python.org') > > or > > ('http://www.python.org', '', '') > > The former. The invariants for rpartition() are a mirror image of those > for partition(). Just to put my spoke in the wheel, I find the difference in the ordering of return values for partition() and rpartition() confusing: head, sep, remainder = partition(s) remainder, sep, head = rpartition(s) My first expectation for rpartition() was that it would return exactly the same values as partition(), but just work from the end of the string. IOW, I expected "www.python.org".partition("python") to return exactly the same as "www.python.org".rpartition("python") To try out partition(), I wrote a quick version of split() using partition, and using partition() was obvious and easy: def mysplit(s, sep): l = [] while s: part, _, s = s.partition(sep) l.append(part) return l I tripped up when trying to make an rsplit() (I'm using Python 2.3), because the return values were in "reverse order"; I had expected the only change to be using rpartition() instead of partition(). For a second example: one of the "fixed stdlib" examples that Raymond posted actually uses rpartition and partition in two consecutive lines -- I found this example not immediately obvious for the above reason: def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info rest, _, query = rest.rpartition('?') script, _, rest = rest.partition('/') scriptname = dir + '/' + script scriptfile = self.translate_path(scriptname) if not os.path.exists(scriptfile): Anyway, I'm definitely +1 on partition(), but -1 on rpartition() returning in "reverse order". Andrew ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Hye-Shik Chang] > What would be a result for rpartition(s, '?') ? > ('', '', 'http://www.python.org') > or > ('http://www.python.org', '', '') The former. The invariants for rpartition() are a mirror image of those for partition(). > BTW, I wrote a somewhat preliminary patch for this functionality > to let you save little of your time. :-) > > http://people.freebsd.org/~perky/partition-r1.diff Thanks. I've got one running already, but it is nice to have another for comparison. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/28/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > >>> s = 'http://www.python.org' > >>> partition(s, '://') > ('http', '://', 'www.python.org') > >>> partition(s, '?') > ('http://www.python.org', '', '') > >>> partition(s, 'http://') > ('', 'http://', 'www.python.org') > >>> partition(s, 'org') > ('http://www.python.', 'org', '') > What would be a result for rpartition(s, '?') ? ('', '', 'http://www.python.org') or ('http://www.python.org', '', '') BTW, I wrote a somewhat preliminary patch for this functionality to let you save little of your time. :-) http://people.freebsd.org/~perky/partition-r1.diff Hye-Shik ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Raymond Hettinger" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > [M.-A. Lemburg] >> Also, as I understand Terry's request, .find() should be removed >> in favor of just leaving .index() (which is the identical method >> without the funny -1 return code logic). My proposal is to use the 3.0 opportunity to improve the language in this particular area. I considered and ranked five alternatives more or less as follows. 1. Keep .index and delete .find. 2. Keep .index and repair .find to return None instead of -1. 3.5 Delete .index and repair .find. 3.5 Keep .index and .find as is. 5. Delete .index and keep .find as is. > It is new and separate, but it is also related. I see it as a 6th option: keep.index, delete .find, and replace with .partition. I rank this at least second and maybe first. It is separable in that the replacement can be done now, while the deletion has to wait. > The core of Terry's request is the assertion that str.find() > is bug-prone and should not be used. That and the redundancy, both of which bothered me a bit since I first learned the string module functions. Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Kay Schluehr] >> The discourse about Python3000 has shrunken from the expectation >> of the "next big thing" into a depressive rhetorics of feature >> elimination. The language doesn't seem to become deeper, smaller >> and more powerfull but just smaller. [Guido] > There is much focus on removing things, because we want to be able > to add new stuff but we don't want the language to grow. ISTM that a major reason that the Python 3.0 discussion seems focused more on removal than addition is that a lot of addition can be (and is being) done in Python 2.x. This is a huge benefit, of course, since people can start doing things the "new and improved" way in 2.x, even though it's not until 3.0 that the "old and evil" ;) way is actually removed. Removal of map/filter/reduce is an example - there isn't discussion about addition of new features, because list comps/gen expressions are already here... =Tony.Meyer ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Raymond writes: > That suggests that we need a variant of split() that has been customized > for typical find/index use cases. Perhaps introduce a new pair of > methods, partition() and rpartition() +1 My only suggestion is that when you're about to make a truly inspired suggestion like this one, that you use a new subject header. It will make it easier for the Python-Dev summary authors and for the people who look back in 20 years to ask "That str.partition() function is really swiggy! It's everywhere now, but I wonder what language had it first and who came up with it?" -- Michael Chermside [PS: To explain what "swiggy" means I'd probably have to borrow the time machine.] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[M.-A. Lemburg] > Also, as I understand Terry's request, .find() should be removed > in favor of just leaving .index() (which is the identical method > without the funny -1 return code logic). > > So your proposal really doesn't have all that much to do > with Terry's request, but is a new and separate proposal > (which does have some value in few cases, but not enough > to warrant a new method). It is new and separate, but it is also related. The core of Terry's request is the assertion that str.find() is bug-prone and should not be used. The principal arguments against accepting his request (advanced by Tim) are that the str.index() alternative is slightly more awkward to code, more likely to result in try-suites that catch more than intended, and that the resulting code is slower. Those arguments fall to the wayside if str.partition() becomes available as a superior alternative. IOW, it makes Terry's request much more palatable. > > def run_cgi(self): > > """Execute a CGI script.""" > > dir, rest = self.cgi_info > > rest, _, query = rest.rpartition('?') > > script, _, rest = rest.partition('/') [MAL] > Wouldn't this do the same ?! ... > > rest, query = rest.rsplit('?', maxsplit=1) > script, rest = rest.split('/', maxsplit=1) No. The split() versions are buggy. They fail catastrophically when the original string does not contain '?' or does not contain '/': >>> rest = 'http://www.example.org/subdir' >>> rest, query = rest.rsplit('?', 1) Traceback (most recent call last): File "", line 1, in -toplevel- rest, query = rest.rsplit('?', 1) ValueError: need more than 1 value to unpack The whole point of str.partition() is to repackage str.split() in a way that is conducive to fulfilling many of the existing use cases for str.find() and str.index(). In going through the library examples, I've not found a single case where a direct use of str.split() would improve code that currently uses str.find(). Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Raymond Hettinger wrote: > Looking at sample code transformations shows that the high-power > mxTextTools and re approaches do not simplify code that currently uses > s.find(). In contrast, the proposed partition() method is a joy to use > and has no surprises. The following code transformation shows > unbeatable simplicity and clarity. +1 This doesn't cause any backward compatible issues as well! > --- From CGIHTTPServer.py --- > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > i = rest.rfind('?') > if i >= 0: > rest, query = rest[:i], rest[i+1:] > else: > query = '' > i = rest.find('/') > if i >= 0: > script, rest = rest[:i], rest[i:] > else: > script, rest = rest, '' > . . . > > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > rest, _, query = rest.rpartition('?') > script, _, rest = rest.partition('/') > . . . +1 Much easier to read and understand! Cheers, Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Raymond Hettinger wrote: > [Marc-Andre Lemburg] > >>I may be missing something, but why invent yet another parsing >>method - we already have the re module. I'd suggest to >>use it :-) >> >>If re is not fast enough or you want more control over the >>parsing process, you could also have a look at mxTextTools: >> >>http://www.egenix.com/files/python/mxTextTools.html > > > Both are excellent tools. Neither is as lightweight, as trivial to > learn, or as transparently obvious as the proposed s.partition(sep). > The idea is to find a viable replacement for s.find(). Your partition idea could be had with an additional argument to .split() (e.g. keepsep=1); no need to learn a new method. Also, as I understand Terry's request, .find() should be removed in favor of just leaving .index() (which is the identical method without the funny -1 return code logic). So your proposal really doesn't have all that much to do with Terry's request, but is a new and separate proposal (which does have some value in few cases, but not enough to warrant a new method). > Looking at sample code transformations shows that the high-power > mxTextTools and re approaches do not simplify code that currently uses > s.find(). In contrast, the proposed partition() method is a joy to use > and has no surprises. The following code transformation shows > unbeatable simplicity and clarity. > > > --- From CGIHTTPServer.py --- > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > i = rest.rfind('?') > if i >= 0: > rest, query = rest[:i], rest[i+1:] > else: > query = '' > i = rest.find('/') > if i >= 0: > script, rest = rest[:i], rest[i:] > else: > script, rest = rest, '' > . . . > > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > rest, _, query = rest.rpartition('?') > script, _, rest = rest.partition('/') Wouldn't this do the same ?! ... rest, query = rest.rsplit('?', maxsplit=1) script, rest = rest.split('/', maxsplit=1) > . . . > > > The new proposal does not help every use case though. In > ConfigParser.py, the problem description reads, "a semi-colon is a > comment delimiter only if it follows a spacing character". This cries > out for a regular expression. In StringIO.py, since the task at hand IS > calculating an index, an indexless higher level construct doesn't help. > However, many of the other s.find() use cases in the library simplify as > readily and directly as the above cgi server example. > > > > Raymond > > > --- > > P.S. FWIW, if you want to experiment with it, here a concrete > implementation of partition() expressed as a function: > > def partition(s, t): > """ Returns a three element tuple, (head, sep, tail) where: > > head + sep + tail == s > t not in head > sep == '' or sep is t > bool(sep) == (t in s) # sep indicates if the string was > found > > >>> s = 'http://www.python.org' > >>> partition(s, '://') > ('http', '://', 'www.python.org') > >>> partition(s, '?') > ('http://www.python.org', '', '') > >>> partition(s, 'http://') > ('', 'http://', 'www.python.org') > >>> partition(s, 'org') > ('http://www.python.', 'org', '') > > """ > if not isinstance(t, basestring) or not t: > raise ValueError('partititon argument must be a non-empty > string') > parts = s.split(t, 1) > if len(parts) == 1: > result = (s, '', '') > else: > result = (parts[0], t, parts[1]) > assert len(result) == 3 > assert ''.join(result) == s > assert result[1] == '' or result[1] is t > assert t not in result[0] > return result > > > import doctest > print doctest.testmod() -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 28 2005) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Raymond Hettinger" <[EMAIL PROTECTED]> wrote: > [Guido] > > Another observation: despite the derogatory remarks about regular > > expressions, they have one thing going for them: they provide a higher > > level of abstraction for string parsing, which this is all about. > > (They are higher level in that you don't have to be counting > > characters, which is about the lowest-level activity in programming -- > > only counting bytes is lower!) > > > > Maybe if we had a *good* way of specifying string parsing we wouldn't > > be needing to call find() or index() so much at all! (A good example > > is the code that Raymond lifted from ConfigParser: a semicolon > > preceded by whitespace starts a comment, other semicolons don't. > > Surely there ought to be a better way to write that.) > > A higher level abstraction is surely the way to go. Perhaps... > Of course, if this idea survives the day, then I'll meet my own > requirements and write a context diff on the standard library. That > ought to give a good indication of how well the new methods meet > existing needs and whether the resulting code is better, cleaner, > clearer, faster, etc. My first thought when reading the proposal was "that's just str.split/str.rsplit with maxsplit=1, returning the thing you split on, with 3 items always returned, what's the big deal?" Two second later it hit me, that is the big deal. Right now it is a bit of a pain to get string.split to return consistant numbers of return values; I myself have used: l,r = (x.split(y, 1)+[''])[:2] ...around 10 times - 10 times more than I really should have. Taking a wander through my code, this improves the look and flow in almost every case (the exceptions being where I should have rewritten to use 'substr in str' after I started using Python 2.3). Taking a walk through examples of str.rfind at koders.com leads me to believe that .partition/.rpartition would generally improve the flow, correctness, and beauty of code which had previously been using .find/.rfind. I hope the idea survives the day. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Marc-Andre Lemburg] > I may be missing something, but why invent yet another parsing > method - we already have the re module. I'd suggest to > use it :-) > > If re is not fast enough or you want more control over the > parsing process, you could also have a look at mxTextTools: > > http://www.egenix.com/files/python/mxTextTools.html Both are excellent tools. Neither is as lightweight, as trivial to learn, or as transparently obvious as the proposed s.partition(sep). The idea is to find a viable replacement for s.find(). Looking at sample code transformations shows that the high-power mxTextTools and re approaches do not simplify code that currently uses s.find(). In contrast, the proposed partition() method is a joy to use and has no surprises. The following code transformation shows unbeatable simplicity and clarity. --- From CGIHTTPServer.py --- def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info i = rest.rfind('?') if i >= 0: rest, query = rest[:i], rest[i+1:] else: query = '' i = rest.find('/') if i >= 0: script, rest = rest[:i], rest[i:] else: script, rest = rest, '' . . . def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info rest, _, query = rest.rpartition('?') script, _, rest = rest.partition('/') . . . The new proposal does not help every use case though. In ConfigParser.py, the problem description reads, "a semi-colon is a comment delimiter only if it follows a spacing character". This cries out for a regular expression. In StringIO.py, since the task at hand IS calculating an index, an indexless higher level construct doesn't help. However, many of the other s.find() use cases in the library simplify as readily and directly as the above cgi server example. Raymond --- P.S. FWIW, if you want to experiment with it, here a concrete implementation of partition() expressed as a function: def partition(s, t): """ Returns a three element tuple, (head, sep, tail) where: head + sep + tail == s t not in head sep == '' or sep is t bool(sep) == (t in s) # sep indicates if the string was found >>> s = 'http://www.python.org' >>> partition(s, '://') ('http', '://', 'www.python.org') >>> partition(s, '?') ('http://www.python.org', '', '') >>> partition(s, 'http://') ('', 'http://', 'www.python.org') >>> partition(s, 'org') ('http://www.python.', 'org', '') """ if not isinstance(t, basestring) or not t: raise ValueError('partititon argument must be a non-empty string') parts = s.split(t, 1) if len(parts) == 1: result = (s, '', '') else: result = (parts[0], t, parts[1]) assert len(result) == 3 assert ''.join(result) == s assert result[1] == '' or result[1] is t assert t not in result[0] return result import doctest print doctest.testmod() ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Raymond Hettinger wrote: > [Guido] > >>Another observation: despite the derogatory remarks about regular >>expressions, they have one thing going for them: they provide a higher >>level of abstraction for string parsing, which this is all about. >>(They are higher level in that you don't have to be counting >>characters, which is about the lowest-level activity in programming -- >>only counting bytes is lower!) >> >>Maybe if we had a *good* way of specifying string parsing we wouldn't >>be needing to call find() or index() so much at all! (A good example >>is the code that Raymond lifted from ConfigParser: a semicolon >>preceded by whitespace starts a comment, other semicolons don't. >>Surely there ought to be a better way to write that.) > > A higher level abstraction is surely the way to go. I may be missing something, but why invent yet another parsing method - we already have the re module. I'd suggest to use it :-) If re is not fast enough or you want more control over the parsing process, you could also have a look at mxTextTools: http://www.egenix.com/files/python/mxTextTools.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 28 2005) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Guido] > Another observation: despite the derogatory remarks about regular > expressions, they have one thing going for them: they provide a higher > level of abstraction for string parsing, which this is all about. > (They are higher level in that you don't have to be counting > characters, which is about the lowest-level activity in programming -- > only counting bytes is lower!) > > Maybe if we had a *good* way of specifying string parsing we wouldn't > be needing to call find() or index() so much at all! (A good example > is the code that Raymond lifted from ConfigParser: a semicolon > preceded by whitespace starts a comment, other semicolons don't. > Surely there ought to be a better way to write that.) A higher level abstraction is surely the way to go. I looked over the use cases for find and index. As from cases which are now covered by the "in" operator, it looks like you almost always want the index to support a subsequent partition of the string. That suggests that we need a variant of split() that has been customized for typical find/index use cases. Perhaps introduce a new pair of methods, partition() and rpartition() which work like this: >>> s = 'http://www.python.org' >>> s.partition('://') ('http', '://', 'www.python.org') >>> s.rpartition('.') ('http://www.python', '.', 'org') >>> s.partition('?') (''http://www.python.org', '', '') The idea is still preliminary and I have only applied it to a handful of the library's find() and index() examples. Here are some of the design considerations: * The function always succeeds unless the separator argument is not a string type or is an empty string. So, a typical call doesn't have to be wrapped in a try-suite for normal usage. * The split invariant is: s == ''.join(s.partition(t)) * The result of the partition is always a three element tuple. This allows the results to be unpacked directly: head, sep, tail = s.partition(t) * The use cases for find() indicates a need to both test for the presence of the split element and to then to make a slice at that point. If we used a contains test for the first step, we could end-up having to search the string twice (once for detection and once for splitting). However, by providing the middle element of the result tuple, we can determine found or not-found without an additional search. Accordingly, the middle element has a nice Boolean interpretation with '' for not-found and a non-empty string meaning found. Given (a,b,c)=s.partition(p), the following invariant holds: b == '' or b is p * Returning the left, center, and right portions of the split supports a simple programming pattern for repeated partitions: while s: head, part, s = s.partition(t) . . . Of course, if this idea survives the day, then I'll meet my own requirements and write a context diff on the standard library. That ought to give a good indication of how well the new methods meet existing needs and whether the resulting code is better, cleaner, clearer, faster, etc. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 2005-08-26, Terry Reedy <[EMAIL PROTECTED]> wrote: > Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? > With all the discussion, I think you guys should realize that the find/index method are actually convenient function which do 2 things in one call: 1) If the key exists? 2) If the key exists, find it out. But whether you use find or index, at the end, you *have to* break it into 2 step at then end in order to make bug free code. Without find, you can do: if s in txt: i = txt.index(s) ... else: pass or: try: i = txt.index(s) ... except ValueError: pass With find: i = txt.index(s) if i >= 0: ... else: pass The code is about the same except with exception, the test of Exception is pushed far apart instead of immediately. No much coding was saved. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Steve Holden <[EMAIL PROTECTED]> wrote: > > Josiah Carlson wrote: > > Donovan Baarda <[EMAIL PROTECTED]> wrote: > [...] > > > > One thing that has gotten my underwear in a twist is that no one has > > really offered up a transition mechanism from "str.find working like now" > > and some future "str.find or lack of" other than "use str.index". > > Obviously, I personally find the removal of str.find to be a nonstarter > > (don't make me catch exceptions or use regular expressions when both are > > unnecessary, please), but a proper transition of str.find from -1 to > > None on failure would be beneficial (can which one be chosen at runtime > > via __future__ import?). > > > > During a transition which uses __future__, it would encourage the > > /proper/ use of str.find in all modules and extensions in which use it... > > > > x = y.find(z) > > if x >= 0: > > #... > > > It does seem rather fragile to rely on the continuation of the current > behavior > > >>> None >= 0 > False Please see this previous post on None comparisons and why it is unlikely to change: http://mail.python.org/pipermail/python-dev/2003-December/041374.html > for the correctness of "proper usage". Is this guaranteed in future > implementations? Especially when: > > >>> type(None) >= 0 > True That is an interesting, but subjectively useless comparison: >>> type(0) >= 0 True >>> type(int) >= 0 True When do you ever compare the type of an object with the value of another object? > > Forcing people to use the proper semantic in their modules so as to be > > compatible with other modules which may or may not use str.find returns > > None, would (I believe) result in an overall reduction (if not > > elimination) of bugs stemming from str.find, and would prevent former > > str.find users from stumbling down the try/except/else misuse that Tim > > Peters highlighted. > > > Once "str.find() returns None to fail" becomes the norm then surely the > correct usage would be > > x = y.find(z) > if x is not None: > #... > > which is still a rather ugly paradigm, but acceptable. So the transition > is bound to be troubling. Perhaps, which is why I offered "x >= 0". > > Heck, if you can get the __future__ import working for choosing which > > str.find to use (on a global, not per-module basis), I say toss it into > > 2.6, or even 2.5 if there is really a push for this prior to 3.0 . > > The real problem is surely that one of find()'s legitimate return values > evaluates false in a Boolean context. It's especially troubling that the > value that does so doesn't indicate search failure. I'd prefer to live > with the wart until 3.0 introduces something more satisfactory, or > simply removes find() altogether. Otherwise the resulting code breakage > when the future arrives just causes unnecessary pain. Here's a current (horrible but common) solution: x = string.find(substring) + 1 if x: x -= 1 ... ...I'm up way to late. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Josiah Carlson wrote: > Donovan Baarda <[EMAIL PROTECTED]> wrote: [...] > > One thing that has gotten my underwear in a twist is that no one has > really offered up a transition mechanism from "str.find working like now" > and some future "str.find or lack of" other than "use str.index". > Obviously, I personally find the removal of str.find to be a nonstarter > (don't make me catch exceptions or use regular expressions when both are > unnecessary, please), but a proper transition of str.find from -1 to > None on failure would be beneficial (can which one be chosen at runtime > via __future__ import?). > > During a transition which uses __future__, it would encourage the > /proper/ use of str.find in all modules and extensions in which use it... > > x = y.find(z) > if x >= 0: > #... > It does seem rather fragile to rely on the continuation of the current behavior >>> None >= 0 False for the correctness of "proper usage". Is this guaranteed in future implementations? Especially when: >>> type(None) >= 0 True > Forcing people to use the proper semantic in their modules so as to be > compatible with other modules which may or may not use str.find returns > None, would (I believe) result in an overall reduction (if not > elimination) of bugs stemming from str.find, and would prevent former > str.find users from stumbling down the try/except/else misuse that Tim > Peters highlighted. > Once "str.find() returns None to fail" becomes the norm then surely the correct usage would be x = y.find(z) if x is not None: #... which is still a rather ugly paradigm, but acceptable. So the transition is bound to be troubling. > Heck, if you can get the __future__ import working for choosing which > str.find to use (on a global, not per-module basis), I say toss it into > 2.6, or even 2.5 if there is really a push for this prior to 3.0 . The real problem is surely that one of find()'s legitimate return values evaluates false in a Boolean context. It's especially troubling that the value that does so doesn't indicate search failure. I'd prefer to live with the wart until 3.0 introduces something more satisfactory, or simply removes find() altogether. Otherwise the resulting code breakage when the future arrives just causes unnecessary pain. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Donovan Baarda <[EMAIL PROTECTED]> wrote: > > On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote: > > Guido van Rossum <[EMAIL PROTECTED]> wrote: > [...] > > Oh, there's a good thing to bring up; regular expressions! re.search > > returns a match object on success, None on failure. With this "failure > > -> Exception" idea, shouldn't they raise exceptions instead? And > > goodness, defining a good regular expression can be quite hard, possibly > > leading to not insignificant "my regular expression doesn't do what I > > want it to do" bugs. Just look at all of those escape sequences and the > > syntax! It's enough to make a new user of Python gasp. > > I think re.match() returning None is an example of 1b (as categorised by > Terry Reedy). In this particular case a 1b style response is OK. Why; My tongue was firmly planted in my cheek during my discussion of regular expressions. I was using it as an example of when one starts applying some arbitrary rule to one example, and not noticing other examples that do very similar, if not the same thing. [snip discussion of re.match, re.search, str.find] If you are really going to compare re.match, re.search and str.find, you need to point out that neither re.match nor re.search raise an exception when something isn't found (only when you try to work with None). This puts str.index as the odd-man-out in this discussion of searching a string - so the proposal of tossing str.find as the 'weird one' is a little strange. One thing that has gotten my underwear in a twist is that no one has really offered up a transition mechanism from "str.find working like now" and some future "str.find or lack of" other than "use str.index". Obviously, I personally find the removal of str.find to be a nonstarter (don't make me catch exceptions or use regular expressions when both are unnecessary, please), but a proper transition of str.find from -1 to None on failure would be beneficial (can which one be chosen at runtime via __future__ import?). During a transition which uses __future__, it would encourage the /proper/ use of str.find in all modules and extensions in which use it... x = y.find(z) if x >= 0: #... Forcing people to use the proper semantic in their modules so as to be compatible with other modules which may or may not use str.find returns None, would (I believe) result in an overall reduction (if not elimination) of bugs stemming from str.find, and would prevent former str.find users from stumbling down the try/except/else misuse that Tim Peters highlighted. Heck, if you can get the __future__ import working for choosing which str.find to use (on a global, not per-module basis), I say toss it into 2.6, or even 2.5 if there is really a push for this prior to 3.0 . - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote: > Guido van Rossum <[EMAIL PROTECTED]> wrote: [...] > Oh, there's a good thing to bring up; regular expressions! re.search > returns a match object on success, None on failure. With this "failure > -> Exception" idea, shouldn't they raise exceptions instead? And > goodness, defining a good regular expression can be quite hard, possibly > leading to not insignificant "my regular expression doesn't do what I > want it to do" bugs. Just look at all of those escape sequences and the > syntax! It's enough to make a new user of Python gasp. I think re.match() returning None is an example of 1b (as categorised by Terry Reedy). In this particular case a 1b style response is OK. Why; 1) any successful match evaluates to "True", and None evaluates to "False". This allows simple code like; if myreg.match(s): do something. Note you can't do this for find, as 0 is a successful "find" and evaluates to False, whereas other results including -1 evaluate to True. Even worse, -1 is a valid index. 2) exceptions are for unexpected events, where unexpected means "much less likely than other possibilities". The re.match() operation asks "does this match this", which implies you have an about even chance of not matching... ie a failure to match is not unexpected. The result None makes sense... "what match did we get? None, OK". For str.index() you are asking "give me the index of this inside this", which implies you expect it to be in there... ie not finding it _is_ unexpected, and should raise an exception. Note that re.match() returning None will raise exceptions if the rest of your code doesn't expect it; index = myreg.match(s).start() tail = s[index:] This will raise an exception if there was no match. Unlike str.find(); index = s.find(r) tail = s[index:] Which will happily return the last character if there was no match. This is why find() should return None instead of -1. > With the existance of literally thousands of uses of .find and .rfind in > the wild, any removal consideration should be weighed heavily - which > honestly doesn't seem to be the case here with the ~15 minute reply time > yesterday (just my observation and opinion). If you had been ruminating > over this previously, great, but that did not seem clear to me in your > original reply to Terry Reedy. bare in mind they are talking about Python 3.0... I think :-) -- Donovan Baarda <[EMAIL PROTECTED]> http://minkirri.apana.org.au/~abo/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/27/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > With the existance of literally thousands of uses of .find and .rfind in > the wild, any removal consideration should be weighed heavily - which > honestly doesn't seem to be the case here with the ~15 minute reply time > yesterday (just my observation and opinion). If you had been ruminating > over this previously, great, but that did not seem clear to me in your > original reply to Terry Reedy. I hadn't been ruminating about deleting it previously, but I was well aware of the likelihood of writing buggy tests for find()'s return value. I believe that str.find() is not just something that can be used to write buggy code, but something that *causes* bugs over and over again. (However, see below.) The argument that there are thousands of usages in the wild doesn't carry much weight when we're talking about Python 3.0. There are at least a similar number of modules that expect dict.keys(), zip() and range() to return lists, or that depend on the distinction between Unicode strings and 8-bit strings, or on bare except:, on any other feature that is slated for deletion in Python 3.0 for which the replacement requires careful rethinking of the code rather than a mechanical translation. The *premise* of Python 3.0 is that it drops backwards compatibility in order to make the language better in the long term. Surely you believe that the majority of all Python programs have yet to be written? The only argument in this thread in favor of find() that made sense to me was Tim Peters' observation that the requirement to use a try/except clause leads to another kind of sloppy code. It's hard to judge which is worse -- the buggy find() calls or the buggy/cumbersome try/except code. Note that all code (unless it needs to be backwards compatible to Python 2.2 and before) which is using find() to merely detect whether a given substring is present should be using 's1 in s2' instead. Another observation: despite the derogatory remarks about regular expressions, they have one thing going for them: they provide a higher level of abstraction for string parsing, which this is all about. (They are higher level in that you don't have to be counting characters, which is about the lowest-level activity in programming -- only counting bytes is lower!) Maybe if we had a *good* way of specifying string parsing we wouldn't be needing to call find() or index() so much at all! (A good example is the code that Raymond lifted from ConfigParser: a semicolon preceded by whitespace starts a comment, other semicolons don't. Surely there ought to be a better way to write that.) All in all, I'm still happy to see find() go in Python 3.0, but I'm leaving the door ajar: if you read this post carefully, you'll know what arguments can be used to persuade me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/26/05, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote: > > Can str.find be listed in PEP 3000 (under builtins) for removal? > > Yes please. (Except it's not technically a builtin but a string method.) > Done. Added an "Atomic Types" section to the PEP as well. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Guido van Rossum <[EMAIL PROTECTED]> wrote: > > On 8/26/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > > Taking a look at the commits that Guido did way back in 1993, he doesn't > > mention why he added .find, only that he did. Maybe it was another of > > the 'functional language additions' that he now regrets, I don't know. > > There's nothing functional about it. I remember adding it after > finding it cumbersome to write code using index/rindex. However, that > was long before we added startswith(), endswith(), and 's in t' for > multichar s. Clearly all sorts of varieties of substring matching are > important, or we wouldn't have so many methods devoted to it! (Not to > mention the 're' module.) > > However, after 12 years, I believe that the small benefit of having > find() is outweighed by the frequent occurrence of bugs in its use. Oh, there's a good thing to bring up; regular expressions! re.search returns a match object on success, None on failure. With this "failure -> Exception" idea, shouldn't they raise exceptions instead? And goodness, defining a good regular expression can be quite hard, possibly leading to not insignificant "my regular expression doesn't do what I want it to do" bugs. Just look at all of those escape sequences and the syntax! It's enough to make a new user of Python gasp. Most of us are consenting adults here. If someone writes buggy code with str.find, that is unfortunate, maybe they should have used regular expressions and tested for None, maybe they should have used str.startswith (which is sometimes slower than m == n[:len(m)], but I digress), maybe they should have used str.index. But just because buggy code can be written with it, doesn't mean that it should be removed. Buggy code can, will, and has been written with every Python mechanism that has ever existed or will ever exist. With the existance of literally thousands of uses of .find and .rfind in the wild, any removal consideration should be weighed heavily - which honestly doesn't seem to be the case here with the ~15 minute reply time yesterday (just my observation and opinion). If you had been ruminating over this previously, great, but that did not seem clear to me in your original reply to Terry Reedy. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Tim] > You probably want "except ValueError:" in all these, not "except > ValueError():". Right. I was misremembering the new edict to write: raise ValueError() Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Raymond Hettinger, rewrites some code] > ... > --- StringIO.py --- > > i = self.buf.find('\n', self.pos) > if i < 0: >newpos = self.len > else: >newpos = i+1 > . . . > > > try: >i = self.buf.find('\n', self.pos) > except ValueError(): >newpos = self.len > else: >newpos = i+1 > . . . You probably want "except ValueError:" in all these, not "except ValueError():". Leaving that alone, the last example particularly shows one thing I dislike about try/except here: in a language with properties, how is the code reader supposed to guess that it's specifically and only the .find() call that _can_ raise ValueError in i = self.buf.find('\n', self.pos) ? I agree it's clear enough here from context, but there's no confusion possible on this point in the original spelling: it's immediately obvious that the result of find() is the only thing being tested. There's also strong temptation to slam everything into the 'try' block, and reduce nesting: newpos = self.len try: newpos = self.buf.find('\n', self.pos) + 1 except ValueError: pass I've often seen code in the wild with, say, two-three dozen lines in a ``try`` block, with an "except AttributeError:" that was _intended_ to catch an expected AttributeError only in the second of those lines. Of course that hides legitimate bugs too. Like ``object.attr``, the result of ``string.find()`` is normally used in further computation, so the temptation is to slam the computation inside the ``try`` block too. .find() is a little delicate to use, but IME sloppy try/except practice (putting much more in the ``try`` block than the specific little operation where an exception is expected) is common, and harder to get people to change because it requires thought instead of just reading the manual to see that -1 means "not there" <0.5 wink>. Another consideration is code that needs to use .find() a _lot_. In my programs of that sort, try/except is a lot more expensive than letting -1 signal "not there". ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Remove str.find in 3.0?
[Guido] > However, after 12 years, I believe that the small benefit of having > find() is outweighed by the frequent occurrence of bugs in its use. My little code transformation exercise is bearing that out. Two of the first four cases in the standard library were buggy :-( Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On Sat, 27 Aug 2005 16:46:07 +0200, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 8/27/05, Wolfgang Lipp <[EMAIL PROTECTED]> wrote: >> i never expected .get() >> to work that way (return an unsolicited None) -- i do consider this >> behavior harmful and suggest it be removed. > > That's a bizarre attitude. You don't read the docs and hence you want > a feature you weren't aware of to be removed? i do read the docs, and i believe i do keep a lot of detail in my head. every now and then, tho, you piece sth together using a logic that is not 100% the way it was intended, or the way it came about. let me say that for someone who did developement for python for a while it is natural to know that ~.get() is there for avoidance of exceptions, and default values are an afterthought, but for someone who did developement *with* python (and lacks experience of the other side) this ain't necessarily so. that said, i believe it to be more expressive and safer to demand ~.get('x',None) to be written to achieve the present behavior, and let ~.get('x') raise an exception. personally, i can live with either way, and am happier the second. just my thoughts. > I'm glad you're not on *my* team. (Emphasis mine. :-) i wonder what that would be like. _wolf ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/26/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > Taking a look at the commits that Guido did way back in 1993, he doesn't > mention why he added .find, only that he did. Maybe it was another of > the 'functional language additions' that he now regrets, I don't know. There's nothing functional about it. I remember adding it after finding it cumbersome to write code using index/rindex. However, that was long before we added startswith(), endswith(), and 's in t' for multichar s. Clearly all sorts of varieties of substring matching are important, or we wouldn't have so many methods devoted to it! (Not to mention the 're' module.) However, after 12 years, I believe that the small benefit of having find() is outweighed by the frequent occurrence of bugs in its use. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/27/05, Wolfgang Lipp <[EMAIL PROTECTED]> wrote: > i never expected .get() > to work that way (return an unsolicited None) -- i do consider this > behavior harmful and suggest it be removed. That's a bizarre attitude. You don't read the docs and hence you want a feature you weren't aware of to be removed? I'm glad you're not on *my* team. (Emphasis mine. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/27/05, Kay Schluehr <[EMAIL PROTECTED]> wrote: > The discourse about Python3000 has shrunken from the expectation of the > "next big thing" into a depressive rhetorics of feature elimination. > The language doesn't seem to become deeper, smaller and more powerfull > but just smaller. I understand how your perception reading python-dev would make you think that, but it's not true. There is much focus on removing things, because we want to be able to add new stuff but we don't want the language to grow. Python-dev is (correctly) very focused on the status quo and the near future, so discussions on what can be removed without hurting are valuable here. Discussions on what to add should probably happen elsewhere, since the proposals tend to range from genius to insane (sometimes within one proposal :-) and the discussion tends to become even more rampant than the discussions about changes in 2.5. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Raymond Hettinger wrote: > [Martin] >> For another example, file.read() returns an empty string at EOF. > > When my turn comes for making 3.0 proposals, I'm going to recommend > nixing the "empty string at EOF" API. That is a carry-over from C that > made some sense before there were iterators. Now, we have the option of > introducing much cleaner iterator versions of these methods that use > compact, fast, and readable for-loops instead of multi-statement > while-loop boilerplate. I think for char in iter(lambda: f.read(1), ''): pass is not bad, too. Reinhold -- Mail address is perfectly valid! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/27/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > --- From ConfigParser.py --- > > optname, vi, optval = mo.group('option', 'vi', 'value') > if vi in ('=', ':') and ';' in optval: > # ';' is a comment delimiter only if it follows > # a spacing character > pos = optval.find(';') > if pos != -1 and optval[pos-1].isspace(): > optval = optval[:pos] > optval = optval.strip() > . . . > > > optname, vi, optval = mo.group('option', 'vi', 'value') > if vi in ('=', ':') and ';' in optval: > # ';' is a comment delimiter only if it follows > # a spacing character > try: > pos = optval.index(';') > except ValueError(): I'm sure you meant "except ValueError:" > pass > else: > if optval[pos-1].isspace(): > optval = optval[:pos] > optval = optval.strip() > . . . That code is buggy before and after the transformation -- consider what happens if optval *starts* with a semicolon. Also, the code is searching optval for ';' twice. Suggestion: if vi in ('=',':'): try: pos = optval.index(';') except ValueError: pass else: if pos > 0 and optval[pos-1].isspace(): optval = optval[:pos] -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/27/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > [Martin] > > For another example, file.read() returns an empty string at EOF. > > When my turn comes for making 3.0 proposals, I'm going to recommend > nixing the "empty string at EOF" API. That is a carry-over from C that > made some sense before there were iterators. Now, we have the option of > introducing much cleaner iterator versions of these methods that use > compact, fast, and readable for-loops instead of multi-statement > while-loop boilerplate. -1. For reading lines we already have that in the status quo. For reading bytes, I *know* that a lot of code would become uglier if the API changed to raise EOFError exceptions. It's not a coincidence that raw_input() raises EOFError but readline() doesn't -- the readline API was designed after externsive experience with raw_input(). The situation is different than for find(): - there aren't two APIs that only differ in their handling of the exceptional case - the error return value tests false and all non-error return values tests true - in many cases processing the error return value the same as non-error return values works just fine (as long as you have another way to test for termination) Also, even if read() raised EOFError instead of returning '', code that expects certain data wouldn't be simplified -- after attempting to read e.g. 4 bytes, you'd still have to check that you got exactly 4, so there'd be three cases to handle (EOFError, short, good) instead of two (short, good). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
kay, your suggestion makes perfect sense for me, i haven't actually tried the examples tho. guess there could be a find() or index() or indices() or iterIndices() ??? function 'f' roughly with these arguments: def f( x, element, start = 0, stop = None, default = _Misfit, maxcount = None, reverse = False ) that iterates over the indices of x where element (a substring, key, or value in a sequence or iterator) is found, raising sth. like IndexError when nothing at all is found except when default is not '_Misfit' (mata-None), and starts looking from the right end when reverse is True (this *may* imply that reversed(x) is done on x where no better implementation is available). not quite sure whether it makes sense to me to always return default as the last value of the iteration -- i tend to say rather not. ah yes, only up to maxcount indices are yielded. bet it said that passing an iterator for x would mean that the iterator is gone up to where the last index was yielded; passing an iterator is not acceptable for reverse = True. MHO, _wolf On Sat, 27 Aug 2005 14:57:08 +0200, Kay Schluehr <[EMAIL PROTECTED]> wrote: > > def keep(iter, default=None): > try: > return iter.next() > except StopIteration: > return default > > Together with an index iterator the user can mimic the behaviour he > wants. Instead of a ValueError a StopIteration exception can hold as > an "external" information ( other than a default value ): > > >>> keep( "abcdabc".index("bc"), default=-1) # current behaviour of the ># find() function > >>> (idx for idx in "abcdabc".rindex("bc")) # generator expression > > > Since the find() method acts on a string literal it is not easy to > replace it syntactically. But why not add functions that can be hooked > into classes whose objects are represented by literals? > > def find( string, substring): > return keep( string.index( substring), default=-1) > > str.register(find) > > >>> "abcdabc".find("bc") > 1 > > Now find() can be stored in a pure Python module without maintaining it > on interpreter level ( same as with reduce, map and filter ). > > Kay ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Wolfgang Lipp wrote: > > Just because you don't read the documentation and guessed wrong > > d.get() needs to be removed?!? > > no, not removed... never said that. Fair enough, you proposed to remove the behavior. Not sure how that's all that much less bad, though... > implied). the reason of being for d.get() -- to me -- is simply so you > get a chance to pass a default value, which is syntactically well-nigh > impossible with d['x']. Close, but the main reason to add d.get() was to avoid the exception. The need to specify a default value followed from that. Just ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
FWIW, here are three more comparative code fragments. They are presented without judgment as an evaluation tool to let everyone form their own opinion about the merits of each: --- From CGIHTTPServer.py --- def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info i = rest.rfind('?') if i >= 0: rest, query = rest[:i], rest[i+1:] else: query = '' i = rest.find('/') if i >= 0: script, rest = rest[:i], rest[i:] else: script, rest = rest, '' . . . def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info try: i = rest.rindex('?') except ValueError(): query = '' else: rest, query = rest[:i], rest[i+1:] try: i = rest.index('/') except ValueError(): script, rest = rest, '' else: script, rest = rest[:i], rest[i:] . . . --- From ConfigParser.py --- optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character pos = optval.find(';') if pos != -1 and optval[pos-1].isspace(): optval = optval[:pos] optval = optval.strip() . . . optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character try: pos = optval.index(';') except ValueError(): pass else: if optval[pos-1].isspace(): optval = optval[:pos] optval = optval.strip() . . . --- StringIO.py --- i = self.buf.find('\n', self.pos) if i < 0: newpos = self.len else: newpos = i+1 . . . try: i = self.buf.find('\n', self.pos) except ValueError(): newpos = self.len else: newpos = i+1 . . . My notes so far weren't meant to judge the proposal. I'm just suggesting that examining fragments like the ones above will help inform the design process. Peace, Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Terry Reedy wrote: >>I would object to the removal of str.find(). > > > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. I actually don't see much benefits from the user perspective. The discourse about Python3000 has shrunken from the expectation of the "next big thing" into a depressive rhetorics of feature elimination. The language doesn't seem to become deeper, smaller and more powerfull but just smaller. > B. Change the return type of .find to None. > > C. Remove .(r)index instead. > > D. Add more redundancy for those who do not like exceptions. Why not turning index() into an iterator that yields indices sucessively? From this generalized perspective we can try to reconstruct behaviour of Python 2.X. Sometimes I use a custom keep() function if I want to prevent defining a block for catching StopIteration. The keep() function takes an iterator and returns a default value in case of StopIteration: def keep(iter, default=None): try: return iter.next() except StopIteration: return default Together with an index iterator the user can mimic the behaviour he wants. Instead of a ValueError a StopIteration exception can hold as an "external" information ( other than a default value ): >>> keep( "abcdabc".index("bc"), default=-1) # current behaviour of the # find() function >>> (idx for idx in "abcdabc".rindex("bc")) # generator expression Since the find() method acts on a string literal it is not easy to replace it syntactically. But why not add functions that can be hooked into classes whose objects are represented by literals? def find( string, substring): return keep( string.index( substring), default=-1) str.register(find) >>> "abcdabc".find("bc") 1 Now find() can be stored in a pure Python module without maintaining it on interpreter level ( same as with reduce, map and filter ). Kay ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On Sat, 27 Aug 2005 13:01:02 +0200, Just van Rossum <[EMAIL PROTECTED]> wrote: > Just because you don't read the documentation and guessed wrong d.get() > needs to be removed?!? no, not removed... never said that. > It's a *feature* of d.get(k) to never raise KeyError. If you need an > exception, why not just use d[k]? i agree i misread the specs, but then, i read the specs a lot, and i guess everyone here agrees that if it's in the specs doesn't mean it's automatically what we want or expect -- else there's nothing to discuss. i say d.get('x') == None <== { ( 'x' not in d ) OR ( d['x'] == None ) } is not what i expect (even tho the specs say so) especially since d.pop('x') *does* throw a KeyError when 'x' is not a key in mydict. ok, pop is not get and so on but still i perceive this a problematic behavior (to the point i call it a 'bug' in a jocular way, no offense implied). the reason of being for d.get() -- to me -- is simply so you get a chance to pass a default value, which is syntactically well-nigh impossible with d['x']. _wolf ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Wolfgang Lipp wrote: > On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. Löwis <[EMAIL PROTECTED]> > wrote: > > P.S. Emphasis mine :-) > > no, emphasis all **mine** :-) just to reflect i never expected .get() > to work that way (return an unsolicited None) -- i do consider this > behavior harmful and suggest it be removed. Just because you don't read the documentation and guessed wrong d.get() needs to be removed?!? It's a *feature* of d.get(k) to never raise KeyError. If you need an exception, why not just use d[k]? Just ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. Löwis <[EMAIL PROTECTED]> wrote: > P.S. Emphasis mine :-) no, emphasis all **mine** :-) just to reflect i never expected .get() to work that way (return an unsolicited None) -- i do consider this behavior harmful and suggest it be removed. _wolf ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Wolfgang Lipp wrote: > that's a bug! i had to *test* it to find out it's true! i've been writing > code for *years* all in the understanding that dict.get(x) acts precisely > like dict['x'] *except* you get a chance to define a default value. Clearly, your understanding *all* these years *was* wrong. If you don't specify *a* default value, *it* defaults to None. Regards, Martin P.S. Emphasis mine :-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On Sat, 27 Aug 2005 08:54:12 +0200, Martin v. Löwis <[EMAIL PROTECTED]> wrote: > with choice 1a): dict.get returns None if the key is not found, even > though None could also be the value for the key. that's a bug! i had to *test* it to find out it's true! i've been writing code for *years* all in the understanding that dict.get(x) acts precisely like dict['x'] *except* you get a chance to define a default value. which, for me, has become sort of a standard solution to the problem the last ten or so postings were all about: when i write a function and realize that's one of the cases where python philosophy strongly favors raising an exception because something e.g. could not be found where expected, i make it so that a reasonable exception is raised *and* where meaningful i give consumers a chance to pass in a default value to eschew exceptions. i believe this is the way to go to resolve this .index/.find conflict. and, no, returning -1 when a substring is not found and None when a key is not found is *highly* problematic. i'd sure like to see cases like that to go. i'm not sure why .rindex() should go (correct?), and how to do what it does (reverse the string before doing .index()? is that what is done internally?) and of course, like always, there is the question why these are methods at all and why there is a function len(str) but a method str.index(); one could just as well have *either* str.length and str.index() *or* length(str) and, say, a builtin locate( x, element, start = 0 , stop = None, reversed = False, default = Misfit ) (where Misfit indicates a 'meta-None', so None is still a valid default value; i also like to indicate 'up to the end' with stop=None) that does on iterables (or only on sequences) what the methods do now, but with this strange pattern: -- .index() .find() .get() .pop() list + ?(3) + tuple ?(3) ??(1) str++ ?(3) ??(1) dict x(2) x(2) + + (1) one could argue this should return a copy of a tuple or str, but doubtful. (2) index/find meaningless for dicts. (3) there is no .get() for list, tuple, str, although it would make sense: return the indexed element, or raise IndexError where not found if no default return value given. -- what bites me here is expecially that we have both index and find for str *but a gaping hole* for tuples. assuming tuples are not slated for removal, i suggest to move in a direction that makes things look more like this: -- .index() .get() .pop() list + + + tuple + + str+ + dict + + -- where .index() looks like locate, above: -- {list,tuple,str}.index( element,# element in the collection start = 0, # where to start searching; default is zero stop = None,# where to end; the default, None, indicates # 'to the end' reversed = False, # should we search from the back? *may* cause # reversion of sequence, depending on impl. default = _Misfit, # default value, when given, prevents # IndexError from being raised ) -- hope i didn't miss out crucial points here. _wolf -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
[Martin] > For another example, file.read() returns an empty string at EOF. When my turn comes for making 3.0 proposals, I'm going to recommend nixing the "empty string at EOF" API. That is a carry-over from C that made some sense before there were iterators. Now, we have the option of introducing much cleaner iterator versions of these methods that use compact, fast, and readable for-loops instead of multi-statement while-loop boilerplate. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
> > The most important reason for the patch is that looking at the context > > diff will provide an objective look at how real code will look before > > and after the change. This would make subsequent discussions > > substantially more informed and less anecdotal. > > No, you're just artificially trying to raise the bar for Python 3.0 > proposals to an unreasonable height. Not really. I'm mostly for the proposal (+0), but am certain the conversation about the proposal would be substantially more informed if we had a side-by-side comparison of what real-world code looks like before and after the change. There are not too many instances of str.find() in the library and it is an easy patch to make. I'm just asking for a basic, objective investigative tool. Unlike more complex proposals, this one doesn't rely on any new functionality. It just says don't use X anymore. That makes it particularly easy to investigate in an objective way. BTW, this isn't unprecedented. We're already done it once when backticks got slated for removal in 3.0. All instances of it got changed in the standard library. As a result of the patch, we were able to 1) get an idea of how much work it took, 2) determine every category of use case, 3) learn that the resulting code was more beautiful, readable, and only microscopically slower, 4) learn about a handful of cases that were unexpectedly difficult to convert, and 5) update the library to be an example of what we think modern code looks like. That patch painlessly informed the decision making and validated that we were doing the right thing. The premise of Terry's proposal is that Python code is better when str.find() is not used. This is a testable proposition. Why not use the wealth of data at our fingertips to augment a priori reasoning and anecdotes. I'm not at all arguing against the proposal; I'm just asking for a thoughtful design process. Raymond P.S. Josiah was not alone. The comp.lang.python discussion had other posts expressing distaste for raising exceptions instead of using return codes. While I don't feel the same way, I don't think the respondants should be ignored. "Those people who love sausage and respect the law should not watch either one being made." ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Bill Janssen wrote: >> There are basically two ways for a system, such as a >> Python function, to indicate 'I cannot give a normal response." One (1a) >> is to give an inband signal that is like a normal response except that it >> is not (str.find returing -1). A variation (1b) is to give an inband >> response that is more obviously not a real response (many None returns). >> The other (2) is to not respond (never return normally) but to give an >> out-of-band signal of some sort (str.index raising ValueError). >> >> Python as distributed usually chooses 1b or 2. I believe str.find and >> .rfind are unique in the choice of 1a. > > Doubt it. The problem with returning None is that it tests as False, > but so does 0, which is a valid string index position. Heh. You know what the Perl6 folks would suggest in this case? return 0 but true; # literally! > Might add a boolean "str.contains()" to cover this test case. There's already __contains__. Reinhold -- Mail address is perfectly valid! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
Terry Reedy wrote: > One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. That is not true. str.find's choice is not 1a, and there are other functions which chose 1a): -1 does *not* look like a normal response, since a normal response is non-negative. It is *not* the only method with choice 1a): dict.get returns None if the key is not found, even though None could also be the value for the key. For another example, file.read() returns an empty string at EOF. > I am pretty sure that the choice > of -1 as error return, instead of, for instance, None, goes back the the > need in static languages such as C to return something of the declared > return type. But Python is not C, etcetera. I believe that this pair is > also unique in having exact counterparts of type 2. dict.__getitem__ is a counterpart of type 2 of dict.get. > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. My favourite choice is the status quo. I probably don't fully understand the word "to streamline", but I don't see this as rationalizing. Instead, some applications will be more tedious to write. > So are you advocating D above or claiming that substring indexing is > uniquely deserving of having two versions? If the latter, why so special? Because it is no exception that a string is not part of another string, and because the question I'm asking "is the string in the other string, and if so, where?". This is similar to the question "does the dictionary have a value for that key, and if so, which?" > If we only has str.index, would you actually suggest adding this particular > duplication? That is what happened to dict.get: it was not originally there (I believe), but added later. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Terry Reedy" <[EMAIL PROTECTED]> wrote: > > "Josiah Carlson" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > > > "Terry Reedy" <[EMAIL PROTECTED]> wrote: > >> > >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > Guido has already approved, I noticed, but he approved before anyone could say anything. I understand it is a dictatorship, but he seems to take advisment and reverse (or not) his decisions on occasion based on additional information. Whether this will lead to such, I don't know. > but I will try to explain my reasoning a bit > better for you. There are basically two ways for a system, such as a > Python function, to indicate 'I cannot give a normal response." One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). A variation (1b) is to give an inband > response that is more obviously not a real response (many None returns). > The other (2) is to not respond (never return normally) but to give an > out-of-band signal of some sort (str.index raising ValueError). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. I am pretty sure that the choice > of -1 as error return, instead of, for instance, None, goes back the the > need in static languages such as C to return something of the declared > return type. But Python is not C, etcetera. I believe that this pair is > also unique in having exact counterparts of type 2. (But maybe I forgot > something.) Taking a look at the commits that Guido did way back in 1993, he doesn't mention why he added .find, only that he did. Maybe it was another of the 'functional language additions' that he now regrets, I don't know. > >> Would anyone really object? > > > I would object to the removal of str.find(). > > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. str.find is not a language construct. It is a method on a built-in type that many people use. This is my vote. > B. Change the return type of .find to None. Again, this would break potentially thousands of lines of user code that is in the wild. Are we talking about changes for 2.5 here, or 3.0? > C. Remove .(r)index instead. see below * > D. Add more redundancy for those who do not like exceptions. In 99% of the cases, such implementations would be minimal. While I understand that "There should be one-- and preferably only one --obvious way to do it.", please see below *. > > Further, forcing users to use try/except when they are looking for the > > offset of a substring seems at least a little strange (if not a lot > > braindead, no offense to those who prefer their code to spew exceptions > > at every turn). > > So are you advocating D above or claiming that substring indexing is > uniquely deserving of having two versions? If the latter, why so special? > If we only has str.index, would you actually suggest adding this particular > duplication? Apparently everyone has forgotten the dozens of threads on similar topics over the years. I'll attempt to summarize. Adding functionality that isn't used is harmful, but not nearly as harmful as removing functionality that people use. If you take just two seconds and do a search on '.find(' vs '.index(' in the standard library, you will notice that '.find(' is used more often than '.index(' regardless of type (I don't have the time this evening to pick out which ones are string only, but I doubt the standard library uses mmap.find, DocTestFinder.find, or gettext.find). This example seems to show that people find str.find to be more intuitive and/or useful than str.index, even though you spent two large paragraphs explaining that Python 'doesn't do it that way very often so it isn't Pythonic'. Apparently the majority of people who have been working on the standard library for the last decade disagree. > > Considering the apparent dislike/hatred for str.find. > > I don't hate str.find. I simply (a) recognize that a function designed for > static typing constraints is out of place in Python, which does not have > those constraints and (b) believe that there is no reason other than > history for the duplication and (c) believe that dropping .find is > definitely better than dropping .index and changing .find. * I don't see why it is necessary to drop or change either one. We've got list() and [] for construcing a list. Heck, we've even got list(iterable) and [i for i in iterable] for making a list copy of any arbitrary iterable. This goes against TSBOOWTDI, so why don't we toss list comprehensions now that we have list(generator expression)? Or did I miss something and this was already going to happen? > > Would you further request that .rfind be removed from strings? > > Of course. Thanks for reminding me. No problem, but again, do a search in the standard libr
Re: [Python-Dev] Remove str.find in 3.0?
Don't know *what* I wasn't thinking :-). Bill > On 8/26/05, Bill Janssen <[EMAIL PROTECTED]> wrote: > > Doubt it. The problem with returning None is that it tests as False, > > but so does 0, which is a valid string index position. The reason > > string.find() returns -1 is probably to allow a test: > > > > if line.find("\f"): > > ... do something > > This has a bug; it is equivalent to "if not line.startswith("\f"):". > > This mistake (which I have made more than once myself and have seen > many times in code by others) is one of the main reasons to want to > get rid of this style of return value. > > > Might add a boolean "str.contains()" to cover this test case. > > We already got that: "\f" in line. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/26/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > I had one further thought. In addition to your excellent list of > reasons, it would be great if these kind of requests were accompanied by > a patch that removed the offending construct from the standard library. Um? Are we now requiring patches for PYTHON THREE DOT OH proposals? Raymond, we all know and agree that Python 3.0 will be incompatible in many ways. range() and keys() becoming iterators, int/int returning float, and so on; we can safely say that it will break nearly every module under the sun, and no amount of defensive coding in Python 2.x will save us. > The most important reason for the patch is that looking at the context > diff will provide an objective look at how real code will look before > and after the change. This would make subsequent discussions > substantially more informed and less anecdotal. No, you're just artificially trying to raise the bar for Python 3.0 proposals to an unreasonable height. > The second reason is that the revised library code becomes more likely > to survive the transition to 3.0. Further, it can continue to serve as > example code which highlights current best practices. But we don't *want* all of the library code to survive. Much of it is 10-15 years old and in dear need of a total rewrite. See Anthony Baxter's lightning talk at OSCON (I'm sure Google can find it for you). > This patch wouldn't take long. I've tried about a half dozen cases > since you first posted. Each provided a new insight (zipfile was not > improved, webbrowser was improved, and urlparse was about the same). So it's neutral in terms of code readability. Great. Given all the other advantages for the proposal (an eminent member of this group just posted a buggy example :-) I'm now doubly convinced that we should do it. Also remember, the standard library is rather atypical -- while some of it makes great example code, other parts of it are highly contorted in order to either maintain backwards compatibility or provide an unusually high level of defensiveness. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/26/05, Bill Janssen <[EMAIL PROTECTED]> wrote: > Doubt it. The problem with returning None is that it tests as False, > but so does 0, which is a valid string index position. The reason > string.find() returns -1 is probably to allow a test: > > if line.find("\f"): > ... do something This has a bug; it is equivalent to "if not line.startswith("\f"):". This mistake (which I have made more than once myself and have seen many times in code by others) is one of the main reasons to want to get rid of this style of return value. > Might add a boolean "str.contains()" to cover this test case. We already got that: "\f" in line. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
> There are basically two ways for a system, such as a > Python function, to indicate 'I cannot give a normal response." One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). A variation (1b) is to give an inband > response that is more obviously not a real response (many None returns). > The other (2) is to not respond (never return normally) but to give an > out-of-band signal of some sort (str.index raising ValueError). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. Doubt it. The problem with returning None is that it tests as False, but so does 0, which is a valid string index position. The reason string.find() returns -1 is probably to allow a test: if line.find("\f"): ... do something Might add a boolean "str.contains()" to cover this test case. Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Josiah Carlson" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > "Terry Reedy" <[EMAIL PROTECTED]> wrote: >> >> Can str.find be listed in PEP 3000 (under builtins) for removal? Guido has already approved, but I will try to explain my reasoning a bit better for you. There are basically two ways for a system, such as a Python function, to indicate 'I cannot give a normal response." One (1a) is to give an inband signal that is like a normal response except that it is not (str.find returing -1). A variation (1b) is to give an inband response that is more obviously not a real response (many None returns). The other (2) is to not respond (never return normally) but to give an out-of-band signal of some sort (str.index raising ValueError). Python as distributed usually chooses 1b or 2. I believe str.find and .rfind are unique in the choice of 1a. I am pretty sure that the choice of -1 as error return, instead of, for instance, None, goes back the the need in static languages such as C to return something of the declared return type. But Python is not C, etcetera. I believe that this pair is also unique in having exact counterparts of type 2. (But maybe I forgot something.) >> Would anyone really object? > I would object to the removal of str.find(). So, I wonder, what is your favored alternative? A. Status quo: ignore the opportunity to streamline the language. B. Change the return type of .find to None. C. Remove .(r)index instead. D. Add more redundancy for those who do not like exceptions. > Further, forcing users to use try/except when they are looking for the > offset of a substring seems at least a little strange (if not a lot > braindead, no offense to those who prefer their code to spew exceptions > at every turn). So are you advocating D above or claiming that substring indexing is uniquely deserving of having two versions? If the latter, why so special? If we only has str.index, would you actually suggest adding this particular duplication? > Considering the apparent dislike/hatred for str.find. I don't hate str.find. I simply (a) recognize that a function designed for static typing constraints is out of place in Python, which does not have those constraints and (b) believe that there is no reason other than history for the duplication and (c) believe that dropping .find is definitely better than dropping .index and changing .find. > Would you further request that .rfind be removed from strings? Of course. Thanks for reminding me. > The inclusion of .rindex? Yes, the continued inclusion of .rindex, which we already have. Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Raymond Hettinger" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > FWIW, here is a sample code transformation (extracted from zipfile.py). > Judge for yourself whether the index version is better: I am sure that we both could write similar code that would be smoother if the math module also had a 'powhalf' function that was the same as sqrt except for returning -1 instead of raising an error on negative or non-numerical input. I'll continue in response to Josiah... Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Guido van Rossum" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote: >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > Yes please. (Except it's not technically a builtin but a string method.) To avoid suggesting a new header, I interpreted Built-ins broadly to include builtin types. The header could be expanded to Built-in Constants, Functions, and Types or Built-ins and Built-in Types but I leave such details to the PEP authors. Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
"Terry Reedy" <[EMAIL PROTECTED]> wrote: > > Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? I would object to the removal of str.find() . In fact, older versions of Python which only allowed for single-character 'x in str' containment tests offered 'str.find(...) != -1' as a suitable replacement option, which is found in the standard library more than a few times... Further, forcing users to use try/except when they are looking for the offset of a substring seems at least a little strange (if not a lot braindead, no offense to those who prefer their code to spew exceptions at every turn). I've been thinking for years that .find should be part of the set of operations offered to most, if not all sequences (lists, buffers, tuples, ...). Considering the apparent dislike/hatred for str.find, it seems I was wise in not requesting it in the past. > > Reasons: > > 1. Str.find is essentially redundant with str.index. The only difference > is that str.index Pythonically indicates 'not found' by raising an > exception while str.find does the same by anomalously returning -1. As > best as I can remember, this is common for Unix system calls but unique > among Python builtin functions. Learning and remembering both is a > nuisance. So pick one and forget the other. I think of .index as a list method (because it doesn't offer .find), not a string method, even though it is. > 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing > subscript. If one uses the return value as a subscript without checking, > the bug is not caught. None would be a better return value should find not > be deleted. And would break potentially thousands of lines of code in the wild which expect -1 right now. Look in the standard library for starting examples, and google around for others. > 3. Anyone who prefers to test return values instead of catch exceptions can > write (simplified, without start,end params): > > def sfind(string, target): > try: > return string.index(target) > except ValueError: > return None # or -1 for back compatibility, but None better > > This can of course be done for any function/method that indicates input > errors with exceptions instead of a special return value. I see no reason > other than history that this particular method should be doubled. I prefer my methods to stay on my instances, and I could have sworn that the string module's functions were generally deprecated in favor of string methods. Now you are (implicitly) advocating the reversal of such for one method which doesn't return an exception under a very normal circumstance. Would you further request that .rfind be removed from strings? The inclusion of .rindex? - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
> Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? > > Reasons: . . . I had one further thought. In addition to your excellent list of reasons, it would be great if these kind of requests were accompanied by a patch that removed the offending construct from the standard library. The most important reason for the patch is that looking at the context diff will provide an objective look at how real code will look before and after the change. This would make subsequent discussions substantially more informed and less anecdotal. The second reason is that the revised library code becomes more likely to survive the transition to 3.0. Further, it can continue to serve as example code which highlights current best practices. This patch wouldn't take long. I've tried about a half dozen cases since you first posted. Each provided a new insight (zipfile was not improved, webbrowser was improved, and urlparse was about the same). Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
> Can str.find be listed in PEP 3000 (under builtins) for removal? FWIW, here is a sample code transformation (extracted from zipfile.py). Judge for yourself whether the index version is better: Existing code: -- END_BLOCK = min(filesize, 1024 * 4) fpin.seek(filesize - END_BLOCK, 0) data = fpin.read() start = data.rfind(stringEndArchive) if start >= 0: # Correct signature string was found endrec = struct.unpack(structEndArchive, data[start:start+22]) endrec = list(endrec) comment = data[start+22:] if endrec[7] == len(comment): # Comment length checks out # Append the archive comment and start offset endrec.append(comment) endrec.append(filesize - END_BLOCK + start) return endrec return # Error, return None Revised code: - END_BLOCK = min(filesize, 1024 * 4) fpin.seek(filesize - END_BLOCK, 0) data = fpin.read() try: start = data.rindex(stringEndArchive) except ValueError: pass else: # Correct signature string was found endrec = struct.unpack(structEndArchive, data[start:start+22]) endrec = list(endrec) comment = data[start+22:] if endrec[7] == len(comment): # Comment length checks out # Append the archive comment and start offset endrec.append(comment) endrec.append(filesize - END_BLOCK + start) return endrec return # Error, return None ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Remove str.find in 3.0?
On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote: > Can str.find be listed in PEP 3000 (under builtins) for removal? Yes please. (Except it's not technically a builtin but a string method.) > Would anyone really object? Not me. > Reasons: > > 1. Str.find is essentially redundant with str.index. The only difference > is that str.index Pythonically indicates 'not found' by raising an > exception while str.find does the same by anomalously returning -1. As > best as I can remember, this is common for Unix system calls but unique > among Python builtin functions. Learning and remembering both is a > nuisance. > > 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing > subscript. If one uses the return value as a subscript without checking, > the bug is not caught. None would be a better return value should find not > be deleted. > > 3. Anyone who prefers to test return values instead of catch exceptions can > write (simplified, without start,end params): > > def sfind(string, target): > try: > return string.index(target) > except ValueError: > return None # or -1 for back compatibility, but None better > > This can of course be done for any function/method that indicates input > errors with exceptions instead of a special return value. I see no reason > other than history that this particular method should be doubled. I'd like to add: 4. The no. 1 use case for str.find() used to be testing whether a substring was present or not; "if s.find(sub) >= 0" can now be written as "if sub in s". This avoids the nasty bug in "if s.find(sub)". > If .find is scheduled for the dustbin of history, I would be willing to > suggest doc and docstring changes. (str.index.__doc__ currently refers to > str.find.__doc__. This should be reversed.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Remove str.find in 3.0?
Can str.find be listed in PEP 3000 (under builtins) for removal? Would anyone really object? Reasons: 1. Str.find is essentially redundant with str.index. The only difference is that str.index Pythonically indicates 'not found' by raising an exception while str.find does the same by anomalously returning -1. As best as I can remember, this is common for Unix system calls but unique among Python builtin functions. Learning and remembering both is a nuisance. 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing subscript. If one uses the return value as a subscript without checking, the bug is not caught. None would be a better return value should find not be deleted. 3. Anyone who prefers to test return values instead of catch exceptions can write (simplified, without start,end params): def sfind(string, target): try: return string.index(target) except ValueError: return None # or -1 for back compatibility, but None better This can of course be done for any function/method that indicates input errors with exceptions instead of a special return value. I see no reason other than history that this particular method should be doubled. If .find is scheduled for the dustbin of history, I would be willing to suggest doc and docstring changes. (str.index.__doc__ currently refers to str.find.__doc__. This should be reversed.) Terry J. Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com