Re: [Python-Dev] Remove str.find in 3.0?

2005-08-31 Thread Gareth McCaughan
I wrote:

[Andrew Durdin:]
> > IOW, I expected "www.python.org".partition("python") to return exactly
> > the same as "www.python.org".rpartition("python")
> 
> Yow. Me too, and indeed I've been skimming this thread without
> it ever occurring to me that it would be otherwise.

And, on re-skimming the thread, I think that was always the plan.
So that's OK, then. :-)

-- 
g

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-31 Thread Gareth McCaughan
> Just to put my spoke in the wheel, I find the difference in the
> ordering of return values for partition() and rpartition() confusing:
> 
> head, sep, remainder = partition(s)
> remainder, sep, head = rpartition(s)
> 
> My first expectation for rpartition() was that it would return exactly
> the same values as partition(), but just work from the end of the
> string.
> 
> IOW, I expected "www.python.org".partition("python") to return exactly
> the same as "www.python.org".rpartition("python")

Yow. Me too, and indeed I've been skimming this thread without
it ever occurring to me that it would be otherwise.

> Anyway, I'm definitely +1 on partition(), but -1 on rpartition()
> returning in "reverse order".

+1.

-- 
g

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Josiah Carlson

Steve Holden <[EMAIL PROTECTED]> wrote:
> 
> Guido van Rossum wrote:
> > On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote:
> [confusion]
> > 
> > 
> > Hm. The example is poorly chosen because it's an end case. The
> > invariant for both is (I'd hope!)
> > 
> >   "".join(s.partition()) == s == "".join(s.rpartition())
> > 
> > Thus,
> > 
> >   "a/b/c".partition("/") returns ("a", "/", "b/c")
> > 
> >   "a/b/c".rpartition("/") returns ("a/b", "/", "c")
> > 
> > That can't be confusing can it?
> > 
> > (Just think of it as rpartition() stopping at the last occurrence,
> > rather than searching from the right. :-)
> > 
> So we can check that a substring x appears precisely once in the string 
> s using
> 
> s.partition(x) == s.rpartition(x)
> 
> Oops, it fails if s == "". I can usually find some way to go wrong ...

There was an example in the standard library that used "s.find(y) ==
s.rfind(y)" as a test for zero or 1 instances of the searched for item.

Generally though, s.count(x)==1 is a better test.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Steve Holden
Guido van Rossum wrote:
> On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote:
[confusion]
> 
> 
> Hm. The example is poorly chosen because it's an end case. The
> invariant for both is (I'd hope!)
> 
>   "".join(s.partition()) == s == "".join(s.rpartition())
> 
> Thus,
> 
>   "a/b/c".partition("/") returns ("a", "/", "b/c")
> 
>   "a/b/c".rpartition("/") returns ("a/b", "/", "c")
> 
> That can't be confusing can it?
> 
> (Just think of it as rpartition() stopping at the last occurrence,
> rather than searching from the right. :-)
> 
So we can check that a substring x appears precisely once in the string 
s using

s.partition(x) == s.rpartition(x)

Oops, it fails if s == "". I can usually find some way to go wrong ...

tongue-in-cheek-ly y'rs  - steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Terry Reedy

"Delaney, Timothy (Tim)" <[EMAIL PROTECTED]> wrote in message
> before, sep, after = s.partition('?')
> ('http://www.python.org', '', '')
>
> before, sep, after = s.rpartition('?')
> ('', '', 'http://www.python.org')

I can also see this as left, sep, right, with the sep not found case 
putting all in left or right depending on whether one scanned to the right 
or left.  In other words, when the scanner runs out of chars to scan, 
everything is 'behind' the scan, where 'behind' depends on the direction of 
scanning.  That seems nicely symmetric.

Terry J. Reedy




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Terry Reedy

""Martin v. Löwis"" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Terry Reedy wrote:
>> One (1a) is to give an inband signal that is like a normal
>> response except that it is not (str.find returing -1).
>>
>> Python as distributed usually chooses 1b or 2.
>>  I believe str.find and
>> .rfind are unique in the choice of 1a.
>
> That is not true. str.find's choice is not 1a,

It it the paradigm example of 1a as I meant my definition.

> -1 does *not* look like a normal response,
> since a normal response is non-negative.

Actually, the current doc does not clearly specify to some people that the 
response is a count.  That is what lead to the 'str.find is buggy' thread 
on c.l.p, and something I will clarify when I propose a doc patch.

In any case, Python does not have a count type, though I sometime wish it 
did.  The return type is int and -1 is int, though it is not meant to be 
used as an int and it is a bug to do so.

>It is *not* the only method with choice 1a):
> dict.get returns None if the key is not found,

None is only the default default, and whatever the default is, it is not 
necessarily an error return.  A dict accessed via .get can be regarded as 
an infinite association matching all but a finite set of keys with the 
default.  Example: a doubly infinite array of numbers with only a finite 
number of non-zero entries, implemented as a dict.  This is the view 
actually used if one does normal calculations with that default return. 
There is no need, at least for that access method, for any key to be 
explicitly associated with the default.

If the default *is* regarded as an error indicator, and is only used to 
guard normal processing of the value returned, then that default must not 
be associated any key.   There is the problem that the domain of  dict 
values is normally considered to be any Python object and functions can 
only return Python objects and not any non-Python-object error return.
So the effective value domain for the particular dict must be the set 
'Python objects' minus the error indicator.  With discipline, None often 
works.  Or, to guarantee 1b-ness, one can create a new type that cannot be 
in the dict.

> For another example, file.read() returns an empty string at EOF.

If the request is 'give me the rest of the file as a string', then '' is 
the answer, not a 'cannot answer' indicator.  Similarly, if the request is 
'how many bytes are left to read', then zero is a numerical answer, not a 
non-numerical 'cannot answer' indicator.

Terry J. Reedy





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Andrew Durdin
On 8/31/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> 
> Hm. The example is poorly chosen because it's an end case. The
> invariant for both is (I'd hope!)
> 
>   "".join(s.partition()) == s == "".join(s.rpartition())


 
> (Just think of it as rpartition() stopping at the last occurrence,
> rather than searching from the right. :-)

Ah, that makes a difference.  I could see that there was a different
way of looking at the function, I just couldn't see what it was... 
Now I understand the way it's been done.

Cheers,

Andrew.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Guido van Rossum
On 8/30/05, Andrew Durdin <[EMAIL PROTECTED]> wrote:
> On 8/31/05, Delaney, Timothy (Tim) <[EMAIL PROTECTED]> wrote:
> > Andrew Durdin wrote:
> >
> > > Just to put my spoke in the wheel, I find the difference in the
> > > ordering of return values for partition() and rpartition() confusing:
> > >
> > > head, sep, remainder = partition(s)
> > > remainder, sep, head = rpartition(s)
> >
> > This is the confusion - you've got the terminology wrong.
> >
> > before, sep, after = s.partition('?')
> > ('http://www.python.org', '', '')
> >
> > before, sep, after = s.rpartition('?')
> > ('', '', 'http://www.python.org')
> 
> That's still confusing (to me), though -- when the string is being
> processed, what comes before the separator is the stuff at the end of
> the string, and what comes after is the bit at the beginning of the
> string.  It's not the terminology that's confusing me, though I find
> it hard to describe exactly what is. Maybe it's just me -- does anyone
> else have the same confusion?

Hm. The example is poorly chosen because it's an end case. The
invariant for both is (I'd hope!)

  "".join(s.partition()) == s == "".join(s.rpartition())

Thus,

  "a/b/c".partition("/") returns ("a", "/", "b/c")

  "a/b/c".rpartition("/") returns ("a/b", "/", "c")

That can't be confusing can it?

(Just think of it as rpartition() stopping at the last occurrence,
rather than searching from the right. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Andrew Durdin
On 8/31/05, Delaney, Timothy (Tim) <[EMAIL PROTECTED]> wrote:
> Andrew Durdin wrote:
> 
> > Just to put my spoke in the wheel, I find the difference in the
> > ordering of return values for partition() and rpartition() confusing:
> >
> > head, sep, remainder = partition(s)
> > remainder, sep, head = rpartition(s)
> 
> This is the confusion - you've got the terminology wrong.
> 
> before, sep, after = s.partition('?')
> ('http://www.python.org', '', '')
> 
> before, sep, after = s.rpartition('?')
> ('', '', 'http://www.python.org')

That's still confusing (to me), though -- when the string is being
processed, what comes before the separator is the stuff at the end of
the string, and what comes after is the bit at the beginning of the
string.  It's not the terminology that's confusing me, though I find
it hard to describe exactly what is. Maybe it's just me -- does anyone
else have the same confusion?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Delaney, Timothy (Tim)
Andrew Durdin wrote:

> Just to put my spoke in the wheel, I find the difference in the
> ordering of return values for partition() and rpartition() confusing:
> 
> head, sep, remainder = partition(s)
> remainder, sep, head = rpartition(s)

This is the confusion - you've got the terminology wrong.

before, sep, after = s.partition('?')
('http://www.python.org', '', '')

before, sep, after = s.rpartition('?')
('', '', 'http://www.python.org')

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Andrew Durdin
On 8/31/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> [Hye-Shik Chang]
> > What would be a result for rpartition(s, '?') ?
> > ('', '', 'http://www.python.org')
> > or
> > ('http://www.python.org', '', '')
> 
> The former.  The invariants for rpartition() are a mirror image of those
> for partition().

Just to put my spoke in the wheel, I find the difference in the
ordering of return values for partition() and rpartition() confusing:

head, sep, remainder = partition(s)
remainder, sep, head = rpartition(s)

My first expectation for rpartition() was that it would return exactly
the same values as partition(), but just work from the end of the
string.

IOW, I expected "www.python.org".partition("python") to return exactly
the same as "www.python.org".rpartition("python")

To try out partition(), I wrote a quick version of split() using
partition, and using partition() was obvious and easy:

def mysplit(s, sep):
l = []
while s:
part, _, s = s.partition(sep)
l.append(part)
return l

I tripped up when trying to make an rsplit() (I'm using Python 2.3),
because the return values were in "reverse order"; I had expected the
only change to be using rpartition() instead of partition().

For a second example: one of the "fixed stdlib" examples that Raymond
posted actually uses rpartition and partition in two consecutive lines
-- I found this example not immediately obvious for the above reason:

  def run_cgi(self):
 """Execute a CGI script."""
 dir, rest = self.cgi_info
 rest, _, query = rest.rpartition('?')
 script, _, rest = rest.partition('/')
 scriptname = dir + '/' + script
 scriptfile = self.translate_path(scriptname)
 if not os.path.exists(scriptfile):

Anyway, I'm definitely +1 on partition(), but -1 on rpartition()
returning in "reverse order".

Andrew
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Raymond Hettinger
[Hye-Shik Chang]
> What would be a result for rpartition(s, '?') ?
> ('', '', 'http://www.python.org')
> or
> ('http://www.python.org', '', '')

The former.  The invariants for rpartition() are a mirror image of those
for partition().



 
> BTW, I wrote a somewhat preliminary patch for this functionality
> to let you save little of your time. :-)
> 
> http://people.freebsd.org/~perky/partition-r1.diff

Thanks.  I've got one running already, but it is nice to have another
for comparison.



Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-30 Thread Hye-Shik Chang
On 8/28/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> >>> s = 'http://www.python.org'
> >>> partition(s, '://')
> ('http', '://', 'www.python.org')
> >>> partition(s, '?')
> ('http://www.python.org', '', '')
> >>> partition(s, 'http://')
> ('', 'http://', 'www.python.org')
> >>> partition(s, 'org')
> ('http://www.python.', 'org', '')
> 

What would be a result for rpartition(s, '?') ?
('', '', 'http://www.python.org')
or
('http://www.python.org', '', '')

BTW, I wrote a somewhat preliminary patch for this functionality
to let you save little of your time. :-)

http://people.freebsd.org/~perky/partition-r1.diff


Hye-Shik
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-29 Thread Terry Reedy

"Raymond Hettinger" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> [M.-A. Lemburg]
>> Also, as I understand Terry's request, .find() should be removed
>> in favor of just leaving .index() (which is the identical method
>> without the funny -1 return code logic).

My proposal is to use the 3.0 opportunity to improve the language in this 
particular area.  I considered and ranked five alternatives more or less as 
follows.

1. Keep .index and delete .find.
2. Keep .index and repair .find to return None instead of -1.
3.5 Delete .index and repair .find.
3.5 Keep .index and .find as is.
5. Delete .index and keep .find as is.

> It is new and separate, but it is also related.

I see it as a 6th option: keep.index, delete .find, and replace with 
.partition.  I rank this at least second and maybe first.  It is separable 
in that the replacement can be done now, while the deletion has to wait.

> The core of Terry's request is the assertion that str.find()
> is bug-prone and should not be used.

That and the redundancy, both of which bothered me a bit since I first 
learned the string module functions.

Terry J. Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-29 Thread Tony Meyer
[Kay Schluehr]
>> The discourse about Python3000 has shrunken from the expectation
>> of the "next big thing" into a depressive rhetorics of feature 
>> elimination. The language doesn't seem to become deeper, smaller
>> and more powerfull but just smaller.
 
[Guido]
> There is much focus on removing things, because we want to be able 
> to add new stuff but we don't want the language to grow.

ISTM that a major reason that the Python 3.0 discussion seems 
focused more on removal than addition is that a lot of 
addition can be (and is being) done in Python 2.x.  This is a 
huge benefit, of course, since people can start doing things 
the "new and improved" way in 2.x, even though it's not until 
3.0 that the "old and evil" ;) way is actually removed.

Removal of map/filter/reduce is an example - there isn't 
discussion about addition of new features, because list 
comps/gen expressions are already here...

=Tony.Meyer

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-29 Thread Michael Chermside
Raymond writes:
> That suggests that we need a variant of split() that has been customized
> for typical find/index use cases.  Perhaps introduce a new pair of
> methods, partition() and rpartition()

+1

My only suggestion is that when you're about to make a truly
inspired suggestion like this one, that you use a new subject
header. It will make it easier for the Python-Dev summary
authors and for the people who look back in 20 years to ask
"That str.partition() function is really swiggy! It's everywhere
now, but I wonder what language had it first and who came up with
it?"

-- Michael Chermside

[PS: To explain what "swiggy" means I'd probably have to borrow
  the time machine.]

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Raymond Hettinger
[M.-A. Lemburg]
> Also, as I understand Terry's request, .find() should be removed
> in favor of just leaving .index() (which is the identical method
> without the funny -1 return code logic).
> 
> So your proposal really doesn't have all that much to do
> with Terry's request, but is a new and separate proposal
> (which does have some value in few cases, but not enough
> to warrant a new method).

It is new and separate, but it is also related.  The core of Terry's
request is the assertion that str.find() is bug-prone and should not be
used.  The principal arguments against accepting his request (advanced
by Tim) are that the str.index() alternative is slightly more awkward to
code, more likely to result in try-suites that catch more than intended,
and that the resulting code is slower.  Those arguments fall to the
wayside if str.partition() becomes available as a superior alternative.
IOW, it makes Terry's request much more palatable.




> > def run_cgi(self):
> > """Execute a CGI script."""
> > dir, rest = self.cgi_info
> > rest, _, query = rest.rpartition('?')
> > script, _, rest = rest.partition('/')

[MAL]
> Wouldn't this do the same ?! ...
> 
> rest, query = rest.rsplit('?', maxsplit=1)
> script, rest = rest.split('/', maxsplit=1)

No.  The split() versions are buggy.  They fail catastrophically when
the original string does not contain '?' or does not contain '/':

>>> rest = 'http://www.example.org/subdir'
>>> rest, query = rest.rsplit('?', 1)

Traceback (most recent call last):
  File "", line 1, in -toplevel-
rest, query = rest.rsplit('?', 1)
ValueError: need more than 1 value to unpack


The whole point of str.partition() is to repackage str.split() in a way
that is conducive to fulfilling many of the existing use cases for
str.find() and str.index().

In going through the library examples, I've not found a single case
where a direct use of str.split() would improve code that currently uses
str.find().



Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Ron Adam
Raymond Hettinger wrote:

> Looking at sample code transformations shows that the high-power
> mxTextTools and re approaches do not simplify code that currently uses
> s.find().  In contrast, the proposed partition() method is a joy to use
> and has no surprises.  The following code transformation shows
> unbeatable simplicity and clarity.

+1

This doesn't cause any backward compatible issues as well!

> --- From CGIHTTPServer.py ---
> 
> def run_cgi(self):
> """Execute a CGI script."""
> dir, rest = self.cgi_info
> i = rest.rfind('?')
> if i >= 0:
> rest, query = rest[:i], rest[i+1:]
> else:
> query = ''
> i = rest.find('/')
> if i >= 0:
> script, rest = rest[:i], rest[i:]
> else:
> script, rest = rest, ''
> . . .
> 
> 
> def run_cgi(self):
> """Execute a CGI script."""
> dir, rest = self.cgi_info
> rest, _, query = rest.rpartition('?')
> script, _, rest = rest.partition('/')
> . . .

+1

Much easier to read and understand!

Cheers,
Ron



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread M.-A. Lemburg
Raymond Hettinger wrote:
> [Marc-Andre Lemburg]
> 
>>I may be missing something, but why invent yet another parsing
>>method - we already have the re module. I'd suggest to
>>use it :-)
>>
>>If re is not fast enough or you want more control over the
>>parsing process, you could also have a look at mxTextTools:
>>
>>http://www.egenix.com/files/python/mxTextTools.html
> 
> 
> Both are excellent tools.  Neither is as lightweight, as trivial to
> learn, or as transparently obvious as the proposed s.partition(sep).
> The idea is to find a viable replacement for s.find().

Your partition idea could be had with an additional argument
to .split() (e.g. keepsep=1); no need to learn a new method.

Also, as I understand Terry's request, .find() should be removed
in favor of just leaving .index() (which is the identical method
without the funny -1 return code logic).

So your proposal really doesn't have all that much to do
with Terry's request, but is a new and separate proposal
(which does have some value in few cases, but not enough
to warrant a new method).

> Looking at sample code transformations shows that the high-power
> mxTextTools and re approaches do not simplify code that currently uses
> s.find().  In contrast, the proposed partition() method is a joy to use
> and has no surprises.  The following code transformation shows
> unbeatable simplicity and clarity.
> 
> 
> --- From CGIHTTPServer.py ---
> 
> def run_cgi(self):
> """Execute a CGI script."""
> dir, rest = self.cgi_info
> i = rest.rfind('?')
> if i >= 0:
> rest, query = rest[:i], rest[i+1:]
> else:
> query = ''
> i = rest.find('/')
> if i >= 0:
> script, rest = rest[:i], rest[i:]
> else:
> script, rest = rest, ''
> . . .
> 
> 
> def run_cgi(self):
> """Execute a CGI script."""
> dir, rest = self.cgi_info
> rest, _, query = rest.rpartition('?')
> script, _, rest = rest.partition('/')

Wouldn't this do the same ?! ...

rest, query = rest.rsplit('?', maxsplit=1)
script, rest = rest.split('/', maxsplit=1)

> . . .
> 
> 
> The new proposal does not help every use case though.  In
> ConfigParser.py, the problem description reads, "a semi-colon is a
> comment delimiter only if it follows a spacing character".  This cries
> out for a regular expression.  In StringIO.py, since the task at hand IS
> calculating an index, an indexless higher level construct doesn't help.
> However, many of the other s.find() use cases in the library simplify as
> readily and directly as the above cgi server example.
> 
> 
> 
> Raymond
> 
> 
> ---
> 
> P.S.  FWIW, if you want to experiment with it, here a concrete
> implementation of partition() expressed as a function:
> 
> def partition(s, t):
> """ Returns a three element tuple, (head, sep, tail) where:
> 
> head + sep + tail == s
> t not in head
> sep == '' or sep is t
> bool(sep) == (t in s)   # sep indicates if the string was
> found
> 
> >>> s = 'http://www.python.org'
> >>> partition(s, '://')
> ('http', '://', 'www.python.org')
> >>> partition(s, '?')
> ('http://www.python.org', '', '')
> >>> partition(s, 'http://')
> ('', 'http://', 'www.python.org')
> >>> partition(s, 'org')
> ('http://www.python.', 'org', '')
> 
> """
> if not isinstance(t, basestring) or not t:
> raise ValueError('partititon argument must be a non-empty
> string')
> parts = s.split(t, 1)
> if len(parts) == 1:
> result = (s, '', '')
> else:
> result = (parts[0], t, parts[1])
> assert len(result) == 3
> assert ''.join(result) == s
> assert result[1] == '' or result[1] is t
> assert t not in result[0]
> return result
> 
> 
> import doctest
> print doctest.testmod()

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 28 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Josiah Carlson

"Raymond Hettinger" <[EMAIL PROTECTED]> wrote:
> [Guido]
> > Another observation: despite the derogatory remarks about regular
> > expressions, they have one thing going for them: they provide a higher
> > level of abstraction for string parsing, which this is all about.
> > (They are higher level in that you don't have to be counting
> > characters, which is about the lowest-level activity in programming --
> > only counting bytes is lower!)
> > 
> > Maybe if we had a *good* way of specifying string parsing we wouldn't
> > be needing to call find() or index() so much at all! (A good example
> > is the code that Raymond lifted from ConfigParser: a semicolon
> > preceded by whitespace starts a comment, other semicolons don't.
> > Surely there ought to be a better way to write that.)
> 
> A higher level abstraction is surely the way to go.

Perhaps...

> Of course, if this idea survives the day, then I'll meet my own
> requirements and write a context diff on the standard library.  That
> ought to give a good indication of how well the new methods meet
> existing needs and whether the resulting code is better, cleaner,
> clearer, faster, etc.


My first thought when reading the proposal was "that's just
str.split/str.rsplit with maxsplit=1, returning the thing you split on,
with 3 items always returned, what's the big deal?"  Two second later it
hit me, that is the big deal.

Right now it is a bit of a pain to get string.split to return consistant
numbers of return values; I myself have used:
  l,r = (x.split(y, 1)+[''])[:2]
...around 10 times - 10 times more than I really should have.

Taking a wander through my code, this improves the look and flow in
almost every case (the exceptions being where I should have rewritten to
use 'substr in str' after I started using Python 2.3). Taking a walk
through examples of str.rfind at koders.com leads me to believe that
.partition/.rpartition would generally improve the flow, correctness,
and beauty of code which had previously been using .find/.rfind.


I hope the idea survives the day.
 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Raymond Hettinger
[Marc-Andre Lemburg]
> I may be missing something, but why invent yet another parsing
> method - we already have the re module. I'd suggest to
> use it :-)
> 
> If re is not fast enough or you want more control over the
> parsing process, you could also have a look at mxTextTools:
> 
> http://www.egenix.com/files/python/mxTextTools.html

Both are excellent tools.  Neither is as lightweight, as trivial to
learn, or as transparently obvious as the proposed s.partition(sep).
The idea is to find a viable replacement for s.find().

Looking at sample code transformations shows that the high-power
mxTextTools and re approaches do not simplify code that currently uses
s.find().  In contrast, the proposed partition() method is a joy to use
and has no surprises.  The following code transformation shows
unbeatable simplicity and clarity.


--- From CGIHTTPServer.py ---

def run_cgi(self):
"""Execute a CGI script."""
dir, rest = self.cgi_info
i = rest.rfind('?')
if i >= 0:
rest, query = rest[:i], rest[i+1:]
else:
query = ''
i = rest.find('/')
if i >= 0:
script, rest = rest[:i], rest[i:]
else:
script, rest = rest, ''
. . .


def run_cgi(self):
"""Execute a CGI script."""
dir, rest = self.cgi_info
rest, _, query = rest.rpartition('?')
script, _, rest = rest.partition('/')
. . .


The new proposal does not help every use case though.  In
ConfigParser.py, the problem description reads, "a semi-colon is a
comment delimiter only if it follows a spacing character".  This cries
out for a regular expression.  In StringIO.py, since the task at hand IS
calculating an index, an indexless higher level construct doesn't help.
However, many of the other s.find() use cases in the library simplify as
readily and directly as the above cgi server example.



Raymond


---

P.S.  FWIW, if you want to experiment with it, here a concrete
implementation of partition() expressed as a function:

def partition(s, t):
""" Returns a three element tuple, (head, sep, tail) where:

head + sep + tail == s
t not in head
sep == '' or sep is t
bool(sep) == (t in s)   # sep indicates if the string was
found

>>> s = 'http://www.python.org'
>>> partition(s, '://')
('http', '://', 'www.python.org')
>>> partition(s, '?')
('http://www.python.org', '', '')
>>> partition(s, 'http://')
('', 'http://', 'www.python.org')
>>> partition(s, 'org')
('http://www.python.', 'org', '')

"""
if not isinstance(t, basestring) or not t:
raise ValueError('partititon argument must be a non-empty
string')
parts = s.split(t, 1)
if len(parts) == 1:
result = (s, '', '')
else:
result = (parts[0], t, parts[1])
assert len(result) == 3
assert ''.join(result) == s
assert result[1] == '' or result[1] is t
assert t not in result[0]
return result


import doctest
print doctest.testmod()

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread M.-A. Lemburg
Raymond Hettinger wrote:
> [Guido]
> 
>>Another observation: despite the derogatory remarks about regular
>>expressions, they have one thing going for them: they provide a higher
>>level of abstraction for string parsing, which this is all about.
>>(They are higher level in that you don't have to be counting
>>characters, which is about the lowest-level activity in programming --
>>only counting bytes is lower!)
>>
>>Maybe if we had a *good* way of specifying string parsing we wouldn't
>>be needing to call find() or index() so much at all! (A good example
>>is the code that Raymond lifted from ConfigParser: a semicolon
>>preceded by whitespace starts a comment, other semicolons don't.
>>Surely there ought to be a better way to write that.)
>  
> A higher level abstraction is surely the way to go.

I may be missing something, but why invent yet another parsing
method - we already have the re module. I'd suggest to
use it :-)

If re is not fast enough or you want more control over the
parsing process, you could also have a look at mxTextTools:

http://www.egenix.com/files/python/mxTextTools.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 28 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Raymond Hettinger
[Guido]
> Another observation: despite the derogatory remarks about regular
> expressions, they have one thing going for them: they provide a higher
> level of abstraction for string parsing, which this is all about.
> (They are higher level in that you don't have to be counting
> characters, which is about the lowest-level activity in programming --
> only counting bytes is lower!)
> 
> Maybe if we had a *good* way of specifying string parsing we wouldn't
> be needing to call find() or index() so much at all! (A good example
> is the code that Raymond lifted from ConfigParser: a semicolon
> preceded by whitespace starts a comment, other semicolons don't.
> Surely there ought to be a better way to write that.)

A higher level abstraction is surely the way to go.

I looked over the use cases for find and index.  As from cases which are
now covered by the "in" operator, it looks like you almost always want
the index to support a subsequent partition of the string.

That suggests that we need a variant of split() that has been customized
for typical find/index use cases.  Perhaps introduce a new pair of
methods, partition() and rpartition() which work like this:

>>> s = 'http://www.python.org'
>>> s.partition('://')
('http', '://', 'www.python.org')
>>> s.rpartition('.')
('http://www.python', '.', 'org')
>>> s.partition('?')
(''http://www.python.org', '', '')

The idea is still preliminary and I have only applied it to a handful of
the library's find() and index() examples.  Here are some of the design
considerations:

* The function always succeeds unless the separator argument is not a
string type or is an empty string.  So, a typical call doesn't have to
be wrapped in a try-suite for normal usage.

* The split invariant is:   s == ''.join(s.partition(t))

* The result of the partition is always a three element tuple.  This
allows the results to be unpacked directly:

   head, sep, tail = s.partition(t)

* The use cases for find() indicates a need to both test for the
presence of the split element and to then to make a slice at that point.
If we used a contains test for the first step, we could end-up having to
search the string twice (once for detection and once for splitting).
However, by providing the middle element of the result tuple, we can
determine found or not-found without an additional search.  Accordingly,
the middle element has a nice Boolean interpretation with '' for
not-found and a non-empty string meaning found.  Given
(a,b,c)=s.partition(p), the following invariant holds:

   b == '' or b is p
   
* Returning the left, center, and right portions of the split supports a
simple programming pattern for repeated partitions:

   while s:
   head, part, s = s.partition(t)
   . . .

Of course, if this idea survives the day, then I'll meet my own
requirements and write a context diff on the standard library.  That
ought to give a good indication of how well the new methods meet
existing needs and whether the resulting code is better, cleaner,
clearer, faster, etc.



Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread JustFillBug
On 2005-08-26, Terry Reedy <[EMAIL PROTECTED]> wrote:
> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?
>

With all the discussion, I think you guys should realize that the
find/index method are actually convenient function which do 2 things in
one call:
1) If the key exists?
2) If the key exists, find it out.

But whether you use find or index, at the end, you *have to* break it into
2 step at then end in order to make bug free code. Without find, you can
do:

if s in txt:
   i = txt.index(s)
   ...
else:
   pass

or:
try:
   i = txt.index(s)
   ...
except ValueError:
   pass

With find:
i = txt.index(s)
if i >= 0:
  ...
else:
  pass

The code is about the same except with exception, the test of Exception
is pushed far apart instead of immediately. No much coding was saved.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Josiah Carlson

Steve Holden <[EMAIL PROTECTED]> wrote:
> 
> Josiah Carlson wrote:
> > Donovan Baarda <[EMAIL PROTECTED]> wrote:
> [...]
> > 
> > One thing that has gotten my underwear in a twist is that no one has
> > really offered up a transition mechanism from "str.find working like now"
> > and some future "str.find or lack of" other than "use str.index". 
> > Obviously, I personally find the removal of str.find to be a nonstarter
> > (don't make me catch exceptions or use regular expressions when both are
> > unnecessary, please), but a proper transition of str.find from -1 to
> > None on failure would be beneficial (can which one be chosen at runtime
> > via __future__ import?).
> > 
> > During a transition which uses __future__, it would encourage the
> > /proper/ use of str.find in all modules and extensions in which use it...
> > 
> > x = y.find(z)
> > if x >= 0:
> > #...
> > 
> It does seem rather fragile to rely on the continuation of the current 
> behavior
> 
>   >>> None >= 0
> False

Please see this previous post on None comparisons and why it is unlikely
to change:
http://mail.python.org/pipermail/python-dev/2003-December/041374.html


> for the correctness of "proper usage". Is this guaranteed in future 
> implementations? Especially when:
> 
>   >>> type(None) >= 0
> True

That is an interesting, but subjectively useless comparison:

>>> type(0) >= 0
True
>>> type(int) >= 0
True

When do you ever compare the type of an object with the value of another
object?



> > Forcing people to use the proper semantic in their modules so as to be
> > compatible with other modules which may or may not use str.find returns
> > None, would (I believe) result in an overall reduction (if not
> > elimination) of bugs stemming from str.find, and would prevent former
> > str.find users from stumbling down the try/except/else misuse that Tim
> > Peters highlighted.
> > 
> Once "str.find() returns None to fail" becomes the norm then surely the 
> correct usage would be
> 
>  x = y.find(z)
>  if x is not None:
>  #...
> 
> which is still a rather ugly paradigm, but acceptable. So the transition 
> is bound to be troubling.

Perhaps, which is why I offered "x >= 0".


> > Heck, if you can get the __future__ import working for choosing which
> > str.find to use (on a global, not per-module basis), I say toss it into
> > 2.6, or even 2.5 if there is really a push for this prior to 3.0 .
> 
> The real problem is surely that one of find()'s legitimate return values 
> evaluates false in a Boolean context. It's especially troubling that the 
> value that does so doesn't indicate search failure. I'd prefer to live 
> with the wart until 3.0 introduces something more satisfactory, or 
> simply removes find() altogether. Otherwise the resulting code breakage 
> when the future arrives just causes unnecessary pain.

Here's a current (horrible but common) solution:

x = string.find(substring) + 1
if x:
x -= 1
...


...I'm up way to late.
 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-28 Thread Steve Holden
Josiah Carlson wrote:
> Donovan Baarda <[EMAIL PROTECTED]> wrote:
[...]
> 
> One thing that has gotten my underwear in a twist is that no one has
> really offered up a transition mechanism from "str.find working like now"
> and some future "str.find or lack of" other than "use str.index". 
> Obviously, I personally find the removal of str.find to be a nonstarter
> (don't make me catch exceptions or use regular expressions when both are
> unnecessary, please), but a proper transition of str.find from -1 to
> None on failure would be beneficial (can which one be chosen at runtime
> via __future__ import?).
> 
> During a transition which uses __future__, it would encourage the
> /proper/ use of str.find in all modules and extensions in which use it...
> 
> x = y.find(z)
> if x >= 0:
> #...
> 
It does seem rather fragile to rely on the continuation of the current 
behavior

  >>> None >= 0
False

for the correctness of "proper usage". Is this guaranteed in future 
implementations? Especially when:

  >>> type(None) >= 0
True

> Forcing people to use the proper semantic in their modules so as to be
> compatible with other modules which may or may not use str.find returns
> None, would (I believe) result in an overall reduction (if not
> elimination) of bugs stemming from str.find, and would prevent former
> str.find users from stumbling down the try/except/else misuse that Tim
> Peters highlighted.
> 
Once "str.find() returns None to fail" becomes the norm then surely the 
correct usage would be

 x = y.find(z)
 if x is not None:
 #...

which is still a rather ugly paradigm, but acceptable. So the transition 
is bound to be troubling.

> Heck, if you can get the __future__ import working for choosing which
> str.find to use (on a global, not per-module basis), I say toss it into
> 2.6, or even 2.5 if there is really a push for this prior to 3.0 .

The real problem is surely that one of find()'s legitimate return values 
evaluates false in a Boolean context. It's especially troubling that the 
value that does so doesn't indicate search failure. I'd prefer to live 
with the wart until 3.0 introduces something more satisfactory, or 
simply removes find() altogether. Otherwise the resulting code breakage 
when the future arrives just causes unnecessary pain.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Josiah Carlson

Donovan Baarda <[EMAIL PROTECTED]> wrote:
> 
> On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote:
> > Guido van Rossum <[EMAIL PROTECTED]> wrote:
> [...]
> > Oh, there's a good thing to bring up; regular expressions!  re.search
> > returns a match object on success, None on failure.  With this "failure
> > -> Exception" idea, shouldn't they raise exceptions instead?  And
> > goodness, defining a good regular expression can be quite hard, possibly
> > leading to not insignificant "my regular expression doesn't do what I
> > want it to do" bugs.  Just look at all of those escape sequences and the
> > syntax! It's enough to make a new user of Python gasp.
> 
> I think re.match() returning None is an example of 1b (as categorised by
> Terry Reedy). In this particular case a 1b style response is OK. Why;

My tongue was firmly planted in my cheek during my discussion of regular
expressions.  I was using it as an example of when one starts applying
some arbitrary rule to one example, and not noticing other examples that
do very similar, if not the same thing.

[snip discussion of re.match, re.search, str.find]

If you are really going to compare re.match, re.search and str.find, you
need to point out that neither re.match nor re.search raise an exception
when something isn't found (only when you try to work with None).  This
puts str.index as the odd-man-out in this discussion of searching a
string - so the proposal of tossing str.find as the 'weird one' is a
little strange.



One thing that has gotten my underwear in a twist is that no one has
really offered up a transition mechanism from "str.find working like now"
and some future "str.find or lack of" other than "use str.index". 
Obviously, I personally find the removal of str.find to be a nonstarter
(don't make me catch exceptions or use regular expressions when both are
unnecessary, please), but a proper transition of str.find from -1 to
None on failure would be beneficial (can which one be chosen at runtime
via __future__ import?).

During a transition which uses __future__, it would encourage the
/proper/ use of str.find in all modules and extensions in which use it...

x = y.find(z)
if x >= 0:
#...

Forcing people to use the proper semantic in their modules so as to be
compatible with other modules which may or may not use str.find returns
None, would (I believe) result in an overall reduction (if not
elimination) of bugs stemming from str.find, and would prevent former
str.find users from stumbling down the try/except/else misuse that Tim
Peters highlighted.

Heck, if you can get the __future__ import working for choosing which
str.find to use (on a global, not per-module basis), I say toss it into
2.6, or even 2.5 if there is really a push for this prior to 3.0 .

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Donovan Baarda
On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote:
> Guido van Rossum <[EMAIL PROTECTED]> wrote:
[...]
> Oh, there's a good thing to bring up; regular expressions!  re.search
> returns a match object on success, None on failure.  With this "failure
> -> Exception" idea, shouldn't they raise exceptions instead?  And
> goodness, defining a good regular expression can be quite hard, possibly
> leading to not insignificant "my regular expression doesn't do what I
> want it to do" bugs.  Just look at all of those escape sequences and the
> syntax! It's enough to make a new user of Python gasp.

I think re.match() returning None is an example of 1b (as categorised by
Terry Reedy). In this particular case a 1b style response is OK. Why;

1) any successful match evaluates to "True", and None evaluates to
"False". This allows simple code like;

  if myreg.match(s):
do something.

Note you can't do this for find, as 0 is a successful "find" and
evaluates to False, whereas other results including -1 evaluate to True.
Even worse, -1 is a valid index.

2) exceptions are for unexpected events, where unexpected means "much
less likely than other possibilities". The re.match() operation asks
"does this match this", which implies you have an about even chance of
not matching... ie a failure to match is not unexpected. The result None
makes sense... "what match did we get? None, OK".

For str.index() you are asking "give me the index of this inside this",
which implies you expect it to be in there... ie not finding it _is_
unexpected, and should raise an exception.

Note that re.match() returning None will raise exceptions if the rest of
your code doesn't expect it;

index = myreg.match(s).start()
tail = s[index:]

This will raise an exception if there was no match.

Unlike str.find();

index = s.find(r)
tail = s[index:]

Which will happily return the last character if there was no match. This
is why find() should return None instead of -1.

> With the existance of literally thousands of uses of .find and .rfind in
> the wild, any removal consideration should be weighed heavily - which
> honestly doesn't seem to be the case here with the ~15 minute reply time
> yesterday (just my observation and opinion).  If you had been ruminating
> over this previously, great, but that did not seem clear to me in your
> original reply to Terry Reedy.

bare in mind they are talking about Python 3.0... I think :-)

-- 
Donovan Baarda <[EMAIL PROTECTED]>
http://minkirri.apana.org.au/~abo/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/27/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> With the existance of literally thousands of uses of .find and .rfind in
> the wild, any removal consideration should be weighed heavily - which
> honestly doesn't seem to be the case here with the ~15 minute reply time
> yesterday (just my observation and opinion).  If you had been ruminating
> over this previously, great, but that did not seem clear to me in your
> original reply to Terry Reedy.

I hadn't been ruminating about deleting it previously, but I was well
aware of the likelihood of writing buggy tests for find()'s return
value. I believe that str.find() is not just something that can be
used to write buggy code, but something that *causes* bugs over and
over again. (However, see below.)

The argument that there are thousands of usages in the wild doesn't
carry much weight when we're talking about Python 3.0.

There are at least a similar number of modules that expect
dict.keys(), zip() and range() to return lists, or that depend on the
distinction between Unicode strings and 8-bit strings, or on bare
except:, on any other feature that is slated for deletion in Python
3.0 for which the replacement requires careful rethinking of the code
rather than a mechanical translation.

The *premise* of Python 3.0 is that it drops backwards compatibility
in order to make the language better in the long term. Surely you
believe that the majority of all Python programs have yet to be
written?

The only argument in this thread in favor of find() that made sense to
me was Tim Peters' observation that the requirement to use a
try/except clause leads to another kind of sloppy code. It's hard to
judge which is worse -- the buggy find() calls or the buggy/cumbersome
try/except code.

Note that all code (unless it needs to be backwards compatible to
Python 2.2 and before) which is using find() to merely detect whether
a given substring is present should be using 's1 in s2' instead.

Another observation: despite the derogatory remarks about regular
expressions, they have one thing going for them: they provide a higher
level of abstraction for string parsing, which this is all about.
(They are higher level in that you don't have to be counting
characters, which is about the lowest-level activity in programming --
only counting bytes is lower!)

Maybe if we had a *good* way of specifying string parsing we wouldn't
be needing to call find() or index() so much at all! (A good example
is the code that Raymond lifted from ConfigParser: a semicolon
preceded by whitespace starts a comment, other semicolons don't.
Surely there ought to be a better way to write that.)

All in all, I'm still happy to see find() go in Python 3.0, but I'm
leaving the door ajar: if you read this post carefully, you'll know
what arguments can be used to persuade me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Brett Cannon
On 8/26/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote:
> > Can str.find be listed in PEP 3000 (under builtins) for removal?
> 
> Yes please. (Except it's not technically a builtin but a string method.)
> 

Done.  Added an "Atomic Types" section to the PEP as well.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Josiah Carlson

Guido van Rossum <[EMAIL PROTECTED]> wrote:
> 
> On 8/26/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> > Taking a look at the commits that Guido did way back in 1993, he doesn't
> > mention why he added .find, only that he did.  Maybe it was another of
> > the 'functional language additions' that he now regrets, I don't know.
> 
> There's nothing functional about it. I remember adding it after
> finding it cumbersome to write code using index/rindex. However, that
> was long before we added startswith(), endswith(), and 's in t' for
> multichar s. Clearly all sorts of varieties of substring matching are
> important, or we wouldn't have so many methods devoted to it! (Not to
> mention the 're' module.)
> 
> However, after 12 years, I believe that the small benefit of having
> find() is outweighed by the frequent occurrence of bugs in its use.

Oh, there's a good thing to bring up; regular expressions!  re.search
returns a match object on success, None on failure.  With this "failure
-> Exception" idea, shouldn't they raise exceptions instead?  And
goodness, defining a good regular expression can be quite hard, possibly
leading to not insignificant "my regular expression doesn't do what I
want it to do" bugs.  Just look at all of those escape sequences and the
syntax! It's enough to make a new user of Python gasp.

Most of us are consenting adults here.  If someone writes buggy code
with str.find, that is unfortunate, maybe they should have used regular
expressions and tested for None, maybe they should have used
str.startswith (which is sometimes slower than m == n[:len(m)], but I
digress), maybe they should have used str.index. But just because buggy
code can be written with it, doesn't mean that it should be removed. 
Buggy code can, will, and has been written with every Python mechanism
that has ever existed or will ever exist.

With the existance of literally thousands of uses of .find and .rfind in
the wild, any removal consideration should be weighed heavily - which
honestly doesn't seem to be the case here with the ~15 minute reply time
yesterday (just my observation and opinion).  If you had been ruminating
over this previously, great, but that did not seem clear to me in your
original reply to Terry Reedy.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Raymond Hettinger
[Tim]
> You probably want "except ValueError:" in all these, not "except
> ValueError():".

Right.  I was misremembering the new edict to write:

   raise ValueError()


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Tim Peters
[Raymond Hettinger, rewrites some code]
> ...
> --- StringIO.py ---
> 
> i = self.buf.find('\n', self.pos)
> if i < 0:
>newpos = self.len
> else:
>newpos = i+1
> . . .
> 
> 
> try:
>i = self.buf.find('\n', self.pos)
> except ValueError():
>newpos = self.len
> else:
>newpos = i+1
> . . .

You probably want "except ValueError:" in all these, not "except ValueError():".

Leaving that alone, the last example particularly shows one thing I
dislike about try/except here:  in a language with properties, how is
the code reader supposed to guess that it's specifically and only the
.find() call that _can_ raise ValueError in

i = self.buf.find('\n', self.pos)

?  I agree it's clear enough here from context, but there's no
confusion possible on this point in the original spelling:  it's
immediately obvious that the result of find() is the only thing being
tested.  There's also strong temptation to slam everything into the
'try' block, and reduce nesting:

newpos = self.len
try:
newpos = self.buf.find('\n', self.pos) + 1
except ValueError:
pass

I've often seen code in the wild with, say, two-three dozen lines in a
``try`` block, with an "except AttributeError:" that was _intended_ to
catch an expected AttributeError only in the second of those lines. 
Of course that hides legitimate bugs too.  Like ``object.attr``, the
result of ``string.find()`` is normally used in further computation,
so the temptation is to slam the computation inside the ``try`` block
too.

.find() is a little delicate to use, but IME sloppy try/except
practice (putting much more in the ``try`` block than the specific
little operation where an exception is expected) is common, and harder
to get people to change because it requires thought instead of just
reading the manual to see that -1 means "not there" <0.5 wink>.

Another consideration is code that needs to use .find() a _lot_.  In
my programs of that sort, try/except is a lot more expensive than
letting -1 signal "not there".
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Raymond Hettinger
[Guido]
> However, after 12 years, I believe that the small benefit of having
> find() is outweighed by the frequent occurrence of bugs in its use.

My little code transformation exercise is bearing that out.  Two of the
first four cases in the standard library were buggy :-(


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Wolfgang Lipp
On Sat, 27 Aug 2005 16:46:07 +0200, Guido van Rossum  
<[EMAIL PROTECTED]> wrote:

> On 8/27/05, Wolfgang Lipp <[EMAIL PROTECTED]> wrote:
>> i never expected .get()
>> to work that way (return an unsolicited None) -- i do consider this
>> behavior harmful and suggest it be removed.
>
> That's a bizarre attitude. You don't read the docs and hence you want
> a feature you weren't aware of to be removed?

i do read the docs, and i believe i do keep a lot of detail in my
head. every now and then, tho, you piece sth together using a logic
that is not 100% the way it was intended, or the way it came about.
let me say that for someone who did developement for python for
a while it is natural to know that ~.get() is there for avoidance
of exceptions, and default values are an afterthought, but for someone
who did developement *with* python (and lacks experience of the other
side) this ain't necessarily so. that said, i believe it to be
more expressive and safer to demand ~.get('x',None) to be written
to achieve the present behavior, and let ~.get('x') raise an
exception. personally, i can live with either way, and am happier
the second. just my thoughts.

> I'm glad you're not on *my* team. (Emphasis mine. :-)

i wonder what that would be like.

_wolf


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/26/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> Taking a look at the commits that Guido did way back in 1993, he doesn't
> mention why he added .find, only that he did.  Maybe it was another of
> the 'functional language additions' that he now regrets, I don't know.

There's nothing functional about it. I remember adding it after
finding it cumbersome to write code using index/rindex. However, that
was long before we added startswith(), endswith(), and 's in t' for
multichar s. Clearly all sorts of varieties of substring matching are
important, or we wouldn't have so many methods devoted to it! (Not to
mention the 're' module.)

However, after 12 years, I believe that the small benefit of having
find() is outweighed by the frequent occurrence of bugs in its use.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/27/05, Wolfgang Lipp <[EMAIL PROTECTED]> wrote:
> i never expected .get()
> to work that way (return an unsolicited None) -- i do consider this
> behavior harmful and suggest it be removed.

That's a bizarre attitude. You don't read the docs and hence you want
a feature you weren't aware of to be removed? I'm glad you're not on
*my* team. (Emphasis mine. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/27/05, Kay Schluehr <[EMAIL PROTECTED]> wrote:
> The discourse about Python3000 has shrunken from the expectation of the
> "next big thing" into a depressive rhetorics of feature elimination.
> The language doesn't seem to become deeper, smaller and more powerfull
> but just smaller.

I understand how your perception reading python-dev would make you
think that, but it's not true.

There is much focus on removing things, because we want to be able to
add new stuff but we don't want the language to grow. Python-dev is
(correctly) very focused on the status quo and the near future, so
discussions on what can be removed without hurting are valuable here.

Discussions on what to add should probably happen elsewhere, since the
proposals tend to range from genius to insane (sometimes within one
proposal :-) and the discussion tends to become even more rampant than
the discussions about changes in 2.5.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Reinhold Birkenfeld
Raymond Hettinger wrote:
> [Martin]
>> For another example, file.read() returns an empty string at EOF.
> 
> When my turn comes for making 3.0 proposals, I'm going to recommend
> nixing the "empty string at EOF" API.  That is a carry-over from C that
> made some sense before there were iterators.  Now, we have the option of
> introducing much cleaner iterator versions of these methods that use
> compact, fast, and readable for-loops instead of multi-statement
> while-loop boilerplate.

I think

for char in iter(lambda: f.read(1), ''):
pass

is not bad, too.

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/27/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:

> --- From ConfigParser.py ---
> 
> optname, vi, optval = mo.group('option', 'vi', 'value')
> if vi in ('=', ':') and ';' in optval:
> # ';' is a comment delimiter only if it follows
> # a spacing character
> pos = optval.find(';')
> if pos != -1 and optval[pos-1].isspace():
> optval = optval[:pos]
> optval = optval.strip()
> . . .
> 
> 
> optname, vi, optval = mo.group('option', 'vi', 'value')
> if vi in ('=', ':') and ';' in optval:
> # ';' is a comment delimiter only if it follows
> # a spacing character
> try:
> pos = optval.index(';')
> except ValueError():

I'm sure you meant "except ValueError:"

> pass
> else:
> if optval[pos-1].isspace():
> optval = optval[:pos]
> optval = optval.strip()
> . . .

That code is buggy before and after the transformation -- consider
what happens if optval *starts* with a semicolon. Also, the code is
searching optval for ';' twice. Suggestion:

if vi in ('=',':'):
  try: pos = optval.index(';')
  except ValueError: pass
  else:
if pos > 0 and optval[pos-1].isspace():
  optval = optval[:pos]

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Guido van Rossum
On 8/27/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> [Martin]
> > For another example, file.read() returns an empty string at EOF.
> 
> When my turn comes for making 3.0 proposals, I'm going to recommend
> nixing the "empty string at EOF" API.  That is a carry-over from C that
> made some sense before there were iterators.  Now, we have the option of
> introducing much cleaner iterator versions of these methods that use
> compact, fast, and readable for-loops instead of multi-statement
> while-loop boilerplate.

-1.

For reading lines we already have that in the status quo.

For reading bytes, I *know* that a lot of code would become uglier if
the API changed to raise EOFError exceptions. It's not a coincidence
that raw_input() raises EOFError but readline() doesn't -- the
readline API was designed after externsive experience with
raw_input().

The situation is different than for find():

- there aren't two APIs that only differ in their handling of the
exceptional case

- the error return value tests false and all non-error return values tests true

- in many cases processing the error return value the same as
non-error return values works just fine (as long as you have another
way to test for termination)

Also, even if read() raised EOFError instead of returning '', code
that expects certain data wouldn't be simplified -- after attempting
to read e.g. 4 bytes, you'd still have to check that you got exactly
4, so there'd be three cases to handle (EOFError, short, good) instead
of two (short, good).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Wolfgang Lipp

kay,

your suggestion makes perfect sense for me, i haven't actually tried
the examples tho. guess there could be a find() or index() or
indices() or iterIndices() ??? function 'f' roughly with these arguments:

def f( x, element, start = 0, stop = None, default = _Misfit, maxcount =  
None, reverse = False )

that iterates over the indices of x where element (a substring, key, or
value in a sequence or iterator) is found, raising sth. like IndexError
when nothing at all is found except when default is not '_Misfit'  
(mata-None),
and starts looking from the right end when reverse is True (this *may*
imply that reversed(x) is done on x where no better implementation is
available). not quite sure whether it makes sense to me to always return
default as the last value of the iteration -- i tend to say rather not.
ah yes, only up to maxcount indices are yielded.

bet it said that passing an iterator for x would mean that the iterator is  
gone
up to where the last index was yielded; passing an iterator is not
acceptable for reverse = True.

MHO,

_wolf



On Sat, 27 Aug 2005 14:57:08 +0200, Kay Schluehr <[EMAIL PROTECTED]>  
wrote:
>
> def keep(iter, default=None):
>  try:
>  return iter.next()
>  except StopIteration:
>  return default
>
> Together with an index iterator the user can mimic the behaviour he
> wants. Instead of a ValueError a StopIteration exception can hold as
> an "external" information ( other than a default value ):
>
>  >>> keep( "abcdabc".index("bc"), default=-1)  # current behaviour of the
># find() function
>  >>> (idx for idx in "abcdabc".rindex("bc"))   # generator expression
>
>
> Since the find() method acts on a string literal it is not easy to
> replace it syntactically. But why not add functions that can be hooked
> into classes whose objects are represented by literals?
>
> def find( string, substring):
>  return keep( string.index( substring), default=-1)
>
> str.register(find)
>
>  >>> "abcdabc".find("bc")
> 1
>
> Now find() can be stored in a pure Python module without maintaining it
> on interpreter level ( same as with reduce, map and filter ).
>
> Kay





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Just van Rossum
Wolfgang Lipp wrote:

> > Just because you don't read the documentation and guessed wrong
> > d.get() needs to be removed?!?
> 
> no, not removed... never said that.

Fair enough, you proposed to remove the behavior. Not sure how that's
all that much less bad, though...

> implied). the reason of being for d.get() -- to me -- is simply so you
> get a chance to pass a default value, which is syntactically well-nigh
> impossible with d['x'].

Close, but the main reason to add d.get() was to avoid the exception.
The need to specify a default value followed from that.

Just
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Raymond Hettinger
FWIW, here are three more comparative code fragments.  They are
presented without judgment as an evaluation tool to let everyone form
their own opinion about the merits of each:


--- From CGIHTTPServer.py ---

def run_cgi(self):
"""Execute a CGI script."""
dir, rest = self.cgi_info
i = rest.rfind('?')
if i >= 0:
rest, query = rest[:i], rest[i+1:]
else:
query = ''
i = rest.find('/')
if i >= 0:
script, rest = rest[:i], rest[i:]
else:
script, rest = rest, ''
. . .


def run_cgi(self):
"""Execute a CGI script."""
dir, rest = self.cgi_info
try:
i = rest.rindex('?')
except ValueError():
query = ''
else:
rest, query = rest[:i], rest[i+1:]
try:
i = rest.index('/')
except ValueError():
script, rest = rest, ''
else:
script, rest = rest[:i], rest[i:]
. . .


--- From ConfigParser.py ---

optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval:
# ';' is a comment delimiter only if it follows
# a spacing character
pos = optval.find(';')
if pos != -1 and optval[pos-1].isspace():
optval = optval[:pos]
optval = optval.strip()
. . .


optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval:
# ';' is a comment delimiter only if it follows
# a spacing character
try:
pos = optval.index(';')
except ValueError():
pass
else:
if optval[pos-1].isspace():
optval = optval[:pos]
optval = optval.strip()
. . .


--- StringIO.py ---

i = self.buf.find('\n', self.pos)
if i < 0:
newpos = self.len
else:
newpos = i+1
. . .


try:
i = self.buf.find('\n', self.pos)
except ValueError():
newpos = self.len
else:
newpos = i+1
. . .


My notes so far weren't meant to judge the proposal.  I'm just
suggesting that examining fragments like the ones above will help inform
the design process.

Peace,


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Kay Schluehr
Terry Reedy wrote:

>>I would object to the removal of str.find().
> 
> 
> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

I actually don't see much benefits from the user perspective. The 
discourse about Python3000 has shrunken from the expectation of the 
"next big thing" into a depressive rhetorics of feature elimination.
The language doesn't seem to become deeper, smaller and more powerfull 
but just smaller.


> B. Change the return type of .find to None.
> 
> C. Remove .(r)index instead.
> 
> D. Add more redundancy for those who do not like exceptions.

Why not turning index() into an iterator that yields indices 
sucessively? From this generalized perspective we can try to reconstruct 
behaviour of Python 2.X.

Sometimes I use a custom keep() function if I want to prevent defining a 
block for catching StopIteration. The keep() function takes an iterator 
and returns a default value in case of StopIteration:

def keep(iter, default=None):
 try:
 return iter.next()
 except StopIteration:
 return default

Together with an index iterator the user can mimic the behaviour he 
wants. Instead of a ValueError a StopIteration exception can hold as
an "external" information ( other than a default value ):

 >>> keep( "abcdabc".index("bc"), default=-1)  # current behaviour of the
   # find() function
 >>> (idx for idx in "abcdabc".rindex("bc"))   # generator expression


Since the find() method acts on a string literal it is not easy to
replace it syntactically. But why not add functions that can be hooked 
into classes whose objects are represented by literals?

def find( string, substring):
 return keep( string.index( substring), default=-1)

str.register(find)

 >>> "abcdabc".find("bc")
1

Now find() can be stored in a pure Python module without maintaining it 
on interpreter level ( same as with reduce, map and filter ).

Kay

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Wolfgang Lipp
On Sat, 27 Aug 2005 13:01:02 +0200, Just van Rossum <[EMAIL PROTECTED]>  
wrote:

> Just because you don't read the documentation and guessed wrong d.get()
> needs to be removed?!?

no, not removed... never said that.

> It's a *feature* of d.get(k) to never raise KeyError. If you need an
> exception, why not just use d[k]?

i agree i misread the specs, but then, i read the specs a lot, and
i guess everyone here agrees that if it's in the specs doesn't mean
it's automatically what we want or expect -- else there's nothing to
discuss. i say

d.get('x') == None
<==
{ ( 'x' not in d ) OR ( d['x'] == None ) }

is not what i expect (even tho the specs say so) especially since
d.pop('x') *does* throw a KeyError when 'x' is not a key in mydict.
ok, pop is not get and so on but still i perceive this a problematic
behavior (to the point i call it a 'bug' in a jocular way, no offense
implied). the reason of being for d.get() -- to me -- is simply so you
get a chance to pass a default value, which is syntactically well-nigh
impossible with d['x'].

_wolf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Just van Rossum
Wolfgang Lipp wrote:

> On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. Löwis
<[EMAIL PROTECTED]>  
> wrote:
> > P.S. Emphasis mine :-)
> 
> no, emphasis all **mine** :-) just to reflect i never expected .get()
> to work that way (return an unsolicited None) -- i do consider this
> behavior harmful and suggest it be removed.

Just because you don't read the documentation and guessed wrong d.get()
needs to be removed?!?

It's a *feature* of d.get(k) to never raise KeyError. If you need an
exception, why not just use d[k]?

Just
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Wolfgang Lipp
On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. Löwis <[EMAIL PROTECTED]>  
wrote:
> P.S. Emphasis mine :-)

no, emphasis all **mine** :-) just to reflect i never expected .get()
to work that way (return an unsolicited None) -- i do consider this
behavior harmful and suggest it be removed.

_wolf




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Martin v. Löwis
Wolfgang Lipp wrote:
> that's a bug! i had to *test* it to find out it's true! i've been writing
> code for *years* all in the understanding that dict.get(x) acts precisely
> like dict['x'] *except* you get a chance to define a default value.

Clearly, your understanding *all* these years *was* wrong. If you don't
specify *a* default value, *it* defaults to None.

Regards,
Martin

P.S. Emphasis mine :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Wolfgang Lipp
On Sat, 27 Aug 2005 08:54:12 +0200, Martin v. Löwis <[EMAIL PROTECTED]>  
wrote:
> with choice 1a): dict.get returns None if the key is not found, even
> though None could also be the value for the key.

that's a bug! i had to *test* it to find out it's true! i've been writing
code for *years* all in the understanding that dict.get(x) acts precisely
like dict['x'] *except* you get a chance to define a default value. which,
for me, has become sort of a standard solution to the problem the last ten
or so postings were all about: when i write a function and realize that's
one of the cases where python philosophy strongly favors raising an  
exception
because something e.g. could not be found where expected, i make it so that
a reasonable exception is raised *and* where meaningful i give consumers
a chance to pass in a default value to eschew exceptions. i believe
this is the way to go to resolve this .index/.find conflict. and, no,
returning -1 when a substring is not found and None when a key is not
found is *highly* problematic. i'd sure like to see cases like that to go.

i'm not sure why .rindex() should go (correct?), and how to do what it does
(reverse the string before doing .index()? is that what is done  
internally?)

and of course, like always, there is the question why these are methods
at all and why there is a function len(str) but a method str.index(); one
could just as well have *either* str.length and str.index() *or*
length(str) and, say, a builtin

  locate( x, element, start = 0 , stop = None, reversed = False, default =  
Misfit )

(where Misfit indicates a 'meta-None', so None is still a valid default  
value;
i also like to indicate 'up to the end' with stop=None) that does on  
iterables
(or only on sequences) what the methods do now, but with this strange  
pattern:

--
 .index() .find() .get() .pop()
list   +   ?(3)   +
tuple  ?(3)   ??(1)
str++  ?(3)   ??(1)
dict   x(2) x(2)   +  +

(1) one could argue this should return a copy of a tuple or str,
but doubtful. (2) index/find meaningless for dicts. (3) there
is no .get() for list, tuple, str, although it would make sense:
return the indexed element, or raise IndexError where not found
if no default return value given.
--

what bites me here is expecially that we have both index and find
for str *but a gaping hole* for tuples. assuming tuples are not slated
for removal, i suggest to move in a direction that makes things look
more like this:

--
 .index() .get() .pop()
list   +   +  +
tuple  +   +
str+   +
dict   +  +
--

where .index() looks like locate, above:

--
{list,tuple,str}.index(
 element,# element in the collection
 start = 0,  # where to start searching; default is zero
 stop = None,# where to end; the default, None, indicates
 # 'to the end'
 reversed = False,   # should we search from the back? *may* cause
 # reversion of sequence, depending on impl.
 default = _Misfit,  # default value, when given, prevents
 # IndexError from being raised
 )
--

hope i didn't miss out crucial points here.

_wolf



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Raymond Hettinger
[Martin]
> For another example, file.read() returns an empty string at EOF.

When my turn comes for making 3.0 proposals, I'm going to recommend
nixing the "empty string at EOF" API.  That is a carry-over from C that
made some sense before there were iterators.  Now, we have the option of
introducing much cleaner iterator versions of these methods that use
compact, fast, and readable for-loops instead of multi-statement
while-loop boilerplate.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Raymond Hettinger
> > The most important reason for the patch is that looking at the
context
> > diff will provide an objective look at how real code will look
before
> > and after the change.  This would make subsequent discussions
> > substantially more informed and less anecdotal.
> 
> No, you're just artificially trying to raise the bar for Python 3.0
> proposals to an unreasonable height.

Not really.  I'm mostly for the proposal (+0), but am certain the
conversation about the proposal would be substantially more informed if
we had a side-by-side comparison of what real-world code looks like
before and after the change.  There are not too many instances of
str.find() in the library and it is an easy patch to make.  I'm just
asking for a basic, objective investigative tool.

Unlike more complex proposals, this one doesn't rely on any new
functionality.  It just says don't use X anymore.  That makes it
particularly easy to investigate in an objective way.

BTW, this isn't unprecedented.  We're already done it once when
backticks got slated for removal in 3.0.  All instances of it got
changed in the standard library.  As a result of the patch, we were able
to 1) get an idea of how much work it took, 2) determine every category
of use case, 3) learn that the resulting code was more beautiful,
readable, and only microscopically slower, 4) learn about a handful of
cases that were unexpectedly difficult to convert, and 5) update the
library to be an example of what we think modern code looks like.  That
patch painlessly informed the decision making and validated that we were
doing the right thing.

The premise of Terry's proposal is that Python code is better when
str.find() is not used.  This is a testable proposition.  Why not use
the wealth of data at our fingertips to augment a priori reasoning and
anecdotes.  I'm not at all arguing against the proposal; I'm just asking
for a thoughtful design process.

 

Raymond


P.S.  Josiah was not alone.  The comp.lang.python discussion had other
posts expressing distaste for raising exceptions instead of using return
codes.  While I don't feel the same way, I don't think the respondants
should be ignored.



"Those people who love sausage and respect the law should not watch
either one being made."

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-27 Thread Reinhold Birkenfeld
Bill Janssen wrote:
>> There are basically two ways for a system, such as a 
>> Python function, to indicate 'I cannot give a normal response."  One (1a) 
>> is to give an inband signal that is like a normal response except that it 
>> is not (str.find returing -1).  A variation (1b) is to give an inband 
>> response that is more obviously not a real response (many None returns). 
>> The other (2) is to not respond (never return normally) but to give an 
>> out-of-band signal of some sort (str.index raising ValueError).
>> 
>> Python as distributed usually chooses 1b or 2.  I believe str.find and 
>> .rfind are unique in the choice of 1a.
> 
> Doubt it.  The problem with returning None is that it tests as False,
> but so does 0, which is a valid string index position.

Heh. You know what the Perl6 folks would suggest in this case?

return 0 but true; # literally!

> Might add a boolean "str.contains()" to cover this test case.

There's already __contains__.

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Martin v. Löwis
Terry Reedy wrote:
> One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.

That is not true. str.find's choice is not 1a, and there are other
functions which chose 1a): -1 does *not* look like a normal response,
since a normal response is non-negative. It is *not* the only method
with choice 1a): dict.get returns None if the key is not found, even
though None could also be the value for the key.

For another example, file.read() returns an empty string at EOF.


> I am pretty sure that the choice 
> of -1 as error return, instead of, for instance, None, goes back the the 
> need in static languages such as C to return something of the declared 
> return type.  But Python is not C, etcetera.  I believe that this pair is 
> also unique in having exact counterparts of type 2.

dict.__getitem__ is a counterpart of type 2 of dict.get.

> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

My favourite choice is the status quo. I probably don't fully
understand the word "to streamline", but I don't see this as
rationalizing. Instead, some applications will be more tedious
to write.

> So are you advocating D above or claiming that substring indexing is 
> uniquely deserving of having two versions?  If the latter, why so special? 

Because it is no exception that a string is not part of another string,
and because the question I'm asking "is the string in the other string,
and if so, where?". This is similar to the question "does the dictionary
have a value for that key, and if so, which?"

> If we only has str.index, would you actually suggest adding this particular 
> duplication?

That is what happened to dict.get: it was not originally there (I
believe), but added later.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Josiah Carlson

"Terry Reedy" <[EMAIL PROTECTED]> wrote:
> 
> "Josiah Carlson" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> >
> > "Terry Reedy" <[EMAIL PROTECTED]> wrote:
> >>
> >> Can str.find be listed in PEP 3000 (under builtins) for removal?
> 
> Guido has already approved,

I noticed, but he approved before anyone could say anything.  I
understand it is a dictatorship, but he seems to take advisment and
reverse (or not) his decisions on occasion based on additional
information. Whether this will lead to such, I don't know.


> but I will try to explain my reasoning a bit 
> better for you.  There are basically two ways for a system, such as a 
> Python function, to indicate 'I cannot give a normal response."  One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).  A variation (1b) is to give an inband 
> response that is more obviously not a real response (many None returns). 
> The other (2) is to not respond (never return normally) but to give an 
> out-of-band signal of some sort (str.index raising ValueError).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.  I am pretty sure that the choice 
> of -1 as error return, instead of, for instance, None, goes back the the 
> need in static languages such as C to return something of the declared 
> return type.  But Python is not C, etcetera.  I believe that this pair is 
> also unique in having exact counterparts of type 2.  (But maybe I forgot 
> something.)

Taking a look at the commits that Guido did way back in 1993, he doesn't
mention why he added .find, only that he did.  Maybe it was another of
the 'functional language additions' that he now regrets, I don't know.


> >> Would anyone really object?
> 
> > I would object to the removal of str.find().
> 
> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

str.find is not a language construct.  It is a method on a built-in type
that many people use.  This is my vote.


> B. Change the return type of .find to None.

Again, this would break potentially thousands of lines of user code that
is in the wild.  Are we talking about changes for 2.5 here, or 3.0?

> C. Remove .(r)index instead.

see below *

> D. Add more redundancy for those who do not like exceptions.

In 99% of the cases, such implementations would be minimal.  While I
understand that "There should be one-- and preferably only one --obvious
way to do it.", please see below *.


> > Further, forcing users to use try/except when they are looking for the
> > offset of a substring seems at least a little strange (if not a lot
> > braindead, no offense to those who prefer their code to spew exceptions
> > at every turn).
> 
> So are you advocating D above or claiming that substring indexing is 
> uniquely deserving of having two versions?  If the latter, why so special? 
> If we only has str.index, would you actually suggest adding this particular 
> duplication?

Apparently everyone has forgotten the dozens of threads on similar
topics over the years.  I'll attempt to summarize.

Adding functionality that isn't used is harmful, but not nearly as
harmful as removing functionality that people use.

If you take just two seconds and do a search on '.find(' vs '.index(' in
the standard library, you will notice that '.find(' is used more often
than '.index(' regardless of type (I don't have the time this evening to
pick out which ones are string only, but I doubt the standard library
uses mmap.find, DocTestFinder.find, or gettext.find).  This example
seems to show that people find str.find to be more intuitive and/or
useful than str.index, even though you spent two large paragraphs
explaining that Python 'doesn't do it that way very often so it isn't
Pythonic'. Apparently the majority of people who have been working on
the standard library for the last decade disagree.


> > Considering the apparent dislike/hatred for str.find.
> 
> I don't hate str.find.  I simply (a) recognize that a function designed for 
> static typing constraints is out of place in Python, which does not have 
> those constraints and (b) believe that there is no reason other than 
> history for the duplication and (c) believe that dropping .find is 
> definitely better than dropping .index and changing .find.

* I don't see why it is necessary to drop or change either one.  We've
got list() and [] for construcing a list.  Heck, we've even got
list(iterable) and [i for i in iterable] for making a list copy of any
arbitrary iterable.  This goes against TSBOOWTDI, so why don't we toss
list comprehensions now that we have list(generator expression)?  Or did
I miss something and this was already going to happen?


> > Would you further request that .rfind be removed from strings?
> 
> Of course.  Thanks for reminding me.

No problem, but again, do a search in the standard libr

Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Bill Janssen
Don't know *what* I wasn't thinking :-).

Bill

> On 8/26/05, Bill Janssen <[EMAIL PROTECTED]> wrote:
> > Doubt it.  The problem with returning None is that it tests as False,
> > but so does 0, which is a valid string index position.  The reason
> > string.find() returns -1 is probably to allow a test:
> > 
> >   if line.find("\f"):
> >  ... do something
> 
> This has a bug; it is equivalent to "if not line.startswith("\f"):".
> 
> This mistake (which I have made more than once myself and have seen
> many times in code by others) is one of the main reasons to want to
> get rid of this style of return value.
> 
> > Might add a boolean "str.contains()" to cover this test case.
> 
> We already got that: "\f" in line.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Guido van Rossum
On 8/26/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> I had one further thought.  In addition to your excellent list of
> reasons, it would be great if these kind of requests were accompanied by
> a patch that removed the offending construct from the standard library.

Um? Are we now requiring patches for PYTHON THREE DOT OH proposals?

Raymond, we all know and agree that Python 3.0 will be incompatible in
many ways. range() and keys() becoming iterators, int/int returning
float, and so on; we can safely say that it will break nearly every
module under the sun, and no amount of defensive coding in Python 2.x
will save us.

> The most important reason for the patch is that looking at the context
> diff will provide an objective look at how real code will look before
> and after the change.  This would make subsequent discussions
> substantially more informed and less anecdotal.

No, you're just artificially trying to raise the bar for Python 3.0
proposals to an unreasonable height.

> The second reason is that the revised library code becomes more likely
> to survive the transition to 3.0.  Further, it can continue to serve as
> example code which highlights current best practices.

But we don't *want* all of the library code to survive. Much of it is
10-15 years old and in dear need of a total rewrite. See Anthony
Baxter's lightning talk at OSCON (I'm sure Google can find it for
you).

> This patch wouldn't take long.  I've tried about a half dozen cases
> since you first posted.  Each provided a new insight (zipfile was not
> improved, webbrowser was improved, and urlparse was about the same).

So it's neutral in terms of code readability. Great. Given all the
other advantages for the proposal (an eminent member of this group
just posted a buggy example :-) I'm now doubly convinced that we
should do it.

Also remember, the standard library is rather atypical -- while some
of it makes great example code, other parts of it are highly contorted
in order to either maintain backwards compatibility or provide an
unusually high level of defensiveness.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Guido van Rossum
On 8/26/05, Bill Janssen <[EMAIL PROTECTED]> wrote:
> Doubt it.  The problem with returning None is that it tests as False,
> but so does 0, which is a valid string index position.  The reason
> string.find() returns -1 is probably to allow a test:
> 
>   if line.find("\f"):
>  ... do something

This has a bug; it is equivalent to "if not line.startswith("\f"):".

This mistake (which I have made more than once myself and have seen
many times in code by others) is one of the main reasons to want to
get rid of this style of return value.

> Might add a boolean "str.contains()" to cover this test case.

We already got that: "\f" in line.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Bill Janssen
> There are basically two ways for a system, such as a 
> Python function, to indicate 'I cannot give a normal response."  One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).  A variation (1b) is to give an inband 
> response that is more obviously not a real response (many None returns). 
> The other (2) is to not respond (never return normally) but to give an 
> out-of-band signal of some sort (str.index raising ValueError).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.

Doubt it.  The problem with returning None is that it tests as False,
but so does 0, which is a valid string index position.  The reason
string.find() returns -1 is probably to allow a test:

  if line.find("\f"):
 ... do something

Might add a boolean "str.contains()" to cover this test case.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Terry Reedy

"Josiah Carlson" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>
> "Terry Reedy" <[EMAIL PROTECTED]> wrote:
>>
>> Can str.find be listed in PEP 3000 (under builtins) for removal?

Guido has already approved, but I will try to explain my reasoning a bit 
better for you.  There are basically two ways for a system, such as a 
Python function, to indicate 'I cannot give a normal response."  One (1a) 
is to give an inband signal that is like a normal response except that it 
is not (str.find returing -1).  A variation (1b) is to give an inband 
response that is more obviously not a real response (many None returns). 
The other (2) is to not respond (never return normally) but to give an 
out-of-band signal of some sort (str.index raising ValueError).

Python as distributed usually chooses 1b or 2.  I believe str.find and 
.rfind are unique in the choice of 1a.  I am pretty sure that the choice 
of -1 as error return, instead of, for instance, None, goes back the the 
need in static languages such as C to return something of the declared 
return type.  But Python is not C, etcetera.  I believe that this pair is 
also unique in having exact counterparts of type 2.  (But maybe I forgot 
something.)

>> Would anyone really object?

> I would object to the removal of str.find().

So, I wonder, what is your favored alternative?

A. Status quo: ignore the opportunity to streamline the language.

B. Change the return type of .find to None.

C. Remove .(r)index instead.

D. Add more redundancy for those who do not like exceptions.

> Further, forcing users to use try/except when they are looking for the
> offset of a substring seems at least a little strange (if not a lot
> braindead, no offense to those who prefer their code to spew exceptions
> at every turn).

So are you advocating D above or claiming that substring indexing is 
uniquely deserving of having two versions?  If the latter, why so special? 
If we only has str.index, would you actually suggest adding this particular 
duplication?

> Considering the apparent dislike/hatred for str.find.

I don't hate str.find.  I simply (a) recognize that a function designed for 
static typing constraints is out of place in Python, which does not have 
those constraints and (b) believe that there is no reason other than 
history for the duplication and (c) believe that dropping .find is 
definitely better than dropping .index and changing .find.

> Would you further request that .rfind be removed from strings?

Of course.  Thanks for reminding me.

>  The inclusion of .rindex?

Yes, the continued inclusion of .rindex, which we already have.

Terry J. Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Terry Reedy

"Raymond Hettinger" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>> Can str.find be listed in PEP 3000 (under builtins) for removal?
>
> FWIW, here is a sample code transformation (extracted from zipfile.py).
> Judge for yourself whether the index version is better:

I am sure that we both could write similar code that would be smoother if 
the math module also had a 'powhalf' function that was the same as sqrt 
except for returning -1 instead of raising an error on negative or 
non-numerical input.

I'll continue in response to Josiah...

Terry J. Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Terry Reedy

"Guido van Rossum" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote:
>> Can str.find be listed in PEP 3000 (under builtins) for removal?
>
> Yes please. (Except it's not technically a builtin but a string method.)

To avoid suggesting a new header, I interpreted Built-ins broadly to 
include builtin types.  The header could be expanded to Built-in Constants, 
Functions, and Types or Built-ins and Built-in Types but I leave such 
details to the PEP authors.

Terry J. Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Josiah Carlson

"Terry Reedy" <[EMAIL PROTECTED]> wrote:
> 
> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?

I would object to the removal of str.find() .  In fact, older versions
of Python which only allowed for single-character 'x in str' containment
tests offered 'str.find(...) != -1' as a suitable replacement option,
which is found in the standard library more than a few times...

Further, forcing users to use try/except when they are looking for the
offset of a substring seems at least a little strange (if not a lot
braindead, no offense to those who prefer their code to spew exceptions
at every turn).

I've been thinking for years that .find should be part of the set of
operations offered to most, if not all sequences (lists, buffers, tuples, ...). 
Considering the apparent dislike/hatred for str.find, it seems I was
wise in not requesting it in the past.

> 
> Reasons:
> 
> 1. Str.find is essentially redundant with str.index.  The only difference 
> is that str.index Pythonically indicates 'not found' by raising an 
> exception while str.find does the same by anomalously returning -1.  As 
> best as I can remember, this is common for Unix system calls but unique 
> among Python builtin functions.  Learning and remembering both is a 
> nuisance.

So pick one and forget the other.  I think of .index as a list method 
(because it doesn't offer .find), not a string method, even though it is.

> 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing 
> subscript.  If one uses the return value as a subscript without checking, 
> the bug is not caught.  None would be a better return value should find not 
> be deleted.

And would break potentially thousands of lines of code in the wild which
expect -1 right now.  Look in the standard library for starting examples,
and google around for others.

> 3. Anyone who prefers to test return values instead of catch exceptions can 
> write (simplified, without start,end params):
> 
> def sfind(string, target):
>   try:
> return string.index(target)
>   except ValueError:
> return None # or -1 for back compatibility, but None better
> 
> This can of course be done for any function/method that indicates input 
> errors with exceptions instead of a special return value.  I see no reason 
> other than history that this particular method should be doubled.

I prefer my methods to stay on my instances, and I could have sworn that
the string module's functions were generally deprecated in favor of
string methods.  Now you are (implicitly) advocating the reversal of
such for one method which doesn't return an exception under a very
normal circumstance.

Would you further request that .rfind be removed from strings?  The
inclusion of .rindex?

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Raymond Hettinger
> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?
> 
> Reasons:
  . . .


I had one further thought.  In addition to your excellent list of
reasons, it would be great if these kind of requests were accompanied by
a patch that removed the offending construct from the standard library.

The most important reason for the patch is that looking at the context
diff will provide an objective look at how real code will look before
and after the change.  This would make subsequent discussions
substantially more informed and less anecdotal.

The second reason is that the revised library code becomes more likely
to survive the transition to 3.0.  Further, it can continue to serve as
example code which highlights current best practices.

This patch wouldn't take long.  I've tried about a half dozen cases
since you first posted.  Each provided a new insight (zipfile was not
improved, webbrowser was improved, and urlparse was about the same).



Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Raymond Hettinger
> Can str.find be listed in PEP 3000 (under builtins) for removal?


FWIW, here is a sample code transformation (extracted from zipfile.py).
Judge for yourself whether the index version is better:


Existing code:
--
END_BLOCK = min(filesize, 1024 * 4)
fpin.seek(filesize - END_BLOCK, 0)
data = fpin.read()
start = data.rfind(stringEndArchive)
if start >= 0: # Correct signature string was found
endrec = struct.unpack(structEndArchive, data[start:start+22])
endrec = list(endrec)
comment = data[start+22:]
if endrec[7] == len(comment): # Comment length checks out
# Append the archive comment and start offset
endrec.append(comment)
endrec.append(filesize - END_BLOCK + start)
return endrec
return  # Error, return None


Revised code:
-
END_BLOCK = min(filesize, 1024 * 4)
fpin.seek(filesize - END_BLOCK, 0)
data = fpin.read()
try:
start = data.rindex(stringEndArchive)
except ValueError:
pass
else:
# Correct signature string was found
endrec = struct.unpack(structEndArchive, data[start:start+22])
endrec = list(endrec)
comment = data[start+22:]
if endrec[7] == len(comment): # Comment length checks out
# Append the archive comment and start offset
endrec.append(comment)
endrec.append(filesize - END_BLOCK + start)
return endrec
return  # Error, return None

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Guido van Rossum
On 8/26/05, Terry Reedy <[EMAIL PROTECTED]> wrote:
> Can str.find be listed in PEP 3000 (under builtins) for removal?

Yes please. (Except it's not technically a builtin but a string method.)

> Would anyone really object?

Not me.

> Reasons:
> 
> 1. Str.find is essentially redundant with str.index.  The only difference
> is that str.index Pythonically indicates 'not found' by raising an
> exception while str.find does the same by anomalously returning -1.  As
> best as I can remember, this is common for Unix system calls but unique
> among Python builtin functions.  Learning and remembering both is a
> nuisance.
> 
> 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing
> subscript.  If one uses the return value as a subscript without checking,
> the bug is not caught.  None would be a better return value should find not
> be deleted.
> 
> 3. Anyone who prefers to test return values instead of catch exceptions can
> write (simplified, without start,end params):
> 
> def sfind(string, target):
>   try:
> return string.index(target)
>   except ValueError:
> return None # or -1 for back compatibility, but None better
> 
> This can of course be done for any function/method that indicates input
> errors with exceptions instead of a special return value.  I see no reason
> other than history that this particular method should be doubled.

I'd like to add:

4. The no. 1 use case for str.find() used to be testing whether a
substring was present or not; "if s.find(sub) >= 0" can now be written
as "if sub in s". This avoids the nasty bug in "if s.find(sub)".

> If .find is scheduled for the dustbin of history, I would be willing to
> suggest doc and docstring changes.  (str.index.__doc__ currently refers to
> str.find.__doc__.  This should be reversed.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Remove str.find in 3.0?

2005-08-26 Thread Terry Reedy
Can str.find be listed in PEP 3000 (under builtins) for removal?
Would anyone really object?

Reasons:

1. Str.find is essentially redundant with str.index.  The only difference 
is that str.index Pythonically indicates 'not found' by raising an 
exception while str.find does the same by anomalously returning -1.  As 
best as I can remember, this is common for Unix system calls but unique 
among Python builtin functions.  Learning and remembering both is a 
nuisance.

2. As is being discussed in a current c.l.p thread, -1 is a legal indexing 
subscript.  If one uses the return value as a subscript without checking, 
the bug is not caught.  None would be a better return value should find not 
be deleted.

3. Anyone who prefers to test return values instead of catch exceptions can 
write (simplified, without start,end params):

def sfind(string, target):
  try:
return string.index(target)
  except ValueError:
return None # or -1 for back compatibility, but None better

This can of course be done for any function/method that indicates input 
errors with exceptions instead of a special return value.  I see no reason 
other than history that this particular method should be doubled.

If .find is scheduled for the dustbin of history, I would be willing to 
suggest doc and docstring changes.  (str.index.__doc__ currently refers to 
str.find.__doc__.  This should be reversed.)

Terry J. Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com