Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-09-03 Thread Scott David Daniels
Bengt Richter wrote:
 On Wed, 31 Aug 2005 14:16:28 GMT, Ron Adam [EMAIL PROTECTED] wrote:
 [...]
 
The problem with negative index's are that positive index's are zero 
based, but negative index's are 1 based.  Which leads to a non 
symmetrical situations.
Although it is _way_ too late to try something like this, once upon
a time you could have done all of this using the one's complement
operator:
 ~0  does exist and is distinct from 0.
So you could talk about a slice:
 str[4 : ~2]
and so on.

--Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-09-03 Thread Ron Adam
Bengt Richter wrote:

 IMO the problem is that the index sign is doing two jobs, which for zero-based
 reverse indexing have to be separate: i.e., to show direction _and_ a _signed_
 offset which needs to be realtive to the direction and base position.

Yes, that's definitely part of it.


 A list-like class, and an option to use a zero-based reverse index will 
 illustrate:
 
class Zbrx(object):
 
  ... def __init__(self, value=0):
  ... self.value = value
  ... def __repr__(self): return 'Zbrx(%r)'%self.value
  ... def __sub__(self, other): return Zbrx(self.value - other)
  ... def __add__(self, other): return Zbrx(self.value + other)
  ...
   class Zbrxlist(object):
  ... def normslc(self, slc):
  ... sss = [slc.start, slc.stop, slc.step]
  ... for i,s in enumerate(sss):
  ... if isinstance(s, Zbrx): sss[i] = len(self.value)-1-s.value
  ... return tuple(sss), slice(*sss)
  ... def __init__(self, value):
  ... self.value = value
  ... def __getitem__(self, i):
  ... if isinstance(i, int):
  ... return '[%r]: %r'%(i, self.value[i])
  ... elif isinstance(i, Zbrx):
  ... return '[%r]: %r'%(i, self.value[len(self.value)-1-i.value])
  ... elif isinstance(i, slice):
  ... sss, slc = self.normslc(i)
  ... return '[%r:%r:%r]: %r'%(sss+ (list.__getitem__(self.value, 
 slc),))
  ... def __setitem__(self, i, v):
  ... if isinstance(i, int):
  ... list.__setitem__(self, i, v)
  ... elif isinstance(i, slice):
  ... sss, slc = self.normslc(i)
  ... list.__setitem__(self.value, slc, v)
  ... def __repr__(self): return 'Zbrxlist(%r)'%self.value
  ...
   zlast = Zbrx(0)
   zbr10 = Zbrxlist(range(10))
   zbr10[zlast]
  '[Zbrx(0)]: 9'
   zbr10[zlast:]
  '[9:None:None]: [9]'
   zbr10[zlast:zlast] = ['end']
   zbr10
  Zbrxlist([0, 1, 2, 3, 4, 5, 6, 7, 8, 'end', 9])
   ztop = Zbrx(-1)
   zbr10[ztop:ztop] = ['final']
   zbr10
  Zbrxlist([0, 1, 2, 3, 4, 5, 6, 7, 8, 'end', 9, 'final'])
   zbr10[zlast:]
  [11:None:None]: ['final']
   zbr10[zlast]
  [Zbrx(0)]: 'final'
   zbr10[zlast+1]
  '[Zbrx(1)]: 9'
   zbr10[zlast+2]
  [Zbrx(2)]: 'end'
 
   a = Zbrxlist(list('abcde'))
   a
  Zbrxlist(['a', 'b', 'c', 'd', 'e'])
 
 Forgot to provide a __len__ method ;-)
   a[len(a.value):len(a.value)] = ['end']
   a
  Zbrxlist(['a', 'b', 'c', 'd', 'e', 'end'])
 
 lastx refers to the last items by zero-based reverse indexing
   a[lastx]
  [Zbrx(0)]: 'end'
   a[lastx:lastx] = ['last']
   a
  Zbrxlist(['a', 'b', 'c', 'd', 'e', 'last', 'end'])
 
 As expected, or do you want to define different semantics?
 You still need to spell len(a) in the slice somehow to indicate
 beond the top. E.g.,
 
   a[lastx-1:lastx-1] = ['final']
   a
  Zbrxlist(['a', 'b', 'c', 'd', 'e', 'last', 'end', 'final'])
 
 Perhaps you can take the above toy and make something that works
 they way you had in mind? Nothing like implementation to give
 your ideas reality ;-)

Thanks, I'll play around with it.  ;-)

As you stated before the index is doing two jobs, so limiting it in some 
way may be what is needed.  Here's a few possible (or impossible) options.

(Some of these aren't pretty.)


* Disallow *all* negative values, use values of start/stop to determine 
direction. Indexing from far end needs to be explicit (len(n)-x).

a[len(a):0]reverse order
a[len(a):0:2]  reveres order, even items

(I was wondering why list's couldn't have len,min, and max attribute 
that are updated when ever the list is modified in place of using 
len,min, and max functions? Would the overhead be that much?)

   a[len.a:0]


* Disallow negative index's,  use negative steps to determine indexing 
direction. Order of index's to determine output order.

a[len(a):0:-1] forward order, zero based indexing from end.
a[0:len(a):-1] reverse order, zero based from end.
a[0:1:-1]  last item

I works, but single a[-1] is used extremely often.  I don't think having 
to do a[0:1:-1] would be very popular.


* A reverse index symbol/operator could be used ...

a[~0]  -   last item,  This works BTW. :-)  ~0 == -1
a[~1]  -   next to last item

(Could this be related to the original intended use?)


a[~0:~0]   slice after end ?.  Doesn't work correctly.

What is needed here is to index from the left instead of the right.

a[~0] - item to left of end gap.

*IF* this could be done; I'm sure there's some reason why this won't 
work. ;-), then all indexing operations with '~' could be symmetric with 
all positive indexing operations. Then in Python 3k true negative 
index's could cause an exception... less bugs I bet.  And then negative 
steps could reverse lists with a lot less confusion, keeping that 
functionality as well.

Maybe a[~:~] == a[-len(a):-0]
   a[~:]  == a[-len(a):len(a)]
   a[:~]  == a[0:-0]

   a[~:~] == a[:]

This 

Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-09-02 Thread Bengt Richter
On Wed, 31 Aug 2005 14:16:28 GMT, Ron Adam [EMAIL PROTECTED] wrote:
[...]

The problem with negative index's are that positive index's are zero 
based, but negative index's are 1 based.  Which leads to a non 
symmetrical situations.

Note that you can insert an item before the first item using slices. But 
not after the last item without using len(list) or some value larger 
than len(list).

IMO the problem is that the index sign is doing two jobs, which for zero-based
reverse indexing have to be separate: i.e., to show direction _and_ a _signed_
offset which needs to be realtive to the direction and base position.

A list-like class, and an option to use a zero-based reverse index will 
illustrate:

 class Zbrx(object):
 ... def __init__(self, value=0):
 ... self.value = value
 ... def __repr__(self): return 'Zbrx(%r)'%self.value
 ... def __sub__(self, other): return Zbrx(self.value - other)
 ... def __add__(self, other): return Zbrx(self.value + other)
 ...
  class Zbrxlist(object):
 ... def normslc(self, slc):
 ... sss = [slc.start, slc.stop, slc.step]
 ... for i,s in enumerate(sss):
 ... if isinstance(s, Zbrx): sss[i] = len(self.value)-1-s.value
 ... return tuple(sss), slice(*sss)
 ... def __init__(self, value):
 ... self.value = value
 ... def __getitem__(self, i):
 ... if isinstance(i, int):
 ... return '[%r]: %r'%(i, self.value[i])
 ... elif isinstance(i, Zbrx):
 ... return '[%r]: %r'%(i, self.value[len(self.value)-1-i.value])
 ... elif isinstance(i, slice):
 ... sss, slc = self.normslc(i)
 ... return '[%r:%r:%r]: %r'%(sss+ (list.__getitem__(self.value, 
slc),))
 ... def __setitem__(self, i, v):
 ... if isinstance(i, int):
 ... list.__setitem__(self, i, v)
 ... elif isinstance(i, slice):
 ... sss, slc = self.normslc(i)
 ... list.__setitem__(self.value, slc, v)
 ... def __repr__(self): return 'Zbrxlist(%r)'%self.value
 ...
  zlast = Zbrx(0)
  zbr10 = Zbrxlist(range(10))
  zbr10[zlast]
 '[Zbrx(0)]: 9'
  zbr10[zlast:]
 '[9:None:None]: [9]'
  zbr10[zlast:zlast] = ['end']
  zbr10
 Zbrxlist([0, 1, 2, 3, 4, 5, 6, 7, 8, 'end', 9])
  ztop = Zbrx(-1)
  zbr10[ztop:ztop] = ['final']
  zbr10
 Zbrxlist([0, 1, 2, 3, 4, 5, 6, 7, 8, 'end', 9, 'final'])
  zbr10[zlast:]
 [11:None:None]: ['final']
  zbr10[zlast]
 [Zbrx(0)]: 'final'
  zbr10[zlast+1]
 '[Zbrx(1)]: 9'
  zbr10[zlast+2]
 [Zbrx(2)]: 'end'


  a = list('abcde')
  a[len(a):len(a)] = ['end']
  a
['a', 'b', 'c', 'd', 'e', 'end']

  a[-1:-1] = ['last']
  a
['a', 'b', 'c', 'd', 'e', 'last', 'end'] # Second to last.

  a[100:100] = ['final']
  a
['a', 'b', 'c', 'd', 'e', 'last', 'end', 'final']


  a = Zbrxlist(list('abcde'))
  a
 Zbrxlist(['a', 'b', 'c', 'd', 'e'])

Forgot to provide a __len__ method ;-)
  a[len(a.value):len(a.value)] = ['end']
  a
 Zbrxlist(['a', 'b', 'c', 'd', 'e', 'end'])

lastx refers to the last items by zero-based reverse indexing
  a[lastx]
 [Zbrx(0)]: 'end'
  a[lastx:lastx] = ['last']
  a
 Zbrxlist(['a', 'b', 'c', 'd', 'e', 'last', 'end'])

As expected, or do you want to define different semantics?
You still need to spell len(a) in the slice somehow to indicate
beond the top. E.g.,

  a[lastx-1:lastx-1] = ['final']
  a
 Zbrxlist(['a', 'b', 'c', 'd', 'e', 'last', 'end', 'final'])

Perhaps you can take the above toy and make something that works
they way you had in mind? Nothing like implementation to give
your ideas reality ;-)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-09-01 Thread Stefan Rank
 [snipped alot from others about indexing, slicing problems,
  and the inadequacy of -1 as Not Found indicator]

on 31.08.2005 16:16 Ron Adam said the following:
 The problem with negative index's are that positive index's are zero 
 based, but negative index's are 1 based.  Which leads to a non 
 symmetrical situations.

Hear, hear.

This is, for me, the root of the problem.

But changing the whole of Python to the (more natural and consistent) 
one-based indexing style, for indexing from left and right, is...
difficult.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-31 Thread Antoon Pardon
Op 2005-08-30, Steve Holden schreef [EMAIL PROTECTED]:
 Antoon Pardon wrote:
 Op 2005-08-29, Steve Holden schreef [EMAIL PROTECTED]:
 
Antoon Pardon wrote:

Op 2005-08-27, Steve Holden schreef [EMAIL PROTECTED]:


If you want an exception from your code when 'w' isn't in the string you 
should consider using index() rather than find.


Sometimes it is convenient to have the exception thrown at a later
time.



Otherwise, whatever find() returns you will have to have an if in 
there to handle the not-found case.


And maybe the more convenient place for this if is in a whole different
part of your program, a part where using -1 as an invalid index isn't
at all obvious.



This just sounds like whining to me. If you want to catch errors, use a 
function that will raise an exception rather than relying on the 
invalidity of the result.


You always seem to look at such things in a very narrow scope. You never
seem to consider that various parts of a program have to work together.


Or perhaps it's just that I try not to mix parts inappropriately.
 
 
 I didn't know it was inappropriately to mix certain parts. Can you
 give a list of modules in the standard list I shouldn't mix.
 
 
So what happens if you have a module that is collecting string-index
pair, colleted from various other parts. In one part you
want to select the last letter, so you pythonically choose -1 as
index. In an other part you get a result of find and are happy
with -1 as an indictation for an invalid index. Then these
data meet.


That's when debugging has to start. Mixing data of such types is 
somewhat inadvisable, don't you agree?
 
 
 The type of both data is the same, it is a string-index pair in
 both cases. The problem is that a module from the standard lib
 uses a certain value to indicate an illegal index, that has
 a very legal value in python in general.
 
 Since you are clearly feeling pedantic enough to beat this one to death 
 with a 2 x 4 please let me substitute usages for types.

But it's not my usage but python's usage.

 In the case of a find() result -1 *isn't* a string index, it's a failure 
 flag. Which is precisely why it should be filtered out of any set of 
 indexes. once it's been inserted it can no longer be distinguished as a 
 failure indication.

Which is precisely why it was such a bad choice in the first place.

If I need to write code like this:

  var = str.find('.')
  if var == -1:
var = None

each time I want to store an index for later use, then surely '-1'
shouldn't have been used here.


I suppose I can't deny that people do things like that, myself included, 
 
 
 It is not about what people do. If this was about someone implementing
 find himself and using -1 as an illegal index, I would certainly agree
 that it was inadvisable to do so. Yet when this is what python with
 its libary offers the programmer, you seem reluctant find fault with
 it.

 I've already admitted that the choice of -1 as a return value wasn't 
 smart. However you appear to be saying that it's sensible to mix return 
 values from find() with general-case index values.

I'm saying it should be possible without a problem. It is poor design
to return a legal value as an indication for an error flag.

 I'm saying that you 
 should do so only with caution. The fact that the naiive user will often 
 not have the wisdom to apply such caution is what makes a change desirable.

I don't think it is naive, if you expect that no legal value will be
returned as an error flag. 

but mixing data sets where -1 is variously an error flag and a valid 
index is only going to lead to trouble when the combined data is used.
 
 
 Yet this is what python does. Using -1 variously as an error flag and
 a valid index and when  people complain about that, you say it sounds like
 whining.
 
 What I am trying to say is that this doesn't make sense: if you want to 
 combine find() results with general-case indexes (i.e. both positive and 
 negative index values) it behooves you to strip out the -1's before you 
 do so. Any other behaviour is asking for trouble.

I would say that choosing this particular return value as an error flag
was asking for trouble. My impression is that you are putting more
blame on the programmer which fails to take corrective action, instead
of on the design of find, which makes that corrective action needed
in the first place.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-31 Thread Antoon Pardon
Op 2005-08-30, Bengt Richter schreef [EMAIL PROTECTED]:
 On 30 Aug 2005 10:07:06 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?
 Give it a handy property? E.g.,

 table.as_python_list[-1]

Your missing the point, I probably didn't make it clear.

It is not about the possibilty of doing such a thing. It is
about python providing a frame for such things that work
in general without the need of extra properties in 'special'
cases.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-31 Thread Bengt Richter
On 31 Aug 2005 07:26:48 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Bengt Richter schreef [EMAIL PROTECTED]:
 On 30 Aug 2005 10:07:06 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?
 Give it a handy property? E.g.,

 table.as_python_list[-1]

Your missing the point, I probably didn't make it clear.

It is not about the possibilty of doing such a thing. It is
about python providing a frame for such things that work
in general without the need of extra properties in 'special'
cases.

How about interpreting seq[i] as an abbreviation of seq[i%len(seq)] ?
That would give a consitent interpretation of seq[-1] and no errors
for any value ;-)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-31 Thread Antoon Pardon
Op 2005-08-31, Bengt Richter schreef [EMAIL PROTECTED]:
 On 31 Aug 2005 07:26:48 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Bengt Richter schreef [EMAIL PROTECTED]:
 On 30 Aug 2005 10:07:06 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?
 Give it a handy property? E.g.,

 table.as_python_list[-1]

Your missing the point, I probably didn't make it clear.

It is not about the possibilty of doing such a thing. It is
about python providing a frame for such things that work
in general without the need of extra properties in 'special'
cases.

 How about interpreting seq[i] as an abbreviation of seq[i%len(seq)] ?
 That would give a consitent interpretation of seq[-1] and no errors
 for any value ;-)

But the question was not about having a consistent interpretation for
-1, but about an easy way to get the last value.

But I like your idea. I just think there should be two differnt ways
to index. maybe use braces in one case.

  seq{i} would be pure indexing, that throws exceptions if you
  are out of bound

  seq[i] would then be seq{i%len(seq)}

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-31 Thread Bryan Olson
Paul Rubin wrote:
  Not every sequence needs __len__; for example, infinite sequences, or
  sequences that implement slicing and subscripts by doing lazy
  evaluation of iterators:
 
digits_of_pi = memoize(generate_pi_digits())  # 3,1,4,1,5,9,2,...
print digits_of_pi[5]   # computes 6 digits and prints '9'
print digits_of_pi($-5)  # raises exception

Good point. I like the memoize thing, so here is one:


class memoize (object):
  Build a sequence from an iterable, evaluating as needed.
 

 def __init__(self, iterable):
 self.it = iterable
 self.known = []

 def extend_(self, stop):
 while len(self.known)  stop:
 self.known.append(self.it.next())

 def __getitem__(self, key):
 if isinstance(key, (int, long)):
 self.extend_(key + 1)
 return self.known[key]
 elif isinstance(key, slice):
 start, stop, step = key.start, key.stop, key.step
 stop = start + 1 + (stop - start - 1) // step * step
 self.extend_(stop)
 return self.known[start : stop : step]
 else:
 raise TypeError(_type_err_note), Bad subscript type


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-31 Thread Ron Adam
Antoon Pardon wrote:

 Op 2005-08-31, Bengt Richter schreef [EMAIL PROTECTED]:
 
On 31 Aug 2005 07:26:48 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:


Op 2005-08-30, Bengt Richter schreef [EMAIL PROTECTED]:

On 30 Aug 2005 10:07:06 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:


Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]


Really it's x[-1]'s behavior that should go, not find/rfind.

I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?

Give it a handy property? E.g.,

table.as_python_list[-1]

Your missing the point, I probably didn't make it clear.

It is not about the possibilty of doing such a thing. It is
about python providing a frame for such things that work
in general without the need of extra properties in 'special'
cases.


How about interpreting seq[i] as an abbreviation of seq[i%len(seq)] ?
That would give a consitent interpretation of seq[-1] and no errors
for any value ;-)
 
 
 But the question was not about having a consistent interpretation for
 -1, but about an easy way to get the last value.
 
 But I like your idea. I just think there should be two differnt ways
 to index. maybe use braces in one case.
 
   seq{i} would be pure indexing, that throws exceptions if you
   are out of bound
 
   seq[i] would then be seq{i%len(seq)}

The problem with negative index's are that positive index's are zero 
based, but negative index's are 1 based.  Which leads to a non 
symmetrical situations.

Note that you can insert an item before the first item using slices. But 
not after the last item without using len(list) or some value larger 
than len(list).

  a = list('abcde')
  a[len(a):len(a)] = ['end']
  a
['a', 'b', 'c', 'd', 'e', 'end']

  a[-1:-1] = ['last']
  a
['a', 'b', 'c', 'd', 'e', 'last', 'end'] # Second to last.

  a[100:100] = ['final']
  a
['a', 'b', 'c', 'd', 'e', 'last', 'end', 'final']


Cheers,
Ron














-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Paul Rubin
Terry Reedy [EMAIL PROTECTED] writes:
 The fact that the -1 return *has* lead to bugs in actual code is the 
 primary reason Guido has currently decided that find and rfind should go. 
 A careful review of current usages in the standard library revealed at 
 least a couple bugs even there.

Really it's x[-1]'s behavior that should go, not find/rfind.

Will socket.connect_ex also go?  How about dict.get?  Being able to
return some reasonable value for failure is a good thing, if failure
is expected.  Exceptions are for unexpected, i.e., exceptional failures.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-30 Thread Antoon Pardon
Op 2005-08-29, Steve Holden schreef [EMAIL PROTECTED]:
 Antoon Pardon wrote:
 Op 2005-08-27, Steve Holden schreef [EMAIL PROTECTED]:
 

If you want an exception from your code when 'w' isn't in the string you 
should consider using index() rather than find.
 
 
 Sometimes it is convenient to have the exception thrown at a later
 time.
 
 
Otherwise, whatever find() returns you will have to have an if in 
there to handle the not-found case.
 
 
 And maybe the more convenient place for this if is in a whole different
 part of your program, a part where using -1 as an invalid index isn't
 at all obvious.
 
 
This just sounds like whining to me. If you want to catch errors, use a 
function that will raise an exception rather than relying on the 
invalidity of the result.
 
 
 You always seem to look at such things in a very narrow scope. You never
 seem to consider that various parts of a program have to work together.
 
 Or perhaps it's just that I try not to mix parts inappropriately.

I didn't know it was inappropriately to mix certain parts. Can you
give a list of modules in the standard list I shouldn't mix.

 So what happens if you have a module that is collecting string-index
 pair, colleted from various other parts. In one part you
 want to select the last letter, so you pythonically choose -1 as
 index. In an other part you get a result of find and are happy
 with -1 as an indictation for an invalid index. Then these
 data meet.
 
 That's when debugging has to start. Mixing data of such types is 
 somewhat inadvisable, don't you agree?

The type of both data is the same, it is a string-index pair in
both cases. The problem is that a module from the standard lib
uses a certain value to indicate an illegal index, that has
a very legal value in python in general.

 I suppose I can't deny that people do things like that, myself included, 

It is not about what people do. If this was about someone implementing
find himself and using -1 as an illegal index, I would certainly agree
that it was inadvisable to do so. Yet when this is what python with
its libary offers the programmer, you seem reluctant find fault with
it.

 but mixing data sets where -1 is variously an error flag and a valid 
 index is only going to lead to trouble when the combined data is used.

Yet this is what python does. Using -1 variously as an error flag and
a valid index and when  people complain about that, you say it sounds like
whining.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bryan Olson
Steve Holden wrote:
  I'm all in favor of discussions to make 3.0 a better
  language.

This one should definitely be two-phase. First, the non-code-
breaking change that replaces-and-deprecates the warty handling
of negative indexes, and later the removal of the old style. For
the former, there's no need to wait for a X.0 release; for the
latter, 3.0 may be too early.

The draft PEP went to the PEP editors a couple days ago. Haven't
heard back yet.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Antoon Pardon
Op 2005-08-29, Steven Bethard schreef [EMAIL PROTECTED]:
 Antoon Pardon wrote:
 I think a properly implented find is better than an index.

 See the current thread in python-dev[1], which proposes a new method, 
 str.partition().  I believe that Raymond Hettinger has shown that almost 
 all uses of str.find() can be more clearly be represented with his 
 proposed function.

Do we really need this? As far as I understand most of this
functionality is already provided by str.split and str.rsplit

I think adding an optional third parameter 'full=False' to these
methods, would be all that is usefull here. If full was set
to True, split and rsplit would enforce that a list with
maxsplit + 1 elements was returned, filling up the list with
None's if necessary.


  head, sep, tail = str.partion(sep)

would then almost be equivallent to

  head, tail = str.find(sep, 1, True)


Code like the following:

 head, found, tail = result.partition(' ')
 if not found:
 break
 result = head + tail


Could be replaced by:

 head, tail = result.split(' ', 1, full = True)
 if tail is None
 break
 result = head + tail


I also think that code like this:

  while tail:
  head, _, tail = tail.partition('.')
  mname = %s.%s % (m.__name__, head)
  m = self.import_it(head, mname, m)
  ...


Would probably better be written as follows:

  for head in tail.split('.'):
  mname = %s.%s % (m.__name__, head)
  m = self.import_it(head, mname, m)
  ...


Unless I'm missing something.


-- 
Antoon Pardon


[1]http://mail.python.org/pipermail/python-dev/2005-August/055781.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Terry Reedy

Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
useful, especially when 'x' is an expression instead of a name.  But even 
if -1 were not a legal subscript, I would still consider it a design error 
for Python to mistype a non-numeric singleton indicator as an int.  Such 
mistyping is  only necessary in a language like C that requires all return 
values to be of the same type, even when the 'return value' is not really a 
return value but an error signal.  Python does not have that typing 
restriction and should not act as if it does by copying C.

 Will socket.connect_ex also go?

Not familiar with it.

  How about dict.get?

A default value is not necessarily an error indicator.  One can regard a 
dict that is 'getted' as an infinite dict matching all keys with the 
default except for a finite subset of keys, as recorded in the dict.

If the default is to be regarded a 'Nothing to return' indicator, then that 
indicator *must not* be in the dict.  A recommended idiom is to then create 
a new, custom subset of object which *cannot* be a value in the dict. 
Return values can they safely be compared with that indicator by using the 
'is' operator.

In either case, .get is significantly different from .find.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Paul Rubin
Terry Reedy [EMAIL PROTECTED] writes:
  Really it's x[-1]'s behavior that should go, not find/rfind.
 
 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

There are other abbreviations possible, for example the one in the
proposed PEP at the beginning of this thread.

 But even 
 if -1 were not a legal subscript, I would still consider it a design error 
 for Python to mistype a non-numeric singleton indicator as an int.

OK, .find should return None if the string is not found.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bryan Olson
Terry Reedy wrote:
  Paul Rubin wrote:
 
 Really it's x[-1]'s behavior that should go, not find/rfind.
 
  I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is 
extremely
  useful, especially when 'x' is an expression instead of a name.

Hear us out; your disagreement might not be so complete as you
think. From-the-far-end indexing is too useful a feature to
trash. If you look back several posts, you'll see that the
suggestion here is that the index expression should explicitly
call for it, rather than treat negative integers as a special
case.

I wrote up and sent off my proposal, and once the PEP-Editors
respond, I'll be pitching it on the python-dev list. Below is
the version I sent (not yet a listed PEP).


--
--Bryan


PEP: -1
Title: Improved from-the-end indexing and slicing
Version: $Revision: 1.00 $
Last-Modified: $Date: 2005/08/26 00:00:00 $
Author: Bryan G. Olson [EMAIL PROTECTED]
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 26 Aug 2005
Post-History:


Abstract

 To index or slice a sequence from the far end, we propose
 using a symbol, '$', to stand for the length, instead of
 Python's current special-case interpretation of negative
 subscripts. Where Python currently uses:

 sequence[-i]

 We propose:

 sequence[$ - i]

 Python's treatment of negative indexes as offsets from the
 high end of a sequence causes minor obvious problems and
 major subtle ones. This PEP proposes a consistent meaning
 for indexes, yet still supports from-the-far-end
 indexing. Use of new syntax avoids breaking existing code.


Specification

 We propose a new style of slicing and indexing for Python
 sequences. Instead of:

 sequence[start : stop : step]

 new-style slicing uses the syntax:

 sequence[start ; stop ; step]

 It works like current slicing, except that negative start or
 stop values do not trigger from-the-high-end interpretation.
 Omissions and 'None' work the same as in old-style slicing.

 Within the square-brackets, the '$' symbol stands for the
 length of the sequence. One can index from the high end by
 subtracting the index from '$'. Instead of:

 seq[3 : -4]

 we write:

 seq[3 ; $ - 4]

 When square-brackets appear within other square-brackets,
 the inner-most bracket-pair determines which sequence '$'
 describes. The length of the next-outer sequence is denoted
 by '$1', and the next-out after than by '$2', and so on. The
 symbol '$0' behaves identically to '$'. Resolution of $x is
 syntactic; a callable object invoked within square brackets
 cannot use the symbol to examine the context of the call.

 The '$' notation also works in simple (non-slice) indexing.
 Instead of:

 seq[-2]

 we write:

 seq[$ - 2]

 If we did not care about backward compatibility, new-style
 slicing would define seq[-2] to be out-of-bounds. Of course
 we do care about backward compatibility, and rejecting
 negative indexes would break way too much code. For now,
 simple indexing with a negative subscript (and no '$') must
 continue to index from the high end, as a deprecated
 feature. The presence of '$' always indicates new-style
 indexing, so a programmer who needs a negative index to
 trigger a range error can write:

 seq[($ - $) + index]


Motivation

 From-the-far-end indexing is such a useful feature that we
 cannot reasonably propose its removal; nevertheless Python's
 current method, which is to treat a range of negative
 indexes as special cases, is warty. The wart bites novice or
 imperfect Pythoners by not raising an exceptions when they
 need to know about a bug. For example, the following code
 prints 'y' with no sign of error:

 s = 'buggy'
 print s[s.find('w')]

 The wart becomes an even bigger problem with more
 sophisticated use of Python sequences. What is the 'stop'
 value for a slice when the step is negative and the slice
 includes the zero index? An instance of Python's slice type
 will report that the stop value is -1, but if we use this
 stop value to slice, it gets misinterpreted as the last
 index in the sequence. Here's an example:

 class BuggerAll:

 def __init__(self, somelist):
 self.sequence = somelist[:]

 def __getitem__(self, key):
 if isinstance(key, slice):
 start, stop, step = key.indices(len(self.sequence))
 # print 'Slice says start, stop, step are:', start, 
stop, step
 return self.sequence[start : stop : step]


 print   range(10) [None : None : -2]
 print BuggerAll(range(10))[None : None : -2]

 The above prints:

 [9, 7, 5, 3, 1]
 []

 Un-commenting the print statement in 

Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Paul Rubin
Bryan Olson [EMAIL PROTECTED] writes:
  Specifically, to support new-style slicing, a class that
  accepts index or slice arguments to any of:
 
  __getitem__
  __setitem__
  __delitem__
  __getslice__
  __setslice__
  __delslice__
 
  must also consistently implement:
 
  __len__
 
  Sane programmers already follow this rule.

It should be ok to use new-style slicing without implementing __len__
as long as you don't use $ in any slices.  Using $ in a slice without
__len__ would throw a runtime error.  I expect using negative
subscripts in old-style slices on objects with no __len__ also throws
an error.

Not every sequence needs __len__; for example, infinite sequences, or
sequences that implement slicing and subscripts by doing lazy
evaluation of iterators:

  digits_of_pi = memoize(generate_pi_digits())  # 3,1,4,1,5,9,2,...
  print digits_of_pi[5]   # computes 6 digits and prints '9'
  print digits_of_pi($-5)  # raises exception
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Antoon Pardon
Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Robert Kern
Bryan Olson wrote:

  Currently, user-defined classes can implement Python
  subscripting and slicing without implementing Python's len()
  function. In our proposal, the '$' symbol stands for the
  sequence's length, so classes must be able to report their
  length in order for $ to work within their slices and
  indexes.
 
  Specifically, to support new-style slicing, a class that
  accepts index or slice arguments to any of:
 
  __getitem__
  __setitem__
  __delitem__
  __getslice__
  __setslice__
  __delslice__
 
  must also consistently implement:
 
  __len__
 
  Sane programmers already follow this rule.

Incorrect. Some sane programmers have multiple dimensions they need to
index.

  from Numeric import *
  A = array([[0, 1], [2, 3], [4, 5]])
  A[$-1, $-1]

The result of len(A) has nothing to do with the second $.

-- 
Robert Kern
[EMAIL PROTECTED]

In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die.
  -- Richard Harter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Antoon Pardon
Op 2005-08-30, Robert Kern schreef [EMAIL PROTECTED]:
 Bryan Olson wrote:

  Currently, user-defined classes can implement Python
  subscripting and slicing without implementing Python's len()
  function. In our proposal, the '$' symbol stands for the
  sequence's length, so classes must be able to report their
  length in order for $ to work within their slices and
  indexes.
 
  Specifically, to support new-style slicing, a class that
  accepts index or slice arguments to any of:
 
  __getitem__
  __setitem__
  __delitem__
  __getslice__
  __setslice__
  __delslice__
 
  must also consistently implement:
 
  __len__
 
  Sane programmers already follow this rule.

 Incorrect. Some sane programmers have multiple dimensions they need to
 index.

I don't see how that contradicts Bryan's statement.

   from Numeric import *
   A = array([[0, 1], [2, 3], [4, 5]])
   A[$-1, $-1]

 The result of len(A) has nothing to do with the second $.

But that is irrelevant to the fact wether or not sane
programmes follow Bryan's stated rule. That the second
$ has nothing to do with len(A), doesn't contradict
__len__ has to be implemented nor that sane programers
already do.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Robert Kern
Antoon Pardon wrote:
 Op 2005-08-30, Robert Kern schreef [EMAIL PROTECTED]:
 
Bryan Olson wrote:

 Currently, user-defined classes can implement Python
 subscripting and slicing without implementing Python's len()
 function. In our proposal, the '$' symbol stands for the
 sequence's length, so classes must be able to report their
 length in order for $ to work within their slices and
 indexes.

 Specifically, to support new-style slicing, a class that
 accepts index or slice arguments to any of:

 __getitem__
 __setitem__
 __delitem__
 __getslice__
 __setslice__
 __delslice__

 must also consistently implement:

 __len__

 Sane programmers already follow this rule.

Incorrect. Some sane programmers have multiple dimensions they need to
index.
 
 I don't see how that contradicts Bryan's statement.
 
  from Numeric import *
  A = array([[0, 1], [2, 3], [4, 5]])
  A[$-1, $-1]

The result of len(A) has nothing to do with the second $.
 
 But that is irrelevant to the fact wether or not sane
 programmes follow Bryan's stated rule. That the second
 $ has nothing to do with len(A), doesn't contradict
 __len__ has to be implemented nor that sane programers
 already do.

Except that the *consistent* implementation is supposed to support the
interpretation of $. It clearly can't for multiple dimensions.

-- 
Robert Kern
[EMAIL PROTECTED]

In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die.
  -- Richard Harter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bryan Olson
Robert Kern wrote:
  Bryan Olson wrote:
 
 
  Currently, user-defined classes can implement Python
  subscripting and slicing without implementing Python's len()
  function. In our proposal, the '$' symbol stands for the
  sequence's length, so classes must be able to report their
  length in order for $ to work within their slices and
  indexes.
 
  Specifically, to support new-style slicing, a class that
  accepts index or slice arguments to any of:
 
  __getitem__
  __setitem__
  __delitem__
  __getslice__
  __setslice__
  __delslice__
 
  must also consistently implement:
 
  __len__
 
  Sane programmers already follow this rule.
 
 
  Incorrect. Some sane programmers have multiple dimensions they need to
  index.
 
from Numeric import *
A = array([[0, 1], [2, 3], [4, 5]])
A[$-1, $-1]
 
  The result of len(A) has nothing to do with the second $.

I think you have a good observation there, but I'll stand by my
correctness.

My initial post considered re-interpreting tuple arguments, but
I abandoned that alternative after Steven Bethard pointed out
how much code it would break. Modules/classes would remain free
to interpret tuple arguments in any way they wish. I don't think
my proposal breaks any sane existing code.

Going forward, I would advocate that user classes which
implement their own kind of subscripting adopt the '$' syntax,
and interpret it as consistently as possible. For example, they
could respond to __len__() by returning a type that supports the
Emulating numeric types methods from the Python Language
Reference 3.3.7, and also allows the class's methods to tell
that it stands for the length of the dimension in question.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Robert Kern
Bryan Olson wrote:
 Robert Kern wrote:

 from Numeric import *
 A = array([[0, 1], [2, 3], [4, 5]])
 A[$-1, $-1]
  
   The result of len(A) has nothing to do with the second $.
 
 I think you have a good observation there, but I'll stand by my
 correctness.

len() cannot be used to determine the value of $ in the context of
multiple dimensions.

 My initial post considered re-interpreting tuple arguments, but
 I abandoned that alternative after Steven Bethard pointed out
 how much code it would break. Modules/classes would remain free
 to interpret tuple arguments in any way they wish. I don't think
 my proposal breaks any sane existing code.

What it does do is provide a second way to do indexing from the end that
can't be extended to multiple dimensions.

 Going forward, I would advocate that user classes which
 implement their own kind of subscripting adopt the '$' syntax,
 and interpret it as consistently as possible.

How? You haven't proposed how an object gets the information that
$-syntax is being used. You've proposed a syntax and some semantics; you
also need to flesh out the pragmatics.

 For example, they
 could respond to __len__() by returning a type that supports the
 Emulating numeric types methods from the Python Language
 Reference 3.3.7, and also allows the class's methods to tell
 that it stands for the length of the dimension in question.

I have serious doubts about __len__() returning anything but a bona-fide
integer. We shouldn't need to use incredible hacks like that to support
a core language feature.

-- 
Robert Kern
[EMAIL PROTECTED]

In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die.
  -- Richard Harter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread phil hunt
On Tue, 30 Aug 2005 08:53:27 GMT, Bryan Olson [EMAIL PROTECTED] wrote:
 Specifically, to support new-style slicing, a class that
 accepts index or slice arguments to any of:

 __getitem__
 __setitem__
 __delitem__
 __getslice__
 __setslice__
 __delslice__

 must also consistently implement:

 __len__

 Sane programmers already follow this rule.


Wouldn't it be more sensible to have an abstract IndexedCollection 
superclass, which imlements all the slicing stuff, then when someone 
writes their own collection class they just have to implement 
__len__ and __getitem__ and slicing works automatically?


-- 
Email: zen19725 at zen dot co dot uk


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-30 Thread Steve Holden
Antoon Pardon wrote:
 Op 2005-08-29, Steve Holden schreef [EMAIL PROTECTED]:
 
Antoon Pardon wrote:

Op 2005-08-27, Steve Holden schreef [EMAIL PROTECTED]:


If you want an exception from your code when 'w' isn't in the string you 
should consider using index() rather than find.


Sometimes it is convenient to have the exception thrown at a later
time.



Otherwise, whatever find() returns you will have to have an if in 
there to handle the not-found case.


And maybe the more convenient place for this if is in a whole different
part of your program, a part where using -1 as an invalid index isn't
at all obvious.



This just sounds like whining to me. If you want to catch errors, use a 
function that will raise an exception rather than relying on the 
invalidity of the result.


You always seem to look at such things in a very narrow scope. You never
seem to consider that various parts of a program have to work together.


Or perhaps it's just that I try not to mix parts inappropriately.
 
 
 I didn't know it was inappropriately to mix certain parts. Can you
 give a list of modules in the standard list I shouldn't mix.
 
 
So what happens if you have a module that is collecting string-index
pair, colleted from various other parts. In one part you
want to select the last letter, so you pythonically choose -1 as
index. In an other part you get a result of find and are happy
with -1 as an indictation for an invalid index. Then these
data meet.


That's when debugging has to start. Mixing data of such types is 
somewhat inadvisable, don't you agree?
 
 
 The type of both data is the same, it is a string-index pair in
 both cases. The problem is that a module from the standard lib
 uses a certain value to indicate an illegal index, that has
 a very legal value in python in general.
 
Since you are clearly feeling pedantic enough to beat this one to death 
with a 2 x 4 please let me substitute usages for types.

In the case of a find() result -1 *isn't* a string index, it's a failure 
flag. Which is precisely why it should be filtered out of any set of 
indexes. once it's been inserted it can no longer be distinguished as a 
failure indication.

 
I suppose I can't deny that people do things like that, myself included, 
 
 
 It is not about what people do. If this was about someone implementing
 find himself and using -1 as an illegal index, I would certainly agree
 that it was inadvisable to do so. Yet when this is what python with
 its libary offers the programmer, you seem reluctant find fault with
 it.
 
I've already admitted that the choice of -1 as a return value wasn't 
smart. However you appear to be saying that it's sensible to mix return 
values from find() with general-case index values. I'm saying that you 
should do so only with caution. The fact that the naiive user will often 
not have the wisdom to apply such caution is what makes a change desirable.

 
but mixing data sets where -1 is variously an error flag and a valid 
index is only going to lead to trouble when the combined data is used.
 
 
 Yet this is what python does. Using -1 variously as an error flag and
 a valid index and when  people complain about that, you say it sounds like
 whining.
 
What I am trying to say is that this doesn't make sense: if you want to 
combine find() results with general-case indexes (i.e. both positive and 
negative index values) it behooves you to strip out the -1's before you 
do so. Any other behaviour is asking for trouble.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bengt Richter
On Tue, 30 Aug 2005 08:53:27 GMT, Bryan Olson [EMAIL PROTECTED] wrote:
[...]
Specification

 We propose a new style of slicing and indexing for Python
 sequences. Instead of:

 sequence[start : stop : step]

 new-style slicing uses the syntax:

 sequence[start ; stop ; step]

I don't mind the semantics, but I don't like the semicolons ;-)

What about if when brackets trail as if attributes, it means
your-style slicing written with colons instead of semicolons?

  sequence.[start : stop : step]

I think that would just be a tweak on the trailer syntax.
I just really dislike the semicolons ;-)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bengt Richter
On Tue, 30 Aug 2005 11:56:24 GMT, Bryan Olson [EMAIL PROTECTED] wrote:

Robert Kern wrote:
  Bryan Olson wrote:
 
 
  Currently, user-defined classes can implement Python
  subscripting and slicing without implementing Python's len()
  function. In our proposal, the '$' symbol stands for the
  sequence's length, so classes must be able to report their
  length in order for $ to work within their slices and
  indexes.
 
  Specifically, to support new-style slicing, a class that
  accepts index or slice arguments to any of:
 
  __getitem__
  __setitem__
  __delitem__
  __getslice__
  __setslice__
  __delslice__
 
  must also consistently implement:
 
  __len__
 
  Sane programmers already follow this rule.
 
 
  Incorrect. Some sane programmers have multiple dimensions they need to
  index.
 
from Numeric import *
A = array([[0, 1], [2, 3], [4, 5]])
A[$-1, $-1]
 
  The result of len(A) has nothing to do with the second $.

I think you have a good observation there, but I'll stand by my
correctness.

My initial post considered re-interpreting tuple arguments, but
I abandoned that alternative after Steven Bethard pointed out
how much code it would break. Modules/classes would remain free
to interpret tuple arguments in any way they wish. I don't think
my proposal breaks any sane existing code.

Going forward, I would advocate that user classes which
implement their own kind of subscripting adopt the '$' syntax,
and interpret it as consistently as possible. For example, they
could respond to __len__() by returning a type that supports the
Emulating numeric types methods from the Python Language
Reference 3.3.7, and also allows the class's methods to tell
that it stands for the length of the dimension in question.


(OTTOMH ;-)
Perhaps the slice triple could be extended with a flag indicating
which of the other elements should have $ added to it, and $ would
take meaning from the subarray being indexed, not the whole. E.g.,

arr.[1:$-1, $-5:$-2]

would call arr.__getitem__((slice(1,-1,None,STOP), slice(-5,-2,None,START|STOP))

(Hypothesizing bitmask constants START and STOP)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-30 Thread Bengt Richter
On 30 Aug 2005 10:07:06 GMT, Antoon Pardon [EMAIL PROTECTED] wrote:

Op 2005-08-30, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Really it's x[-1]'s behavior that should go, not find/rfind.

 I complete disagree, x[-1] as an abbreviation of x[len(x)-1] is extremely 
 useful, especially when 'x' is an expression instead of a name.

I don't think the ability to easily index sequences from the right is
in dispute. Just the fact that negative numbers on their own provide
this functionality.

Because I sometimes find it usefull to have a sequence start and
end at arbitrary indexes, I have written a table class. So I
can have a table that is indexed from e.g. -4 to +6. So how am
I supposed to easily get at that last value?
Give it a handy property? E.g.,

table.as_python_list[-1]


Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-29 Thread Magnus Lycka
Robert Kern wrote:
 If I may digress for a bit, my advisor is currently working on a project
 that is processing seafloor depth datasets starting from a few decades
 ago. A lot of this data was orginally to be processed using FORTRAN
 software, so in the idiom of much FORTRAN software from those days, 
 is often used to mark missing data. Unfortunately,  is a perfectly
 valid datum in most of the unit systems used by the various datasets.
 
 Now he has to find a grad student to traul through the datasets and
 clean up the really invalid 's (as well as other such fun tasks like
 deciding if a dataset that says it's using feet is actually using meters).

I'm afraid this didn't end with FORTRAN. It's not that long ago
that I wrote a program for my wife that combined a data editor
with a graph display, so that she could clean up time lines with
length and weight data for children (from an international research
project performed during the 90's). 99cm is not unreasonable as a
length, but if you see it in a graph with other length measurements,
it's easy to spot most of the false ones, just as mistyped year part
in a date (common in the beginning of a new year).

Perhaps graphics can help this grad student too? It's certainly much
easier to spot deviations in curves than in an endless line of
numbers if the curves would normally be reasonably smooth.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-29 Thread Antoon Pardon
Op 2005-08-27, Terry Reedy schreef [EMAIL PROTECTED]:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]
 Terry Reedy [EMAIL PROTECTED] writes:
 The try/except pattern is a pretty basic part of Python's design.  One
 could say the same about clutter for *every* function or method that 
 raises
 an exception on invalid input.  Should more or even all be duplicated? 
 Why
 just this one?

 Someone must have thought str.find was worth having, or else it
 wouldn't be in the library.

 Well, Guido no longer thinks it worth having and emphatically agreed that 
 it should be added to one of the 'To be removed' sections of PEP 3000.

I think a properly implented find is better than an index.

If we only have index, Then asking for permission is no longer a
possibility. If we have a find that returns None, we can either
ask permission before we index or be forgiven by the exception
that is raised.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-29 Thread Antoon Pardon
Op 2005-08-27, Steve Holden schreef [EMAIL PROTECTED]:
 
 
 If you want an exception from your code when 'w' isn't in the string you 
 should consider using index() rather than find.

Sometimes it is convenient to have the exception thrown at a later
time.

 Otherwise, whatever find() returns you will have to have an if in 
 there to handle the not-found case.

And maybe the more convenient place for this if is in a whole different
part of your program, a part where using -1 as an invalid index isn't
at all obvious.

 This just sounds like whining to me. If you want to catch errors, use a 
 function that will raise an exception rather than relying on the 
 invalidity of the result.

You always seem to look at such things in a very narrow scope. You never
seem to consider that various parts of a program have to work together.

So what happens if you have a module that is collecting string-index
pair, colleted from various other parts. In one part you
want to select the last letter, so you pythonically choose -1 as
index. In an other part you get a result of find and are happy
with -1 as an indictation for an invalid index. Then these
data meet.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-29 Thread Robert Kern
Magnus Lycka wrote:
 Robert Kern wrote:
 
If I may digress for a bit, my advisor is currently working on a project
that is processing seafloor depth datasets starting from a few decades
ago. A lot of this data was orginally to be processed using FORTRAN
software, so in the idiom of much FORTRAN software from those days, 
is often used to mark missing data. Unfortunately,  is a perfectly
valid datum in most of the unit systems used by the various datasets.

Now he has to find a grad student to traul through the datasets and
clean up the really invalid 's (as well as other such fun tasks like
deciding if a dataset that says it's using feet is actually using meters).
 
 I'm afraid this didn't end with FORTRAN. It's not that long ago
 that I wrote a program for my wife that combined a data editor
 with a graph display, so that she could clean up time lines with
 length and weight data for children (from an international research
 project performed during the 90's). 99cm is not unreasonable as a
 length, but if you see it in a graph with other length measurements,
 it's easy to spot most of the false ones, just as mistyped year part
 in a date (common in the beginning of a new year).
 
 Perhaps graphics can help this grad student too? It's certainly much
 easier to spot deviations in curves than in an endless line of
 numbers if the curves would normally be reasonably smooth.

Yes! In fact, that was the context of the discussion when my advisor
told me about this project. Another student had written an interactive
GUI for exploring bathymetry maps. My advisor: That kind of thing would
be really great for this new project, etc. etc.

-- 
Robert Kern
[EMAIL PROTECTED]

In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die.
  -- Richard Harter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-29 Thread Steven Bethard
Antoon Pardon wrote:
 I think a properly implented find is better than an index.

See the current thread in python-dev[1], which proposes a new method, 
str.partition().  I believe that Raymond Hettinger has shown that almost 
all uses of str.find() can be more clearly be represented with his 
proposed function.

STeVe

[1]http://mail.python.org/pipermail/python-dev/2005-August/055781.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-29 Thread Steve Holden
Antoon Pardon wrote:
 Op 2005-08-27, Steve Holden schreef [EMAIL PROTECTED]:
 

If you want an exception from your code when 'w' isn't in the string you 
should consider using index() rather than find.
 
 
 Sometimes it is convenient to have the exception thrown at a later
 time.
 
 
Otherwise, whatever find() returns you will have to have an if in 
there to handle the not-found case.
 
 
 And maybe the more convenient place for this if is in a whole different
 part of your program, a part where using -1 as an invalid index isn't
 at all obvious.
 
 
This just sounds like whining to me. If you want to catch errors, use a 
function that will raise an exception rather than relying on the 
invalidity of the result.
 
 
 You always seem to look at such things in a very narrow scope. You never
 seem to consider that various parts of a program have to work together.
 
Or perhaps it's just that I try not to mix parts inappropriately.

 So what happens if you have a module that is collecting string-index
 pair, colleted from various other parts. In one part you
 want to select the last letter, so you pythonically choose -1 as
 index. In an other part you get a result of find and are happy
 with -1 as an indictation for an invalid index. Then these
 data meet.
 
That's when debugging has to start. Mixing data of such types is 
somewhat inadvisable, don't you agree?

I suppose I can't deny that people do things like that, myself included, 
but mixing data sets where -1 is variously an error flag and a valid 
index is only going to lead to trouble when the combined data is used.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-29 Thread Terry Reedy

Steve Holden [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Antoon Pardon wrote:
 So what happens if you have a module that is collecting string-index
 pair, colleted from various other parts. In one part you
 want to select the last letter, so you pythonically choose -1 as
 index. In an other part you get a result of find and are happy
 with -1 as an indictation for an invalid index. Then these
 data meet.

 That's when debugging has to start. Mixing data of such types is
 somewhat inadvisable, don't you agree?

 I suppose I can't deny that people do things like that, myself included,
 but mixing data sets where -1 is variously an error flag and a valid
 index is only going to lead to trouble when the combined data is used.

The fact that the -1 return *has* lead to bugs in actual code is the 
primary reason Guido has currently decided that find and rfind should go. 
A careful review of current usages in the standard library revealed at 
least a couple bugs even there.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-28 Thread Bryan Olson
Steve Holden wrote:
  Paul Rubin wrote:
  We are arguing about trivialities here. Let's stop before it gets
  interesting :-)

Some of us are looking beyond the trivia of what string.find()
should return, at an unfortunate interaction of Python features,
brought on by the special-casing of negative indexes. The wart
bites novice or imperfect Python programmers in simple cases
such as string.find(), or when their subscripts accidentally
fall off the low end. It bites programmers who want to fully
implement Python slicing, because of the double-and-
contradictory- interpretation of -1, as both an exclusive ending
bound and the index of the last element. It bites documentation
authors who naturally think of the non-negative subscript as
*the* index of a sequence item.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-28 Thread Steve Holden
Bryan Olson wrote:
 Steve Holden wrote:
   Paul Rubin wrote:
   We are arguing about trivialities here. Let's stop before it gets
   interesting :-)
 
 Some of us are looking beyond the trivia of what string.find()
 should return, at an unfortunate interaction of Python features,
 brought on by the special-casing of negative indexes. The wart
 bites novice or imperfect Python programmers in simple cases
 such as string.find(), or when their subscripts accidentally
 fall off the low end. It bites programmers who want to fully
 implement Python slicing, because of the double-and-
 contradictory- interpretation of -1, as both an exclusive ending
 bound and the index of the last element. It bites documentation
 authors who naturally think of the non-negative subscript as
 *the* index of a sequence item.
 
 
Sure. I wrote two days ago:

 We might agree, before further discussion, that this isn't the most 
 elegant part of Python's design, and it's down to history that this tiny 
 little wart remains.

While I agree it's a trap for the unwary I still don't regard it as a 
major wart. But I'm all in favor of discussions to make 3.0 a better 
language.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-27 Thread Terry Reedy

Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]
 Steve Holden [EMAIL PROTECTED] writes:
 Of course. But onc you (sensibly) decide to use an if then there
 really isn't much difference between -1, None, () and sys.maxint as
 a sentinel value, is there?

 Of course there is.  -1 is (under Python's perverse semantics) a valid
 subscript.  sys.maxint is an artifact of Python's fixed-size int
 datatype, which is fading away under int/long unification, so it's
 something that soon won't exist and shouldn't be used.  None and ()
 are invalid subscripts so would be reasonable return values, unlike -1
 and sys.maxint.  Of those, None is preferable to () because of its
 semantic connotations.

I agree here that None is importantly different from -1 for the reason 
stated.  The use of -1 is, I am sure, a holdover from statically typed 
languages (C, in particular) that require all return values to be of the 
same type, even if the 'return value' is actually meant to indicat that 
there is no valid return value.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-27 Thread Bryan Olson
Steve Holden wrote:
  Bryan Olson wrote:
  [...] I see no good reason for the following
  to happily print 'y'.
 
   s = 'buggy'
   print s[s.find('w')]
 
Before using the result you always have to perform
a test to discriminate between the found and not found cases. So I
  don't
really see why this wart has put such a bug up your ass.
 
  The bug that got me was what a slice object reports as the
  'stop' bound when the step is negative and the slice includes
  index 0. Took me hours to figure out why my code was failing.
 
  The double-meaning of -1, as both an exclusive stopping bound
  and an alias for the highest valid index, is just plain whacked.
  Unfortunately, as negative indexes are currently handled, there
  is no it-just-works value that slice could return.
 
 
  If you want an exception from your code when 'w' isn't in the string you
  should consider using index() rather than find.

That misses the point. The code is a hypothetical example of
what a novice or imperfect Pythoners might have to deal with.
The exception isn't really wanted; it's just vastly superior to
silently returning a nonsensical value.


  Otherwise, whatever find() returns you will have to have an if in
  there to handle the not-found case.
 
  This just sounds like whining to me. If you want to catch errors, use a
  function that will raise an exception rather than relying on the
  invalidity of the result.

I suppose if you ignore the real problems and the proposed
solution, it might sound a lot like whining.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-27 Thread Steve Holden
Paul Rubin wrote:
 Steve Holden [EMAIL PROTECTED] writes:
 
If you want an exception from your code when 'w' isn't in the string
you should consider using index() rather than find.
 
 
 The idea is you expect w to be in the string.  If w isn't in the
 string, your code has a bug, and programs with bugs should fail as
 early as possible so you can locate the bugs quickly and easily.  That
 is why, for example, 
 
   x = 'buggy'[None]
 
 raises an exception instead of doing something stupid like returning 'g'.

You did read the sentence you were replying to, didn't you?

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-27 Thread Steve Holden
Terry Reedy wrote:
 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]
 
Steve Holden [EMAIL PROTECTED] writes:

Of course. But onc you (sensibly) decide to use an if then there
really isn't much difference between -1, None, () and sys.maxint as
a sentinel value, is there?

Of course there is.  -1 is (under Python's perverse semantics) a valid
subscript.  sys.maxint is an artifact of Python's fixed-size int
datatype, which is fading away under int/long unification, so it's
something that soon won't exist and shouldn't be used.  None and ()
are invalid subscripts so would be reasonable return values, unlike -1
and sys.maxint.  Of those, None is preferable to () because of its
semantic connotations.
 
 
 I agree here that None is importantly different from -1 for the reason 
 stated.  The use of -1 is, I am sure, a holdover from statically typed 
 languages (C, in particular) that require all return values to be of the 
 same type, even if the 'return value' is actually meant to indicat that 
 there is no valid return value.

While I agree that it would have been more sensible to choose None in 
find()'s original design, there's really no reason to go breaking 
existing code just to fix it.

Guido has already agreed that find() can change (or even disappear) in 
Python 3.0, so please let's just leave things as they are for now.

A corrected find() that returns None on failure is a five-liner.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-27 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
 A corrected find() that returns None on failure is a five-liner.

If I wanted to write five lines instead of one everywhere in a Python
program, I'd use Java.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-27 Thread skip

Paul Steve Holden [EMAIL PROTECTED] writes:
 A corrected find() that returns None on failure is a five-liner.

Paul If I wanted to write five lines instead of one everywhere in a
Paul Python program, I'd use Java.

+1 for QOTW.

Skip

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-27 Thread Steve Holden
Paul Rubin wrote:
 Steve Holden [EMAIL PROTECTED] writes:
 
A corrected find() that returns None on failure is a five-liner.
 
 
 If I wanted to write five lines instead of one everywhere in a Python
 program, I'd use Java.

We are arguing about trivialities here. Let's stop before it gets 
interesting :-)

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Antoon Pardon
Op 2005-08-25, Bryan Olson schreef [EMAIL PROTECTED]:
 Steve Holden asked:
  Do you just go round looking for trouble?

 In the course of programming, yes, absolutly.

  As far as position reporting goes, it seems pretty clear that find()
  will always report positive index values. In a five-character string
  then -1 and 4 are effectively equivalent.
 
  What on earth makes you call this a bug?

 What you just said, versus what the doc says.

  And what are you proposing that
  find() should return if the substring isn't found at all? please don't
  suggest it should raise an exception, as index() exists to provide that
  functionality.

 There are a number of good options. A legal index is not one of
 them.

IMO, with find a number of features of python come together.
that create an awkward situation.

1) 0 is a false value, but indexes start at 0 so you can't
   return 0 to indicate nothing was found.

2) -1 is returned, which is both a true value and a legal
   index.


It probably is too late now, but I always felt, find should
have returned None when the substring isn't found.

-- 
Antoon Pardon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Bryan Olson
Antoon Pardon wrote:
  Bryan Olson schreef:
 
 Steve Holden asked:
 And what are you proposing that
 find() should return if the substring isn't found at all? please don't
 suggest it should raise an exception, as index() exists to provide that
 functionality.
 
 There are a number of good options. A legal index is not one of
 them.
 
  IMO, with find a number of features of python come together.
  that create an awkward situation.
 
  1) 0 is a false value, but indexes start at 0 so you can't
 return 0 to indicate nothing was found.
 
  2) -1 is returned, which is both a true value and a legal
 index.
 
  It probably is too late now, but I always felt, find should
  have returned None when the substring isn't found.

None is certainly a reasonable candidate. The one-past-the-end
value, len(sequence), would be fine, and follows the preferred
idiom of C/C++. I don't see any elegant way to arrange for
successful finds always to return a true value and unsuccessful
calls to return a false value.

The really broken part is that unsuccessful searches return a
legal index.

My suggestion doesn't change what find() returns, and doesn't
break code. Negative one is a reasonable choice to represent an
unsuccessful search -- provided it is not a legal index. Instead
of changing what find() returns, we should heal the
special-case-when-index-is-negative-in-a-certain-range wart.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Rick Wotnaz
Bryan Olson [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 Steve Holden asked:
  Do you just go round looking for trouble?
 
 In the course of programming, yes, absolutly.
 
  As far as position reporting goes, it seems pretty clear that
  find() will always report positive index values. In a
  five-character string then -1 and 4 are effectively
  equivalent. 
 
  What on earth makes you call this a bug?
 
 What you just said, versus what the doc says.
 
  And what are you proposing that
  find() should return if the substring isn't found at all?
  please don't suggest it should raise an exception, as index()
  exists to provide that functionality.
 
 There are a number of good options. A legal index is not one of
 them.
 
 

Practically speaking, what difference would it make? Supposing find 
returned None for not-found. How would you use it in your code that 
would make it superior to what happens now? In either case you 
would have to test for the not-found state before relying on the 
index returned, wouldn't you? Or do you have a use that would 
eliminate that step?

-- 
rzed
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Steve Holden
Bryan Olson wrote:
 Antoon Pardon wrote:
   Bryan Olson schreef:
  
  Steve Holden asked:
  And what are you proposing that
  find() should return if the substring isn't found at all? please don't
  suggest it should raise an exception, as index() exists to provide that
  functionality.
  
  There are a number of good options. A legal index is not one of
  them.
  
   IMO, with find a number of features of python come together.
   that create an awkward situation.
  
   1) 0 is a false value, but indexes start at 0 so you can't
  return 0 to indicate nothing was found.
  
   2) -1 is returned, which is both a true value and a legal
  index.
  
   It probably is too late now, but I always felt, find should
   have returned None when the substring isn't found.
 
 None is certainly a reasonable candidate. The one-past-the-end
 value, len(sequence), would be fine, and follows the preferred
 idiom of C/C++. I don't see any elegant way to arrange for
 successful finds always to return a true value and unsuccessful
 calls to return a false value.
 
 The really broken part is that unsuccessful searches return a
 legal index.
 
We might agree, before further discussion, that this isn't the most 
elegant part of Python's design, and it's down to history that this tiny 
little wart remains.

 My suggestion doesn't change what find() returns, and doesn't
 break code. Negative one is a reasonable choice to represent an
 unsuccessful search -- provided it is not a legal index. Instead
 of changing what find() returns, we should heal the
 special-case-when-index-is-negative-in-a-certain-range wart.
 
 
What I don't understand is why you want it to return something that 
isn't a legal index. Before using the result you always have to perform 
a test to discriminate between the found and not found cases. So I don't 
really see why this wart has put such a bug up your ass.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Bryan Olson
Steve Holden wrote:
  Bryan Olson wrote:
  Antoon Pardon wrote:

It probably is too late now, but I always felt, find should
have returned None when the substring isn't found.
 
  None is certainly a reasonable candidate.
[...]
  The really broken part is that unsuccessful searches return a
  legal index.
 
  We might agree, before further discussion, that this isn't the most
  elegant part of Python's design, and it's down to history that this tiny
  little wart remains.

I don't think my proposal breaks historic Python code, and I
don't think it has the same kind of unfortunate subtle
consequences as the current indexing scheme. You may think the
wart is tiny, but the duct-tape* is available so let's cure it.

[*] http://www.google.com/search?as_q=warts+%22duct+tape%22


  My suggestion doesn't change what find() returns, and doesn't
  break code. Negative one is a reasonable choice to represent an
  unsuccessful search -- provided it is not a legal index. Instead
  of changing what find() returns, we should heal the
  special-case-when-index-is-negative-in-a-certain-range wart.
 
 
  What I don't understand is why you want it to return something that
  isn't a legal index.

In this case, so that errors are caught as close to their
occurrence as possible. I see no good reason for the following
to happily print 'y'.

 s = 'buggy'
 print s[s.find('w')]

  Before using the result you always have to perform
  a test to discriminate between the found and not found cases. So I don't
  really see why this wart has put such a bug up your ass.

The bug that got me was what a slice object reports as the
'stop' bound when the step is negative and the slice includes
index 0. Took me hours to figure out why my code was failing.

The double-meaning of -1, as both an exclusive stopping bound
and an alias for the highest valid index, is just plain whacked.
Unfortunately, as negative indexes are currently handled, there
is no it-just-works value that slice could return.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Reinhold Birkenfeld
Bryan Olson wrote:
 Steve Holden wrote:
   Bryan Olson wrote:
   Antoon Pardon wrote:
 
 It probably is too late now, but I always felt, find should
 have returned None when the substring isn't found.
  
   None is certainly a reasonable candidate.
 [...]
   The really broken part is that unsuccessful searches return a
   legal index.
  
   We might agree, before further discussion, that this isn't the most
   elegant part of Python's design, and it's down to history that this tiny
   little wart remains.
 
 I don't think my proposal breaks historic Python code, and I
 don't think it has the same kind of unfortunate subtle
 consequences as the current indexing scheme. You may think the
 wart is tiny, but the duct-tape* is available so let's cure it.
 
 [*] http://www.google.com/search?as_q=warts+%22duct+tape%22

Well, nobody stops you from posting this on python-dev and be screamed
at by Guido...

just-kidding-ly
Reinhold
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Terry Reedy

Bryan Olson [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 The double-meaning of -1, as both an exclusive stopping bound
 and an alias for the highest valid index, is just plain whacked.

I agree in this sense: the use of any int as an error return is an 
unPythonic *nix-Cism, which I believe was copied therefrom.  Str.find is 
redundant with the Pythonic exception-raising str.index and I think it 
should be removed in Py3.

Therefore, I think changing it now is untimely and changing the language 
because of it backwards.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Paul Rubin
Terry Reedy [EMAIL PROTECTED] writes:
 I agree in this sense: the use of any int as an error return is an 
 unPythonic *nix-Cism, which I believe was copied therefrom.  Str.find is 
 redundant with the Pythonic exception-raising str.index and I think it 
 should be removed in Py3.

I like having it available so you don't have to clutter your code with
try/except if the substring isn't there.  But it should not return a
valid integer index.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Terry Reedy

Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]
 Terry Reedy [EMAIL PROTECTED] writes:
Str.find is
 redundant with the Pythonic exception-raising str.index
 and I think it should be removed in Py3.

 I like having it available so you don't have to clutter your code with
 try/except if the substring isn't there.  But it should not return a
 valid integer index.

The try/except pattern is a pretty basic part of Python's design.  One 
could say the same about clutter for *every* function or method that raises 
an exception on invalid input.  Should more or even all be duplicated?  Why 
just this one?




Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Torsten Bronger
Hallöchen!

Terry Reedy [EMAIL PROTECTED] writes:

 Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
 news:[EMAIL PROTECTED]

 Terry Reedy [EMAIL PROTECTED] writes:

 Str.find is redundant with the Pythonic exception-raising
 str.index and I think it should be removed in Py3.

 I like having it available so you don't have to clutter your code
 with try/except if the substring isn't there.  But it should not
 return a valid integer index.

 The try/except pattern is a pretty basic part of Python's design.
 One could say the same about clutter for *every* function or
 method that raises an exception on invalid input.  Should more or
 even all be duplicated?  Why just this one?

Granted, try/except can be used for deliberate case discrimination
(which may even happen in the standard library in many places),
however, it is only the second most elegant method -- the most
elegant being if.  Where if does the job, it should be prefered
in my opinion.

Tschö,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetusICQ 264-296-646
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Paul Rubin
Terry Reedy [EMAIL PROTECTED] writes:
 The try/except pattern is a pretty basic part of Python's design.  One 
 could say the same about clutter for *every* function or method that raises 
 an exception on invalid input.  Should more or even all be duplicated?  Why 
 just this one?

Someone must have thought str.find was worth having, or else it
wouldn't be in the library.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Terry Reedy

Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]
 Terry Reedy [EMAIL PROTECTED] writes:
 The try/except pattern is a pretty basic part of Python's design.  One
 could say the same about clutter for *every* function or method that 
 raises
 an exception on invalid input.  Should more or even all be duplicated? 
 Why
 just this one?

 Someone must have thought str.find was worth having, or else it
 wouldn't be in the library.

Well, Guido no longer thinks it worth having and emphatically agreed that 
it should be added to one of the 'To be removed' sections of PEP 3000.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Steve Holden
Torsten Bronger wrote:
 Hallöchen!
 
 Terry Reedy [EMAIL PROTECTED] writes:
 
 
Paul Rubin http://phr.cx@NOSPAM.invalid wrote in message 
news:[EMAIL PROTECTED]


Terry Reedy [EMAIL PROTECTED] writes:


Str.find is redundant with the Pythonic exception-raising
str.index and I think it should be removed in Py3.

I like having it available so you don't have to clutter your code
with try/except if the substring isn't there.  But it should not
return a valid integer index.

The try/except pattern is a pretty basic part of Python's design.
One could say the same about clutter for *every* function or
method that raises an exception on invalid input.  Should more or
even all be duplicated?  Why just this one?
 
 
 Granted, try/except can be used for deliberate case discrimination
 (which may even happen in the standard library in many places),
 however, it is only the second most elegant method -- the most
 elegant being if.  Where if does the job, it should be prefered
 in my opinion.
 
Of course. But onc you (sensibly) decide to use an if then there 
really isn't much difference between -1, None, () and sys.maxint as
a sentinel value, is there?

Which is what I've been trying to say all along.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Steve Holden
Bryan Olson wrote:
 Steve Holden wrote:
   Bryan Olson wrote:
   Antoon Pardon wrote:
 
 It probably is too late now, but I always felt, find should
 have returned None when the substring isn't found.
  
   None is certainly a reasonable candidate.
 [...]
   The really broken part is that unsuccessful searches return a
   legal index.
  
   We might agree, before further discussion, that this isn't the most
   elegant part of Python's design, and it's down to history that this tiny
   little wart remains.
 
 I don't think my proposal breaks historic Python code, and I
 don't think it has the same kind of unfortunate subtle
 consequences as the current indexing scheme. You may think the
 wart is tiny, but the duct-tape* is available so let's cure it.
 
 [*] http://www.google.com/search?as_q=warts+%22duct+tape%22
 
 
   My suggestion doesn't change what find() returns, and doesn't
   break code. Negative one is a reasonable choice to represent an
   unsuccessful search -- provided it is not a legal index. Instead
   of changing what find() returns, we should heal the
   special-case-when-index-is-negative-in-a-certain-range wart.
  
  
   What I don't understand is why you want it to return something that
   isn't a legal index.
 
 In this case, so that errors are caught as close to their
 occurrence as possible. I see no good reason for the following
 to happily print 'y'.
 
  s = 'buggy'
  print s[s.find('w')]
 
   Before using the result you always have to perform
   a test to discriminate between the found and not found cases. So I don't
   really see why this wart has put such a bug up your ass.
 
 The bug that got me was what a slice object reports as the
 'stop' bound when the step is negative and the slice includes
 index 0. Took me hours to figure out why my code was failing.
 
 The double-meaning of -1, as both an exclusive stopping bound
 and an alias for the highest valid index, is just plain whacked.
 Unfortunately, as negative indexes are currently handled, there
 is no it-just-works value that slice could return.
 
 
If you want an exception from your code when 'w' isn't in the string you 
should consider using index() rather than find.

Otherwise, whatever find() returns you will have to have an if in 
there to handle the not-found case.

This just sounds like whining to me. If you want to catch errors, use a 
function that will raise an exception rather than relying on the 
invalidity of the result.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Robert Kern
Steve Holden wrote:

 Of course. But onc you (sensibly) decide to use an if then there 
 really isn't much difference between -1, None, () and sys.maxint as
 a sentinel value, is there?

Sure there is. -1 is a valid index; None is not. -1 as a sentinel is
specific to str.find(); None is used all over Python as a sentinel.

If I may digress for a bit, my advisor is currently working on a project
that is processing seafloor depth datasets starting from a few decades
ago. A lot of this data was orginally to be processed using FORTRAN
software, so in the idiom of much FORTRAN software from those days, 
is often used to mark missing data. Unfortunately,  is a perfectly
valid datum in most of the unit systems used by the various datasets.

Now he has to find a grad student to traul through the datasets and
clean up the really invalid 's (as well as other such fun tasks like
deciding if a dataset that says it's using feet is actually using meters).

I have already called Not It.

-- 
Robert Kern
[EMAIL PROTECTED]

In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die.
  -- Richard Harter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-26 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
 Of course. But onc you (sensibly) decide to use an if then there
 really isn't much difference between -1, None, () and sys.maxint as
 a sentinel value, is there?

Of course there is.  -1 is (under Python's perverse semantics) a valid
subscript.  sys.maxint is an artifact of Python's fixed-size int
datatype, which is fading away under int/long unification, so it's
something that soon won't exist and shouldn't be used.  None and ()
are invalid subscripts so would be reasonable return values, unlike -1
and sys.maxint.  Of those, None is preferable to () because of its
semantic connotations.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-26 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
 If you want an exception from your code when 'w' isn't in the string
 you should consider using index() rather than find.

The idea is you expect w to be in the string.  If w isn't in the
string, your code has a bug, and programs with bugs should fail as
early as possible so you can locate the bugs quickly and easily.  That
is why, for example, 

  x = 'buggy'[None]

raises an exception instead of doing something stupid like returning 'g'.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing,was Re: Bug in slice type

2005-08-25 Thread en.karpachov
On Thu, 25 Aug 2005 00:05:18 -0400
Steve Holden wrote:

 What on earth makes you call this a bug? And what are you proposing that 
 find() should return if the substring isn't found at all? please don't 
 suggest it should raise an exception, as index() exists to provide that 
 functionality.

Returning -1 looks like C-ism for me. It could better return None when none
is found.

index = Hello.find(z)
if index is not None:
 # ...

Now it's too late for it, I know.

-- 
jk
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-25 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
 As far as position reporting goes, it seems pretty clear that find()
 will always report positive index values. In a five-character string
 then -1 and 4 are effectively equivalent.
 
 What on earth makes you call this a bug? And what are you proposing
 that find() should return if the substring isn't found at all? please
 don't suggest it should raise an exception, as index() exists to
 provide that functionality.

Bryan is making the case that Python's use of negative subscripts to
measure from the end of sequences is bogus, and that it should be done
some other way instead.  I've certainly had bugs in my own programs
related to that feature.  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-25 Thread Bryan Olson
Steve Holden asked:
  Do you just go round looking for trouble?

In the course of programming, yes, absolutly.

  As far as position reporting goes, it seems pretty clear that find()
  will always report positive index values. In a five-character string
  then -1 and 4 are effectively equivalent.
 
  What on earth makes you call this a bug?

What you just said, versus what the doc says.

  And what are you proposing that
  find() should return if the substring isn't found at all? please don't
  suggest it should raise an exception, as index() exists to provide that
  functionality.

There are a number of good options. A legal index is not one of
them.


-- 
--Bryan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-24 Thread Steve Holden
Bryan Olson wrote:
 The doc for the find() method of string objects, which is
 essentially the same as the string.find() function, states:
 
  find(sub[, start[, end]])
Return the lowest index in the string where substring sub
is found, such that sub is contained in the range [start,
end). Optional arguments start and end are interpreted as
in slice notation. Return -1 if sub is not found.
 
 Consider:
 
  print 'Hello'.find('o')
 
 or:
 
  import string
  print string.find('Hello', 'o')
 
 The substring 'o' is found in 'Hello' at the index -1, and at
 the index 4, and it is not found at any other index. Both the
 locations found are in the range [start, end), and obviously -1
 is less than 4, so according to the documentation, find() should
 return -1.
 
 What the either of the above actually prints is:
 
  4
 
 which shows yet another bug resulting from Python's handling of
 negative indexes. This one is clearly a documentation error, but
 the real fix is to cure the wart so that Python's behavior is
 consistent enough that we'll be able to describe it correctly.
 
 
Do you just go round looking for trouble?

As far as position reporting goes, it seems pretty clear that find() 
will always report positive index values. In a five-character string 
then -1 and 4 are effectively equivalent.

What on earth makes you call this a bug? And what are you proposing that 
find() should return if the substring isn't found at all? please don't 
suggest it should raise an exception, as index() exists to provide that 
functionality.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in string.find; was: Re: Proposed PEP: New style indexing, was Re: Bug in slice type

2005-08-24 Thread Casey Hawthorne
contained in the range [start, end)

Does range(start, end) generate negative integers in Python if start
= 0 and end = start?
--
Regards,
Casey
-- 
http://mail.python.org/mailman/listinfo/python-list