On 4/14/06, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Ian Bicking wrote:
> > I propose that strings (unicode/text) shouldn't be iterable. Seeing this:
> >
> > <ul>
> > <li> i
> > <li> t
> > <li> e
> > <li> m
> > <li>
> > <li> 1
> > </ul>
> >
> > a few too many times... it's annoying. Instead, I propose that strings
> > get a list-like view on their characters. Oh synergy!
>
> Another +1 here.
And a moderate +0.1 here (we need to research the consequences more).
> Some other details:
>
> __contains__ would still be there, so "substr in string" would still work
> __getitem__ would still be there, so slicing would work
Right.
> To remove the iterable behaviour either iter() would have to change so that
> the default "all sequences are iterable" behaviour goes away (which Guido has
> suggested previously) or else the __iter__ method of strings would need to
> explicitly raise TypeError.
I'm for simplifying iter() so that it *only* looks for __iter__().
(There's another problem with falling back on __getitem__(): if
someone implements a dict-ish type from scratch without defining
__iter__, iteration tries to use keys 0, 1, ... instead of cleanly
failing or iterating over the actual keys.)
> My preference is the latter (as it's less likely to break other classes), but
> the former could also work if python3warn flagged classes which defined
> __getitem__ without also defining __iter__.
The breakage argument isn't very strong for Python 3000.
[...]
> > Should bytes be iterable as well? Because bytes (the container) and
> > integers are not interchangeable, the problems that occur with strings
> > seem much less likely, and the container-like nature of bytes is
> > clearer. So I don't propose this effect bytes in any way.
Agreed; I doubt that bytes will cause users to fall in the same trap
often. But we should review this after bytes is implemented and has
been used some.
> > Questions:
> >
> > * .chars() doesn't return characters; should it be named something else?
>
> Why do you say it doesn't return characters? Python's chars are just strings
> of length 1, and that's what this view will contain.
I'm not too keen on the name chars(), but I can't think of an
alternative just now. The argument above doesn't hold IMO; it *does*
return characters. Perhaps it should be named characters()? (Some
folks don't like abbrevs.)
I'm not sure of the functionality of chars()/characters() -- perhaps
it should just return an iterator? I don't think that s.count('a') is
a trap.
> > * Should it be a method that is called? dict.keys() has a legacy, but
> > this does not. There is presumably very little overhead to getting this
> > view. However, symmetry with the only other views we are considering
> > (dictionary views) would indicate it should be a method. Also, there
> > are no attributes on strings currently.
>
> Using methods for view creation is fine by me. The various ideas for turning
> them into attributes instead are cute, but not particularly compelling.
-1 on using attributes to return views.
> > * Are there other views on strings? Can string->byte encoding be
> > usefully seen as a view in some cases?
That's not a view but an operation that returns a new object!
> Given that all Py3k strings will be Unicode, I think providing a view that
> exposed the code points of the characters would be good. Being able to write
> mystr.codes() instead of [ord(c) for c in mystr.chars()] would be a good
> thing.
Feels like YAGNI territory to me.
> Interestingly, ord(c) would then be little more than an abbreviation of
> c.codes()[0].
That doesn't make it a good idea.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com