On 4/14/06, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Ian Bicking wrote:
> > I propose that strings (unicode/text) shouldn't be iterable.  Seeing this:
> >
> > <ul>
> >   <li> i
> >   <li> t
> >   <li> e
> >   <li> m
> >   <li>
> >   <li> 1
> > </ul>
> >
> > a few too many times... it's annoying.  Instead, I propose that strings
> > get a list-like view on their characters.  Oh synergy!
>
> Another +1 here.

And a moderate +0.1 here (we need to research the consequences more).

> Some other details:
>
>    __contains__ would still be there, so "substr in string" would still work
>    __getitem__ would still be there, so slicing would work

Right.

> To remove the iterable behaviour either iter() would have to change so that
> the default "all sequences are iterable" behaviour goes away (which Guido has
> suggested previously) or else the __iter__ method of strings would need to
> explicitly raise TypeError.

I'm for simplifying iter() so that it *only* looks for __iter__().
(There's another problem with falling back on __getitem__(): if
someone implements a dict-ish type from scratch without defining
__iter__, iteration tries to use keys 0, 1, ... instead of cleanly
failing or iterating over the actual keys.)

> My preference is the latter (as it's less likely to break other classes), but
> the former could also work if python3warn flagged classes which defined
> __getitem__ without also defining __iter__.

The breakage argument isn't very strong for Python 3000.

[...]
> > Should bytes be iterable as well?  Because bytes (the container) and
> > integers are not interchangeable, the problems that occur with strings
> > seem much less likely, and the container-like nature of bytes is
> > clearer.  So I don't propose this effect bytes in any way.

Agreed; I doubt that bytes will cause users to fall in the same trap
often. But we should review this after bytes is implemented and has
been used some.

> > Questions:
> >
> > * .chars() doesn't return characters; should it be named something else?
>
> Why do you say it doesn't return characters? Python's chars are just strings
> of length 1, and that's what this view will contain.

I'm not too keen on the name chars(), but I can't think of an
alternative just now. The argument above doesn't hold IMO; it *does*
return characters. Perhaps it should be named characters()? (Some
folks don't like abbrevs.)

I'm not sure of the functionality of chars()/characters() -- perhaps
it should just return an iterator? I don't think that s.count('a') is
a trap.

> > * Should it be a method that is called?  dict.keys() has a legacy, but
> > this does not.  There is presumably very little overhead to getting this
> > view.  However, symmetry with the only other views we are considering
> > (dictionary views) would indicate it should be a method.  Also, there
> > are no attributes on strings currently.
>
> Using methods for view creation is fine by me. The various ideas for turning
> them into attributes instead are cute, but not particularly compelling.

-1 on using attributes to return views.

> > * Are there other views on strings?  Can string->byte encoding be
> > usefully seen as a view in some cases?

That's not a view but an operation that returns a new object!

> Given that all Py3k strings will be Unicode, I think providing a view that
> exposed the code points of the characters would be good. Being able to write
> mystr.codes() instead of [ord(c) for c in mystr.chars()] would be a good 
> thing.

Feels like YAGNI territory to me.

> Interestingly, ord(c) would then be little more than an abbreviation of
> c.codes()[0].

That doesn't make it a good idea.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to