On 28-May-08, at 5:44 PM, Greg Ewing wrote:

Mike Klaas wrote:

In my perfect world, strings would be indicable and sliceable, but not iterable.

An object that was indexable but not iterable would
be a very strange thing. If it has __len__ and __getitem__,
there's nothing to stop you iterating over it by hand
anyway, so disallowing __iter__ would just seem perverse.

Python has a beautiful abstraction in iteration: iter() is a generic function that allows you lazily consume a sequence of objects, whether it be lists, tuples, custom iterators, generators, or what have you. It is trivial to write your code to be agnostic to the type of iterable passed-in. Almost anything else a consumer of your code passes in will result in an immediate exception.

Unfortunately, python has two extremely common data types which do not fail when this generic function is applied to them, and instead almost always returns a result which is not desired. Instead, it iterates over the characters of the string, a behaviour which is rarely needed in practice due to the wealth of methods available.

I agree that it would be perverse to disallowing iterating over a string. I just wish that the way to do that wasn't glommed on to the object-iteration abstraction.

As it stands, any consumer of iterables has to keep strings in mind. It is particularly irksome when the target input is an iterable of strings. I recall a function that accepts a list/iterable of item keys, hashes them, and then retrieves values based on the item hashes (usually over the network, so it is necessary to batch requests). This function is often used in the interactive interpreter, and it is thus very prone to being passed-in a string rather than a list. There was no good way to prevent the (frequent) mysterious "not found" errors save adding an explicit type check for basestring.

String already behaves slightly differently from the way other sequences act: It is the only sequence for which 'seq in seq' is true, and the only sequence for which 'x in seq' can be true but 'any(x==item for item in seq)' is false. Abstractions are sometimes imperfect: this is why there is an explicit typecheck for strings in the sum() builtin.

I'll stop here as I realize that the likelihood that this will be accepted is terribly small, especially considering the late stage of the process. But I would be willing to develop a patch that implements this behaviour on the off chance it is.

-Mike
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to