[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

Andrew Barnert via Python-ideas Wed, 23 Oct 2019 17:36:57 -0700

On Oct 23, 2019, at 16:00, Christopher Barker <python...@gmail.com> wrote:
> 
>> On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas 
>> <python-ideas@python.org> wrote:
> 
>> The main problem is that a str is a sequence of single-character str, each 
>> of which is a one-element sequence of itself, etc. forever. If you wanted to 
>> change this, I think it would make more sense to go the opposite way: leave 
>> str a sequence, but make it a sequence of char objects. (And likewise, bytes 
>> and bytearray could be sequences of byte objects—or just go all the way to 
>> making them sequences of ints.) And then maybe add a c prefix for defining 
>> char constants, and you’ve solved all the problems without having to add new 
>> confusing methods or properties.
> 
> I've thought for a long time that this would be a "good thing". the "string 
> or sequence of strings" issues is pretty much the only hidden-bug-triggering 
> type error I've gotten since "true division".
> 
> The only way we really live with it fairly easily is that strings are pretty 
> much never duck typed -- so I can check if I got a string, and then I know I 
> didn't get a sequence of strings. But I've always wondered how disruptive it 
> would be to add a char type -- it doesn't seem like it would be very 
> disruptive, but I have not thought it through at all.


Well, just adding a char type (and presumably a way of defining char literals) 
wouldn’t be too disruptive. 

But changing str to iterate chars instead of strs, that probably would be.

Also, you’d have to go through a lot of functions and decide what types they 
should take. For example, does str.join still accept a string instead of an 
iterable of strings? Does it accept other iterables of char too? (I have used ' 
'.join on a string in real life production code, even if I did feel guilty 
about it…) Can you pass a char to str.__contains__ or str.endswith? What about 
a tuple of chars? Or should we take the backward-compat breaking opportunity to 
eliminate the “str or tuple of str” thing and instead use *args, or at least 
change it to “str or iterable of str (which no longer includes str itself)”?

> And I'm not sure how much string functionality a char should have -- probably 
> next to none, as the point is that it would be easy to distinguish from a 
> string that happened to have one character.

Surely you’d want to be able to do things like isdigit or swapcase. Even C has 
functions to do most of that kind of stuff on chars.

But I think that, other than join and maybe encode and translate, there’s an 
obvious right answer for every str method and operator, so this isn’t too much 
of a problem.

Speaking of operators, should char+int and char-int and char-char be legal? 
(What about char%int? A thousand students doing the rot13 assignment would 
rejoice, but allowing % without * and // is kind of weird, and allowing * and 
// even weirder—as well as potentially confusing with str*int being legal but 
meaning something very different.)

> By the way, the bytes and bytearray types already does this -- index into or 
> loop through a bytes object, you get an int.

Sure, but b'abc'.find(66) is -1, and b'abc'.replace(66, 70) is a TypeError, and 
so on.

Fixing those inconsistencies is what I meant by “go all the way to making them 
sequences of ints”. But it might be friendlier to undo the changes and instead 
add a byte type like the char type for bytes to be a sequence of. I’m not sure 
which is better.

But anyway, I think all of these questions are questions for a new language. If 
making str not iterate str was too big a change even for 3.0, how could it be 
reasonable for any future version?

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WRVOKGHNK7JKR66WG7MG73FUFZODLC4R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Python 4000: Have stringlike objects provide sequence views rather than being sequences

Reply via email to