On Oct 23, 2019, at 16:00, Christopher Barker <python...@gmail.com> wrote: > >> On Sun, Oct 13, 2019 at 12:52 PM Andrew Barnert via Python-ideas >> <python-ideas@python.org> wrote: > >> The main problem is that a str is a sequence of single-character str, each >> of which is a one-element sequence of itself, etc. forever. If you wanted to >> change this, I think it would make more sense to go the opposite way: leave >> str a sequence, but make it a sequence of char objects. (And likewise, bytes >> and bytearray could be sequences of byte objects—or just go all the way to >> making them sequences of ints.) And then maybe add a c prefix for defining >> char constants, and you’ve solved all the problems without having to add new >> confusing methods or properties. > > I've thought for a long time that this would be a "good thing". the "string > or sequence of strings" issues is pretty much the only hidden-bug-triggering > type error I've gotten since "true division". > > The only way we really live with it fairly easily is that strings are pretty > much never duck typed -- so I can check if I got a string, and then I know I > didn't get a sequence of strings. But I've always wondered how disruptive it > would be to add a char type -- it doesn't seem like it would be very > disruptive, but I have not thought it through at all.
Well, just adding a char type (and presumably a way of defining char literals) wouldn’t be too disruptive. But changing str to iterate chars instead of strs, that probably would be. Also, you’d have to go through a lot of functions and decide what types they should take. For example, does str.join still accept a string instead of an iterable of strings? Does it accept other iterables of char too? (I have used ' '.join on a string in real life production code, even if I did feel guilty about it…) Can you pass a char to str.__contains__ or str.endswith? What about a tuple of chars? Or should we take the backward-compat breaking opportunity to eliminate the “str or tuple of str” thing and instead use *args, or at least change it to “str or iterable of str (which no longer includes str itself)”? > And I'm not sure how much string functionality a char should have -- probably > next to none, as the point is that it would be easy to distinguish from a > string that happened to have one character. Surely you’d want to be able to do things like isdigit or swapcase. Even C has functions to do most of that kind of stuff on chars. But I think that, other than join and maybe encode and translate, there’s an obvious right answer for every str method and operator, so this isn’t too much of a problem. Speaking of operators, should char+int and char-int and char-char be legal? (What about char%int? A thousand students doing the rot13 assignment would rejoice, but allowing % without * and // is kind of weird, and allowing * and // even weirder—as well as potentially confusing with str*int being legal but meaning something very different.) > By the way, the bytes and bytearray types already does this -- index into or > loop through a bytes object, you get an int. Sure, but b'abc'.find(66) is -1, and b'abc'.replace(66, 70) is a TypeError, and so on. Fixing those inconsistencies is what I meant by “go all the way to making them sequences of ints”. But it might be friendlier to undo the changes and instead add a byte type like the char type for bytes to be a sequence of. I’m not sure which is better. But anyway, I think all of these questions are questions for a new language. If making str not iterate str was too big a change even for 3.0, how could it be reasonable for any future version?
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WRVOKGHNK7JKR66WG7MG73FUFZODLC4R/ Code of Conduct: http://python.org/psf/codeofconduct/