Ah. That makes a lot of sense, actually. Anyway, so then Latin1 strings are
memcmp-able, and others are not. That's fine; I'll just add a check for
that (I think there are already helper functions for this) and then have
two special-case string functions. Thanks!

On Wed, Oct 12, 2016 at 4:08 PM Alexander Belopolsky <
alexander.belopol...@gmail.com> wrote:

>
> On Wed, Oct 12, 2016 at 5:57 PM, Elliot Gorokhovsky <
> elliot.gorokhov...@gmail.com> wrote:
>
> On Wed, Oct 12, 2016 at 3:51 PM Nathaniel Smith <n...@pobox.com> wrote:
>
> But this isn't relevant to Python's str, because Python's str never uses
> UTF-8.
>
>
> Really? I thought in python 3, strings are all unicode... so what encoding
> do they use, then?
>
>
> No encoding is used.  The actual code points are stored as integers of the
> same size.  If all code points are less than 256, they are stored as 8-bit
> integers (bytes).  If some code points are greater or equal to 256 but less
> than 65536, they are stored as 16-bit integers and so on.
>
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to