On 30/06/2020 13:43, Emily Bowman wrote:
I completely agree with this, that UTF-8 has become the One True
Encoding(tm), and UCS-2 and UTF-16 are hardly found anywhere outside of the
Win32 API. Nearly all basic emoji can't be represented in UCS-2 wchar_t,
let alone composite emoji.

You say that as if it's a bad thing :-)

So how to make that C-compatible? Make everything a void* and it just comes
back with as many bytes as it gets?

I'd be inclined to something like that. You really don't want people trying to roll their own UTF-8 handling if you can help it. That does imply the C API will need to be pretty comprehensive, though.

(If you want nightmares, take a look at the parsing code in Expat. Multiple layers of macros and function tables make it a horror to comprehend.)

--
Rhodri James *-* Kynesim Ltd
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7HPGNVZ46ROP3HMRUJXJXX2WI4LI4JAL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to