On 6/1/23 2:06 PM, David Mertz, Ph.D. wrote:
I'm not sure why U+FEFF isn't included, but that seems to match the current standards, so all good.
I think because Zero Width, No-Breaking Space, (aka BOM Mark) doesn't act like a "Space" character.
If used as the BOM mark, it is intended that it gets stripped out when read and the UTF-16/UTF-32 data file that follows it be typically just read and have its byte order corrected as the mark indicates.
If used elsewhere as the ZWNBSP (which has been deprecated and replaced with U+2060) then it use is intentionally "no-break" so not a space to seperate on.
-- Richard Damon _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7D2NZMF445F4XNKJFVXLDKDLI3NGDK65/ Code of Conduct: http://python.org/psf/codeofconduct/