Re: [Python-Dev] New Py_UNICODE doc

James Y Knight Fri, 06 May 2005 12:42:25 -0700

On May 6, 2005, at 2:49 PM, Nicholas Bastin wrote:
> If this is the case, then we're clearly misleading users.  If the
> configure script says UCS-2, then as a user I would assume that
> surrogate pairs would *not* be encoded, because I chose UCS-2, and it
> doesn't support that.  I would assume that any UTF-16 string I would
> read would be transcoded into the internal type (UCS-2), and
> information would be lost.  If this is not the case, then what does the
> configure option mean?


It means all the string operations treat strings as if they were UCS-2, 
but that in actuality, they are UTF-16. Same as the case in the windows 
APIs and Java. That is, all string operations are essentially broken, 
because they're operating on encoded bytes, not characters, but claim 
to be operating on characters.

James

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New Py_UNICODE doc

Reply via email to