Nicholas Bastin wrote:
> If this is the case, then we're clearly misleading users.  If the
> configure script says UCS-2, then as a user I would assume that
> surrogate pairs would *not* be encoded, because I chose UCS-2, and it
> doesn't support that.

What do you mean by that? That the interpreter crashes if you try
to store a low surrogate into a Py_UNICODE?

> I would assume that any UTF-16 string I would
> read would be transcoded into the internal type (UCS-2), and information
> would be lost.  If this is not the case, then what does the configure
> option mean?

It tells you whether you have the two-octet form of the Universal
Character Set, or the four-octet form.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to