Nicholas Bastin wrote: > If this is the case, then we're clearly misleading users. If the > configure script says UCS-2, then as a user I would assume that > surrogate pairs would *not* be encoded, because I chose UCS-2, and it > doesn't support that.
What do you mean by that? That the interpreter crashes if you try to store a low surrogate into a Py_UNICODE? > I would assume that any UTF-16 string I would > read would be transcoded into the internal type (UCS-2), and information > would be lost. If this is not the case, then what does the configure > option mean? It tells you whether you have the two-octet form of the Universal Character Set, or the four-octet form. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com