> What about things like the surrogateescape codec that
> deliberately use code units in non-standard ways? Will
> tricks like that still be possible if the code-unit
> level is hidden from the programmer?

Most certainly. In the PEP-393 representation, the surrogate
characters can readily be represented (and would imply atleast
the two-byte form), but they will never take their UTF-16
function (i.e. the UTF-8 codec won't try to combine surrogate
pairs), so they can be used for surrogateescape and other
functions. Of course, in strict error mode, codecs will
refuse to encode them (notice that surrogateescape is an error
handler, not a codec).

Regards,
Martin

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to