> If the bytes are mapped to single half surrogate codes instead of the > normal pairs (low+high), then I can see that decoding could never be > ambiguous and encoding could produce the original bytes.
I was confused by Markus Kuhn's original UTF-8b specification. I have now changed the PEP to avoid using PUA characters at all. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list