J Decker schreef op 2016-01-31 03:28:
I've reconsidered and think for ease of implementation to just mask
every UTF-16 character (not codepoint) with a 10 bit value, This will
result in no character changing from BMP space to surrogate-pair or
vice-versa.
Thanks for the feedback.
So you are still trying to handle the unarmed output as plaintext.
Do you realize that if a string in the output is replaced by a canonical
equivalent
one this may mess up things because the originals are not canonical
equivalent?