Stephen J. Turnbull added the comment:

Please do not add the "rehandle" functions to codecs.  They do not change the 
(duck-typed) representation of data while maintaining the semantics, they 
change the semantics of data while retaining the representation.

I suggest a "validation" submodule of the unicodedata package, or perhaps a new 
"unicodeutils" package, for these functions, as well as those that just detect 
the surrogates, etc.

Because they change the semantics of data they should be documented as 
potentially dangerous because they can't be inverted back to bytes without 
knowledge of the history of transformations they perform (and not even then in 
the case of the "replace" error handler).  This matters in applications where 
the input bytes may have been digitally signed, for example.

----------
nosy: +sjt

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18814>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to