STINNER Victor added the comment:

Serhiy wrote: "All other error handlers lose information and can't be used per 
se for transcoding bytes as string or string as bytes."

Well, it was very simple to implement replace and ignore in decoders. I believe 
that the error handlers are commonly used.

"(...) adding it can slow down common case (no errors). That is why I limit my 
patch for "surrogateescape" and "surrogatepass" only."

We can start with benchmarks and see if modifying Objects/stringlib/ has a real 
impact on performances, or if modifying the "slower" decoder in 
Objects/unicodeobject.c is enough. IMHO it's fine to implement many error 
handlers in Objects/unicodeobject.c: it's the "slow" path when at least one 
error occurred, so it doesn't impact the path to decode valid UTF-8 strings.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24870>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to