Michael Urman wrote:
> On Wed, May 6, 2009 at 15:42, "Martin v. Löwis" <[email protected]> wrote:
>> Despite there being also an error handler called "surrogates".
>
> Not that I have to be, but I'm not sold on the previous UTF-8 codec
> behavior becoming an error handler of the name "surrogates" for two
> reasons (I do respect the obvious PBP argument for the implementation,
> and have no better name - "lenient"?).
PBP?
> First, unless there's a way to stack error handlers, there's no way to
> access the old behavior combined with the "replace" handler.
Well, there is a way to stack error handlers, although it's not pretty:
_surrogates = codecs.lookup_errors("surrogates")
_replace = codecs.lookup_errors("replace")
def surrogates_then_replace(exc):
try:
return _surrogates(exc)
except UnicodeError:
return _replace(exc)
codecs.register_error("surrogates_then_replace",
surrogates_then_replace)
> The stacking argument also applies to the new utf8b behavior on encode
> (only, as it handles all errors on decode). This may be a YAGNI
Indeed - in particular, as, in the primary application of this error
handler (i.e. file IO operations), there is no way of specifying
an addition error handler anyway.
Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com