Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

Martin v. Löwis Wed, 06 May 2009 22:43:58 -0700

Michael Urman wrote:
> On Wed, May 6, 2009 at 15:42, "Martin v. Löwis" <[email protected]> wrote:
>> Despite there being also an error handler called "surrogates".
> 
> Not that I have to be, but I'm not sold on the previous UTF-8 codec
> behavior becoming an error handler of the name "surrogates" for two
> reasons (I do respect the obvious PBP argument for the implementation,
> and have no better name - "lenient"?).


PBP?

> First, unless there's a way to stack error handlers, there's no way to
> access the old behavior combined with the "replace" handler.

Well, there is a way to stack error handlers, although it's not pretty:

_surrogates = codecs.lookup_errors("surrogates")
_replace = codecs.lookup_errors("replace")
def surrogates_then_replace(exc):
    try:
        return _surrogates(exc)
    except UnicodeError:
        return _replace(exc)
codecs.register_error("surrogates_then_replace",
                      surrogates_then_replace)

> The stacking argument also applies to the new utf8b behavior on encode
> (only, as it handles all errors on decode). This may be a YAGNI

Indeed - in particular, as, in the primary application of this error
handler (i.e. file IO operations), there is no way of specifying
an addition error handler anyway.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

Reply via email to