[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Marc-Andre Lemburg Thu, 24 Feb 2011 08:42:05 -0800

Marc-Andre Lemburg <[email protected]> added the comment:

Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <[email protected]> added the comment:
> 
> On Thu, Feb 24, 2011 at 11:01 AM, Marc-Andre Lemburg
> <[email protected]> wrote:
> ..
>> On this ticker, we're discussing just one application area: that
>> of the builtin short cuts.
>>
> Fair enough.  I was hoping to close this ticket by simply committing
> the posted patch, but it looks like people want to do more.  I don't
> think we'll get measurable performance gains but may improve code
> understandability.
> 
>> To have more encoding name variants benefit from the optimization,
>> we might want to enhance that particular normalization function
>> to avoid having to compare against "utf8" and "utf-8" in the
>> encode/decode functions.
> 
> Which function are you talking about?
> 
> 1. normalize_encoding() in unicodeobject.c
> 2. normalizestring() in codecs.c


The first one, since that's being used by the shortcuts.

> The first is s.lower().replace('-', '_') and the second is

It does this: s.lower().replace('_', '-')

> s.lower().replace(' ', '_'). (Note space vs. dash difference.)
> 
> Why do we need both?  And why should they be different?

Because the first is specifically used for the shortcuts
(which can do more without breaking anything, since it's
only used internally) and the second prepares the encoding
names for lookup in the codec registry (which has a PEP100
defined behavior we cannot easily change).

----------
title: b'x'.decode('latin1') is much slower     than    b'x'.decode('latin-1') 
-> b'x'.decode('latin1') is much slower than     b'x'.decode('latin-1')

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue11303>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11303] b'x'.decode('latin1') is much slower than b'x'.decode('latin-1')

Reply via email to