New submission from Josh Rosenberg:
String translation functions, at least in other languages, are typically a
performance optimization when you *really* need the speed. In exchange for
paying the memory cost and one-time work to create a translation table, you get
much faster 1 to 1 and deletion of characters.
In Python 3, str.translate is usually a performance pessimization, not
optimization. While it does a few things similar functions (e.g. str.replace)
can't do, like perform a translation where the inputs and outputs overlap
('abc' -> 'bca' wouldn't work with str.replace), and it handles the full
Unicode range ('\u0f01\u0f02' -> '\u0f03\u0f04' can't be done by encoding,
using bytes.translate, then decoding again), when another approach is
available, str.translate is almost always the slowest.
While it may not be practical to make the general case for str.translate match
the performance of bytes.translate, it should at least be possible to improve
upon it in many common cases.
My timing data (see attached file for how it was produced):
$ python.exe translate_timing.py
Testing 1-1 translation
bytes.translate 0.05638575777521804
str.translate 2.9639258423152874
str.translate from bytes trans 1.6307581351818357
str.encode->bytes.translate->bytes.decode 0.09472254504689914
str.replace 0.2507745649580393
Testing deletion
bytes.translate 0.05367317344084199
str.translate 2.360142494240577
str.encode->bytes.translate->bytes.decode 0.0629345277274993
str.replace 0.31173443413690904
Testing enlarging translations
str.translate 2.33631417640283
str.replace 0.41248187325160757
Obviously, this is somewhat contrived, but it illustrates my point that the
common case (ASCII or latin-1 strings) are far too slow. In every case,
str.translate loses to every competitor, including itself when you pass it a
translation table produced by bytes.translate, and including the case where a
pure Python function calls str.replace over and over.
A large part of the problem is clearly the use of a dict to perform lookups; as
seen in the 1-1 test case, str.translate using the result of str.maketrans
takes nearly twice as long as when it uses the result of a similar
bytes.maketrans. While I have not profiled the code, I suspect the remainder is
a combination of the overhead involving conversion of Py_UCS4 to PyLong for
general purpose lookup, handling the multiple ways the translation target can
be expressed (a Unicode ordinal or a str object), and the code that checks for
and resizes the output buffer each time.
I have ideas for a solution involving a special purpose translation object to
be produced by str.maketrans for cases where the translation table will produce
an equal or smaller string (all to values are None or length 1 or lower
strings), and the from values all fall in a small enough range (say, 1024) to
avoid excessive memory usage from large translation table objects. But I'd like
to know whether such a change (including modifying the behavior of
str.maketrans to produce an unspecified mapping object, instead of the current
dict) is desired, acceptable, what have you.
----------
components: Unicode
files: translate_timing.py
messages: 215283
nosy: ezio.melotti, haypo, josh.rosenberg
priority: normal
severity: normal
status: open
title: str.translate is absurdly slow in majority of use cases (takes 3x to 10x
longer than similar functions)
type: performance
versions: Python 3.4, Python 3.5
Added file: http://bugs.python.org/file34688/translate_timing.py
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue21118>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com