STINNER Victor <[email protected]> added the comment:
> Fill the ascii buffer and then copying can be cheaper than using
> _PyUnicodeWriter with general non-ascii string.
Here is a new patch using _PyUnicodeWriter directly in longobject.c.
According to my benchmark (see below), formating a small number (5 decimal
digits) is 17% faster with my patch version 2 compared to tip, and 38% faster
compared to Python 3.3 before my optimizations on str%tuples or str.format().
Creating a temporary PyUnicode is not cheap, at least for short strings.
str%tuple and str.format() allocates len(format_string)+100 ASCII characters at
the beginning, which is enough for "x={}".format(12345) for example. So only a
resize is needed, and it looks like resizing is cheap.
I'm not completly satisfied of the usage of Py_LOCAL_INLINE in unicodeobject.c
for _PyUnicodeWriter methods. The same "hacks" (?) should be used in
formatter_unicode.c.
Shell script (bench.sh) used to benchmark:
--------
echo -n "{0}.{1}.{2}: "; ./python -m timeit -r 10 -s 'fmt="{0}.{1}.{2}"'
'fmt.format("http", "client", "HTTPConnection")'
echo -n " [line {0:2d}] : "; ./python -m timeit -r 10 -s 'fmt=" [line {0:2d}]
"' 'fmt.format(5)'
echo -n "str: "; ./python -m timeit -r 10 -s 'fmt="{0}"*100'
'fmt.format("ABCDEF")'
echo -n "str conv: "; ./python -m timeit -r 10 -s 'fmt="{0:s}"*100'
'fmt.format("ABCDEF")'
echo -n "long x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"'
'fmt.format(12345)'
echo -n "float x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"'
'fmt.format(12.345)'
echo -n "complex x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"'
'fmt.format(12.345+2j)'
echo -n "long, float, complex: "; ./python -m timeit -r 10 -s 'fmt="x={} y={}
z={}"' 'fmt.format(12345, 12.345, 12.345+2j)'
echo -n "huge long: "; ./python -m timeit -r 10 -s 'import math;
huge=math.factorial(2000); fmt="x={}"' 'fmt.format(huge)'
--------
Results:
--------
3.3:
{0}.{1}.{2}: 1000000 loops, best of 10: 0.394 usec per loop
[line {0:2d}] : 1000000 loops, best of 10: 0.519 usec per loop
str: 100000 loops, best of 10: 7.01 usec per loop
str conv: 100000 loops, best of 10: 13.3 usec per loop
long x 3: 1000000 loops, best of 10: 0.569 usec per loop
float x 3: 1000000 loops, best of 10: 1.62 usec per loop
complex x 3: 100000 loops, best of 10: 3.34 usec per loop
long, float, complex: 100000 loops, best of 10: 2.08 usec per loop
huge long: 1000 loops, best of 10: 666 usec per loop
3.3 + format_writer.patch :
{0}.{1}.{2}: 1000000 loops, best of 10: 0.412 usec per loop (+5%)
[line {0:2d}] : 1000000 loops, best of 10: 0.461 usec per loop (-11%)
str: 100000 loops, best of 10: 6.85 usec per loop (-2%)
str conv: 100000 loops, best of 10: 11.1 usec per loop (-17%)
long x 3: 1000000 loops, best of 10: 0.605 usec per loop (+6%)
float x 3: 1000000 loops, best of 10: 1.57 usec per loop (-3%)
complex x 3: 100000 loops, best of 10: 3.54 usec per loop (+6%)
long, float, complex: 100000 loops, best of 10: 2.19 usec per loop (+5%)
huge long: 1000 loops, best of 10: 665 usec per loop (0%)
3.3 + format_writer-2.patch :
{0}.{1}.{2}: 1000000 loops, best of 10: 0.378 usec per loop (-4%)
[line {0:2d}] : 1000000 loops, best of 10: 0.454 usec per loop (-13%)
str: 100000 loops, best of 10: 6.18 usec per loop (-12%)
str conv: 100000 loops, best of 10: 10.9 usec per loop (-18%)
long x 3: 1000000 loops, best of 10: 0.471 usec per loop (-17%)
float x 3: 1000000 loops, best of 10: 1.37 usec per loop (-15%)
complex x 3: 100000 loops, best of 10: 3.4 usec per loop (+2%)
long, float, complex: 1000000 loops, best of 10: 1.93 usec per loop (-7%)
huge long: 1000 loops, best of 10: 665 usec per loop (0%)
--------
----------
Added file: http://bugs.python.org/file25502/format_writer-2.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14744>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com