New submission from Eric V. Smith:

Currently, the f-string f'a{3!r:10}' evaluates to bytecode that does the same 
thing as:

''.join(['a', format(repr(3), '10')])

That is, it literally calls the functions format() and repr(). The same holds 
true for str() and ascii() with !s and !a, respectively.

By redefining format, str, repr, and ascii, you can break or pervert the 
computation of the f-string's value:

>>> def format(v, fmt=None): return '42'
...
>>> f'{3}'
'42'

It's always been my intention to fix this. This patch adds an opcode 
FORMAT_VALUE, which instead of looking up format, etc., directly calls 
PyObject_Format, PyObject_Str, PyObject_Repr, and PyObject_ASCII. Thus, you can 
no longer modify what an f-string produces merely by overriding the named 
functions.


In addition, because I'm now saving the name lookups and function calls, 
performance is improved.

Here are the times without this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
1000000 loops, best of 3: 0.3 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
1000000 loops, best of 3: 0.511 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
1000000 loops, best of 3: 0.497 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
1000000 loops, best of 3: 0.461 usec per loop


And with this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
100000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
10000000 loops, best of 3: 0.0896 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
10000000 loops, best of 3: 0.0923 usec per loop


So a 90%+ speedup, for these simple cases.

Also, now f-strings are faster than %-formatting, at least for some types:

$ ./python -m timeit -s 'x="test"' '"%s"%x'
10000000 loops, best of 3: 0.0755 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop


Note that people often "benchmark" %-formatting with code like the following. 
But the optimizer converts this to a constant string, so it's not a fair 
comparison:

$ ./python -m timeit '"%s"%"test"'
100000000 loops, best of 3: 0.0161 usec per loop


These microbenchmarks aren't the end of the story, since the string 
concatenation also takes some time. That's another optimization I might 
implement in the future.

Thanks to Mark and Larry for some advice on this.

----------
assignee: eric.smith
components: Interpreter Core
files: format-opcode.diff
keywords: patch
messages: 253476
nosy: Mark.Shannon, eric.smith, larry
priority: normal
severity: normal
stage: patch review
status: open
title: Improve f-string implementation: FORMAT_VALUE opcode
type: enhancement
versions: Python 3.6
Added file: http://bugs.python.org/file40863/format-opcode.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25483>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to