[Python-Dev] Annotating pure functions to improve inline caching/optimization

Philipp Burch Wed, 14 Sep 2022 08:01:41 -0700

Hello everyone,

the docs on the upcoming 3.11 release state

> This [specializing adaptive interpreter] also brings in anotherconcept called inline caching, where Python caches the results ofexpensive operations directly in the bytecode.

I wonder how this caching works, given that the dynamic nature meansthat virtually every operation could have side effects, causing wrongbehaviour when cached. The only mitigation for this that I can imagineis that caching just occurs for basic operations defined in the standardlibrary, where it is known that they are free of side effects or "pure".

A web search did reveal some discussions[1,2] and a module related todealing with pure functions, but, as far as I see, not related tooptimization.


As an example, consider a code like this:

```py
@pure
def rot_factor(angle_deg: float) -> complex:
    # This could also be a much more expensive calculation.
    return cmath.exp(angle_deg / 180 * cmath.pi * 1j)

# ...

res: List[Tuple(complex, complex, complex, float)] = []
for x in many:
    res.append((
        x * rot_factor(90),
        x * rot_factor(45),
        x * rot_factor(-45),
        x * math.sin(math.pi/8),
    ))
```

The problem with this code is obvious, every loop iteration calls`rot_factor()` with 90, 45 and -45 and will get exactly the same set ofresults. The last factor might already be inline cached by theinterpreter, since it probably knows that `math.pi` is a constant and`math.sin()` is a pure function. Optimizing this by hand (notconsidering a list comprehension or other more sophisticatedimprovements) is easy, but not very pretty:


```py
f_p90 = rot_factor(90)
f_p45 = rot_factor(45)
f_m45 = rot_factor(-45)
f_sin = math.sin(math.pi / 8)
res: List[Tuple(complex, complex, complex, float)] = []
for x in many:
    res.append((
        x * f_p90,
        x * f_p45,
        x * f_m45,
        x * f_sin,
    ))
```

I actually find myself often factoring such data out of loops in Python,whereas in C I would just leave that to the optimizer/compiler.

An available option would be to use `@lru_cache` for `rot_factor()`, butthis will still cause the same dictionary lookups in every iteration andit may not work at all in case the function argument(s) is/are not hashable.

Now, if the interpreter understood the `@pure` decorator for`rot_factor()` indicated above would give it the same opportunity tocache the three results throughout the loop, basically creating themanually-optimized code above. For these completely static values, itcould even precompute the results and integrate them into the bytecode.

Has anything like this been considered already, or is the interpreteritself capable to perform such optimizations?



Thanks and best regards,
Philipp

[1] 'pure' type annotation for mypy:https://github.com/python/mypy/issues/4468


[2] pure-func module: https://pypi.org/project/pure-func/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XDYZRC6L7LBPE3T6RO6S5IVY3J6IMRSJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Annotating pure functions to improve inline caching/optimization

Reply via email to