[issue30509] Optimize calling type slots

2019-05-17 Thread Cheryl Sabella
Change by Cheryl Sabella : -- versions: +Python 3.8 -Python 3.7 ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue30509] Optimize calling type slots

2017-06-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Without Py_LOCAL_INLINE all mickrobenchmarks become about 20% slower. I'm not sure that all these changes are needed. Maybe the same effect can be achieved by smaller changes. But I tried and failed to achieve the same performance with a smaller patch yet.

[issue30509] Optimize calling type slots

2017-06-13 Thread STINNER Victor
STINNER Victor added the comment: I'm not sure about adding Py_LOCAL_INLINE() (static inline). I'm not sure that it's needed when you use PGO compilation. Would it be possible to run again your benchmark without added Py_LOCAL_INLINE() please? It's hard to say no to a change makes so many

[issue30509] Optimize calling type slots

2017-06-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: $ ./python -m perf timeit -q --compare-to=./python-orig -s 'class A:' -s ' def __add__(s, o): return s' -s 'a = A(); b = A()' --duplicate=100 'a.__add__(b)' Mean +- std dev: [python-orig] 229 ns +- 9 ns -> [python] 235 ns +- 13 ns: 1.02x slower (+2%) $

[issue30509] Optimize calling type slots

2017-06-02 Thread Terry J. Reedy
Terry J. Reedy added the comment: I believe Rietveld does not work with git-format patches. I don't know if git can produce the format hg did. -- nosy: +terry.reedy ___ Python tracker

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 4e624ca50a665d7e4d527ab98932347ff43a19b0 by Serhiy Storchaka in branch 'master': bpo-30509: Clean up calling type slots. (#1883) https://github.com/python/cpython/commit/4e624ca50a665d7e4d527ab98932347ff43a19b0 --

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you, I know about this, but it takes twice more time, so I don't use it regularly. And it doesn't allow to compare three versions. :-( -- ___ Python tracker

[issue30509] Optimize calling type slots

2017-05-31 Thread STINNER Victor
STINNER Victor added the comment: FYI you can use "./python -m perf timeit --compare-to=./python-ref" if you keep the "reference" Python (unpatched), so perf computes for you the "?.??x slower/faster" factor ;-) -- ___ Python tracker

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- Removed message: http://bugs.python.org/msg294837 ___ Python tracker ___

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Sorry, wrong data. PR 1883 makes indexing 1.2 times faster, PR 1861 makes it 1.7 times faster $ ./python -m perf timeit -s 'class A:' -s ' def __getitem__(s, i): return t[i]' -s 'a = A(); t = tuple(range(1000))' --duplicate 100 'list(a)' Unpatched: Mean

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: PR 1883 seems doesn't affect indexing, PR 1861 makes it 1.7 times faster. $ ./python -m perf timeit -s 'class A:' -s ' def __getitem__(s, i): return t[i]' -s 'a = A(); t = tuple(range(1000))' --duplicate 100 'list(a)' Unpatched: Mean +- std dev: 498 us +-

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: PR 1883 cleans up the code related to calling type slots. * Use _PyObject_LookupSpecial() instead of lookup_maybe() in set_names(). This isn't performance critical. * Inline lookup_maybe() in _PyObject_LookupSpecial(). This was the only remaining use of

[issue30509] Optimize calling type slots

2017-05-31 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1959 ___ Python tracker ___ ___

[issue30509] Optimize calling type slots

2017-05-30 Thread STINNER Victor
STINNER Victor added the comment: The PR makes different changes: * replace lookup_method() with lookup_maybe_method() * specialize call_xxx() functions for a fixed number of parameters * rename lookup_maybe() to _PyObject_LookupSpecial() If possible, I would prefer to not have to duplicate

[issue30509] Optimize calling type slots

2017-05-30 Thread STINNER Victor
STINNER Victor added the comment: > I provided just a patch because I expected that you perhaps will want to play > with it and propose alternative patch. It is simpler to compare patches with > Rietveld than on GitHub. It seems like Rietveld is broken: there is no [Review] button on your

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +1944 ___ Python tracker ___ ___

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > type-slot-calls.diff: Can you please create a pull request? I provided just a patch because I expected that you perhaps will want to play with it and propose alternative patch. It is simpler to compare patches with Rietveld than on GitHub. But if you

[issue30509] Optimize calling type slots

2017-05-30 Thread STINNER Victor
STINNER Victor added the comment: type-slot-calls.diff: Can you please create a pull request? > `a + b` still is 25-30% slower than `a.__add__(b)` Hum, can you please post a microbenchmark results to see the effect of the patch? > After analyzing the article and comparing it with the current

[issue30509] Optimize calling type slots

2017-05-30 Thread STINNER Victor
STINNER Victor added the comment: > I have other patch that makes `a + b` yet tens percents faster, faster than > `a.__add__(b)`, by adding functions and declarations that are never used. > Confusing. It seems like you are a victim of the "deadcode" issue related to code locality:

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Added file: http://bugs.python.org/file46913/type-slot-calls.diff ___ Python tracker ___

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Removed file: http://bugs.python.org/file46912/type-slot-calls.diff ___ Python tracker ___

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have other patch that makes `a + b` yet tens percents faster, faster than `a.__add__(b)`, by adding functions and declarations that are never used. Confusing. -- ___ Python tracker

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Despite the fact that `a + b` still is slower than `a.__add__(b)` in 3.7, it is more than 2 times faster than in 2.7 and 3.5. -- ___ Python tracker

[issue30509] Optimize calling type slots

2017-05-30 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: In excellent Peter Cawley's article "Why are slots so slow?" [1] analysed causes why `a + b` is slower than `a.__add__(b)` for custom __add__ and provided suggestions for optimizing type slot calls. `a + b` and `a.__add__(b)` execute the same user code,