[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2017-05-17 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : -- nosy: +jcea ___ Python tracker ___ ___ Python-bugs-list mailing

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2017-02-01 Thread STINNER Victor
STINNER Victor added the comment: Victor: "FYI I wrote an article about this issue: https://haypo.github.io/analysis-python-performance-issue.html Sadly, it seems like I was just lucky when adding __attribute__((hot)) fixed the issue, because call_method is slow again!" I upgraded

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread STINNER Victor
STINNER Victor added the comment: > But I failed to reproduce it. Hey, performance issues with code placement is a mysterious secret :-) Nobody understands it :-D The server runner the benchmark is a Intel Xeon CPU of 2011. It seems like code placement issues are more important on this CPU

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread INADA Naoki
INADA Naoki added the comment: I setup Ubuntu 14.04 on Azure, built python without neither PGO nor LTO. But I failed to reproduce it. @haypo, would you give me two binaries? $ ~/local/py-2a143/bin/python3 -c 'import sys; print(sys.version)' 3.7.0a0 (default:2a14385710dc, Nov 22 2016, 12:02:34)

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread STINNER Victor
STINNER Victor added the comment: Naoki: "Wow. It's sad that tagged version is accidentally slow..." If you use PGO compilation, for example use "./configure --enable-optimizations" as suggested by configure if you don't enable the option, you don't get the issue. I hope that most Linux

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread STINNER Victor
STINNER Victor added the comment: 2016-11-22 12:07 GMT+01:00 INADA Naoki : > I want to reproduce it and check `perf record -e L1-icache-load-misses`. > But IaaS (EC2, GCE, Azure VM) doesn't support CPU performance counter. You don't need to go that far to check

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread INADA Naoki
INADA Naoki added the comment: Wow. It's sad that tagged version is accidentally slow... I want to reproduce it and check `perf record -e L1-icache-load-misses`. But IaaS (EC2, GCE, Azure VM) doesn't support CPU performance counter. -- ___ Python

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-22 Thread STINNER Victor
STINNER Victor added the comment: FYI I wrote an article about this issue: https://haypo.github.io/analysis-python-performance-issue.html Sadly, it seems like I was just lucky when adding __attribute__((hot)) fixed the issue, because call_method is slow again! * acde821520fc (Nov 21): 16.3 ms

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: Serhiy Storchaka: >> * json: scanstring_unicode() > > This doesn't look wise. This is specific to single extension module and > perhaps to single particular benchmark. Most Python code don't use json at > all. Well, I tried different things to make these

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: > New changeset cfc956f13ce2 by Victor Stinner in branch 'default': > Issue #28618: Mark dict lookup functions as hot > https://hg.python.org/cpython/rev/cfc956f13ce2 Here are benchmark results on the speed-python server: haypo@speed-python$ PYTHONPATH=~/perf

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > * json: scanstring_unicode() This doesn't look wise. This is specific to single extension module and perhaps to single particular benchmark. Most Python code don't use json at all. What is the top of "perf report"? How this list intersects with the list

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: I wrote hot3.patch when trying to make the following benchmarks more reliable: - logging_silent: rev 8ebaa546a033 is 20% slower than the average en 2016 - json_loads: rev 0bd618fe0639 is 30% slower and rev 8ebaa546a033 is 15% slower than the average on 2016 -

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: hot3.patch: Mark additional functions as hot * PyNumber_AsSsize_t() * _PyUnicode_FromUCS1() * json: scanstring_unicode() * siphash24() * sre_ucs1_match, sre_ucs2_match, sre_ucs4_match I'm not sure about this patch. It's hard to get reliable benchmark results

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread STINNER Victor
STINNER Victor added the comment: > How about marking lookdict_unicode and lookdict_unicode_nodummy as hot? Ok, your benchmark results doens't look bad, so I marked the following functions as hot: - lookdict - lookdict_unicode - lookdict_unicode_nodummy - lookdict_split It's common to see

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread Roundup Robot
Roundup Robot added the comment: New changeset cfc956f13ce2 by Victor Stinner in branch 'default': Issue #28618: Mark dict lookup functions as hot https://hg.python.org/cpython/rev/cfc956f13ce2 -- ___ Python tracker

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread INADA Naoki
INADA Naoki added the comment: > so I suggest to run benchmarks and check that it has a non negligible effect > on benchmarks ;-) When added _Py_HOT_FUNCTION to lookdict_unicode, lookdict_unicode_nodummy and lookdict_split (I can't measure L1 miss via `perf stat -d` because I use EC2 for

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-15 Thread INADA Naoki
INADA Naoki added the comment: > I don't understand well the effect of the hot attribute I compared lookdict_unicode_nodummy assembly by `objdump -d dictobject.o`. It looks completely same. So I think only difference is placement. hot functions are in .text.hot section and linker groups hot

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-14 Thread STINNER Victor
STINNER Victor added the comment: INADA Naoki added the comment: > How about marking lookdict_unicode and lookdict_unicode_nodummy as hot? I don't understand well the effect of the hot attribute, so I suggest to run benchmarks and check that it has a non negligible effect on benchmarks ;-)

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-14 Thread INADA Naoki
INADA Naoki added the comment: How about marking lookdict_unicode and lookdict_unicode_nodummy as hot? -- nosy: +inada.naoki ___ Python tracker ___

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-12 Thread STINNER Victor
STINNER Victor added the comment: > Can we commit this to 3.6 too? I worked on patches to try to optimize json_loads and regex_effbot as well, but it's still unclear to me how the hot attribute works, and I'm not 100% sure that using the attribut explicitly does not introduce a performance

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-12 Thread Yury Selivanov
Yury Selivanov added the comment: Can we commit this to 3.6 too? -- nosy: +yselivanov ___ Python tracker ___

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-11 Thread STINNER Victor
STINNER Victor added the comment: > - scimark_sparse_mat_mult: 8.71 ms +- 0.19 ms -> 9.28 ms +- 0.12 ms: 1.07x > slower Same issue on this benchmark: * average on one year: 8.8 ms * peak at rev 59b91b4e9506: 9.3 ms * run after rev 59b91b4e9506: 9.0 ms The benchmark is unstable, but the

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-11 Thread STINNER Victor
STINNER Victor added the comment: > - json_loads: 71.4 us +- 0.8 us -> 72.9 us +- 1.4 us: 1.02x slower Hum, sadly this benchmark is still unstable after my change 59b91b4e9506 ("Mark hot functions using __attribute__((hot))", oops, I wanted to write Mark, not Make :-/). This benchmark is

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-11 Thread STINNER Victor
STINNER Victor added the comment: Final result on speed-python: haypo@speed-python$ python3 -m perf compare_to json_8nov/2016-11-10_15-39-default-8ebaa546a033.json 2016-11-11_02-13-default-59b91b4e9506.json -G Slower (12): - scimark_sparse_mat_mult: 8.71 ms +- 0.19 ms -> 9.28 ms +- 0.12 ms:

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-10 Thread STINNER Victor
STINNER Victor added the comment: I tried different patches and ran many quick & dirty benchmarks. I tried to use likely/unlikely macros (using GCC __builtin__expect): the effect is not significant on call_simple microbenchmark. I gave up on this part. __attribute__((hot)) on a few Python

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-10 Thread Roundup Robot
Roundup Robot added the comment: New changeset 59b91b4e9506 by Victor Stinner in branch 'default': Issue #28618: Make hot functions using __attribute__((hot)) https://hg.python.org/cpython/rev/59b91b4e9506 -- nosy: +python-dev ___ Python tracker

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-08 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file45397/patch.json.gz ___ Python tracker ___

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-08 Thread STINNER Victor
STINNER Victor added the comment: >> Do you mean comparison between current Python with PGO and patched >> Python without PGO? > > Yes. Ok, here you have. As expected, PGO compilation is faster than default compilation with my patch. PGO implements more optimization than just

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread STINNER Victor
STINNER Victor added the comment: Antoine Pitrou added the comment: >> Do you mean comparison between current Python with PGO and patched >> Python without PGO? > > Yes. Oh ok, sure. I will try to run these 2 benchmarks. >>> Ubuntu 14.04 is old, and I don't think this is something we should

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 05/11/2016 à 16:37, STINNER Victor a écrit : > > Antoine Pitrou added the comment: >> Can you compare against a PGO build? > > Do you mean comparison between current Python with PGO and patched > Python without PGO? Yes. >> Ubuntu 14.04 is old, and I

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Moreover, I like the idea of getting a fast(er) Python even when no advanced optimization techniques like LTO or PGO is used. Seconded. -- nosy: +serhiy.storchaka ___ Python tracker

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread STINNER Victor
STINNER Victor added the comment: Antoine Pitrou added the comment: > Can you compare against a PGO build? Do you mean comparison between current Python with PGO and patched Python without PGO? The hot attribute is ignored by GCC when PGO compilation is used. > Ubuntu 14.04 is old, and I

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: Can you compare against a PGO build? Ubuntu 14.04 is old, and I don't think this is something we should worry about. Overall I think this manual approach is really the wrong way to look at it. Compilers can do better than us. -- nosy: +pitrou

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread STINNER Victor
STINNER Victor added the comment: Oh, I forgot to mention that I compiled Python with "./configure -C". The purpose of the patch is to optimize Python when LTO and/or PGO compilation are not explicitly used. -- ___ Python tracker

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-05 Thread STINNER Victor
STINNER Victor added the comment: I ran benchmarks. Globally, it seems like the impact of the patch is positive. regex_v8 and call_simple are slower, but these benchmarks are microbenchmarks impacted by low level stuff like CPU L1 cache. Well, my patch was supposed to optimize CPython for

[issue28618] Decorate hot functions using __attribute__((hot)) to optimize Python

2016-11-04 Thread STINNER Victor
New submission from STINNER Victor: When analyzing results of Python performance benchmarks, I noticed that call_method was 70% slower (!) between revisions 83877018ef97 (Oct 18) and 3e073e7b4460 (Oct 22), including these revisions, on the speed-python server. On these revisions, the CPU L1