https://github.com/python/cpython/commit/b9caa0977c512a5e7966ebfdc64fabdc4f3e4971
commit: b9caa0977c512a5e7966ebfdc64fabdc4f3e4971
branch: main
author: Pablo Galindo Salgado <[email protected]>
committer: pablogsal <[email protected]>
date: 2024-05-07T17:25:15+01:00
summary:

gh-118518: Improve perf docs (#118708)

files:
M Doc/howto/perf_profiling.rst

diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst
index ed2b76ff4f410c..06459d1b222964 100644
--- a/Doc/howto/perf_profiling.rst
+++ b/Doc/howto/perf_profiling.rst
@@ -162,12 +162,12 @@ the :option:`!-X` option takes precedence over the 
environment variable.
 
 Example, using the environment variable::
 
-   $ PYTHONPERFSUPPORT=1 python script.py
+   $ PYTHONPERFSUPPORT=1 perf record -F 9999 -g -o perf.data python script.py
    $ perf report -g -i perf.data
 
 Example, using the :option:`!-X` option::
 
-   $ python -X perf script.py
+   $ perf record -F 9999 -g -o perf.data python -X perf script.py
    $ perf report -g -i perf.data
 
 Example, using the :mod:`sys` APIs in file :file:`example.py`:
@@ -184,7 +184,7 @@ Example, using the :mod:`sys` APIs in file 
:file:`example.py`:
 
 ...then::
 
-   $ python ./example.py
+   $ perf record -F 9999 -g -o perf.data python ./example.py
    $ perf report -g -i perf.data
 
 
@@ -210,31 +210,57 @@ of ``perf``.
 How to work without frame pointers
 ----------------------------------
 
-If you are working with a Python interpreter that has been compiled without 
frame pointers
-you can still use the ``perf`` profiler but the overhead will be a bit higher 
because Python
-needs to generate unwinding information for every Python function call on the 
fly. Additionally,
-``perf`` will take more time to process the data because it will need to use 
the DWARF debugging
-information to unwind the stack and this is a slow process.
+If you are working with a Python interpreter that has been compiled without
+frame pointers, you can still use the ``perf`` profiler, but the overhead will 
be
+a bit higher because Python needs to generate unwinding information for every
+Python function call on the fly. Additionally, ``perf`` will take more time to
+process the data because it will need to use the DWARF debugging information to
+unwind the stack and this is a slow process.
 
-To enable this mode, you can use the environment variable 
:envvar:`PYTHON_PERF_JIT_SUPPORT` or the
-:option:`-X perf_jit <-X>` option, which will enable the JIT mode for the 
``perf`` profiler.
+To enable this mode, you can use the environment variable
+:envvar:`PYTHON_PERF_JIT_SUPPORT` or the :option:`-X perf_jit <-X>` option,
+which will enable the JIT mode for the ``perf`` profiler.
 
-When using the perf JIT mode, you need an extra step before you can run ``perf 
report``. You need to
-call the ``perf inject`` command to inject the JIT information into the 
``perf.data`` file.
+.. note::
+
+    Due to a bug in the ``perf`` tool, only ``perf`` versions higher than v6.8
+    will work with the JIT mode.  The fix was also backported to the v6.7.2
+    version of the tool.
+
+    Note that when checking the version of the ``perf`` tool (which can be done
+    by running ``perf version``) you must take into account that some distros
+    add some custom version numbers including a ``-`` character.  This means
+    that ``perf 6.7-3`` is not necessarily ``perf 6.7.3``.
+
+When using the perf JIT mode, you need an extra step before you can run ``perf
+report``. You need to call the ``perf inject`` command to inject the JIT
+information into the ``perf.data`` file.::
 
     $ perf record -F 9999 -g --call-graph dwarf -o perf.data python -Xperf_jit 
my_script.py
-    $ perf inject -i perf.data --jit
-    $ perf report -g -i perf.data
+    $ perf inject -i perf.data --jit --output perf.jit.data
+    $ perf report -g -i perf.jit.data
 
 or using the environment variable::
 
     $ PYTHON_PERF_JIT_SUPPORT=1 perf record -F 9999 -g --call-graph dwarf -o 
perf.data python my_script.py
-    $ perf inject -i perf.data --jit
-    $ perf report -g -i perf.data
-
-Notice that when using ``--call-graph dwarf`` the ``perf`` tool will take 
snapshots of the stack of
-the process being profiled and save the information in the ``perf.data`` file. 
By default the size of
-the stack dump is 8192 bytes but the user can change the size by passing the 
size after comma like
-``--call-graph dwarf,4096``. The size of the stack dump is important because 
if the size is too small
-``perf`` will not be able to unwind the stack and the output will be 
incomplete.
+    $ perf inject -i perf.data --jit --output perf.jit.data
+    $ perf report -g -i perf.jit.data
+
+``perf inject --jit`` command will read ``perf.data``,
+automatically pick up the perf dump file that Python creates (in
+``/tmp/perf-$PID.dump``), and then create ``perf.jit.data`` which merges all 
the
+JIT information together. It should also create a lot of ``jitted-XXXX-N.so``
+files in the current directory which are ELF images for all the JIT trampolines
+that were created by Python.
+
+.. warning::
+    Notice that when using ``--call-graph dwarf`` the ``perf`` tool will take
+    snapshots of the stack of the process being profiled and save the
+    information in the ``perf.data`` file. By default the size of the stack 
dump
+    is 8192 bytes but the user can change the size by passing the size after
+    comma like ``--call-graph dwarf,4096``. The size of the stack dump is
+    important because if the size is too small ``perf`` will not be able to
+    unwind the stack and the output will be incomplete. On the other hand, if
+    the size is too big, then ``perf`` won't be able to sample the process as
+    frequently as it would like as the overhead will be higher.
 

_______________________________________________
Python-checkins mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-checkins.python.org/
Member address: [email protected]

Reply via email to