This is an automated email from the ASF dual-hosted git repository. xiaoxiang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/nuttx.git
commit 43405d34c4950fb25c17ca16d120fe824c7da1ea Author: chenxiaoyi <[email protected]> AuthorDate: Fri Dec 12 17:08:31 2025 +0800 Documentation: update the gprof usage Update the usage part to demostrate two easily accessible environments, one QEMU and one real board. Signed-off-by: chenxiaoyi <[email protected]> --- Documentation/applications/system/gprof/index.rst | 117 ++++++++++++++-------- 1 file changed, 75 insertions(+), 42 deletions(-) diff --git a/Documentation/applications/system/gprof/index.rst b/Documentation/applications/system/gprof/index.rst index caf82dd198a..79ab53d0bc6 100644 --- a/Documentation/applications/system/gprof/index.rst +++ b/Documentation/applications/system/gprof/index.rst @@ -17,65 +17,98 @@ gprof can be used to: Usage ===== -Build ------ +QEMU example +------------ +For this example, we're using **QEMU** and **aarch64-none-elf-gcc** with the **qemu-armv8a** board. -Enable the following configuration in NuttX:: +1. Configure ``./tools/configure.sh -E qemu-armv8a:nsh`` and make sure ``CONFIG_SYSTEM_GPROF`` and ``CONFIG_PROFILE_MINI`` are enabled +2. Build ``make -j`` +3. Launch qemu:: - CONFIG_SYSTEM_GPROF + qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \ + -machine virt,virtualization=on,gic-version=3 \ + -chardev stdio,id=con,mux=on -serial chardev:con \ + -mon chardev=con,mode=readline -semihosting -kernel ./nuttx -Using in NuttX --------------- +4. Mount hostfs for saving data later:: -1. Start profiling:: + nsh> mount -t hostfs -o fs=. /mnt - nsh> gprof start +5. Start profiling:: -2. Stop profiling:: + nsh> gprof start - nsh> gprof stop +6. Do some test and stop profiling:: -3. Dump profiling data:: + nsh> gprof stop - nsh> gprof dump /tmp/gmon.out +7. Dump profiling data:: -Analyzing on Host ------------------ + nsh> gprof dump /mnt/gmon.out -1. Pull the profiling data to host:: +8. Analyze the data on host using gprof tool:: - adb pull /tmp/gmon.out ./gmon.out + $ aarch64-none-elf-gprof nuttx gmon.out -b -2. Analyze the data using gprof tool:: +.. note:: The saved file format complies with the standard gprof format. + For detailed instructions on gprof command usage, please refer to the GNU gprof manual: + https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html - The saved file format complies with the standard gprof format. - For detailed instructions on gprof command usage, please refer to the GNU gprof manual: - https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html +Example output:: - arm-none-eabi-gprof ./nuttx/nuttx gmon.out -b + $ aarch64-none-elf-gprof nuttx gmon.out -b + Flat profile: - Example output: + Each sample counts as 0.001 seconds. + % cumulative self self total + time seconds seconds calls s/call s/call name + 75.58 12.44 12.44 12462 0.00 0.00 up_idle + 24.30 16.44 4.00 4 1.00 1.00 up_ndelay + 0.05 16.45 0.01 177 0.00 0.00 pl011_txint + 0.02 16.45 0.00 35 0.00 0.00 uart_readv - ``` - arm-none-eabi-gprof nuttx/nuttx gmon.out -b - Flat profile: +This output shows the performance profile of the program, +including execution time and call counts for each function. +The flat profile table provides a quick overview of where the program spends most of its time. +This information can be used to identify performance bottlenecks and optimize critical parts of the code. - Each sample counts as 0.001 seconds. - % cumulative self self total - time seconds seconds calls s/call s/call name - 66.41 3.55 3.55 43 0.08 0.08 sdelay - 33.44 5.34 1.79 44 0.04 0.04 delay - 0.07 5.34 0.00 up_idle - 0.04 5.34 0.00 nx_start - 0.02 5.34 0.00 fdtdump_main - 0.02 5.34 0.00 nxsem_wait - 0.00 5.34 0.00 1 0.00 5.34 hello_main - 0.00 5.34 0.00 1 0.00 0.00 singal_handler +Real board example +------------------ +Let take **esp32s3-devkit** as an example. - ``` +Test the flat profile +~~~~~~~~~~~~~~~~~~~~~ +1. Configure ``./tools/configure.sh -E esp32s3-devkit:nsh`` and make sure these items are enabled:: - This output shows the performance profile of the program, - including execution time and call counts for each function. - The flat profile table provides a quick overview of where the program spends most of its time. - In this example, `sdelay` and `delay` functions consume the majority of execution time. - This information can be used to identify performance bottlenecks and optimize critical parts of the code. + # for gprof + CONFIG_PROFILE_MINI=y + CONFIG_SYSTEM_GPROF=y + + # save and transfer data + CONFIG_FS_TMPFS=y + CONFIG_SYSTEM_YMODEM=y + +2. Build and flash ``make flash ESPTOOL_PORT=/dev/ttyUSB0 -j`` +3. Run ``minicom -D /dev/ttyUSB0 -b 115200`` to connect to the board +4. Start profiling:: + + nsh> gprof start + + # do some test here, such as ostest + + nsh> gprof stop + nsh> gprof dump /tmp/gmon.out + nsh> sb /tmp/gmon.out + +5. Receive the file on PC, and analyze the data on host:: + + $ cp nuttx nuttx_prof + $ xtensa-esp32s3-elf-objcopy -I elf32-xtensa-le --rename-section .flash.text=.text nuttx_prof + $ xtensa-esp32s3-elf-gprof nuttx_prof gmon.out + +Test the call graph profile +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Add compiler option ``-pg`` to the component, such as ostest Makefile, like: ``CFLAGS += -pg`` +2. Enable the configuration item ``CONFIG_FRAME_POINTER`` + +The other steps are the same as the flat profile.
