dfanache opened a new pull request, #18868:
URL: https://github.com/apache/nuttx/pull/18868

   ## Summary
   
   Currently, three top-level assignments in `tools/Config.mk` use the form 
`export VAR ?= $(... ${shell ...} ...)` which produces a recursive/lazy 
variable, slowing down the build.
   
   The right-hand side is re-expanded - and the embedded `${shell ...}` reruns 
every time the variable is used. Because these variables are also `export`ed, 
make expands them once per recipe's environment, spawning the `tools/incdir` / 
`tools/define` host helpers hundreds of times over a full build.
   
   This PR replaces the three lines with:
   ```
   ifeq ($(origin VAR),undefined)
     VAR := $(... ${shell ...} ...)
   endif
   export VAR
   ```
   Here, `:=` is simply-expanded, so the shell runs just once at parse time. 
The `ifeq ...` wrapper is meant to preserve the override semantics of `?=` 
(passing `DEFINE_PREFIX=...` on the make command line or as an env variable 
cancels the assignment).
   
   ## Impact
   
   **Developers and build systems**: noticeably faster full builds on 
multi-core hosts; no behaviour change - the override semantics of `?=` are 
preserved.
   
   Measured impact on a 20-core build host is a ~26% speedup of wall time, 
while testing a number of standard boards.
   
   ## Testing
   
   Host: Linux, 20-threads x86_64, arm-none-eabi-gcc 15.2.1, GNU make 4.4.1.
   
   Methodology: three full clean builds per side (patched/unpatched), each 
preceded by `make distclean` and a fresh `configure.sh + olddefconfig`. The 
patched and unpatched runs were taken back to back on the same host, with the 
same `-j20` invocation. The times reported in seconds are medians over the 
three runs; variance was always under 0.2s.
   
   ### Config 1: `raspberrypi-pico-2:nsh` (rp23xx, Cortex-M33)
   
   Extra apps enabled to increase the number of recipes: `TESTING_OSTEST`, 
`SYSTEM_SYSTEM_TOP`, `TESTING_GETPRIME`, `BENCHMARKS_COREMARK`, 
`BENCHMARKS_DHRYSTONE`, `LIBC_FLOATINGPOINT`.
   
   | | Baseline | Patched | Change |
   |---|---|---|---|
   | Wall | 9.67s  | 7.15s  | **-2.52s, -26%**  |
   | User | 75.48s  | 73.51s  | -1.97s  |
   | Sys | 25.09s  | 23.36s  | -1.73s  |
   
   The elf was flashed to a Raspberry Pi Pico 2 W; ostest ran to completion 
with exit status 0.
   
   ### Config 2: `imxrt1064-evk:netnsh` (iMX RT 1064, Cortex-M7)
   
   Default config, no app overrides. No hardware test; build success only.
   
   | | Baseline | Patched | Change |
   |---|---|---|---|
   | Wall | 12.37s  | 9.01s  | **-3.36 s, -27%** |
   | User | 106.57s  | 95.38s  | -11.19s  |
   | Sys | 35.14s  | 30.16s  | -4.98s  |
   
   
   ## Larger scale impact and fix origins
   
   This optimization came about by trying to speedup the build of a 
RP2350-based custom PX4-Autopilot board target, where the build time drops from 
126.01s to 14.69s wall time (88% speedup, 8.6x quicker).
   
   In PX4, NuttX is built through CMake and multiple isolated sub-makes are 
spawned (each NuttX library is a separate `add_custom_command` invoking `make 
-C <libdir>` and the wrappers reset `MAKELEVEL=0` probably to avoid jobserver 
collision). So here we can see a proportionally larger speed increase, because 
the bug fires per-sub-make rather than once.
   
   I am not including timing logs of the PX4 builds as evidence for now, just 
as context on how the cost of the existing inefficiency increases with build 
complexity from other projects including the OS.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to