From: Xiang Gao <[email protected]>
Fix a performance regression issue in command substitution when the shell
has a large exported environment.
A minimal reproducer is:
cat > /tmp/repro_comsub.sh <<'EOF'
#!/usr/bin/env bash
conf=/tmp/test10000.conf
: > "$conf"
for ((i=0;i<10000;i++)); do
printf 'export BATCH_ETL_PORT%d=192.168.100.%d\n' "$i" "$i" >> "$conf"
done
. "$conf"
TIMEFORMAT='comsub real=%3R user=%3U sys=%3S'
time for ((i=0;i<100;i++)); do
x=$(date +%s%N)
done
EOF
bash --noprofile --norc /tmp/repro_comsub.sh
On the test system (Fedora 42), bash-4.4 completes this workload in about 1
second, bash-5.0 takes tens of seconds, and current bash-5.3.9 takes about
25 seconds. The same command substitution loop without exported variables
remains fast, which isolates the regression to exported environment
construction.
Perf shows that the CPU time is spent in bash rebuilding the export
environment, not in date or kernel work. The dominant stack is:
__strcmp_avx2
flatten
map_over
make_var_export_array
maybe_make_export_env
execute_disk_command
execute_simple_command
parse_and_execute
command_substitute
param_expand
expand_word_internal
expand_string_assignment
In the command substitution nofork exec path, execute_disk_command() adjusts
exported SHLVL before maybe_make_export_env(). This marks array_needs_making
dirty, so each child rebuilds the full export_env before exec.
With many exported variables, repeated command substitutions repeatedly pay
this full rebuild cost even though the inherited export_env is otherwise clean.
Fix this by preparing export_env before the SHLVL adjustment, then updating
the exported SHLVL entry in place. This preserves the SHLVL value visible to
the external command while avoiding the unnecessary rebuild.
On bash-5.3.9, the reproducer drops from about 25s to under 1s; the noexport
control remains sub-second as bash-4.4.
Signed-off-by: Xiang Gao <[email protected]>
---
execute_cmd.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/execute_cmd.c b/execute_cmd.c
index 070f511..154f3e9 100644
--- a/execute_cmd.c
+++ b/execute_cmd.c
@@ -5814,9 +5814,17 @@ execute_disk_command (WORD_LIST *words, REDIRECT
*redirects, char *command_line,
shell level like `exec' would do. Don't do this if we are already
in a pipeline environment, assuming it's already been done. */
if (nofork && pipe_in == NO_PIPE && pipe_out == NO_PIPE &&
(subshell_environment & SUBSHELL_PIPE) == 0)
- adjust_shell_level (-1);
-
- maybe_make_export_env ();
+ {
+ maybe_make_export_env ();
+ adjust_shell_level (-1);
+ update_export_env_inplace ("SHLVL=", 6, get_string_value ("SHLVL"));
+ /* adjust_shell_level() marks the export environment dirty because
SHLVL is
+ exported. Since we have already built the environment, update
SHLVL in place
+ below and avoid forcing a full rebuild here. */
+ array_needs_making = 0;
+ }
+ else
+ maybe_make_export_env ();
put_command_name_into_env (command);
}
else if (command == 0 && notfound_str == 0) /* make sure */
--
2.53.0