On 7/26/19 5:55 AM, Alkis Georgopoulos wrote: > While handling some big strings, I noticed that bash is a lot slower > than other shells like dash, posh and busybox ash. > I came up with the following little benchmark and results. > While the specific benchmark isn't important, maybe some developer > would like to use it to pinpoint and optimize some internal bash > function that is a lot slower than in other shells?
Thanks for the report. There are places in bash where it copies and re-processes strings too many times, and you uncovered a couple. > > # Avoid UTF-8 complications > export LANG=C > > # Run the following COMMANDs with `time bash -c` > # or `time busybox ash -c` > # The time columns are in seconds, on an i5-4440 CPU > > ASH BASH COMMAND > 0.1 0.1 printf "%100000000s" "." >/dev/null > 0.7 1.1 x=$(printf "%100000000s" ".") The first assignment is dominated by the command substitution and reading the data through a pipe. > 0.8 2.4 x=$(printf "%100000000s" "."); echo ${#x} > 0.9 3.7 x=$(printf "%100000000s" "."); echo ${#x}; echo ${#x} The length function was too general, and didn't optimize for the common case. Bash would expand the parameter name following the `#' as if the `#' were not present, then take the length of the results. Most uses don't need that generality, or the common error handling if `set -u' is enabled. Factoring out the common case provides substantial improvement: $ time ./bash ./x1a real 0m1.215s user 0m0.959s sys 0m0.248s $ time ./bash ./x1b 100000000 real 0m1.242s user 0m0.982s sys 0m0.256s $ time ./bash ./x1c 100000000 100000000 real 0m1.290s user 0m1.020s sys 0m0.265s where the three scripts are the three cases above. There's always more work to do, though. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/