On 7/26/19 5:55 AM, Alkis Georgopoulos wrote:
> While handling some big strings, I noticed that bash is a lot slower
> than other shells like dash, posh and busybox ash.
> I came up with the following little benchmark and results.
> While the specific benchmark isn't important, maybe some developer
> would like to use it to pinpoint and optimize some internal bash
> function that is a lot slower than in other shells?

Thanks for the report. There are places in bash where it copies and
re-processes strings too many times, and you uncovered a couple.

> 
> # Avoid UTF-8 complications
> export LANG=C
> 
> # Run the following COMMANDs with `time bash -c`
> # or `time busybox ash -c`
> # The time columns are in seconds, on an i5-4440 CPU
> 
> ASH BASH  COMMAND
> 0.1  0.1  printf "%100000000s" "." >/dev/null
> 0.7  1.1  x=$(printf "%100000000s" ".")

The first assignment is dominated by the command substitution and
reading the data through a pipe.

> 0.8  2.4  x=$(printf "%100000000s" "."); echo ${#x}
> 0.9  3.7  x=$(printf "%100000000s" "."); echo ${#x}; echo ${#x}

The length function was too general, and didn't optimize for the common
case. Bash would expand the parameter name following the `#' as if the
`#' were not present, then take the length of the results. Most uses don't
need that generality, or the common error handling if `set -u' is enabled.
Factoring out the common case provides substantial improvement:

$ time ./bash ./x1a

real    0m1.215s
user    0m0.959s
sys     0m0.248s
$ time ./bash ./x1b
100000000

real    0m1.242s
user    0m0.982s
sys     0m0.256s
$ time ./bash ./x1c
100000000
100000000

real    0m1.290s
user    0m1.020s
sys     0m0.265s

where the three scripts are the three cases above.

There's always more work to do, though.

Chet


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

Reply via email to