On 2023-09-09 07:58, Bruno Haible wrote:
* Commit 5f27affb42337dc605a9a59f1c6a99516cd9747a has replaced a use of
mbiterf.h with mbuiterf.h. I'm not convinced this provides a speed
improvement, since the comments in mbuiterf.h say:
The mbuif_* macros are therefore suitable when there is a high probability
that only the first few multibyte characters need to be inspected.
Whereas the mbif_* macros are better if usually the iteration runs
through the entire string.
To validate or invalidate my hypothesis, someone would need to create
a benchmark test-bench-trim.c, using the same bench.h infrastructure
that we already have.
I made that change to avoid over-allocation, and (though I forgot to
write this down) to fix some undefined behavior in the single-byte code,
where it mistakenly computed "d + strlen (d) - 1". I didn't worry about
speed performance; that is, the switch from mbiterf to mbuiterf was done
for algorithmic reasons, not raw performance reasons.
It wouldn't be hard to change trim.c to calculate length first and then
use mbiterf. Not sure it's worth the hassle of benchmarking this, as
trim typically doesn't see much CPU time.