On 2023-09-09 07:58, Bruno Haible wrote:

* Commit 5f27affb42337dc605a9a59f1c6a99516cd9747a has replaced a use of
   mbiterf.h with mbuiterf.h. I'm not convinced this provides a speed
   improvement, since the comments in mbuiterf.h say:
     The mbuif_* macros are therefore suitable when there is a high probability
     that only the first few multibyte characters need to be inspected.
     Whereas the mbif_* macros are better if usually the iteration runs
     through the entire string.
   To validate or invalidate my hypothesis, someone would need to create
   a benchmark test-bench-trim.c, using the same bench.h infrastructure
   that we already have.

I made that change to avoid over-allocation, and (though I forgot to write this down) to fix some undefined behavior in the single-byte code, where it mistakenly computed "d + strlen (d) - 1". I didn't worry about speed performance; that is, the switch from mbiterf to mbuiterf was done for algorithmic reasons, not raw performance reasons.

It wouldn't be hard to change trim.c to calculate length first and then use mbiterf. Not sure it's worth the hassle of benchmarking this, as trim typically doesn't see much CPU time.

Reply via email to