On 02/04/2026 21:25, Bruno Haible wrote:
Paul Eggert wrote:
I'm a little lost here. Why do we always replace mbrtowc on glibc? And
why is mbrtoc32 not also replaced on glibc?
Gnulib overrides mbrtowc on glibc:
REPLACE_MBRTOWC=1
because gl_cv_func_mbrtowc_C_locale_sans_EILSEQ=no /
MBRTOWC_IN_C_LOCALE_MAYBE_EILSEQ=1
Gnulib also overrides mbrtoc32 on glibc:
REPLACE_MBRTOC32=1
because gl_cv_func_mbrtoc32_C_locale_sans_EILSEQ=no /
MBRTOC32_IN_C_LOCALE_MAYBE_EILSEQ=1
We're still relying on the above for this optimization to be enabled,
but that's unlikely to change I think, so that's probably fine.
If it was easy for the wchar-single module to ensure the replacement
of these functions that would be good I think. No worries either way.
Pádraig Brady wrote:
In the attached I adjusted things so that the efficient
dispatch routines are used once the wchar-single module is referenced.
I'm not sure about this approach
This patch has a major problem: it reuses the code path meant for AIX, that
requires
1. overriding mbstate_t,
2. locking around mbtowc() calls for non-UTF-8 locales.
Instead, the intended speedup — on glibc systems, in an UTF-8 locale — can
be obtained by inlining glibc compatible code for mbrtowc/mbrtoc32 specialized
to UTF-8.
I'm committing the attached two patches. They don't cause test failures in
coreutils.
but it works with coreutils
on glibc-2.43 at least, and cut -c (mcel) is 2.6x faster,
and wc -m (mbrtoc32) is 2x faster.
My test case is
$ time src/wc -m mb10000.in
where mb10000.in is attached.
I observe that it is 2x faster. The profiling (attached, done with gprofng-gui,
see https://gitlab.com/ghwiki/gnow-how/-/wikis/Profiling/with_sampling )
shows that the mbrtoc32 time is reduced from 3.83 sec to 1.49 sec.
I can't observe a speedup on 'cut -c1 mb10000.in' because 'cut' does not
operate on multibyte characters:
- option 'c' is equivalent to option 'b',
- the profiling of function cut_bytes shows no multibyte stuff invocation.
Maybe you are using a modified 'cut' program? Or, can you attach your input
file (compressed)?
I'm not committing the proposed change to lib/mcel.h, because I don't have
a test case where it would make a difference. If you have one, please show it.
Yes I'm working on multi-byte cut which I'll probably push
in the next day or so. It's currently at:
$ git clone https://github.com/pixelb/coreutils.git
$ git checkout cut-mb
Anyway I tested your change and it works really well.
I need to remove the '#undef mbrtoc32' from mcel.h to
get the win there of course. Again I get the same 2.6x win
as seen with my previous patch:
$ yes $(yes éééááé | head -n9 | paste -s -d,) |
head -n1M > mb.in
$ time LC_ALL=C.UTF-8 src/cut-before -c1 mb.in >/dev/null
real 0m1.582s
$ time LC_ALL=C.UTF-8 src/cut-after -c1 mb.in >/dev/null
real 0m0.592s
thank you!
Padraig