Paul Eggert wrote in
<https://lists.gnu.org/archive/html/bug-gnulib/2023-09/msg00055.html>:
> It's been a month and I couldn't think of anything better
I hardly couldn't work on the gnulib i18n backlog because I've been
jumping in on other tasks the last month (readutmp and boot-time
improvements, coreutils testing, countering attempts to use -Wextra, ...).
And I won't be working on it in the next week, since I want to prepare
a gettext bug-fix release now.
My current gnulib i18n backlog is as follows; maybe someone can jump in
on tasks that I have not yet started.
* Commit 5f27affb42337dc605a9a59f1c6a99516cd9747a has replaced a use of
mbiterf.h with mbuiterf.h. I'm not convinced this provides a speed
improvement, since the comments in mbuiterf.h say:
The mbuif_* macros are therefore suitable when there is a high probability
that only the first few multibyte characters need to be inspected.
Whereas the mbif_* macros are better if usually the iteration runs
through the entire string.
To validate or invalidate my hypothesis, someone would need to create
a benchmark test-bench-trim.c, using the same bench.h infrastructure
that we already have.
* Integrate Paul's improved handling of error byte sequences (MEE).
Based on the patch in
https://lists.gnu.org/archive/html/bug-gnulib/2023-08/msg00047.html
but as an 'extern' function, not inline, since it's about an
exceptional case. Keeping exceptional case handling out-of-line
will help the compiler's code generation.
Also update all modules' test suite accordingly.
* Commit b93de66735cd6f935ee0970f8cb26908d113e09d introduced mcel.h, but
it has tabs. Can we untabify
mcel.h
mountlist.c
verify.h
(as we do with all source files that are not shared with glibc)?
* Commit b93de66735cd6f935ee0970f8cb26908d113e09d introduced mcel.h.
Summarize, in comments, the discussion we had regarding SEE and MEE.
Basically, MEE is good in all circumstances, whereas SEE is only
good if the surrounding applications does only specific things with
the strings.
Also needs to mention for which encodings it makes a difference,
cf. https://lists.gnu.org/archive/html/bug-gnulib/2023-07/msg00131.html
* In https://lists.gnu.org/archive/html/bug-gnulib/2023-09/msg00055.html
Paul argues that "These patches shouldn't affect behavior"
but I had already explained why mbscasecmp with SEE likely has different
behaviour than with MEE,
in https://lists.gnu.org/archive/html/bug-gnulib/2023-07/msg00131.html .
Currently the mbscasecmp tests test only valid input. Someone should
extend the unit test to cover strings with invalid input bytes. Then
we could see what difference exactly it makes.
* Dependencies with "or" instead of "and", requested by Paul:
https://lists.gnu.org/archive/html/bug-gnulib/2023-09/msg00055.html
We have a few examples of it so far, but no straightforwardly-
applicable technique. Need to think about it.
* Enhance the unit tests of the 'regex' module. (Already started by me.)
* Migrate the 'regex' module to use mbrtoc32-regular instead of mbrtowc.
(I have a patch. But it needs to wait until the units are extended first.)
If you want to work on any of this, please start a new thread with
appropriate Subject line. Thanks!
Bruno