On Thu, Jun 04, 2026 at 03:03:06PM +0200, Bruno Haible wrote:
> Pavel Cahyna wrote:
> > Perhaps you could start by providing
> > some commits to analyze (some of them that contain known regressions,
> > some of them that don't - without indicating which one is which) to have
> > a smaller set than e.g. all commits since 2026-01-01?
>
> Makes sense.
>
> Try these commits:
Here are the findings:
> 6aae65332edc58970c0dc287b28c8f1f6250282a
Review: commit 6aae65332e — "quotearg: lessen dependencies"
Verdict: No functional regressions found.
This commit is a straightforward dead-dependency cleanup:
- c-strcaseeq.h removal — The STRCASEEQ macro was eliminated from quotearg.c
in an earlier commit (751e42b0bc, Sep 2025) that rewrote gettext_quote to use
mbrtoc32 instead of locale_charset() comparisons. The #include was left
behind; this commit removes it.
- memcmp dependency removal — memcmp was replaced by memeq in another earlier
commit (6ca831b019, Sep 2025). The module dependency was left behind; this
commit removes it. The memeq dependency remains correctly listed.
Both removals are verified: zero occurrences of STRCASEEQ/CASEEQ/memcmp exist
in the post-commit quotearg.c. The remaining dependency list in
modules/quotearg correctly reflects actual usage.
> 17dc60e624cd6fc3491f9cb002f760d60e66ce8b
Review of commit 17dc60e624cd6fc3491f9cb002f760d60e66ce8b
"mbrtoc32, mbrtowc: do not optimze for uClibc-ng"
Summary: The commit adds `&& !__UCLIBC__` guards to prevent glibc-specific
optimizations from running on uClibc-ng, which defines __GLIBC__ but is not
actually glibc. The mbrtoc32.c changes are correct. However, mbrtowc.c
contains a copy-paste regression.
---
BUG: Copy-paste error in lib/mbrtowc.c n==0 guard (lines 122-123)
The preprocessor guard for the `n == 0` early return was changed from:
# if MBRTOWC_EMPTY_INPUT_BUG || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2)
to:
# if (MBRTOC32_EMPTY_INPUT_BUG || _GL_SMALL_WCHAR_T \
|| (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2 && !__UCLIBC__))
This line was copied verbatim from the corresponding guard in lib/mbrtoc32.c
(line 137), introducing two problems:
1. MBRTOWC_EMPTY_INPUT_BUG was replaced by MBRTOC32_EMPTY_INPUT_BUG. These are
separate macros defined by separate m4 tests (m4/mbrtowc.m4 vs
m4/mbrtoc32.m4), probing different functions. On a system where mbrtowc()
has the empty-input bug but mbrtoc32() does not, the guard will no longer
trigger, and rpl_mbrtowc(pwc, s, 0, ps) will fall through to the buggy
system mbrtowc() instead of returning (size_t) -2.
2. _GL_SMALL_WCHAR_T was added, but this macro is defined only in
lib/uchar.in.h, which mbrtowc.c does not include. This term is dead code
in this compilation unit.
Note the inconsistency within the file itself: line 113 still correctly uses
MBRTOWC_EMPTY_INPUT_BUG in the s==NULL guard, while line 122 was changed to
the wrong macro.
The correct fix should have been:
# if (MBRTOWC_EMPTY_INPUT_BUG \
|| (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2 && !__UCLIBC__))
Impact: Regression on systems where mbrtowc() has the empty-input bug but
mbrtoc32() does not. On such systems, calling rpl_mbrtowc() with n=0 will
no longer be handled correctly.
Severity: Medium. The bug requires a specific combination of platform
characteristics (mbrtowc empty-input bug present, mbrtoc32 empty-input bug
absent), but on affected systems it silently produces incorrect behavior.
> 53d4558960659ba7c4e9e2757bfb0977a5027fae
Review of commit 53d4558960 — "quotearg: be nicer on macOS etc"
Overall: No functional regressions found. The commit correctly addresses
macOS/Android/musl C locales that use UTF-8 by introducing a
C_LOCALE_MIGHT_BE_MULTIBYTE macro and falling back to <wchar.h> / <wctype.h>
functions.
Findings
--------
1. ChangeLog entry: missing closing paren (Low / cosmetic)
The commit message has "(wch, mbstate, mbs_clear, mbrtowch, chisprint,
wchisprint" followed by "(quotearg_buffer_restyled):" — missing "):" after
"wchisprint". The same typo is in the ChangeLog file. Documentation-only,
no functional impact.
2. Preprocessor indentation inconsistency (Low / cosmetic)
Line 73: "# include <wctype.h>" uses one space after #, but it sits at
nesting depth 2 (inside #else -> # if USE_C_LOCALE). The sibling at line 76,
"# include <uchar.h>", correctly uses two spaces. Should be
"# include <wctype.h>" per the file's convention. No functional impact.
3. Dead (size_t)-3 guards when using mbrtowc (Informational)
When USE_C_LOCALE && C_LOCALE_MIGHT_BE_MULTIBYTE, mbrtowch calls mbrtowc
which never returns (size_t)-3. The #if !GNULIB_MBRTOC32_REGULAR guards that
handle -3 become dead code. The mbsinit loop-control guard is redundant but
correct (mbsinit always returns true after a complete mbrtowc conversion).
No incorrect behavior is possible — this is functionally harmless dead code.
No critical or medium-severity issues. The core logic — preprocessor
partitioning, type selection, unibyte_locale fix, and mbrtowc/iswprint
fallbacks — is correct.
> f6894c646a4f2f544209faf8ecb57ba64d9b8238
## Commit Review: f6894c646a — quotearg: remove SIZE_MAX, INT_BITS
**Verdict: No functional regressions found.**
This commit is a clean minor cleanup that replaces a custom `INT_BITS` macro
(`sizeof(int) * CHAR_BIT`) with the C23-standard `UINT_WIDTH` constant (with
gnulib fallback via the `limits-h` module), and removes a dead `SIZE_MAX`
fallback definition.
### Analysis
1. **INT_BITS → UINT_WIDTH substitution**: Semantically equivalent on all
real-world platforms (both evaluate to 32 for 32-bit int). On hypothetical
padding-bit platforms, `UINT_WIDTH` is actually *more correct* — the old
`INT_BITS` could shift into padding bits (undefined behavior), while
`UINT_WIDTH` stays within value bits. The `quote_these_too` array size is
unchanged on real platforms, and would be safely *larger* on padding-bit
platforms.
2. **SIZE_MAX removal**: Safe. The file already includes `<stdint.h>` (and
depends on the gnulib `stdint-h` module), which defines `SIZE_MAX` on all
supported toolchains. The `#ifndef` guard was dead code.
3. **ABI impact**: None. `struct quoting_options` is opaque — defined only in
`quotearg.c`, forward-declared in `quotearg.h`. External consumers never
allocate or directly access its fields.
4. **Dependency**: The added `limits-h` module dependency in `modules/quotearg`
correctly ensures `UINT_WIDTH` is available. The gnulib `lib/limits.in.h`
provides a fallback definition via `_GL_INTEGER_WIDTH(0, UINT_MAX)`.
5. **Cross-file impact**: `INT_BITS` and the `SIZE_MAX` fallback were
file-local to `quotearg.c`. No other files depended on them.
Regards, Pavel