Hi Pavel,
Many thanks for having done this experiment for us!
> Here are the findings:
>
> > 6aae65332edc58970c0dc287b28c8f1f6250282a
>
> Review: commit 6aae65332e — "quotearg: lessen dependencies"
>
> Verdict: No functional regressions found.
>
> This commit is a straightforward dead-dependency cleanup:
>
> - c-strcaseeq.h removal — The STRCASEEQ macro was eliminated from quotearg.c
> in an earlier commit (751e42b0bc, Sep 2025) that rewrote gettext_quote to
> use
> mbrtoc32 instead of locale_charset() comparisons. The #include was left
> behind; this commit removes it.
>
> - memcmp dependency removal — memcmp was replaced by memeq in another earlier
> commit (6ca831b019, Sep 2025). The module dependency was left behind; this
> commit removes it. The memeq dependency remains correctly listed.
>
> Both removals are verified: zero occurrences of STRCASEEQ/CASEEQ/memcmp exist
> in the post-commit quotearg.c. The remaining dependency list in
> modules/quotearg correctly reflects actual usage.
Perfectly correct.
> > 17dc60e624cd6fc3491f9cb002f760d60e66ce8b
>
> Review of commit 17dc60e624cd6fc3491f9cb002f760d60e66ce8b
> "mbrtoc32, mbrtowc: do not optimze for uClibc-ng"
>
> Summary: The commit adds `&& !__UCLIBC__` guards to prevent glibc-specific
> optimizations from running on uClibc-ng, which defines __GLIBC__ but is not
> actually glibc. The mbrtoc32.c changes are correct. However, mbrtowc.c
> contains a copy-paste regression.
>
> ---
>
> BUG: Copy-paste error in lib/mbrtowc.c n==0 guard (lines 122-123)
>
> The preprocessor guard for the `n == 0` early return was changed from:
>
> # if MBRTOWC_EMPTY_INPUT_BUG || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >=
> 2)
>
> to:
>
> # if (MBRTOC32_EMPTY_INPUT_BUG || _GL_SMALL_WCHAR_T \
> || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2 && !__UCLIBC__))
>
> This line was copied verbatim from the corresponding guard in lib/mbrtoc32.c
> (line 137), introducing two problems:
>
> 1. MBRTOWC_EMPTY_INPUT_BUG was replaced by MBRTOC32_EMPTY_INPUT_BUG. These are
> separate macros defined by separate m4 tests (m4/mbrtowc.m4 vs
> m4/mbrtoc32.m4), probing different functions. On a system where mbrtowc()
> has the empty-input bug but mbrtoc32() does not, the guard will no longer
> trigger, and rpl_mbrtowc(pwc, s, 0, ps) will fall through to the buggy
> system mbrtowc() instead of returning (size_t) -2.
>
> 2. _GL_SMALL_WCHAR_T was added, but this macro is defined only in
> lib/uchar.in.h, which mbrtowc.c does not include. This term is dead code
> in this compilation unit.
>
> Note the inconsistency within the file itself: line 113 still correctly uses
> MBRTOWC_EMPTY_INPUT_BUG in the s==NULL guard, while line 122 was changed to
> the wrong macro.
>
> The correct fix should have been:
>
> # if (MBRTOWC_EMPTY_INPUT_BUG \
> || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2 && !__UCLIBC__))
>
> Impact: Regression on systems where mbrtowc() has the empty-input bug but
> mbrtoc32() does not. On such systems, calling rpl_mbrtowc() with n=0 will
> no longer be handled correctly.
Perfectly correct as well! It detected the regression and explained it in
understandable terms.
> Severity: Medium. The bug requires a specific combination of platform
> characteristics (mbrtowc empty-input bug present, mbrtoc32 empty-input bug
> absent), but on affected systems it silently produces incorrect behavior.
I'm not familiar with this "Severity" scale. Maybe other people can make use
of it. Anyway, this paragraph is not wrong.
> > 53d4558960659ba7c4e9e2757bfb0977a5027fae
>
> Review of commit 53d4558960 — "quotearg: be nicer on macOS etc"
>
> Overall: No functional regressions found. The commit correctly addresses
> macOS/Android/musl C locales that use UTF-8 by introducing a
> C_LOCALE_MIGHT_BE_MULTIBYTE macro and falling back to <wchar.h> / <wctype.h>
> functions.
>
> Findings
> --------
>
> 1. ChangeLog entry: missing closing paren (Low / cosmetic)
>
> The commit message has "(wch, mbstate, mbs_clear, mbrtowch, chisprint,
> wchisprint" followed by "(quotearg_buffer_restyled):" — missing "):" after
> "wchisprint". The same typo is in the ChangeLog file. Documentation-only,
> no functional impact.
>
> 2. Preprocessor indentation inconsistency (Low / cosmetic)
>
> Line 73: "# include <wctype.h>" uses one space after #, but it sits at
> nesting depth 2 (inside #else -> # if USE_C_LOCALE). The sibling at line 76,
> "# include <uchar.h>", correctly uses two spaces. Should be
> "# include <wctype.h>" per the file's convention. No functional impact.
>
> 3. Dead (size_t)-3 guards when using mbrtowc (Informational)
>
> When USE_C_LOCALE && C_LOCALE_MIGHT_BE_MULTIBYTE, mbrtowch calls mbrtowc
> which never returns (size_t)-3. The #if !GNULIB_MBRTOC32_REGULAR guards that
> handle -3 become dead code. The mbsinit loop-control guard is redundant but
> correct (mbsinit always returns true after a complete mbrtowc conversion).
> No incorrect behavior is possible — this is functionally harmless dead code.
>
> No critical or medium-severity issues. The core logic — preprocessor
> partitioning, type selection, unibyte_locale fix, and mbrtowc/iswprint
> fallbacks — is correct.
I had not seen finding 1.
Finding 2 I had seen and corrected in a follow-up commit.
Finding 3: OK, it explains that it is "functionally harmless dead code".
However, it did miss the regression: The function c_isprint is used
if USE_C_LOCALE is true; however the include of <c-ctype.h> was changed
from USE_C_LOCALE to USE_C_LOCALE && !C_LOCALE_MIGHT_BE_MULTIBYTE.
> > f6894c646a4f2f544209faf8ecb57ba64d9b8238
>
> ## Commit Review: f6894c646a — quotearg: remove SIZE_MAX, INT_BITS
>
> **Verdict: No functional regressions found.**
>
> This commit is a clean minor cleanup that replaces a custom `INT_BITS` macro
> (`sizeof(int) * CHAR_BIT`) with the C23-standard `UINT_WIDTH` constant (with
> gnulib fallback via the `limits-h` module), and removes a dead `SIZE_MAX`
> fallback definition.
>
> ### Analysis
>
> 1. **INT_BITS → UINT_WIDTH substitution**: Semantically equivalent on all
> real-world platforms (both evaluate to 32 for 32-bit int). On hypothetical
> padding-bit platforms, `UINT_WIDTH` is actually *more correct* — the old
> `INT_BITS` could shift into padding bits (undefined behavior), while
> `UINT_WIDTH` stays within value bits. The `quote_these_too` array size is
> unchanged on real platforms, and would be safely *larger* on padding-bit
> platforms.
>
> 2. **SIZE_MAX removal**: Safe. The file already includes `<stdint.h>` (and
> depends on the gnulib `stdint-h` module), which defines `SIZE_MAX` on all
> supported toolchains. The `#ifndef` guard was dead code.
>
> 3. **ABI impact**: None. `struct quoting_options` is opaque — defined only in
> `quotearg.c`, forward-declared in `quotearg.h`. External consumers never
> allocate or directly access its fields.
>
> 4. **Dependency**: The added `limits-h` module dependency in
> `modules/quotearg`
> correctly ensures `UINT_WIDTH` is available. The gnulib `lib/limits.in.h`
> provides a fallback definition via `_GL_INTEGER_WIDTH(0, UINT_MAX)`.
>
> 5. **Cross-file impact**: `INT_BITS` and the `SIZE_MAX` fallback were
> file-local to `quotearg.c`. No other files depended on them.
All correct again.
Finding one regression out of two with such a tool, is impressing.
Two questions:
* Where did it get the information about the Gnulib modules from?
"The file already includes `<stdint.h>` (and depends on the gnulib
`stdint-h` module) ..."
Is it something that you explained in a prompt? Or did it "digest"
the Gnulib documentation first?
More generally, what's the size of the Gnulib specifics in the prompt
that you had to provide?
* What's an estimate for the cost that the analysis of these 4 commits
costed you or your employer (excluding your salary's cost)?
I'd like to compare it to the CI costs. The GitLab CI and GitHub CI
are zero-cost for us, but nonetheless I like to have an estimate based
on the (processing time * number of CPUs * electricity cost).
Bruno