Re: request for help: LLM-based quality assurance

Bruno Haible via Gnulib discussion list Mon, 08 Jun 2026 10:41:56 -0700

Hi Jeffrey,

> However, there are downsides to LLMs, and I would look
> into fixing the current processes while using the LLMs as a complement
> to existing practices.


That's a reasonable approach too.

> First, code should not be merged into Master until the CI tests have
> successfully run.  Commits that don't pass the CI test don't pass
> through the security gate.  The new code stays in a development branch
> or testing fork until they pass the CI tests.  This is how many (all?)
> organizations with a mature SDLC operate.

There are three reasons why we don't do it this way:

  - Gnulib commits can introduce many platform specific issues, and
    therefore the "CI tests" would need to be the multi-platform CI.
    This multi-platform CI is expensive: currently it takes more than
    11 hours of CPU time [1]. Even if GitHub use if free for us, there
    is some environmental cost: Assume 100 W per CPU, it consumes 1.1 kWh
    for a single run, that is ca. 0.21 $ in USA or 0.40 € in Europe.

  - GitHub not being Free Software, we cannot make its use mandatory for
    all Gnulib contributors.

  - It is demotivating to have to struggle with such a gate. We all know
    that the "fun factor" is higher if a developer can complete a piece
    of work without going through robots and gates.

> Second, LLMs have at least three costs.  First is the deskilling that
> happens when relying on them. [1,2].  Relying too much on LLMs will
> have a negative effect on the talent contributing to the project.

I agree: When Collin or I review a commit from Paul, we can 1) learn from
him and 2) learn about Gnulib's internals. This skills build-up is lost
if an LLM takes over this review.

> Second is the power consumed powering the algorithms, and its effect
> on renewable energy.[3,4]  The environmental and societal cost seems
> to be high.

I tend to agree: 15 minutes for an LLM-generated pretranslation of a
PO file seems adequate (because it would replace 1/2 hour of work of
a translator), whereas 15 minutes for an LLM-generated patch review
seems excessive (because a developer would do it in 1-2 minutes).

But things tend to get more efficient over time. Therefore if it's
not reasonable for cost reasons today, maybe it will be in two years?

Bruno

[1] https://github.com/gnu-gnulib/ci-testdir-check/actions/runs/27127350805

Re: request for help: LLM-based quality assurance

Reply via email to