Re: Tomas Vondra
> >> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
> >> attached patch. It still calls numa_available(), so that we don't
> >> silently miss future libnuma changes.
> >>
> >> Can you check this makes it work inside the docker container?
> > 
> > Yes your patch works. (Sorry I meant to test earlier, but RL...)
> 
> Thanks. I've pushed the fix (and backpatched to 18).

It looks like we are not done here yet :(

postgresql-18 is failing here intermittently with this diff:

12:20:24 --- 
/build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out  
   2025-11-10 21:52:06.000000000 +0000
12:20:24 +++ 
/build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out
        2025-12-11 11:20:22.618989603 +0000
12:20:24 @@ -6,8 +6,4 @@
12:20:24  -- switch to superuser
12:20:24  \c -
12:20:24  SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
12:20:24 - ok
12:20:24 -----
12:20:24 - t
12:20:24 -(1 row)
12:20:24 -
12:20:24 +ERROR:  invalid NUMA node id outside of allowed range [0, 0]: -2

That's REL_18_STABLE @ 580b5c, with the Debian packaging on top.

I've seen it on unstable/amd64, unstable/arm64, and Ubuntu
questing/amd64, where libnuma should take care of this itself, without
the extra patch in PG. There was another case on bullseye/amd64 which
has the old libnuma.

It's been frequent enough so it killed 4 out of the 10 builds
currently visible on
https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/.
(Though to be fair, only one distribution/arch combination was failing
for each of them.)

There is also one instance of it in
https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/

I currently have no idea what's happening.

Christoph


Reply via email to