Re: How did queensnake corrupt zic.o?

2022-02-15 Thread Filipe Rosset
hi guys,
I had to disable ccache on queensnake, all builds are fine right now. Let's
see how it goes.
due OS migration on queensnake (scientific linux -> centos -> rocky linux)
I think it's time to decommission the queensnake and request a new animal
for a new server, maybe.

On Mon, Feb 14, 2022, 03:25 Tom Lane  wrote:

> Thomas Munro  writes:
> > We see a successful compile and then a failure to read the file while
> > linking.  We see that the animal got into that state recently and then
> > fixed itself, and now it's back in that state.  I don't know if it's
> > significant, but it happened to fix itself when a configure change
> > came along, which might be explained by ccache invalidation; that is,
> > the failure mode doesn't depend on the input files, but once it's
> > borked you need a change to kick ccache.  My memory may be playing
> > tricks on me but I vaguely recall seeing another animal do this, a
> > while back.
>
> queensnake's seen repeated cycles of unexplainable build failures.
> I wonder if it's using a bogus ccache version, or if the machine
> itself is flaky.
>
> regards, tom lane
>


Re: sys_siglist[] is causing us trouble again

2020-07-15 Thread Filipe Rosset
On Wed, Jul 15, 2020 at 7:48 PM Tom Lane  wrote:

> As of a couple days ago, buildfarm member caiman (Fedora rawhide)
> is failing like this in all the pre-v12 branches:
>
> ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
> -Wno-format-truncation -Wno-stringop-truncation -g -O2 -DFRONTEND
> -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2   -c -o
> wait_error.o wait_error.c
> wait_error.c: In function \342\200\230wait_result_to_str\342\200\231:
> wait_error.c:71:6: error: \342\200\230sys_siglist\342\200\231 undeclared
> (first use in this function)
>71 |  sys_siglist[WTERMSIG(exitstatus)] : "(unknown)");
>   |  ^~~
> wait_error.c:71:6: note: each undeclared identifier is reported only once
> for each function it appears in
> make[2]: *** [: wait_error.o] Error 1
>
> We haven't changed anything, ergo something changed at the OS level.
>
> Oddly, we'd not get to this code unless configure set
> HAVE_DECL_SYS_SIGLIST, so it's defined *somewhere*.  I suspect the root
> issue here is some rearrangement of system header files combined with
> wait_error.c (and maybe other places?) not including exactly the same
> headers that configure tested.
>
> Anyway, rather than installing rawhide and trying to debug this,
> I'd like to make a modest proposal: let's back-patch the v12
> patches that made us stop relying on sys_siglist[], viz a73d08319
> and cc92cca43.  Per the discussions that led to those patches,
> it's been decades since any platform didn't have POSIX-compliant
> strsignal(), so we'd be much better off relying on that.
>
> regards, tom lane
>

 I believe it's related with these recent glibc changes at rawhide.
https://src.fedoraproject.org/rpms/glibc/c/0aab7eb58528999277c626fc16682da179de03d0?branch=master

  - signal: Move sys_errlist to a compat symbol
  - signal: Move sys_siglist to a compat symbol
SHA512 (glibc-2.31.9000-683-gffb17e7ba3.tar.xz) =
103ff3c04de5dc149df93e5399de1630f6fff1b8d7f127881d6e530492b8b953a8064205ceecb311a77c0a10de3a5ab2056121fd1fa833a30327c6b1f08beacc