Re: seawasp failing, maybe in glibc allocator

2021-06-26 Thread Fabien COELHO
Seawasp should turn green on its next run. It did! -- Fabien.

Re: seawasp failing, maybe in glibc allocator

2021-06-25 Thread Fabien COELHO
Hello Thomas, Seawasp should turn green on its next run. Hopefully. It is not scheduled very soon because Tom complained about the induced noise in one buildfarm report, so I put the check to once a week. I changed it to start a run in a few minutes. I've rescheduled to once a day after

Re: seawasp failing, maybe in glibc allocator

2021-06-24 Thread Thomas Munro
On Mon, Jun 21, 2021 at 11:57 AM Tom Lane wrote: > If that's an accurate characterization of the tradeoff, I have little > difficulty in voting for #2. A crash is strictly worse than a memory > leak. Besides which, I've heard little indication that they might > revert. Agreed. On Mon, Jun 21,

Re: seawasp failing, maybe in glibc allocator

2021-06-21 Thread Andres Freund
Hi, On 2021-06-20 19:56:56 -0400, Tom Lane wrote: > Thomas Munro writes: > > Looking at their release schedule on https://llvm.org/, I see we have > > a gamble to make. They currently plan to cut RC1 at the end of July, > > and to release in late September (every second LLVM major release > > co

Re: seawasp failing, maybe in glibc allocator

2021-06-20 Thread Thomas Munro
On Sun, Jun 20, 2021 at 11:01 PM Andres Freund wrote: > On 2021-06-19 10:12:03 -0400, Tom Lane wrote: > > Is a compile-time conditional really going to be reliable? See nearby > > arguments about compile-time vs run-time checks for libpq features. > > It's not clear to me how tightly LLVM binds i

Re: seawasp failing, maybe in glibc allocator

2021-06-20 Thread Tom Lane
Thomas Munro writes: > Looking at their release schedule on https://llvm.org/, I see we have > a gamble to make. They currently plan to cut RC1 at the end of July, > and to release in late September (every second LLVM major release > coincides approximately with a PG major release). Option 1: wa

Re: seawasp failing, maybe in glibc allocator

2021-06-20 Thread Thomas Munro
On Sun, Jun 20, 2021 at 10:59 PM Andres Freund wrote: > I think this should be part of the earlier loop? Once > LLVMOrcAbsoluteSymbols() is called that owns the reference, so there > doesn't seem to be a reason to increase the refcount only later? Right, that makes sense. Here's a patch like tha

Re: seawasp failing, maybe in glibc allocator

2021-06-20 Thread Andres Freund
Hi, On 2021-06-19 10:12:03 -0400, Tom Lane wrote: > Is a compile-time conditional really going to be reliable? See nearby > arguments about compile-time vs run-time checks for libpq features. > It's not clear to me how tightly LLVM binds its headers and running > code. It should be fine (and if

Re: seawasp failing, maybe in glibc allocator

2021-06-20 Thread Andres Freund
Hi, On 2021-06-19 17:07:51 +1200, Thomas Munro wrote: > On Sat, May 22, 2021 at 12:25 PM Andres Freund wrote: > > On 2021-05-21 15:57:01 -0700, Andres Freund wrote: > > > I found the LLVM commit to blame > > > (c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671). > > > Contacting the author and reading th

Re: seawasp failing, maybe in glibc allocator

2021-06-19 Thread Tom Lane
Thomas Munro writes: > On Sat, Jun 19, 2021 at 5:07 PM Thomas Munro wrote: >> if (error != LLVMErrorSuccess) >> LLVMOrcDisposeMaterializationUnit(mu); >> >> +#if LLVM_VERSION_MAJOR > 12 >> + for (int i = 0; i < LookupSetSize; i++) >> + LLVMOrcRetainSymbolStringPoolEntry(symbo

Re: seawasp failing, maybe in glibc allocator

2021-06-19 Thread Thomas Munro
On Sat, Jun 19, 2021 at 5:07 PM Thomas Munro wrote: > if (error != LLVMErrorSuccess) > LLVMOrcDisposeMaterializationUnit(mu); > > +#if LLVM_VERSION_MAJOR > 12 > + for (int i = 0; i < LookupSetSize; i++) > + LLVMOrcRetainSymbolStringPoolEntry(symbols[i].N

Re: seawasp failing, maybe in glibc allocator

2021-06-18 Thread Thomas Munro
On Sat, May 22, 2021 at 12:25 PM Andres Freund wrote: > On 2021-05-21 15:57:01 -0700, Andres Freund wrote: > > I found the LLVM commit to blame (c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671). > > Contacting the author and reading the change to see if I can spit the > > issue myself. > > Hrmpf. It's a

Re: seawasp failing, maybe in glibc allocator

2021-05-22 Thread Fabien COELHO
We know that seawasp was okay as of configure: using compiler=clang version 13.0.0 (https://github.com/llvm/llvm-project.git f22d3813850f9e87c5204df6844a93b8c5db7730) and not okay as of configure: using compiler=clang version 13.0.0 (https://github.com/llvm/llvm-project.git 0e8f5e4a686483

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Andres Freund
Hi, On 2021-05-21 15:57:01 -0700, Andres Freund wrote: > I found the LLVM commit to blame (c8fc5e3ba942057d6c4cdcd1faeae69a28e7b671). > Contacting the author and reading the change to see if I can spit the > issue myself. Hrmpf. It's a silent API breakage. The author intended to email us about it

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Andres Freund
Hi, On 2021-05-21 18:18:54 -0400, Tom Lane wrote: > Andres Freund writes: > > Interesting. I tried this with a slightly older LLVM checkout > > (6f4f0afaa8ae), from 2021-04-20, contrib/ltree tests run without an > > issue, even if I force everything to be jitted+inlined+optimized. The > > git has

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Andres Freund
Hi, On 2021-05-21 14:58:38 -0700, Andres Freund wrote: > Interesting. I tried this with a slightly older LLVM checkout > (6f4f0afaa8ae), from 2021-04-20, contrib/ltree tests run without an > issue, even if I force everything to be jitted+inlined+optimized. The > git hash in the package version ind

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Tom Lane
Andres Freund writes: > Interesting. I tried this with a slightly older LLVM checkout > (6f4f0afaa8ae), from 2021-04-20, contrib/ltree tests run without an > issue, even if I force everything to be jitted+inlined+optimized. The > git hash in the package version indicates the commit is from > 2021-

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Andres Freund
Hi, On 2021-05-21 21:37:22 +1200, Thomas Munro wrote: > I installed Clang/LLVM version > "1:13~++20210520071732+02f2d739e074-1~exp1~20210520052519.57" from > https://apt.llvm.org/ on a Debian buster box, and I saw that > contrib/ltree's test fail about half the time with a range of weird > and won

Re: seawasp failing, maybe in glibc allocator

2021-05-21 Thread Thomas Munro
On Wed, May 19, 2021 at 5:02 AM Fabien COELHO wrote: > It seems that the upload of the valgrind run (many hours…) failed on "413 > request entity too large", and everything seems to have been cleaned > despite the "--keepall" I think I put when I started the run. I installed Clang/LLVM version "1

Re: seawasp failing, maybe in glibc allocator

2021-05-18 Thread Fabien COELHO
The issue is non-deterministically triggered in contrib checks, either in int or ltree, but not elsewhere. This suggests issues specific to these modules, or triggered by these modules. Hmmm… Hmm, yeah. A couple of different ways that ltreetest fails without crashing: https://buildfarm.postg

Re: seawasp failing, maybe in glibc allocator

2021-05-16 Thread Thomas Munro
On Sat, May 15, 2021 at 6:41 PM Fabien COELHO wrote: > The issue is non-deterministically triggered in contrib checks, either in > int or ltree, but not elsewhere. This suggests issues specific to these > modules, or triggered by these modules. Hmmm… Hmm, yeah. A couple of different ways that lt

Re: seawasp failing, maybe in glibc allocator

2021-05-14 Thread Fabien COELHO
Hello Andres, It finally failed with a core on 8f72bba, in llvm_shutdown, AFAIKS in a free while doing malloc-related housekeeping. My guess is that there is an actual memory corruption somewhere. It is unobvious whether it is in bleeding-edge llvm or bleeding-edge postgres though. The is

Re: seawasp failing, maybe in glibc allocator

2021-05-12 Thread Fabien COELHO
Possibly I have just added "ulimit -c unlimited" in the script, we should see the effect on next round. for def5b065 it ended on on the contrib ltree test: 2021-05-12 20:12:52.528 CEST [3042602:410] pg_regress/ltree LOG: disconnection: session time: 0:00:13.426 user=buildfarm database=co

Re: seawasp failing, maybe in glibc allocator

2021-05-11 Thread Fabien COELHO
Hello Andres,p Unless perhaps the hard rlimit for -C is set? ulimit -c -H should show that. Possibly I have just added "ulimit -c unlimited" in the script, we should see the effect on next round. If it's the hard limit that won't help, because the hard limit can only be increased by a priv

Re: seawasp failing, maybe in glibc allocator

2021-05-11 Thread Andres Freund
On 2021-05-11 10:22:02 +0200, Fabien COELHO wrote: > > > On 2021-05-11 12:16:44 +1200, Thomas Munro wrote: > > > OK we got the SIGABRT this time, but still no backtrace. If the > > > kernel's core_pattern is "core", gdb is installed, then considering > > > that the buildfarm core_file_glob is "co

Re: seawasp failing, maybe in glibc allocator

2021-05-11 Thread Fabien COELHO
On 2021-05-11 12:16:44 +1200, Thomas Munro wrote: OK we got the SIGABRT this time, but still no backtrace. If the kernel's core_pattern is "core", gdb is installed, then considering that the buildfarm core_file_glob is "core*" and the script version is recent (REL_12), then I'm out of ideas.

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Andres Freund
On 2021-05-11 12:16:44 +1200, Thomas Munro wrote: > OK we got the SIGABRT this time, but still no backtrace. If the > kernel's core_pattern is "core", gdb is installed, then considering > that the buildfarm core_file_glob is "core*" and the script version is > recent (REL_12), then I'm out of idea

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro
On Mon, May 10, 2021 at 11:21 PM Fabien COELHO wrote: > > And of course this time it succeeded :-) > > Hmmm. ISTM that failures are on and off every few attempts. OK we got the SIGABRT this time, but still no backtrace. If the kernel's core_pattern is "core", gdb is installed, then considering t

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Fabien COELHO
And of course this time it succeeded :-) Hmmm. ISTM that failures are on and off every few attempts. Just by the way, I noticed it takes ~40 minutes to compile. Is there a reason you don't install ccache and set eg CC="ccache /path/to/clang", CXX="ccache /path/to/clang++", CLANG="ccache /p

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro
On Mon, May 10, 2021 at 9:30 PM Fabien COELHO wrote: > I forced-removed apport (which meant removing xserver-xorg). Let's see > whether the reports are better or whether I break something. And of course this time it succeeded :-) Just by the way, I noticed it takes ~40 minutes to compile. Is th

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Fabien COELHO
If you don't care about Ubuntu "apport" on this system (something for sending crash/bug reports to developers with a GUI), you could uninstall it (otherwise it overwrites the core_pattern every time it restarts, no matter what you write in your sysctl.conf, apparently), and then sudo sysctl -w

Re: seawasp failing, maybe in glibc allocator

2021-05-10 Thread Thomas Munro
On Mon, May 10, 2021 at 6:59 PM Fabien COELHO wrote: > > Is gdb installed, and are core files being dumped by that SIGABRT, and > > are they using the default name (/proc/sys/kernel/core_pattern = core), > > which the BF can find with the value it's using, namely 'core_file_glob' > > => 'core*'? >

Re: seawasp failing, maybe in glibc allocator

2021-05-09 Thread Fabien COELHO
Hello Thomas, Since seawasp's bleeding-edge clang moved to "20210226", it failed every run except 4, and a couple of days ago it moved to "20210508" and it's still broken. Indeed I have noticed that there is indeed an issue, but the investigation is not very high on my current too deep pg-u

seawasp failing, maybe in glibc allocator

2021-05-09 Thread Thomas Munro
Hi, Since seawasp's bleeding-edge clang moved to "20210226", it failed every run except 4, and a couple of days ago it moved to "20210508" and it's still broken. It's always like this: 2021-05-09 03:31:37.602 CEST [1678796:171] pg_regress/_int LOG: statement: RESET enable_seqscan; corrupted doub