Re: LLVM 16 (opaque pointers)

2023-10-24 Thread Mark Wong
On Mon, Oct 23, 2023 at 10:47:24PM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > FC 29 is well out of support, so I don't think it makes sense to invest any
> > further time in this. Personally, I don't think it's useful to have years 
> > old
> > fedora in the buildfarm...
> 
> +1.  It's good to test old LTS distros, but Fedora releases have a
> short shelf life by design.

I'll start retiring those old Fedora ones I have. :)

Regards,
Mark




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Tom Lane
Andres Freund  writes:
> FC 29 is well out of support, so I don't think it makes sense to invest any
> further time in this. Personally, I don't think it's useful to have years old
> fedora in the buildfarm...

+1.  It's good to test old LTS distros, but Fedora releases have a
short shelf life by design.

regards, tom lane




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Andres Freund
Hi,

On 2023-10-24 10:17:22 +1300, Thomas Munro wrote:
> This POWER animal fails, unexpectedly to me:
> 
> > bonito: 7.0.1 Fedora 29
> 
> We could try to chase that down, or we could rejoice that at least it
> works on current release.  It must begin working somewhere between 7
> and 11, but when I checked which LLVM releases I could easily install
> on eg cascabel (if I could get access) using the repo at apt.llvm.org,
> I saw that they don't even have anything older than 11.  So someone
> with access who wants to figure this out might have many days or weeks
> of compiling ahead of them.

I could reproduce the failure on bonito. The stack trace is:
#0  0x7fffb83541e8 in raise () from /lib64/libc.so.6
#1  0x7fffb833448c in abort () from /lib64/libc.so.6
#2  0x7fff9c68dd78 in std::__replacement_assert (_file=, 
_line=, _function=, _condition=)
at /usr/include/c++/8/ppc64le-redhat-linux/bits/c++config.h:447
#3  0x7fff9df90838 in std::unique_ptr >::operator* (
this=0x1b946cb8) at ../include/llvm/Support/MemAlloc.h:29
#4  llvm::OrcCBindingsStack::OrcCBindingsStack(llvm::TargetMachine&, 
std::function > ()>) (this=0x1b946be0, 
TM=..., IndirectStubsMgrBuilder=...) at 
../lib/ExecutionEngine/Orc/OrcCBindingsStack.h:242
#5  0x7fff9df90940 in LLVMOrcCreateInstance (TM=0x1b933ae0) at 
/usr/include/c++/8/bits/move.h:182
#6  0x7fffa0618f8c in llvm_session_initialize () at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit.c:981
#7  0x7fffa06179a8 in llvm_create_context (jitFlags=25) at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit.c:219
#8  0x7fffa0626cbc in llvm_compile_expr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit_expr.c:142
#9  0x10a76fc8 in jit_compile_expr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/jit/jit.c:177
#10 0x10404550 in ExecReadyExpr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/executor/execExpr.c:875

with this assertion message printed:
/usr/include/c++/8/bits/unique_ptr.h:328: typename 
std::add_lvalue_reference<_Tp>::type std::unique_ptr<_Tp, Dp>::operator*() 
const [with Tp = llvm::orc::JITCompileCallbackManager; _Dp = 
std::default_delete; typename 
std::add_lvalue_reference<_Tp>::type = llvm::orc::JITCompileCallbackManager&]: 
Assertion 'get() != pointer()' failed.


I wanted to use a debug build to investigate in more detail, but bonito is a
small VM. Thus I built llvm 7 on a more powerful gcc cfarm machine, running on
AlmaLinux 9.2.  The problem doesn't reproduce there.

Given the crash in some c++ standard library code, that the fc29 patches to
llvm look harmless, that building/using llvm 7 on a newer distro does not show
issues on PPC, it seems likely that this is a compiler / standard library
issue.

FC 29 is well out of support, so I don't think it makes sense to invest any
further time in this. Personally, I don't think it's useful to have years old
fedora in the buildfarm...

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Mark Wong
On Tue, Oct 24, 2023 at 10:17:22AM +1300, Thomas Munro wrote:
> On Tue, Oct 24, 2023 at 4:27 AM Mark Wong  wrote:
> > I haven't gotten around to disabling llvm on any of my animals with llvm
> > < 7 yet.  Do you still want to hold on that?
> 
> Yes, please disable --with-llvm on s390x and POWER animals with LLVM <
> 7 (see below).  Also, you have a bunch of machines with LLVM 16 that
> are failing to compile on REL_11_STABLE.  That is expected, because we
> agreed not to back-patch the LLVM 16 API changes into REL_11_STABLE:
> 
> > kingsnake: 16.0.6 Fedora Linux 38
> > krait: CentOS 16.0.6 Stream 8
> > lancehead: CentOS 16.0.6 Stream 8

I should have updated these to not use --with-llvm for REL_11_STABLE.

> These POWER machines fail as expected, and it's unfixable:
> 
> > elasmobranch: 5.0.1 openSUSE Leap 15.0
> > demoiselle: 5.0.1 openSUSE Leap 15.0
> > cavefish: 6.0.0 Ubuntu 18.04.6 LTS

These should now be updated to not use --with-llvm at all.

> These s390x animals are failing, but don't show the layout complaint.
> I can see that LLVM 6 also lacked a case for s390x in
> llvm::orc::createLocalIndirectStubsManagerBuilder(), the thing that
> was fixed in 7 with the addition of a default case.  Therefore these
> presumably fail just like old LLVM on POWER, and it's unfixable.  So I
> suggest turning off --with-llvm on these two:
> 
> > cotinga: 6.0.0 Ubuntu 18.04.6 LTS
> > perch: 6.0.0 Ubuntu 18.04.6 LTS

Ok, I should have removed --with-llvm here too.

> This s390x animal doesn't actually have --with-llvm enabled so it
> passes, but surely it'd be just like lora:
> 
> > mamushi: 15.0.7 Red Hat Enterprise Linux 9.2

Oops, I think I added it now.


I think I made all the recommended changes, and trimmed out the lines
where I didn't need to do anything. :)

Andres pointed out to me that my animals aren't set up to collect core
file so I'm also trying to update that too...

Regards,
Mark




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Thomas Munro
On Tue, Oct 24, 2023 at 4:27 AM Mark Wong  wrote:
> I haven't gotten around to disabling llvm on any of my animals with llvm
> < 7 yet.  Do you still want to hold on that?

Yes, please disable --with-llvm on s390x and POWER animals with LLVM <
7 (see below).  Also, you have a bunch of machines with LLVM 16 that
are failing to compile on REL_11_STABLE.  That is expected, because we
agreed not to back-patch the LLVM 16 API changes into REL_11_STABLE:

> kingsnake: 16.0.6 Fedora Linux 38
> krait: CentOS 16.0.6 Stream 8
> lancehead: CentOS 16.0.6 Stream 8

These POWER machines fail as expected, and it's unfixable:

> elasmobranch: 5.0.1 openSUSE Leap 15.0
> demoiselle: 5.0.1 openSUSE Leap 15.0
> cavefish: 6.0.0 Ubuntu 18.04.6 LTS

(Well, we could in theory supply our own fixed
llvm::orc::createLocalIndirectStubsManagerBuilder() function to hide
the broken one in LLVM <= 6, but that way lies madness IMHO.  An LTS
distro that cares could look into back-patching LLVM's fixes, but for
us, let us focus on current software.)

This POWER animal fails, unexpectedly to me:

> bonito: 7.0.1 Fedora 29

We could try to chase that down, or we could rejoice that at least it
works on current release.  It must begin working somewhere between 7
and 11, but when I checked which LLVM releases I could easily install
on eg cascabel (if I could get access) using the repo at apt.llvm.org,
I saw that they don't even have anything older than 11.  So someone
with access who wants to figure this out might have many days or weeks
of compiling ahead of them.

These POWER animals are passing, as expected:

> cascabel: 11.0.1 Debian GNU/Linux 11
> babbler: 15.0.7 AlmaLinux 8.8
> pytilia: 15.0.7 AlmaLinux 8.8
> nicator: 15.0.7 AlmaLinux 9.2
> twinspot: 15.0.7 AlmaLinux 9.2
> habu: 16.0.6 Fedora Linux 38
> kingsnake: 16.0.6 Fedora Linux 38
> krait: CentOS 16.0.6 Stream 8
> lancehead: CentOS 16.0.6 Stream 8

These s390x animals are passing:

> branta: 10.0.0 Ubuntu 20.04.4 LTS
> pike: 11.0.1 Debian GNU/Linux 11
> rinkhals: 11.0.1 Debian GNU/Linux 11
> sarus: 14.0.0 Ubuntu 22.04.1 LTS

These s390x animals are failing, but don't show the layout complaint.
I can see that LLVM 6 also lacked a case for s390x in
llvm::orc::createLocalIndirectStubsManagerBuilder(), the thing that
was fixed in 7 with the addition of a default case.  Therefore these
presumably fail just like old LLVM on POWER, and it's unfixable.  So I
suggest turning off --with-llvm on these two:

> cotinga: 6.0.0 Ubuntu 18.04.6 LTS
> perch: 6.0.0 Ubuntu 18.04.6 LTS

These s390x animals are failing with the mismatched layout problem,
which should be fixed by Andres's patch to tolerate the changing
z12/z13 ABI thing by falling back to whatever clang picked (at a cost
of not using all the features of your newer CPU, unless you explicitly
tell clang to target it):

> aracari: 15.0.7 Red Hat Enterprise Linux 8.6
> pipit: 15.0.7 Red Hat Enterprise Linux 8.6
> lora: 15.0.7 Red Hat Enterprise Linux 9.2

This s390x animal doesn't actually have --with-llvm enabled so it
passes, but surely it'd be just like lora:

> mamushi: 15.0.7 Red Hat Enterprise Linux 9.2




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Mark Wong
On Mon, Oct 23, 2023 at 01:15:04PM +1300, Thomas Munro wrote:
> On Sun, Oct 22, 2023 at 2:44 PM Thomas Munro  wrote:
> > On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> > > Thomas Munro  writes:
> > > > (It'd be nice if the
> > > > build farm logged "$LLVM_CONFIG --version" somewhere.)
> > >
> > > It's not really the buildfarm script's responsibility to do that,
> > > but feel free to make configure do so.
> >
> > Done, copying the example of how we do it for perl and various other things.
> 
> Build farm measles update:
> 
> With that we can see that nicator (LLVM 15 on POWER) is green.  We can
> see that cavefish (LLVM 6 on POWER) is red as expected.  We can also
> see that bonito (LLVM 7 on POWER) is red, so my earlier theory that
> this might be due to the known bug we got fixed in LLVM 7 is not
> enough.  Maybe there are other things fixed on POWER somewhere between
> those LLVM versions?  I suspect it'll be hard to figure out without
> debug builds and backtraces.

I haven't gotten around to disabling llvm on any of my animals with llvm
< 7 yet.  Do you still want to hold on that?

> One thing is definitely our fault, though.  xenodermus shows failures
> on REL_12_STABLE and REL_13_STABLE, using debug LLVM 6 on x86.  I
> couldn't reproduce this locally on (newer) debug LLVM, so I bugged
> Andres for access to the host/libraries and chased it down.  There is
> some type punning for a function parameter REL_13_STABLE and earlier,
> removed by Andres in REL_14_STABLE, and when I back-patched my
> refactoring I effectively back-patched a few changes from his commit
> df99ddc70b97 that removed the type punning, but I should have brought
> one more line from that commit to remove another trace of it.  See
> attached.

Here are my list of llvm-config versions and distros for s390x and POWER
(didn't see any issues on aarch64 but I grabbed all the info at the same
time.)

s390x:

branta: 10.0.0 Ubuntu 20.04.4 LTS
cotinga: 6.0.0 Ubuntu 18.04.6 LTS
perch: 6.0.0 Ubuntu 18.04.6 LTS
sarus: 14.0.0 Ubuntu 22.04.1 LTS
aracari: 15.0.7 Red Hat Enterprise Linux 8.6
pipit: 15.0.7 Red Hat Enterprise Linux 8.6
lora: 15.0.7 Red Hat Enterprise Linux 9.2
mamushi: 15.0.7 Red Hat Enterprise Linux 9.2
pike: 11.0.1 Debian GNU/Linux 11
rinkhals: 11.0.1 Debian GNU/Linux 11


POWER:

bonito: 7.0.1 Fedora 29
cavefish: 6.0.0 Ubuntu 18.04.6 LTS
demoiselle: 5.0.1 openSUSE Leap 15.0
elasmobranch: 5.0.1 openSUSE Leap 15.0
babbler: 15.0.7 AlmaLinux 8.8
pytilia: 15.0.7 AlmaLinux 8.8
nicator: 15.0.7 AlmaLinux 9.2
twinspot: 15.0.7 AlmaLinux 9.2
cascabel: 11.0.1 Debian GNU/Linux 11
habu: 16.0.6 Fedora Linux 38
kingsnake: 16.0.6 Fedora Linux 38
krait: CentOS 16.0.6 Stream 8
lancehead: CentOS 16.0.6 Stream 8


aarch64:

boiga: 14.0.5 Fedora Linux 36
corzo: 14.0.5 Fedora Linux 36
desman: 16.0.6 Fedora Linux 38
motmot: 16.0.6 Fedora Linux 38
whinchat: 11.0.1 Debian GNU/Linux 11
jackdaw: 11.0.1 Debian GNU/Linux 11
blackneck: 7.0.1 Debian GNU/Linux 10
alimoche: 7.0.1 Debian GNU/Linux 10
bulbul: 15.0.7 AlmaLinux 8.8
broadbill: 15.0.7 AlmaLinux 8.8
oystercatcher: 15.0.7 AlmaLinux 9.2
potoo: 15.0.7 AlmaLinux 9.2
whiting: 6.0.0 Ubuntu 18.04.5 LTS
vimba: 6.0.0 Ubuntu 18.04.5 LTS
splitfin: 10.0.0 Ubuntu 20.04.6 LTS
rudd: 10.0.0 Ubuntu 20.04.6 LTS
turbot: 14.0.0 Ubuntu 22.04.3 LTS
shiner: 14.0.0 Ubuntu 22.04.3 LTS
ziege: 16.0.6 CentOS Stream 8
chevrotain: 11.0.1 Debian GNU/Linux 11

Regards,
Mark




Re: LLVM 16 (opaque pointers)

2023-10-22 Thread Thomas Munro
On Sun, Oct 22, 2023 at 2:44 PM Thomas Munro  wrote:
> On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > (It'd be nice if the
> > > build farm logged "$LLVM_CONFIG --version" somewhere.)
> >
> > It's not really the buildfarm script's responsibility to do that,
> > but feel free to make configure do so.
>
> Done, copying the example of how we do it for perl and various other things.

Build farm measles update:

With that we can see that nicator (LLVM 15 on POWER) is green.  We can
see that cavefish (LLVM 6 on POWER) is red as expected.  We can also
see that bonito (LLVM 7 on POWER) is red, so my earlier theory that
this might be due to the known bug we got fixed in LLVM 7 is not
enough.  Maybe there are other things fixed on POWER somewhere between
those LLVM versions?  I suspect it'll be hard to figure out without
debug builds and backtraces.

One thing is definitely our fault, though.  xenodermus shows failures
on REL_12_STABLE and REL_13_STABLE, using debug LLVM 6 on x86.  I
couldn't reproduce this locally on (newer) debug LLVM, so I bugged
Andres for access to the host/libraries and chased it down.  There is
some type punning for a function parameter REL_13_STABLE and earlier,
removed by Andres in REL_14_STABLE, and when I back-patched my
refactoring I effectively back-patched a few changes from his commit
df99ddc70b97 that removed the type punning, but I should have brought
one more line from that commit to remove another trace of it.  See
attached.
From 52397f8c70641080f2487ee5f019f143dd35957c Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sun, 22 Oct 2023 23:20:56 +
Subject: [PATCH] jit: Fix LLVM back-patching bug in 12 and 13.

While back-patching f90b4a84, I missed that branches before 14 did some
type punning in a function parameter.  That didn't cause any problem for
release builds of LLVM, but debug builds of some older versions would
fail a type cross-check assertion.  To fix this, we need to back-patch a
line of df99ddc7.

Per build farm animal xenodermus, which runs a debug build of LLVM 6
with jit_above_cost=0.

diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index d84fbba7cc..c2e367f00d 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -1126,7 +1126,7 @@ llvm_compile_expr(ExprState *state)
 		 llvm_pg_var_type("TypeExecEvalSubroutine"));
 
 	v_params[0] = v_state;
-	v_params[1] = l_ptr_const(op, l_ptr(TypeSizeT));
+	v_params[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
 	v_params[2] = v_econtext;
 	l_call(b,
 		   LLVMGetFunctionType(ExecEvalSubroutineTemplate),
-- 
2.42.0



Re: LLVM 16 (opaque pointers)

2023-10-21 Thread Thomas Munro
On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > (It'd be nice if the
> > build farm logged "$LLVM_CONFIG --version" somewhere.)
>
> It's not really the buildfarm script's responsibility to do that,
> but feel free to make configure do so.

Done, copying the example of how we do it for perl and various other things.




Re: LLVM 16 (opaque pointers)

2023-10-21 Thread Thomas Munro
On Sat, Oct 21, 2023 at 7:08 PM Andres Freund  wrote:
> I've attached a patch revision that I spent the last couple hours working
> on. It's very very roughly based on a patch Tom Stellard had written (which I
> think a few rpm packages use). But instead of encoding details about specific
> layout details, I made the code check if the data layout works and fall back
> to the cpu / features used for llvmjit_types.bc.  This way it's not s390x
> specific, future odd architecture behaviour would "automatically" be handled
> the same

The explanation makes sense and this seems like a solid plan to deal
with it.  I didn't try on a s390x, but I tested locally on our master
branch with LLVM 7, 10, 17, 18, and then I hacked your patch to take
the fallback path as if a layout mismatch had been detected, and it
worked fine:

2023-10-22 11:49:55.663 NZDT [12000] DEBUG:  detected CPU "skylake",
with features "...", resulting in layout
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
2023-10-22 11:49:55.664 NZDT [12000] DEBUG:  detected CPU / features
yield incompatible data layout, using values from module instead
2023-10-22 11:49:55.664 NZDT [12000] DETAIL:  module CPU "x86-64",
features "...", resulting in layout
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

+To deal with that, check if data layouts match during JIT
initialization. If
+the runtime detected cpu / features result in a different layout,
try if the
+cpu/features recorded in in llvmjit_types.bc work.

s|try |check |
s| in in | in |

+   errmsg_internal("could not
determine working CPU / feature comination for JIT compilation"),

s|comination|combination|
s| / |/|g




Re: LLVM 16 (opaque pointers)

2023-10-21 Thread Andres Freund
Hi,

On 2023-10-19 06:20:26 +1300, Thomas Munro wrote:
> Interestingly, a new problem just showed up on the the RHEL9 s390x
> machine "lora", where a previously reported problem [1] apparently
> re-appeared.  It complains about incompatible layout, previously
> blamed on mismatch between clang and LLVM versions.

I've attached a patch revision that I spent the last couple hours working
on. It's very very roughly based on a patch Tom Stellard had written (which I
think a few rpm packages use). But instead of encoding details about specific
layout details, I made the code check if the data layout works and fall back
to the cpu / features used for llvmjit_types.bc.  This way it's not s390x
specific, future odd architecture behaviour would "automatically" be handled
the same.

With that at least the main regression tests pass on s390x, even with
jit_above_cost=0.


> I can see that its clang is v15 from clues in the conflig log, but I don't
> know which version of LLVM is being used.  However, I see now that
> --with-llvm was literally just turned on, so there is no reason to think
> that this would have worked before or this work is relevant.  Strange though
> -- we must be able to JIT further than that on s390x because we have crash
> reports in other threads (ie we made it past this and into other more
> advanced brokenness).

You can avoid the borkedness by a) running on an older cpu b) adding
compilation flags to change the code generation target
(e.g. -march=native). And some RPM packages have applied the patch by Tom
Stellard.

> [1] 
> https://www.postgresql.org/message-id/flat/20210319190047.7o4bwhbp5dzkqif3%40alap3.anarazel.de#ec51b488ca8eac8c603d91c0439d38b2

Greetings,

Andres Freund
>From e7ab9eb11576ef85e4d2f6e1ade0a6028279634d Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 20 Oct 2023 23:03:53 -0700
Subject: [PATCH v2] jit: Add fallback in case of runtime/compile time ABI
 mismatch

LLVM's s390x target uses a different ABI (called data layout in LLVM) for z13
and newer processors.  If IR files (like llvmjit_types.bc) are compiled to
target a processor older than z13, which is the default, and JIT occurs on a
z13 or newer processor, the ABI mismatch will cause JIT to fail at runtime.

To deal with that, check if data layouts match during JIT initialization. If
the runtime detected cpu / features result in a different layout, try if the
cpu/features recorded in in llvmjit_types.bc work.

Author: Andres Freund 
Author: Tom Stellard  (in an older version)
Discussion: 16971-5d004d34742a3...@postgresql.org
Discussion: ca+hukg+hoperbgvws4je3uy1uo3trnhf08hmispmrpodgti...@mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c | 191 -
 1 file changed, 166 insertions(+), 25 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 58f638859a4..fa2a982991d 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -101,6 +101,8 @@ static size_t llvm_jit_context_in_use_count = 0;
 static size_t llvm_llvm_context_reuse_count = 0;
 static const char *llvm_triple = NULL;
 static const char *llvm_layout = NULL;
+static char *llvm_cpu = NULL;
+static char *llvm_cpu_features = NULL;
 static LLVMContextRef llvm_context;
 
 
@@ -123,6 +125,7 @@ static void llvm_optimize_module(LLVMJitContext *context, LLVMModuleRef module);
 
 static void llvm_create_types(void);
 static void llvm_set_target(void);
+static void llvm_set_cpu_and_features(void);
 static void llvm_recreate_llvm_context(void);
 static uint64_t llvm_resolve_symbol(const char *name, void *ctx);
 
@@ -884,9 +887,6 @@ static void
 llvm_session_initialize(void)
 {
 	MemoryContext oldcontext;
-	char	   *error = NULL;
-	char	   *cpu = NULL;
-	char	   *features = NULL;
 	LLVMTargetMachineRef opt0_tm;
 	LLVMTargetMachineRef opt3_tm;
 
@@ -931,38 +931,21 @@ llvm_session_initialize(void)
 	 */
 	llvm_set_target();
 
-	if (LLVMGetTargetFromTriple(llvm_triple, _targetref, ) != 0)
-	{
-		elog(FATAL, "failed to query triple %s", error);
-	}
-
-	/*
-	 * We want the generated code to use all available features. Therefore
-	 * grab the host CPU string and detect features of the current CPU. The
-	 * latter is needed because some CPU architectures default to enabling
-	 * features not all CPUs have (weird, huh).
-	 */
-	cpu = LLVMGetHostCPUName();
-	features = LLVMGetHostCPUFeatures();
-	elog(DEBUG2, "LLVMJIT detected CPU \"%s\", with features \"%s\"",
-		 cpu, features);
+	llvm_set_cpu_and_features();
 
 	opt0_tm =
-		LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, features,
+		LLVMCreateTargetMachine(llvm_targetref, llvm_triple,
+llvm_cpu, llvm_cpu_features,
 LLVMCodeGenLevelNone,
 LLVMRelocDefault,
 LLVMCodeModelJITDefault);
 	opt3_tm =
-		LLVMCreateTargetMachine(llvm_targetref, llvm_triple, cpu, features,
+		LLVMCreateTargetMachine(llvm_targetref, llvm_triple,
+llvm_cpu, llvm_cpu_features,
 

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Tom Lane
Thomas Munro  writes:
> (It'd be nice if the
> build farm logged "$LLVM_CONFIG --version" somewhere.)

It's not really the buildfarm script's responsibility to do that,
but feel free to make configure do so.

regards, tom lane




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro
On Sat, Oct 21, 2023 at 12:02 PM Thomas Munro  wrote:
> On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> > I'm quite sure that jiting did pass on ppc64 at some point.
>
> Yeah, I remember debugging it on EDB's POWER machine.  First off, we
> know that LLVM < 7 doesn't work for us on POWER, because:
>
> https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com
>
> That was fixed:
>
> https://github.com/llvm/llvm-project/commit/a95b0df5eddbe7fa1e9f8fe0b1ff62427e1c0318
>
> So I think that means that we'd first have to go through those animals
> and figure out which ones have older LLVM, and ignore those results --
> they just can't use --with-llvm.  Unfortunately there doesn't seem to
> be any clue on the version from the paths used by OpenSUSE.  Mark, do
> you know?

Adding Mark to this subthread.  Concretely, could you please disable
--with-llvm on any of those machines running LLVM < 7?  And report
what version any remaining animals are running?  (It'd be nice if the
build farm logged "$LLVM_CONFIG --version" somewhere.)  One of them
seems to have clang 5 which is a clue -- if the LLVM is also 5 it's
just not going to work, as LLVM is one of those forwards-only projects
that doesn't back-patch fixes like that.




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Andres Freund
Hi,

On 2023-10-21 12:02:51 +1300, Thomas Munro wrote:
> On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> > > I doubt I can get anywhere near an s390x though, and we definitely had
> > > pre-existing problems on that arch.
> >
> > IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter 
> > of
> > perspective.
> > https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553
> 
> Ah, good to know about that.  But there are also reports of crashes in
> released versions that manage to get passed that ABI-wobble business:
> 
> https://www.postgresql.org/message-id/flat/CAF1DzPXjpPxnsgySz2Zjm8d2dx7%3DJ070C%2BMQBFh%2B9NBNcBKCAg%40mail.gmail.com

Trying to debug that now, using access to an s390x box provided by Mark...

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro
On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> On 2023-10-21 10:48:47 +1300, Thomas Munro wrote:
> > On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > I see that Mark has also just enabled --with-llvm on some POWER Linux
> > animals, and they have failed in various ways.  The failures are
> > strangely lacking in detail.  It seems we didn't have coverage before,
> > and I recall that there were definitely versions of LLVM that *didn't*
> > work for our usage in the past, which I'll need to dredge out of the
> > archives.  I will try to get onto a cfarm POWER machine and see if I
> > can reproduce that, before and after these commits, and whose bug is
> > it etc.
>
> I'm quite sure that jiting did pass on ppc64 at some point.

Yeah, I remember debugging it on EDB's POWER machine.  First off, we
know that LLVM < 7 doesn't work for us on POWER, because:

https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com

That was fixed:

https://github.com/llvm/llvm-project/commit/a95b0df5eddbe7fa1e9f8fe0b1ff62427e1c0318

So I think that means that we'd first have to go through those animals
and figure out which ones have older LLVM, and ignore those results --
they just can't use --with-llvm.  Unfortunately there doesn't seem to
be any clue on the version from the paths used by OpenSUSE.  Mark, do
you know?

> > I doubt I can get anywhere near an s390x though, and we definitely had
> > pre-existing problems on that arch.
>
> IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter of
> perspective.
> https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553

Ah, good to know about that.  But there are also reports of crashes in
released versions that manage to get passed that ABI-wobble business:

https://www.postgresql.org/message-id/flat/CAF1DzPXjpPxnsgySz2Zjm8d2dx7%3DJ070C%2BMQBFh%2B9NBNcBKCAg%40mail.gmail.com




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Andres Freund
Hi,

On 2023-10-21 10:48:47 +1300, Thomas Munro wrote:
> On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > Interestingly, a new problem just showed up on the the RHEL9 s390x
> > machine "lora", where a previously reported problem [1] apparently
> > re-appeared.  It complains about incompatible layout, previously
> > blamed on mismatch between clang and LLVM versions.  I can see that
> > its clang is v15 from clues in the conflig log, but I don't know which
> > version of LLVM is being used.  However, I see now that --with-llvm
> > was literally just turned on, so there is no reason to think that this
> > would have worked before or this work is relevant.  Strange though --
> > we must be able to JIT further than that on s390x because we have
> > crash reports in other threads (ie we made it past this and into other
> > more advanced brokenness).
> 
> I see that Mark has also just enabled --with-llvm on some POWER Linux
> animals, and they have failed in various ways.  The failures are
> strangely lacking in detail.  It seems we didn't have coverage before,
> and I recall that there were definitely versions of LLVM that *didn't*
> work for our usage in the past, which I'll need to dredge out of the
> archives.  I will try to get onto a cfarm POWER machine and see if I
> can reproduce that, before and after these commits, and whose bug is
> it etc.

I'm quite sure that jiting did pass on ppc64 at some point.


> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter of
perspective.
https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553

I had made another bug report about this issue at some point, but I can't
refind it right now.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Tom Lane
Thomas Munro  writes:
> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

Yeah.  Too bad there's no s390x in the gcc compile farm.
(I'm wondering how straight a line to draw between that fact
and llvm's evident shortcomings on s390x.)

I'm missing my old access to Red Hat's dev machines.  But in
the meantime, Mark's clearly got beaucoup access to s390
machines, so I wonder if he can let you into any of them?

regards, tom lane




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Mark Wong
On Sat, Oct 21, 2023 at 10:48:47AM +1300, Thomas Munro wrote:
> On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > Interestingly, a new problem just showed up on the the RHEL9 s390x
> > machine "lora", where a previously reported problem [1] apparently
> > re-appeared.  It complains about incompatible layout, previously
> > blamed on mismatch between clang and LLVM versions.  I can see that
> > its clang is v15 from clues in the conflig log, but I don't know which
> > version of LLVM is being used.  However, I see now that --with-llvm
> > was literally just turned on, so there is no reason to think that this
> > would have worked before or this work is relevant.  Strange though --
> > we must be able to JIT further than that on s390x because we have
> > crash reports in other threads (ie we made it past this and into other
> > more advanced brokenness).
> 
> I see that Mark has also just enabled --with-llvm on some POWER Linux
> animals, and they have failed in various ways.  The failures are
> strangely lacking in detail.  It seems we didn't have coverage before,
> and I recall that there were definitely versions of LLVM that *didn't*
> work for our usage in the past, which I'll need to dredge out of the
> archives.  I will try to get onto a cfarm POWER machine and see if I
> can reproduce that, before and after these commits, and whose bug is
> it etc.

Yeah, I'm slowing enabling --with-llvm on POWER, s390x, and aarch64 (but
none here yet as I write this)...

> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

If you want to send me your ssh key, I have access to these systems
through OSUOSL and LinuxFoundation programs.

Regards,
Mark

--
Mark Wong
EDB https://enterprisedb.com




Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro
On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> Interestingly, a new problem just showed up on the the RHEL9 s390x
> machine "lora", where a previously reported problem [1] apparently
> re-appeared.  It complains about incompatible layout, previously
> blamed on mismatch between clang and LLVM versions.  I can see that
> its clang is v15 from clues in the conflig log, but I don't know which
> version of LLVM is being used.  However, I see now that --with-llvm
> was literally just turned on, so there is no reason to think that this
> would have worked before or this work is relevant.  Strange though --
> we must be able to JIT further than that on s390x because we have
> crash reports in other threads (ie we made it past this and into other
> more advanced brokenness).

I see that Mark has also just enabled --with-llvm on some POWER Linux
animals, and they have failed in various ways.  The failures are
strangely lacking in detail.  It seems we didn't have coverage before,
and I recall that there were definitely versions of LLVM that *didn't*
work for our usage in the past, which I'll need to dredge out of the
archives.  I will try to get onto a cfarm POWER machine and see if I
can reproduce that, before and after these commits, and whose bug is
it etc.

I doubt I can get anywhere near an s390x though, and we definitely had
pre-existing problems on that arch.




Re: LLVM 16 (opaque pointers)

2023-10-18 Thread Devrim Gündüz


Hi,

On Thu, 2023-10-19 at 06:20 +1300, Thomas Munro wrote:
> Pushed, after digging up some old LLVM skeletons to test, and those
> "old LLVM" animals are turning green now.  I went ahead and pushed the
> much smaller and simpler patch for LLVM 17.

Thank you! I can confirm that RPMs built fine on Fedora 39 with those
patches, which ships LLVM 17.0.2 as of today.

I can also confirm that builds are not broken on RHEL 9 and 8 and Fedora
37 which ship LLVM 15, and Fedora 38 (LLVM 16).

Thanks again!

Regards,
-- 
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
Twitter: @DevrimGunduz , @DevrimGunduzTR




Re: LLVM 16 (opaque pointers)

2023-10-18 Thread Andres Freund
Hi,

On 2023-10-19 06:20:26 +1300, Thomas Munro wrote:
> On Thu, Oct 19, 2023 at 1:06 AM Thomas Munro  wrote:
> > On Thu, Oct 19, 2023 at 12:02 AM Thomas Munro  
> > wrote:
> > > I pushed the first patch, for LLVM 16, and the build farm told me that
> > > some old LLVM versions don't like it.  The problem seems to be the
> > > function LLVMGlobalGetValueType().  I can see that that was only added
> > > to the C API in 2018, so it looks like I may need to back-port that
> > > (trivial) wrapper into our own llvmjit_wrap.cpp for LLVM < 8.
> >
> > Concretely something like the attached should probably fix it, but
> > it'll take me a little while to confirm that...
> 
> Pushed, after digging up some old LLVM skeletons to test, and those
> "old LLVM" animals are turning green now.  I went ahead and pushed the
> much smaller and simpler patch for LLVM 17.

I enabled a new set of buildfarm animals to test LLVM 16 and 17. I initially
forgot to disable them for 11, which means we'll have those failed build on 11
until they age out :/.

Particularly for the LLVM debug builds it'll take a fair bit to run on all
branches. Each branch takes about 3h.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-18 Thread Thomas Munro
On Thu, Oct 19, 2023 at 1:06 AM Thomas Munro  wrote:
> On Thu, Oct 19, 2023 at 12:02 AM Thomas Munro  wrote:
> > I pushed the first patch, for LLVM 16, and the build farm told me that
> > some old LLVM versions don't like it.  The problem seems to be the
> > function LLVMGlobalGetValueType().  I can see that that was only added
> > to the C API in 2018, so it looks like I may need to back-port that
> > (trivial) wrapper into our own llvmjit_wrap.cpp for LLVM < 8.
>
> Concretely something like the attached should probably fix it, but
> it'll take me a little while to confirm that...

Pushed, after digging up some old LLVM skeletons to test, and those
"old LLVM" animals are turning green now.  I went ahead and pushed the
much smaller and simpler patch for LLVM 17.

Interestingly, a new problem just showed up on the the RHEL9 s390x
machine "lora", where a previously reported problem [1] apparently
re-appeared.  It complains about incompatible layout, previously
blamed on mismatch between clang and LLVM versions.  I can see that
its clang is v15 from clues in the conflig log, but I don't know which
version of LLVM is being used.  However, I see now that --with-llvm
was literally just turned on, so there is no reason to think that this
would have worked before or this work is relevant.  Strange though --
we must be able to JIT further than that on s390x because we have
crash reports in other threads (ie we made it past this and into other
more advanced brokenness).

[1] 
https://www.postgresql.org/message-id/flat/20210319190047.7o4bwhbp5dzkqif3%40alap3.anarazel.de#ec51b488ca8eac8c603d91c0439d38b2




Re: LLVM 16 (opaque pointers)

2023-10-18 Thread Thomas Munro
On Thu, Oct 19, 2023 at 12:02 AM Thomas Munro  wrote:
> I pushed the first patch, for LLVM 16, and the build farm told me that
> some old LLVM versions don't like it.  The problem seems to be the
> function LLVMGlobalGetValueType().  I can see that that was only added
> to the C API in 2018, so it looks like I may need to back-port that
> (trivial) wrapper into our own llvmjit_wrap.cpp for LLVM < 8.

Concretely something like the attached should probably fix it, but
it'll take me a little while to confirm that...


0001-Supply-missing-LLVMGlobalGetValueType-for-LLVM-8.patch
Description: Binary data


Re: LLVM 16 (opaque pointers)

2023-10-18 Thread Thomas Munro
I pushed the first patch, for LLVM 16, and the build farm told me that
some old LLVM versions don't like it.  The problem seems to be the
function LLVMGlobalGetValueType().  I can see that that was only added
to the C API in 2018, so it looks like I may need to back-port that
(trivial) wrapper into our own llvmjit_wrap.cpp for LLVM < 8.




Re: LLVM 16 (opaque pointers)

2023-10-16 Thread Ronan Dunklau
Le vendredi 13 octobre 2023, 22:32:13 CEST Thomas Munro a écrit :
> On Wed, Oct 11, 2023 at 10:31 PM Ronan Dunklau  
wrote:
> > Le mercredi 11 octobre 2023, 10:59:50 CEST Thomas Munro a écrit :
> > > The back-patch to 12 was a little trickier than anticipated, but after
> > > taking a break and trying again I now have PG 12...17 patches that
> > > I've tested against LLVM 10...18 (that's 54 combinations), in every
> > > case only with the clang corresponding to LLVM.
> > 
> > Thank you Thomas for those patches, and the extensive testing, I will run
> > my own and let you know.
> 
> Thanks!  No news is good news, I hope?  I'm hoping to commit this today.
> 
> > > I've attached only the patches for master, but the 12-16 versions are
> > > available at https://github.com/macdice/postgres/tree/llvm16-$N in
> > > case anyone has comments on those.
> > 
> > For PG13 and PG12, it looks like the ExecEvalBoolSubroutineTemplate is not
> > used anywhere, as ExecEvalBoolSubroutine was introduced in PG14 if I'm not
> > mistaken.
> 
> Right, looks like I can remove that in those branches.

Oh sorry I thought I followed up. I ran the same stress testing involving 
several hours of sqlsmith with all jit costs set to zero and didn't notice 
anything with LLVM16.

Thank you !

--
Ronan Dunklau







Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Thomas Munro
On Sat, Oct 14, 2023 at 3:56 AM Andres Freund  wrote:
> On 2023-10-13 16:44:13 +0200, Dmitry Dolgov wrote:
> > Here is what I had in mind (only this part in the second patch was changed).
>
> Makes sense to me. I think we'll likely eventually want to use a custom
> pipeline anyway, and I think we should consider using an optimization level
> inbetween "not at all" "as hard as possible"...

Thanks Dmitry and Andres.  I'm planning to commit these today if there
are no further comments.




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Thomas Munro
On Wed, Oct 11, 2023 at 10:31 PM Ronan Dunklau  wrote:
> Le mercredi 11 octobre 2023, 10:59:50 CEST Thomas Munro a écrit :
> > The back-patch to 12 was a little trickier than anticipated, but after
> > taking a break and trying again I now have PG 12...17 patches that
> > I've tested against LLVM 10...18 (that's 54 combinations), in every
> > case only with the clang corresponding to LLVM.
>
> Thank you Thomas for those patches, and the extensive testing, I will run my
> own and let you know.

Thanks!  No news is good news, I hope?  I'm hoping to commit this today.

> > I've attached only the patches for master, but the 12-16 versions are
> > available at https://github.com/macdice/postgres/tree/llvm16-$N in
> > case anyone has comments on those.
>
> For PG13 and PG12, it looks like the ExecEvalBoolSubroutineTemplate is not
> used anywhere, as ExecEvalBoolSubroutine was introduced in PG14 if I'm not
> mistaken.

Right, looks like I can remove that in those branches.




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Andres Freund
On 2023-10-13 16:44:13 +0200, Dmitry Dolgov wrote:
> Here is what I had in mind (only this part in the second patch was changed).

Makes sense to me. I think we'll likely eventually want to use a custom
pipeline anyway, and I think we should consider using an optimization level
inbetween "not at all" "as hard as possible"...

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Andres Freund
Hi,

On 2023-10-13 11:06:21 +0200, Dmitry Dolgov wrote:
> > On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> > I also don't think we should add the mem2reg pass outside of -O0 - running 
> > it
> > after a real optimization pipeline doesn't seem useful and might even make 
> > the
> > code worse? mem2reg is included in default (and obviously also in O3).
> 
> My understanding was that while mem2reg is included everywhere above
> -O0, this set of passes won't hurt. But yeah, if you say it could
> degrade the final result, it's better to not do this. I'll update this
> part.

It's indeed included anywhere above that, but adding it explicitly to the
schedule means it's excuted twice:

echo 'int foo(int a) { return a / 343; }' | clang-16 -emit-llvm -x c -c -o - -S 
-|sed -e 's/optnone//'|opt-17 -debug-pass-manager -passes='default,mem2reg' 
-o /dev/null 2>&1|grep Promote
Running pass: PromotePass on foo (2 instructions)
Running pass: PromotePass on foo (2 instructions)

The second one is in a point in the pipeline where it doesn't help. It also
requires another analysis pass to be executed unnecessarily.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Dmitry Dolgov
> On Fri, Oct 13, 2023 at 11:06:21AM +0200, Dmitry Dolgov wrote:
> > On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> > I don't think the "function(no-op-function),no-op-module" bit does something
> > particularly useful?
>
> Right, looks like leftovers after verifying which passes were actually
> applied. My bad, could be removed.
>
> > I also don't think we should add the mem2reg pass outside of -O0 - running 
> > it
> > after a real optimization pipeline doesn't seem useful and might even make 
> > the
> > code worse? mem2reg is included in default (and obviously also in O3).
>
> My understanding was that while mem2reg is included everywhere above
> -O0, this set of passes won't hurt. But yeah, if you say it could
> degrade the final result, it's better to not do this. I'll update this
> part.

Here is what I had in mind (only this part in the second patch was changed).
>From 040121be6d150e8f18f154a64b409baef2b15ffb Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 9 May 2022 08:27:47 +1200
Subject: [PATCH v4 1/2] jit: Support opaque pointers in LLVM 16.

Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].

 * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
 * For LLVM == 15, we'll continue to do the same, explicitly opting
   out of opaque pointer mode.
 * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
   the extra type argument.

The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc.  The change is mostly mechanical,
except that at each site the correct type had to be provided.

In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.

Back-patch to 12, because it's a little tricker in 11 and we agreed not
to put the latest LLVM support into the upcoming final release of 11.

[1] https://llvm.org/docs/OpaquePointers.html

Reviewed-by: Dmitry Dolgov <9erthali...@gmail.com>
Reviewed-by: Ronan Dunklau 
Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c|  59 ++--
 src/backend/jit/llvm/llvmjit_deform.c | 119 
 src/backend/jit/llvm/llvmjit_expr.c   | 401 --
 src/backend/jit/llvm/llvmjit_types.c  |  39 ++-
 src/backend/jit/llvm/llvmjit_wrap.cpp |  12 +
 src/backend/jit/llvm/meson.build  |   2 +-
 src/include/jit/llvmjit.h |   7 +
 src/include/jit/llvmjit_emit.h| 106 +--
 8 files changed, 481 insertions(+), 264 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 4dfaf797432..1c8e1ab3a76 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -64,8 +64,10 @@ LLVMTypeRef TypeStorageBool;
 LLVMTypeRef TypePGFunction;
 LLVMTypeRef StructNullableDatum;
 LLVMTypeRef StructHeapTupleData;
+LLVMTypeRef StructMinimalTupleData;
 LLVMTypeRef StructTupleDescData;
 LLVMTypeRef StructTupleTableSlot;
+LLVMTypeRef StructHeapTupleHeaderData;
 LLVMTypeRef StructHeapTupleTableSlot;
 LLVMTypeRef StructMinimalTupleTableSlot;
 LLVMTypeRef StructMemoryContextData;
@@ -76,8 +78,11 @@ LLVMTypeRef StructExprState;
 LLVMTypeRef StructAggState;
 LLVMTypeRef StructAggStatePerGroupData;
 LLVMTypeRef StructAggStatePerTransData;
+LLVMTypeRef StructPlanState;
 
 LLVMValueRef AttributeTemplate;
+LLVMValueRef ExecEvalSubroutineTemplate;
+LLVMValueRef ExecEvalBoolSubroutineTemplate;
 
 static LLVMModuleRef llvm_types_module = NULL;
 
@@ -451,11 +456,7 @@ llvm_pg_var_type(const char *varname)
 	if (!v_srcvar)
 		elog(ERROR, "variable %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	typ = LLVMTypeOf(v_srcvar);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL);
+	typ = LLVMGlobalGetValueType(v_srcvar);
 
 	return typ;
 }
@@ -467,12 +468,14 @@ llvm_pg_var_type(const char *varname)
 LLVMTypeRef
 llvm_pg_var_func_type(const char *varname)
 {
-	LLVMTypeRef typ = llvm_pg_var_type(varname);
+	LLVMValueRef v_srcvar;
+	LLVMTypeRef typ;
+
+	v_srcvar = LLVMGetNamedFunction(llvm_types_module, varname);
+	if (!v_srcvar)
+		elog(ERROR, "function %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	Assert(LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMFunctionTypeKind);
+	typ = LLVMGetFunctionType(v_srcvar);
 
 	return typ;
 }
@@ -502,7 +505,7 @@ llvm_pg_func(LLVMModuleRef mod, const char *funcname)
 
 	v_fn = LLVMAddFunction(mod,
 			

Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Dmitry Dolgov
> On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> Hi,
>
> On 2023-10-11 21:59:50 +1300, Thomas Munro wrote:
> > +#else
> > +   LLVMPassBuilderOptionsRef options;
> > +   LLVMErrorRef err;
> > +   int compile_optlevel;
> > +   char   *passes;
> > +
> > +   if (context->base.flags & PGJIT_OPT3)
> > +   compile_optlevel = 3;
> > +   else
> > +   compile_optlevel = 0;
> > +
> > +   passes = 
> > psprintf("default,mem2reg,function(no-op-function),no-op-module",
> > + compile_optlevel);
>
> I don't think the "function(no-op-function),no-op-module" bit does something
> particularly useful?

Right, looks like leftovers after verifying which passes were actually
applied. My bad, could be removed.

> I also don't think we should add the mem2reg pass outside of -O0 - running it
> after a real optimization pipeline doesn't seem useful and might even make the
> code worse? mem2reg is included in default (and obviously also in O3).

My understanding was that while mem2reg is included everywhere above
-O0, this set of passes won't hurt. But yeah, if you say it could
degrade the final result, it's better to not do this. I'll update this
part.




Re: LLVM 16 (opaque pointers)

2023-10-12 Thread Andres Freund
Hi,

On 2023-10-11 21:59:50 +1300, Thomas Munro wrote:
> +#else
> + LLVMPassBuilderOptionsRef options;
> + LLVMErrorRef err;
> + int compile_optlevel;
> + char   *passes;
> +
> + if (context->base.flags & PGJIT_OPT3)
> + compile_optlevel = 3;
> + else
> + compile_optlevel = 0;
> +
> + passes = 
> psprintf("default,mem2reg,function(no-op-function),no-op-module",
> +   compile_optlevel);

I don't think the "function(no-op-function),no-op-module" bit does something
particularly useful?

I also don't think we should add the mem2reg pass outside of -O0 - running it
after a real optimization pipeline doesn't seem useful and might even make the
code worse? mem2reg is included in default (and obviously also in O3).


Thanks for working on this stuff!


I'm working on setting up buildfarm animals for 16, 17, each once with
a normal and an assertion enabled LLVM build.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-11 Thread Ronan Dunklau
Le mercredi 11 octobre 2023, 10:59:50 CEST Thomas Munro a écrit :
> The back-patch to 12 was a little trickier than anticipated, but after
> taking a break and trying again I now have PG 12...17 patches that
> I've tested against LLVM 10...18 (that's 54 combinations), in every
> case only with the clang corresponding to LLVM.

Thank you Thomas for those patches, and the extensive testing, I will run my 
own and let you know.

> I've attached only the patches for master, but the 12-16 versions are
> available at https://github.com/macdice/postgres/tree/llvm16-$N in
> case anyone has comments on those.

For PG13 and PG12, it looks like the ExecEvalBoolSubroutineTemplate is not 
used anywhere, as ExecEvalBoolSubroutine was introduced in PG14 if I'm not 
mistaken. 

Best regards,

--
Ronan Dunklau







Re: LLVM 16 (opaque pointers)

2023-10-11 Thread Thomas Munro
On Thu, Sep 21, 2023 at 12:47 PM Thomas Munro  wrote:
> On Thu, Sep 21, 2023 at 12:24 PM Devrim Gündüz  wrote:
> > RHEL releases new LLVM version along with their new minor releases every
> > 6 month, and we have to build older versions with new LLVM each time.
> > From RHEL point of view, it would be great if we can back-patch back to
> > v12 :(
>
> Got it.  OK, I'll work on 12 and 13 now.

The back-patch to 12 was a little trickier than anticipated, but after
taking a break and trying again I now have PG 12...17 patches that
I've tested against LLVM 10...18 (that's 54 combinations), in every
case only with the clang corresponding to LLVM.

For 12, I decided to back-patch the llvm_types_module variable that
was introduced in 13, to keep the code more similar.

For master, I had to rebase over Daniel's recent commits, which
required re-adding unused variables removed by 2dad308e, and
then changing a bunch of LLVM type constructors like LLVMInt8Type() to
the LLVMInt8TypeInContext(lc, ...) variants following the example of
9dce2203.  Without that, type assertions in my LLVM 18 debug build
would fail (and maybe there could be a leak problem, though I'm not
sure that really applied to integer (non-struct) types).

I've attached only the patches for master, but the 12-16 versions are
available at https://github.com/macdice/postgres/tree/llvm16-$N in
case anyone has comments on those.
From 6d7b21fe103978ceb34f758284938e66391c0a7e Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 9 May 2022 08:27:47 +1200
Subject: [PATCH v3 1/2] jit: Support opaque pointers in LLVM 16.

Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].

 * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
 * For LLVM == 15, we'll continue to do the same, explicitly opting
   out of opaque pointer mode.
 * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
   the extra type argument.

The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc.  The change is mostly mechanical,
except that at each site the correct type had to be provided.

In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.

Back-patch to 12, because it's a little tricker in 11 and we agreed not
to put the latest LLVM support into the upcoming final release of 11.

[1] https://llvm.org/docs/OpaquePointers.html

Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Ronan Dunklau 
Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c|  59 ++--
 src/backend/jit/llvm/llvmjit_deform.c | 119 
 src/backend/jit/llvm/llvmjit_expr.c   | 401 --
 src/backend/jit/llvm/llvmjit_types.c  |  39 ++-
 src/backend/jit/llvm/llvmjit_wrap.cpp |  12 +
 src/backend/jit/llvm/meson.build  |   2 +-
 src/include/jit/llvmjit.h |   7 +
 src/include/jit/llvmjit_emit.h| 106 +--
 8 files changed, 481 insertions(+), 264 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 4dfaf79743..1c8e1ab3a7 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -64,8 +64,10 @@ LLVMTypeRef TypeStorageBool;
 LLVMTypeRef TypePGFunction;
 LLVMTypeRef StructNullableDatum;
 LLVMTypeRef StructHeapTupleData;
+LLVMTypeRef StructMinimalTupleData;
 LLVMTypeRef StructTupleDescData;
 LLVMTypeRef StructTupleTableSlot;
+LLVMTypeRef StructHeapTupleHeaderData;
 LLVMTypeRef StructHeapTupleTableSlot;
 LLVMTypeRef StructMinimalTupleTableSlot;
 LLVMTypeRef StructMemoryContextData;
@@ -76,8 +78,11 @@ LLVMTypeRef StructExprState;
 LLVMTypeRef StructAggState;
 LLVMTypeRef StructAggStatePerGroupData;
 LLVMTypeRef StructAggStatePerTransData;
+LLVMTypeRef StructPlanState;
 
 LLVMValueRef AttributeTemplate;
+LLVMValueRef ExecEvalSubroutineTemplate;
+LLVMValueRef ExecEvalBoolSubroutineTemplate;
 
 static LLVMModuleRef llvm_types_module = NULL;
 
@@ -451,11 +456,7 @@ llvm_pg_var_type(const char *varname)
 	if (!v_srcvar)
 		elog(ERROR, "variable %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	typ = LLVMTypeOf(v_srcvar);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL);
+	typ = LLVMGlobalGetValueType(v_srcvar);
 
 	return typ;
 }
@@ -467,12 +468,14 @@ llvm_pg_var_type(const char *varname)
 LLVMTypeRef
 llvm_pg_var_func_type(const char *varname)
 {
-	LLVMTypeRef typ = llvm_pg_var_type(varname);
+	LLVMValueRef v_srcvar;
+	LLVMTypeRef typ;
+
+	

Re: LLVM 16 (opaque pointers)

2023-09-20 Thread Thomas Munro
On Thu, Sep 21, 2023 at 12:24 PM Devrim Gündüz  wrote:
> On Thu, 2023-09-21 at 08:22 +1200, Thomas Munro wrote:
> > So far I've tested LLVM versions 10, 15, 16, 17, 18 (= their main
> > branch) against PostgreSQL versions 14, 15, 16.  I've attached the
> > versions that apply to master and 16, and pushed back-patches to 14
> > and 15 to public branches if anyone's interested[1].  Back-patching
> > further seems a bit harder.  I'm quite willing to do it, but ... do we
> > actually need to, ie does anyone really *require* old PostgreSQL
> > release branches to work with new LLVM?
>
> RHEL releases new LLVM version along with their new minor releases every
> 6 month, and we have to build older versions with new LLVM each time.
> From RHEL point of view, it would be great if we can back-patch back to
> v12 :(

Got it.  OK, I'll work on 12 and 13 now.




Re: LLVM 16 (opaque pointers)

2023-09-20 Thread Devrim Gündüz


Hi Thomas,

On Thu, 2023-09-21 at 08:22 +1200, Thomas Munro wrote:
> So far I've tested LLVM versions 10, 15, 16, 17, 18 (= their main
> branch) against PostgreSQL versions 14, 15, 16.  I've attached the
> versions that apply to master and 16, and pushed back-patches to 14
> and 15 to public branches if anyone's interested[1].  Back-patching
> further seems a bit harder.  I'm quite willing to do it, but ... do we
> actually need to, ie does anyone really *require* old PostgreSQL
> release branches to work with new LLVM?

RHEL releases new LLVM version along with their new minor releases every
6 month, and we have to build older versions with new LLVM each time.
>From RHEL point of view, it would be great if we can back-patch back to
v12 :(

Regards,
-- 
Devrim Gündüz
Open Source Solution Architect, PostgreSQL Major Contributor
Twitter: @DevrimGunduz , @DevrimGunduzTR




Re: LLVM 16 (opaque pointers)

2023-09-20 Thread Thomas Munro
Belated thanks Dmitry, Ronan, Andres for your feedback.  Here's a new
version, also including Dmitry's patch for 17 which it is now also
time to push.  It required a bit more trivial #if magic to be
conditional, as Dmitry already mentioned.  I just noticed that Dmitry
had the LLVMPassBuilderOptionsSetInlinerThreshold() function added to
LLVM 17's C API for this patch.  Thanks!  (Better than putting stuff
in llvmjit_wrap.c, if you can get it upstreamed in time.)

I thought I needed to block users from building with too-old clang and
too-new LLVM, but I haven't managed to find a combination that
actually breaks anything.  I wouldn't recommend it, but for example
clang 10 bitcode seems to be inlinable without problems by LLVM 16 on
my system (I didn't use an assert build of LLVM though).  I think that
could be a separate adjustment if we learn that we need to enforce or
document a constraint there.

So far I've tested LLVM versions 10, 15, 16, 17, 18 (= their main
branch) against PostgreSQL versions 14, 15, 16.  I've attached the
versions that apply to master and 16, and pushed back-patches to 14
and 15 to public branches if anyone's interested[1].  Back-patching
further seems a bit harder.  I'm quite willing to do it, but ... do we
actually need to, ie does anyone really *require* old PostgreSQL
release branches to work with new LLVM?

(I'll start a separate thread about the related question of when we
get to drop support for old LLVMs.)

One point from an earlier email:

On Sat, Aug 12, 2023 at 6:09 AM Andres Freund  wrote:
> >  AttributeTemplate(PG_FUNCTION_ARGS)
> >  {
> > + PGFunction  fp PG_USED_FOR_ASSERTS_ONLY;
> > +
> > + fp = 

> Other parts of the file do this by putting the functions into
> referenced_functions[], i'd copy that here and below.

Actually here I just wanted to assert that the 3 template functions
match certain function pointer types.  To restate what these functions
are about:  in the JIT code I need the function type, but we have only
the function pointer type, and it is now impossible to go from a
function pointer type to a function type, so I needed to define some
example functions with the right prototype (well, one of them existed
already but I needed more), and then I wanted to assert that they are
assignable to the appropriate function pointer types.  Does that make
sense?

In this version I changed it to what I hope is a more obvious/explicit
expression of that goal:

+   AssertVariableIsOfType(,
+  ExecEvalSubroutine);

[1] https://github.com/macdice/postgres/tree/llvm16-14 and -15
From d5e4ac6dd9c2016c00bedfd8cd404e2f277701e1 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 9 May 2022 08:27:47 +1200
Subject: [PATCH v2 1/2] jit: Support opaque pointers in LLVM 16.

Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].

 * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
 * For LLVM == 15, we'll continue to do the same, explicitly opting
   out of opaque pointer mode.
 * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
   the extra type argument.

The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc.  The change is mostly mechanical,
except that at each site the correct type had to be provided.

In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.

Back-patch to 12, because it's a little tricker in 11 and we agreed not
to put the latest LLVM support into the upcoming final release of 11.

[1] https://llvm.org/docs/OpaquePointers.html

Reviewed-by: Dmitry Dolgov <9erthali...@gmail.com>
Reviewed-by: Ronan Dunklau 
Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c|  57 ++--
 src/backend/jit/llvm/llvmjit_deform.c | 119 
 src/backend/jit/llvm/llvmjit_expr.c   | 401 --
 src/backend/jit/llvm/llvmjit_types.c  |  39 ++-
 src/backend/jit/llvm/llvmjit_wrap.cpp |  12 +
 src/backend/jit/llvm/meson.build  |   2 +-
 src/include/jit/llvmjit.h |   7 +
 src/include/jit/llvmjit_emit.h| 106 +--
 8 files changed, 479 insertions(+), 264 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 09650e2c70..06bb440503 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -85,8 +85,11 @@ LLVMTypeRef StructExprState;
 LLVMTypeRef StructAggState;
 LLVMTypeRef StructAggStatePerGroupData;
 LLVMTypeRef StructAggStatePerTransData;
+LLVMTypeRef 

Re: LLVM 16 (opaque pointers)

2023-08-15 Thread Fabien COELHO



[...] Further changes are already needed for their "main" branch (LLVM 
17-to-be), so this won't quite be enough to shut seawasp up.


For information, the physical server which was hosting my 2 bf animals 
(seawasp and moonjelly) has given up rebooting after a power cut a few 
weeks/months ago, and I have not setup a replacement (yet).


--
Fabien.




Re: LLVM 16 (opaque pointers)

2023-08-11 Thread Andres Freund
Hi,

On 2023-05-21 15:01:41 +1200, Thomas Munro wrote:
> *For anyone working with this type of IR generation code and
> questioning their sanity, I can pass on some excellent advice I got
> from Andres: build LLVM yourself with assertions enabled, as they
> catch some classes of silly mistake that otherwise just segfault
> inscrutably on execution.

Hm. I think we need a buildfarm animal with an assertion enabled llvm 16 once
we merge this. I think after an upgrade my buildfarm machine has the necessary
resources.


> @@ -150,7 +150,7 @@ llvm_compile_expr(ExprState *state)
>  
>   /* create function */
>   eval_fn = LLVMAddFunction(mod, funcname,
> -   
> llvm_pg_var_func_type("TypeExprStateEvalFunc"));
> +   
> llvm_pg_var_func_type("ExecInterpExprStillValid"));

Hm, that's a bit ugly. But ...

> @@ -77,9 +80,44 @@ extern Datum AttributeTemplate(PG_FUNCTION_ARGS);
>  Datum
>  AttributeTemplate(PG_FUNCTION_ARGS)
>  {
> + PGFunction  fp PG_USED_FOR_ASSERTS_ONLY;
> +
> + fp = 
>   PG_RETURN_NULL();
>  }

Other parts of the file do this by putting the functions into
referenced_functions[], i'd copy that here and below.

> +void
> +ExecEvalSubroutineTemplate(ExprState *state,
> +struct ExprEvalStep *op,
> +ExprContext *econtext)
> +{
> + ExecEvalSubroutine fp PG_USED_FOR_ASSERTS_ONLY;
> +
> + fp = 
> +}
> +
> +extern bool ExecEvalBoolSubroutineTemplate(ExprState *state,
> + 
>struct ExprEvalStep *op,
> + 
>ExprContext *econtext);
> +bool
> +ExecEvalBoolSubroutineTemplate(ExprState *state,
> +struct ExprEvalStep 
> *op,
> +ExprContext 
> *econtext)
> +{
> + ExecEvalBoolSubroutine fp PG_USED_FOR_ASSERTS_ONLY;
> +
> + fp = 
> + return false;
> +}
> +


Thanks for working on this!

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-08-11 Thread Andres Freund
Hi,

On 2023-08-10 16:56:54 +0200, Ronan Dunklau wrote:
> I tried my hand at backporting it to previous versions, and not knowing
> anything about it made me indeed question my sanity.  It's quite easy for PG
> 15, 14, 13. PG 12 is nothing insurmontable either, but PG 11 is a bit hairier
> most notably due to to the change in fcinfo args representation. But I guess
> that's also a topic for another day :-)

Given that 11 is about to be EOL, I don't think it's worth spending the time
to support a new LLVM version for it.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-08-10 Thread Ronan Dunklau
Le dimanche 21 mai 2023, 05:01:41 CEST Thomas Munro a écrit :
> Hi,
> 
> Here is a draft version of the long awaited patch to support LLVM 16.
> It's mostly mechanical donkeywork, but it took some time: this donkey
> found it quite hard to understand the mighty getelementptr
> instruction[1] and the code generation well enough to figure out all
> the right types, and small mistakes took some debugging effort*.  I
> now finally have a patch that passes all tests.
> 
> Though it's not quite ready yet, I thought I should give this status
> update to report that the main task is more or less complete, since
> we're starting to get quite a few emails about it (mostly from Fedora
> users) and there is an entry for it on the Open Items for 16 wiki
> page.  Comments/review/testing welcome.

Hello Thomas,

Thank you for this effort !

I've tested it against llvm 15 and 16, and found no problem with it.

> 6. I need to go through the types again with a fine tooth comb, and
> check the test coverage to look out for eg GEP array arithmetic with
> the wrong type/size that isn't being exercised.

I haven't gone through the test coverage myself, but I exercised the following 
things: 

 - running make installcheck with jit_above_cost = 0
 - letting sqlsmith hammer random queries at it for a few hours.

This didn't show obvious issues.

> *For anyone working with this type of IR generation code and
> questioning their sanity, I can pass on some excellent advice I got
> from Andres: build LLVM yourself with assertions enabled, as they
> catch some classes of silly mistake that otherwise just segfault
> > inscrutably on execution.

I tried my hand at backporting it to previous versions, and not knowing 
anything about it made me indeed question my sanity.  It's quite easy for PG 
15, 14, 13. PG 12 is nothing insurmontable either, but PG 11 is a bit hairier 
most notably due to to the change in fcinfo args representation. But I guess 
that's also a topic for another day :-)

Best regards,

--
Ronan Dunklau








Re: LLVM 16 (opaque pointers)

2023-06-04 Thread Dmitry Dolgov
> On Mon, May 22, 2023 at 03:38:44PM +1200, Thomas Munro wrote:
> Further changes are already needed for their "main" branch (LLVM
> 17-to-be), so this won't quite be enough to shut seawasp up.  At a
> glance, we will need to change from the "old pass manager" API that
> has recently been vaporised[1]
> (llvm-c/Transforms/PassManagerBuilder.h) to the new one[2][3]
> (llvm-c/Transforms/PassBuilder.h), which I suspect/hope will be as
> simple as changing llvmjit.c to call LLVMRunPasses() with a string
> describing the passes we want in "opt -passes" format, instead of our
> code that calls LLVMAddFunctionInlingPass() etc.  But that'll be a
> topic for another day, and another thread.
>
> [1] 
> https://github.com/llvm/llvm-project/commit/0aac9a2875bad4f065367e4a6553fad78605f895
> [2] https://llvm.org/docs/NewPassManager.html
> [3] https://reviews.llvm.org/D102136

Thanks for tackling the topic! I've tested it with a couple of versions,
LLVM 12 that comes with my Gentoo box, LLVM 15 build from sources and
the modified version of patch adopted for LLVM 17 (build form sources as
well). In all three cases everything seems to be working fine.

Simple benchmarking with a query stolen from some other jit thread
(pgbench running single client with multiple unions of selects a-la
SELECT a, count(*), sum(b) FROM test WHERE c = 2 GROUP BY a) show some
slight performance differences, but nothing dramatic so far. LLVM 17
version produces the lowest latency, with faster generation, inlining
and optimization, but slower emission time. LLVM 12 version produces the
largest latencies with everything except emission timings being slower.
LLVM 15 is somewhere in between.

I'll continue reviewing and, for the records, attach adjustments I was
using for LLVM 17 (purely for testing, not taking into account other
versions), in case if I've missed something.
>From 33e39a376fdbcceb6d4e757f4a798b3922e7068c Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Sun, 4 Jun 2023 11:10:41 +0200
Subject: [PATCH 2/2] LLVM 17

---
 src/backend/jit/llvm/llvmjit.c| 73 ---
 src/backend/jit/llvm/llvmjit_wrap.cpp |  4 ++
 2 files changed, 25 insertions(+), 52 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index b83bc04ae3a..b701e167abd 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -18,6 +18,9 @@
 #include 
 #include 
 #include 
+#if LLVM_VERSION_MAJOR > 16
+#include 
+#endif
 #if LLVM_VERSION_MAJOR > 11
 #include 
 #include 
@@ -27,12 +30,14 @@
 #endif
 #include 
 #include 
+#if LLVM_VERSION_MAJOR < 17
 #include 
 #include 
 #include 
 #if LLVM_VERSION_MAJOR > 6
 #include 
 #endif
+#endif
 
 #include "jit/llvmjit.h"
 #include "jit/llvmjit_emit.h"
@@ -559,69 +564,33 @@ llvm_function_reference(LLVMJitContext *context,
 static void
 llvm_optimize_module(LLVMJitContext *context, LLVMModuleRef module)
 {
-	LLVMPassManagerBuilderRef llvm_pmb;
-	LLVMPassManagerRef llvm_mpm;
-	LLVMPassManagerRef llvm_fpm;
-	LLVMValueRef func;
+	LLVMPassBuilderOptionsRef options;
+	LLVMErrorRef err;
 	int			compile_optlevel;
+	char	   *passes;
 
 	if (context->base.flags & PGJIT_OPT3)
 		compile_optlevel = 3;
 	else
 		compile_optlevel = 0;
 
-	/*
-	 * Have to create a new pass manager builder every pass through, as the
-	 * inliner has some per-builder state. Otherwise one ends up only inlining
-	 * a function the first time though.
-	 */
-	llvm_pmb = LLVMPassManagerBuilderCreate();
-	LLVMPassManagerBuilderSetOptLevel(llvm_pmb, compile_optlevel);
-	llvm_fpm = LLVMCreateFunctionPassManagerForModule(module);
+	passes = psprintf("default,mem2reg,function(no-op-function),no-op-module",
+	  compile_optlevel);
 
-	if (context->base.flags & PGJIT_OPT3)
-	{
-		/* TODO: Unscientifically determined threshold */
-		LLVMPassManagerBuilderUseInlinerWithThreshold(llvm_pmb, 512);
-	}
-	else
-	{
-		/* we rely on mem2reg heavily, so emit even in the O0 case */
-		LLVMAddPromoteMemoryToRegisterPass(llvm_fpm);
-	}
+	options = LLVMCreatePassBuilderOptions();
 
-	LLVMPassManagerBuilderPopulateFunctionPassManager(llvm_pmb, llvm_fpm);
+#ifdef LLVM_PASS_DEBUG
+	LLVMPassBuilderOptionsSetDebugLogging(options, 1);
+#endif
 
-	/*
-	 * Do function level optimization. This could be moved to the point where
-	 * functions are emitted, to reduce memory usage a bit.
-	 */
-	LLVMInitializeFunctionPassManager(llvm_fpm);
-	for (func = LLVMGetFirstFunction(context->module);
-		 func != NULL;
-		 func = LLVMGetNextFunction(func))
-		LLVMRunFunctionPassManager(llvm_fpm, func);
-	LLVMFinalizeFunctionPassManager(llvm_fpm);
-	LLVMDisposePassManager(llvm_fpm);
+	LLVMPassBuilderOptionsSetInlinerThreshold(options, 512);
 
-	/*
-	 * Perform module level optimization. We do so even in the non-optimized
-	 * case, so always-inline functions etc get inlined. It's cheap enough.
-	 */
-	llvm_mpm = LLVMCreatePassManager();
-	LLVMPassManagerBuilderPopulateModulePassManager(llvm_pmb,
-		

Re: LLVM 16 (opaque pointers)

2023-05-21 Thread Thomas Munro
Oh, one important thing I forgot to mention: that patch is for LLVM 16
only, and I developed it with a local build of their "release/16.x"
branch on a FreeBSD box, and also tested with a released package for
16 on a Debian box.  Further changes are already needed for their
"main" branch (LLVM 17-to-be), so this won't quite be enough to shut
seawasp up.  At a glance, we will need to change from the "old pass
manager" API that has recently been vaporised[1]
(llvm-c/Transforms/PassManagerBuilder.h) to the new one[2][3]
(llvm-c/Transforms/PassBuilder.h), which I suspect/hope will be as
simple as changing llvmjit.c to call LLVMRunPasses() with a string
describing the passes we want in "opt -passes" format, instead of our
code that calls LLVMAddFunctionInlingPass() etc.  But that'll be a
topic for another day, and another thread.

[1] 
https://github.com/llvm/llvm-project/commit/0aac9a2875bad4f065367e4a6553fad78605f895
[2] https://llvm.org/docs/NewPassManager.html
[3] https://reviews.llvm.org/D102136




LLVM 16 (opaque pointers)

2023-05-20 Thread Thomas Munro
Hi,

Here is a draft version of the long awaited patch to support LLVM 16.
It's mostly mechanical donkeywork, but it took some time: this donkey
found it quite hard to understand the mighty getelementptr
instruction[1] and the code generation well enough to figure out all
the right types, and small mistakes took some debugging effort*.  I
now finally have a patch that passes all tests.

Though it's not quite ready yet, I thought I should give this status
update to report that the main task is more or less complete, since
we're starting to get quite a few emails about it (mostly from Fedora
users) and there is an entry for it on the Open Items for 16 wiki
page.  Comments/review/testing welcome.

Here are some things I think I need to do next (probably after PGCon):

1. If you use non-matching clang and LLVM versions I think we might
use "clang -no-opaque-pointers" at the wrong times (I've not looked
into that interaction yet).
2. The treatment of function types is a bit inconsistent/messy and
could be tidied up.
3. There are quite a lot of extra function calls that could perhaps be
elided (ie type variables instead of LLVMTypeInt8(), and calls to
LLVMStructGetTypeAtIndex() that are not used in LLVM < 16).
4. Could use some comments.
5. I need to test with very old versions of LLVM and Clang that we
claim to support (I have several years' worth of releases around but
nothing older than 9).
6. I need to go through the types again with a fine tooth comb, and
check the test coverage to look out for eg GEP array arithmetic with
the wrong type/size that isn't being exercised.

*For anyone working with this type of IR generation code and
questioning their sanity, I can pass on some excellent advice I got
from Andres: build LLVM yourself with assertions enabled, as they
catch some classes of silly mistake that otherwise just segfault
inscrutably on execution.

[1] https://llvm.org/docs/GetElementPtr.html
From 996e4bceee21a9f27116623c1fdac8eab0cdf814 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 9 May 2022 08:27:47 +1200
Subject: [PATCH] jit: Support opaque pointers in LLVM 16.

Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].

 * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
 * For LLVM == 15, we'll continue to do the same, explicitly opting
   out of opaque pointer mode.
 * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
   the extra type argument.

The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc.  The change is mostly mechanical,
except that at each site the correct type had to be provided.

In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.

Back-patch to all releases.

[1] https://llvm.org/docs/OpaquePointers.html

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 04ae3052a8..b83bc04ae3 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -85,8 +85,11 @@ LLVMTypeRef StructExprState;
 LLVMTypeRef StructAggState;
 LLVMTypeRef StructAggStatePerGroupData;
 LLVMTypeRef StructAggStatePerTransData;
+LLVMTypeRef StructPlanState;
 
 LLVMValueRef AttributeTemplate;
+LLVMValueRef ExecEvalSubroutineTemplate;
+LLVMValueRef ExecEvalBoolSubroutineTemplate;
 
 LLVMModuleRef llvm_types_module = NULL;
 
@@ -382,11 +385,7 @@ llvm_pg_var_type(const char *varname)
 	if (!v_srcvar)
 		elog(ERROR, "variable %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	typ = LLVMTypeOf(v_srcvar);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL);
+	typ = LLVMGlobalGetValueType(v_srcvar);
 
 	return typ;
 }
@@ -398,12 +397,14 @@ llvm_pg_var_type(const char *varname)
 LLVMTypeRef
 llvm_pg_var_func_type(const char *varname)
 {
-	LLVMTypeRef typ = llvm_pg_var_type(varname);
+	LLVMValueRef v_srcvar;
+	LLVMTypeRef typ;
+
+	v_srcvar = LLVMGetNamedFunction(llvm_types_module, varname);
+	if (!v_srcvar)
+		elog(ERROR, "function %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	Assert(LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMFunctionTypeKind);
+	typ = LLVMGetFunctionType(v_srcvar);
 
 	return typ;
 }
@@ -433,7 +434,7 @@ llvm_pg_func(LLVMModuleRef mod, const char *funcname)
 
 	v_fn = LLVMAddFunction(mod,
 		   funcname,
-		   LLVMGetElementType(LLVMTypeOf(v_srcfn)));
+		   LLVMGetFunctionType(v_srcfn));
 	llvm_copy_attributes(v_srcfn, v_fn);
 
 	return v_fn;
@@ -529,7 +530,7 @@