Hi,

On 2026-01-14 17:12:45 +0100, Anthonin Bonnefoy wrote:
> I've tried to generate multiple bitcode for a simple 'select aid % 2
> FROM pgbench_accounts limit 10;' query. To keep bitcode simple, I've
> modified the passes to use "default<O0>,mem2reg,inline" when we have
> JIT inline without optimization (as described in [0]). I've tried the
> following
> - LLVM21: With lifetime
> - LLVM21: Without lifetime
> - LLVM22: With Poison
> - LLVM22: Without Poison
>
> In the 4 scenarios, the generated bc were the same with the exact same
> instructions. Removing the lifetime end or the poison value doesn't
> seem to change anything at this level of optimisation.
>
> I'm not sure how to interpret this. Maybe the test is incorrect and a
> different function needs to be called to possibly trigger the issue?
> Or the poison/lifetime is only useful when going through the O3
> optimisation pass?

I think it's the latter - at -O0 there's nothing that could use the
information.

The goal of the lifetime annotations was to allow llvm to remove stores an
loads of FunctionCallInfo->{args,isnull}. After we stored e.g. fcinfo->isnull
before a function call and then checked it after the function call, we don't
need it anymore.  I think that can only matter when the called function is
actually inlined, otherwise there's no way that LLVM can see the store is
unnecessary.


Unfortunately there's an issue with modern LLVM, regardless of lifetime or
poison.  Generally it's able to eliminate stores that are followed by a
poison, but if there's a load inbetween, it fails. The odd part is that it
*is* able to eliminate the load (by forwarding the stored value).

It seems to be an ordering issue - instcombine is required to remove the load,
but also removes the poison, which in turn is required for dead store
elimination.  Gngng.

I've attached a reproducer.

I'm not sure the llvm folks will be all that interested - there's no real C
correspondance to this. And, as it turns out, if I feed the memory to
something like free(), the analysis actually *does* figure out that it's not
needed anymore.


I think if / once we move most of this to a stack allocation, the problem
would also vanish.

Greetings,

Andres Freund
; ModuleID = '/tmp/test.c'
source_filename = "/tmp/test.c"
target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"


%struct.Arg = type { i8, i64 }


define noundef i64 @evalexpr_broken(ptr %0) {
entry:
  %isnullp = getelementptr %struct.Arg, ptr %0, i32 0
  %arg0p = getelementptr %struct.Arg, ptr %0, i32 1

  ; value that's needed just temporarily, during "operation" below
  store i8 38, ptr %isnullp

  %arg0 = load i64, ptr %arg0p

  ; operation that has been optimized out would be here, elided for brevity

  %isnull = load i8, ptr %isnullp

  ; signal that memory at %isnullp isn't needed anymore
  store i8 poison, ptr %isnullp

  ret i64 %arg0
}


define noundef i64 @evalexpr_works(ptr %0) {
entry:
  %isnullp = getelementptr %struct.Arg, ptr %0, i32 0
  %arg0p = getelementptr %struct.Arg, ptr %0, i32 1

  store i8 38, ptr %isnullp

  %arg0 = load i64, ptr %arg0p

  %isnull = load i8, ptr %isnullp
  store i8 poison, ptr %isnullp

  ; now that the return value isn't needed anymore, optimizer can optimize away
  ; the store to %isnullp
  ret i64 0
}

Reply via email to