[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-18 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

The reproducer I posted above still crashes, just in a different place.
```
clang-21: 
/home/jhuber/Documents/llvm/llvm-project/llvm/lib/IR/Instructions.cpp:744: void 
llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, 
llvm::ArrayRef, 
llvm::ArrayRef >, const llvm::Twine&): 
Assertion `(i >= FTy->getNumParams() || FTy->getParamType(i) == 
Args[i]->getType()) && "Calling a function with a bad signature!"' failed.
```

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-17 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

> > seeing breaks in downstream build of rocPRIM
> 
> Probably need to revert the downstream revert of the original problematic 
> patch

It's not quite that, we have a problem with the following pattern: 
. `r` is a returned val so it will come 
from an / as the result of uncasted `alloca`, therefore its address will point 
to the AllocaAS. C and C++ casts yield bitcasts, not AS casts, so we end up 
with an C cast expr (see AST) that tries to bitcast from a pointer to the 
AllocaAS to a pointer to flat / generic, which is invalid. I'm not entirely yet 
sure how to fix this yet except going to an earlier iteration of this where we 
use casted `alloca`s everywhere and handle `sret` a bit more noisily.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-17 Thread via cfe-commits

llvmbot wrote:

/pull-request llvm/llvm-project#127552

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-17 Thread Matt Arsenault via cfe-commits

arsenm wrote:

/cherry-pick 39ec9de

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-17 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

> @AlexVlx I found this breaks my clang build, reduced the C++ source to the 
> following reproducer https://godbolt.org/z/jGnvKeqvr. Please verify that it 
> breaks for you as well and revert or fix. (The issue is the s() function call 
> not using AS(5))
> 
> ```c++
> #pragma omp begin declare target
> struct S {
>   ~S() { };
> };
> S s();
> struct E {
>   S foo;
>   E();
> };
> E::E() : foo(s()) {}
> #pragma omp end declare target
> ```
> 
> ```
> > ./bin/clang++ omp-bug.cpp -fopenmp --offload-arch=gfx1030 -nogpulib
> clang-21: 
> /home/jhuber/Documents/llvm/llvm-project/clang/lib/CodeGen/CGCall.cpp:5648: 
> clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitCall(const 
> clang::CodeGen::CGFunctionInfo&, const clang::CodeGen::CGCallee&, 
> clang::CodeGen::ReturnValueSlot, const clang::CodeGen::CallArgList&, 
> llvm::CallBase**, bool, clang::SourceLocation, bool): Assertion 
> `IRCallArgs[i]->getType() == IRFuncTy->getParamType(i)' failed.
> ```

Thank you for flagging this. Please see #127528.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-15 Thread Matt Arsenault via cfe-commits

arsenm wrote:

> seeing breaks in downstream build of rocPRIM

Probably need to revert the downstream revert of the original problematic patch 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-15 Thread via cfe-commits

ronlieb wrote:

seeing breaks in downstream build of rocPRIM

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-14 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

@AlexVlx I found this breaks my clang build, reduced the C++ source to the 
following reproducer https://godbolt.org/z/jGnvKeqvr. Please verify that it 
breaks for you as well and revert or fix.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-14 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `flang-runtime-cuda-clang` 
running on `as-builder-7` while building `clang` at step 10 
"build-flang-runtime-FortranRuntime".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/7/builds/10932


Here is the relevant piece of the build log for the reference

```
Step 10 (build-flang-runtime-FortranRuntime) failure: cmake (failure)
...
  |   ^~
1 warning generated.
In file included from 
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/extrema.cpp:13:
In file included from 
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/reduction-templates.h:24:
In file included from 
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/numeric-templates.h:22:
In file included from 
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/tools.h:17:
/home/buildbot/worker/as-builder-7/ramdisk/flang-runtime-cuda-clang/llvm-project/flang/runtime/../include/flang/Runtime/freestanding-tools.h:115:27:
 warning: unused function 'MemmoveWrapper' [-Wunused-function]
  115 | static RT_API_ATTRS void *MemmoveWrapper(
  |   ^~
1 warning generated.
command timed out: 1200 seconds without output running [b'cmake', b'--build', 
b'.', b'--target', b'FortranRuntime'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1786.866735

```



https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-14 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx closed 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-13 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

> We can't keep waiting for this to get in the release  

I'll merge later today once CI passes.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-13 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-12 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.

We can't keep waiting for this to get in the release 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-12 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm milestoned 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-10 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

> ping, this really needs to be in the release branch. The device library build 
> is broken without this

@rjmccall any additional issues / comments / suggestions / objections?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-10 Thread Matt Arsenault via cfe-commits

arsenm wrote:

ping, this really needs to be in the release branch. The device library build 
is broken without this 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-03 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-02-03 Thread Matt Arsenault via cfe-commits


@@ -0,0 +1,68 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -cl-std=CL2.0 -triple amdgcn-amd-amdhsa 
-disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// Check there's no assertion when passing a pointer to an address space
+// qualified argument.
+
+extern void private_ptr(__private int *);
+extern void local_ptr(__local int *);
+extern void generic_ptr(__generic int *);
+
+// CHECK-LABEL: define dso_local void @use_of_private_var(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:[[X:%.*]] = alloca i32, align 4, addrspace(5)
+// CHECK-NEXT:[[X_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[X]] to 
ptr
+// CHECK-NEXT:call void @llvm.lifetime.start.p5(i64 4, ptr addrspace(5) 
[[X]]) #[[ATTR4:[0-9]+]]
+// CHECK-NEXT:store i32 0, ptr [[X_ASCAST]], align 4, !tbaa 
[[TBAA4:![0-9]+]]
+// CHECK-NEXT:[[TMP0:%.*]] = addrspacecast ptr [[X_ASCAST]] to ptr 
addrspace(5)
+// CHECK-NEXT:call void @private_ptr(ptr addrspace(5) noundef [[TMP0]]) 
#[[ATTR5:[0-9]+]]
+// CHECK-NEXT:call void @generic_ptr(ptr noundef [[X_ASCAST]]) #[[ATTR5]]
+// CHECK-NEXT:call void @llvm.lifetime.end.p5(i64 4, ptr addrspace(5) 
[[X]]) #[[ATTR4]]
+// CHECK-NEXT:ret void
+//
+void use_of_private_var()
+{
+int x = 0 ;
+private_ptr(&x);
+generic_ptr(&x);
+}
+
+// CHECK-LABEL: define dso_local void @addr_of_arg(
+// CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:[[X_ADDR:%.*]] = alloca i32, align 4, addrspace(5)
+// CHECK-NEXT:[[X_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[X_ADDR]] to ptr
+// CHECK-NEXT:store i32 [[X]], ptr [[X_ADDR_ASCAST]], align 4, !tbaa 
[[TBAA4]]
+// CHECK-NEXT:[[TMP0:%.*]] = addrspacecast ptr [[X_ADDR_ASCAST]] to ptr 
addrspace(5)
+// CHECK-NEXT:call void @private_ptr(ptr addrspace(5) noundef [[TMP0]]) 
#[[ATTR5]]
+// CHECK-NEXT:call void @generic_ptr(ptr noundef [[X_ADDR_ASCAST]]) 
#[[ATTR5]]
+// CHECK-NEXT:ret void
+//
+void addr_of_arg(int x)
+{
+private_ptr(&x);
+generic_ptr(&x);
+}
+
+// CHECK-LABEL: define dso_local amdgpu_kernel void @use_of_local_var(
+// CHECK-SAME: ) #[[ATTR3:[0-9]+]] !kernel_arg_addr_space [[META8:![0-9]+]] 
!kernel_arg_access_qual [[META8]] !kernel_arg_type [[META8]] 
!kernel_arg_base_type [[META8]] !kernel_arg_type_qual [[META8]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:call void @local_ptr(ptr addrspace(3) noundef 
@use_of_local_var.x) #[[ATTR5]]
+// CHECK-NEXT:call void @generic_ptr(ptr noundef addrspacecast (ptr 
addrspace(3) @use_of_local_var.x to ptr)) #[[ATTR5]]
+// CHECK-NEXT:ret void
+//
+__kernel void use_of_local_var()
+{
+__local int x;
+local_ptr(&x);
+generic_ptr(&x);
+}
+
+//.
+// CHECK: [[TBAA4]] = !{[[META5:![0-9]+]], [[META5]], i64 0}
+// CHECK: [[META5]] = !{!"int", [[META6:![0-9]+]], i64 0}
+// CHECK: [[META6]] = !{!"omnipotent char", [[META7:![0-9]+]], i64 0}
+// CHECK: [[META7]] = !{!"Simple C/C++ TBAA"}
+// CHECK: [[META8]] = !{}
+//.

arsenm wrote:

```suggestion
//.

```

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-28 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

Gentle ping.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-28 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-23 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

> There are a still a few places using getTargetAddressSpace(LangAS::Default), 
> which I don't understand.
> 
> Can you also extract my testcase from #115093? I think this should fix that 
> issue (also, this absolutely needs to be fixed in the release branching next 
> week)  

The former was me being daft and missing them. The latter is done (it does fix 
the issue).

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-23 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-22 Thread Alex Voicu via cfe-commits


@@ -3283,12 +3293,14 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, 
unsigned &FreeSSERegs,
   if (RT) {
 if (!IsReturnType) {
   if (CGCXXABI::RecordArgABI RAA = getRecordArgABI(RT, getCXXABI()))
-return getNaturalAlignIndirect(Ty, RAA == 
CGCXXABI::RAA_DirectInMemory);
+return getNaturalAlignIndirect(Ty, 
getDataLayout().getAllocaAddrSpace(),
+   RAA == CGCXXABI::RAA_DirectInMemory);
 }
 
 if (RT->getDecl()->hasFlexibleArrayMember())
-  return getNaturalAlignIndirect(Ty, /*ByVal=*/false);
-
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),

AlexVlx wrote:

Fixed.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-21 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

Gentle ping.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-08 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-08 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-08 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

> In this case I think just using the hardcoded 0 is less wrong than using the 
> "LangAS::Default". This just happens to work out correctly for the OpenCL 1.x 
> hack on amdgpu  

I don't have a strong preference / that makes sense-ish but I was actually 
leaning towards switching this to the AllocaAS, which I think is still safe and 
would do the RightThingTM.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-07 Thread Matt Arsenault via cfe-commits


@@ -21,9 +21,12 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) 
const {
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (CGCXXABI::RecordArgABI RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);
 
-return getNaturalAlignIndirect(Ty);
+return getNaturalAlignIndirect(
+Ty, getContext().getTargetAddressSpace(LangAS::Default));

arsenm wrote:

In this case I think just using the hardcoded 0 is less wrong than using the 
"LangAS::Default". This just happens to work out correctly for the OpenCL 1.x 
hack on amdgpu 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-07 Thread Alex Voicu via cfe-commits


@@ -814,7 +816,10 @@ static ABIArgInfo classifyType(CodeGenModule &CGM, 
CanQualType type,
 auto &layout = CGM.getContext().getASTRecordLayout(record);
 
 if (mustPassRecordIndirectly(CGM, record))
-  return ABIArgInfo::getIndirect(layout.getAlignment(), /*byval*/ false);
+  return ABIArgInfo::getIndirect(
+  layout.getAlignment(),
+  /*AddrSpace*/ 
CGM.getContext().getTargetAddressSpace(LangAS::Default),
+  /*byval*/ false);

AlexVlx wrote:

Reworked altogether.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-07 Thread Matt Arsenault via cfe-commits


@@ -49,6 +49,8 @@ class ABIInfo {
   CodeGen::CodeGenTypes &CGT;
   llvm::CallingConv::ID RuntimeCC;
 
+  unsigned getTargetDefaultAS() const;

arsenm wrote:

I'm fine with the getContext().get 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-06 Thread Alex Voicu via cfe-commits


@@ -21,9 +21,12 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) 
const {
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (CGCXXABI::RecordArgABI RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);
 
-return getNaturalAlignIndirect(Ty);
+return getNaturalAlignIndirect(
+Ty, getContext().getTargetAddressSpace(LangAS::Default));

AlexVlx wrote:

Eventually they should, this ties into what @rjmccall was saying elsewhere re: 
specifying and encoding an actual rule for indirects, which this initial change 
was trying to avoid. The other contexts are for indirect returns, where with 
the current Clang implementation you can only get `alloca`d storage.

This is for indirect, non-aliased args, which in the past didn't actually carry 
an AS, so this is just trying to keep existing (implicit) behaviour; for the 
time being there's no use of the AS for non-aliased indirect args, as it'd not 
have been available before. Probably for args we want what was suggested 
elsewhere, which is what you are saying, namely just use the `alloca` AS, I 
just didn't do it in this change / was going to follow up. Guess it cannot hurt 
to just bite the bullet, which would also remove the need for the helper?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-06 Thread Alex Voicu via cfe-commits


@@ -49,6 +49,8 @@ class ABIInfo {
   CodeGen::CodeGenTypes &CGT;
   llvm::CallingConv::ID RuntimeCC;
 
+  unsigned getTargetDefaultAS() const;

AlexVlx wrote:

Sure, but I couldn't quite figure out any other convenient, accessible, 
non-intrusive place to stash this; the alternative is to keep the spammy calls 
to `getContext().bla`, as per the prior iteration. I'm open to suggestions 
though, if you have something in mind?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-06 Thread Matt Arsenault via cfe-commits


@@ -814,7 +816,10 @@ static ABIArgInfo classifyType(CodeGenModule &CGM, 
CanQualType type,
 auto &layout = CGM.getContext().getASTRecordLayout(record);
 
 if (mustPassRecordIndirectly(CGM, record))
-  return ABIArgInfo::getIndirect(layout.getAlignment(), /*byval*/ false);
+  return ABIArgInfo::getIndirect(
+  layout.getAlignment(),
+  /*AddrSpace*/ 
CGM.getContext().getTargetAddressSpace(LangAS::Default),
+  /*byval*/ false);

arsenm wrote:

```suggestion
  /*AddrSpace=*/ 
CGM.getContext().getTargetAddressSpace(LangAS::Default),
  /*byval=*/ false);
```

as clang-format prefers 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-06 Thread Matt Arsenault via cfe-commits


@@ -49,6 +49,8 @@ class ABIInfo {
   CodeGen::CodeGenTypes &CGT;
   llvm::CallingConv::ID RuntimeCC;
 
+  unsigned getTargetDefaultAS() const;

arsenm wrote:

I don't really like this as a helper, it's not really expressing any new target 
concept. It's still just querying what do use for the Default LangAS 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2025-01-06 Thread Matt Arsenault via cfe-commits


@@ -21,9 +21,12 @@ ABIArgInfo DefaultABIInfo::classifyArgumentType(QualType Ty) 
const {
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (CGCXXABI::RecordArgABI RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);
 
-return getNaturalAlignIndirect(Ty);
+return getNaturalAlignIndirect(
+Ty, getContext().getTargetAddressSpace(LangAS::Default));

arsenm wrote:

This should be getAllocaAddrSpace like the rest of the get*Indirect contexts?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-18 Thread Alex Voicu via cfe-commits


@@ -225,7 +225,9 @@ ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, 
bool Variadic,
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (auto RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);

AlexVlx wrote:

Apologies if my reply came off as defensive (and for the delay), it wasn't 
intended as such, merely wanted to convey trepidation around doing too much in 
one change. I am in agreement with the points you made / the way forward. In 
what regards the proposed rule, as I mentioned elsewhere, I'm of the opinion 
that we should leave indirect returns pointing to the AllocaAS (since this is 
the only possibility at the moment), with a note that this would require 
reworking in the future (where the base ABI would be to return a pointer to the 
default AS, with targets potentially altering this).

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-12 Thread John McCall via cfe-commits


@@ -225,7 +225,9 @@ ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, 
bool Variadic,
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (auto RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);

rjmccall wrote:

Okay, let's step back for a second. Compiler implementation is a naturally 
*highly* combinatoric problem; everything we do here is prone to an explosion 
of complexity wherever features and/or target variations interact. It is very 
important that, as we work on the compiler, we always have a philosophical 
mindset, trying to figure out the right way to think about the problem and 
organize our solution. Otherwise, we find ourselves drowning in "special cases" 
that never quite seem to fix the bugs.

I am assuming here that this is one patch in a series that, cumulatively, will 
implement a sensible plan for handling address spaces for indirect argument and 
return values. This is a part of the compiler that clearly got overlooked a 
bit, so we shouldn't be surprised to find inconsistencies and unexpected 
behavior. Some of those will be bugs; others will actually be intended, and we 
should be looking at those for that philosophical insight about the right way 
to model what we're doing.

My experience is that patches that refactor bits of the API and therefore 
necessitate touching a lot of code are the ideal time to be looking at each of 
those places and trying to figure out what the code is supposed to be doing. 
You don't have to actually make semantic changes as you go, but you need to at 
least have these conversations where you acknowledge "okay, this is doing X 
right now, but it should really be using the general rule for Y" and then leave 
a FIXME behind saying that, one which you intend to fix within the next few 
patches.

But that is what I am doing in this code review: I'm not trying to derail your 
PR, I'm trying to set your whole series of patches up for success so that we 
end up with a coherent design and implementation.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-12 Thread John McCall via cfe-commits


@@ -800,7 +800,9 @@ static ABIArgInfo classifyExpandedType(SwiftAggLowering 
&lowering,
   if (lowering.empty()) {
 return ABIArgInfo::getIgnore();
   } else if (lowering.shouldPassIndirectly(forReturn)) {
-return ABIArgInfo::getIndirect(alignmentForIndirect, /*byval*/ false);
+return ABIArgInfo::getIndirect(alignmentForIndirect,
+   /*AddrSpace*/ 0,
+   /*byval*/ false);

rjmccall wrote:

I'm asking you to change behavior in a way that's consistent with the way 
you're changing behavior in general in your patch.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread Alex Voicu via cfe-commits


@@ -800,7 +800,9 @@ static ABIArgInfo classifyExpandedType(SwiftAggLowering 
&lowering,
   if (lowering.empty()) {
 return ABIArgInfo::getIgnore();
   } else if (lowering.shouldPassIndirectly(forReturn)) {
-return ABIArgInfo::getIndirect(alignmentForIndirect, /*byval*/ false);
+return ABIArgInfo::getIndirect(alignmentForIndirect,
+   /*AddrSpace*/ 0,
+   /*byval*/ false);

AlexVlx wrote:

I did not intend to alter behaviour here, but merely to adopt the changed 
interface. There's no easy way to use she LangAS::Default -> Target mapping 
here in this helper function, hence the hardcoding to 0.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread Alex Voicu via cfe-commits


@@ -225,7 +225,9 @@ ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, 
bool Variadic,
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (auto RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);

AlexVlx wrote:

This is not intended to alter current behaviour, as without this patch non 
aliased indirect args wouldn't carry an AS, so it's just dealing with the 
interface extension for `getIndirect` / `getNaturalAlignIndirect`.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

> I'm fine with how you're handling the address spaces for now.
> 
> I'd like to talk about the rule you're implementing, though. It looks like 
> it's supposed to be:
> 
> * return values always use the alloca AS
> * arguments always use the default AS
> * whether something is indirect because it's non-POD or simply too big to fit 
> in registers doesn't make a difference
> 
> That's a surprising rule; in fact, it's the exact opposite of the rule I 
> would expect.
> 
> Indirect arguments are always true temporaries. The caller has total control 
> over where to allocate and initialize the temporary, and it has very little 
> reason to not always use the stack. So there's no reason for the ABI to not 
> specify that the argument pointer is passed in the alloca AS.
> 
> Return values, in contrast, can be used to directly initialize all sorts of 
> different memory, not just objects on the stack. So the ABI should probably 
> be to pass as generic a pointer as the target supports. Moreover, while 
> passing a restricted pointer in C is okay because we can always use pass a 
> temporary and then relocate the object after the call, the same is not true 
> for types that are non-trivial to copy in C++ — we are not generally allowed 
> to introduce extra moves of such objects. So even if you normally want to 
> return values in the alloca AS as an optimization, you do need to make an 
> exception for non-trivial C++ objects.
> 
> Unrelated note: I think most of the targets can just hardcode that they use 
> AS 0 for alloca and default AS; no need to query for that all over the place.

I think that calling it a "rule" gives it far too much credit, since it's just 
an artifact of wanting to avoid tinkering with current behaviour:) Previously 
`getIndirect` (and the convenience variant that forwards to it) would not have 
yielded an address space for the returned value, so this is not used at the 
moment anywhere / no extant caller would care about it being there / using it. 
Thus, args using the default AS is just the equivalent to hardcoding 0, now 
that the interface takes an AS. I am opposed to hardcoding 0 in general, and 
there were objections in this very review to the notion, hence using default. 
TL;DR, for arguments this is just maintaining current implicit behaviour whilst 
adopting the new interfaces, it's not trying to introduce anything new, and 
should really be NFC.

In what regards the return value, whilst I appreciate the theoretical 
possibility, I will note that with how Clang works today we will never 
practically get a non `alloca`d return value, as far as I can tell. Moving away 
from that to the more general case you describe would require more work (in 
Clang itself), probably actually encoding the rule you describe, leveraging the 
interfaces we're adding with this change. IMHO, as currently written, we're 
just reflecting what Clang does for indirect returns, and adding interfaces 
that can be used to improve this in the future.

Overall, if this has ended up doing too much, I'm happy to shrink it to merely 
addressing the immediate concern for AMDGPU + `sret`, and we can fork out the 
wider conversation into a separate PR. I would like to at least address the 
AMDGPU part of the problem as the current state of affairs creates some 
challenges.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits

https://github.com/rjmccall edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits


@@ -225,7 +225,9 @@ ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, 
bool Variadic,
 // Records with non-trivial destructors/copy-constructors should not be
 // passed by value.
 if (auto RAA = getRecordArgABI(Ty, getCXXABI()))
-  return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
+  return getNaturalAlignIndirect(
+  Ty, getContext().getTargetAddressSpace(LangAS::Default),
+  RAA == CGCXXABI::RAA_DirectInMemory);

rjmccall wrote:

AMDGPU seems to generally use the private AS for indirect arguments; are you 
intentionally using the default AS for non-POD C++ arguments?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits

https://github.com/rjmccall requested changes to this pull request.

I'm fine with how you're handling the address spaces for now.

I'd like to talk about the rule you're implementing, though.  It looks like 
it's supposed to be:
- return values always use the alloca AS
- arguments always use the default AS
- whether something is indirect because it's non-POD or simply too big to fit 
in registers doesn't make a difference
That's a surprising rule; in fact, it's the exact opposite of the rule I would 
expect.

Indirect arguments are always true temporaries. The caller has total control 
over where to allocate and initialize the temporary, and it has very little 
reason to not always use the stack. So there's no reason for the ABI to not 
specify that the argument pointer is passed in the alloca AS.

Return values, in contrast, can be used to directly initialize all sorts of 
different memory, not just objects on the stack. So the ABI should probably be 
to pass as generic a pointer as the target supports. Moreover, while passing a 
restricted pointer in C is okay because we can always use pass a temporary and 
then relocate the object after the call, the same is not true for types that 
are non-trivial to copy in C++ — we are not generally allowed to introduce 
extra moves of such objects. So if you want to return values in the alloca AS 
in general, you do need to make an exception for non-trivial C++ objects.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits

https://github.com/rjmccall edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits


@@ -800,7 +800,9 @@ static ABIArgInfo classifyExpandedType(SwiftAggLowering 
&lowering,
   if (lowering.empty()) {
 return ABIArgInfo::getIgnore();
   } else if (lowering.shouldPassIndirectly(forReturn)) {
-return ABIArgInfo::getIndirect(alignmentForIndirect, /*byval*/ false);
+return ABIArgInfo::getIndirect(alignmentForIndirect,
+   /*AddrSpace*/ 0,
+   /*byval*/ false);

rjmccall wrote:

Swift should use the general rules for the target.  Note that `forReturn` 
indicates whether we have an argument (false) or a return value (true).

This site is for types that are passed indirectly because they're too large; 
the site below is for types that are non-trivial.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-10 Thread John McCall via cfe-commits

https://github.com/rjmccall edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-09 Thread Alex Voicu via cfe-commits


@@ -105,6 +105,11 @@ void AMDGPUABIInfo::computeInfo(CGFunctionInfo &FI) const {
   if (!getCXXABI().classifyReturnType(FI))
 FI.getReturnInfo() = classifyReturnType(FI.getReturnType());
 
+  // srets / indirect returns are unconditionally in the alloca AS.
+  if (FI.getReturnInfo().isIndirect())
+FI.getReturnInfo().setIndirectAddrSpace(
+getDataLayout().getAllocaAddrSpace());

AlexVlx wrote:

@rjmccall are you OK with the current solution (@arsenm  appears to be)?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-05 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-05 Thread Alex Voicu via cfe-commits


@@ -1350,7 +1350,7 @@ bool ItaniumCXXABI::classifyReturnType(CGFunctionInfo 
&FI) const {
   // If C++ prohibits us from making a copy, return by address.
   if (!RD->canPassInRegisters()) {
 auto Align = CGM.getContext().getTypeAlignInChars(FI.getReturnType());
-FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, /*ByVal=*/false);
+FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, 0, /*ByVal=*/false);

AlexVlx wrote:

I actually went ahead and bit the bullet, switching indirect returns to the 
AllocaAS (this is at least initially covered by tests here), and leaving other 
uses of indirect using LangAS::Default. I'm going to have to revisit this 
around indirect args, but I wasn't brave enough to go and add even more noise 
to what is already a chunky patch.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-04 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 01/11] `sret` args should always point to the `alloca` AS, so
 we can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind wri

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-04 Thread Matt Arsenault via cfe-commits


@@ -1350,7 +1350,7 @@ bool ItaniumCXXABI::classifyReturnType(CGFunctionInfo 
&FI) const {
   // If C++ prohibits us from making a copy, return by address.
   if (!RD->canPassInRegisters()) {
 auto Align = CGM.getContext().getTypeAlignInChars(FI.getReturnType());
-FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, /*ByVal=*/false);
+FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, 0, /*ByVal=*/false);

arsenm wrote:

I'm assuming these should default to DL.getAllocaAddrSpace but I guess that's a 
separate change that calls for new test coverage 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-04 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-12-04 Thread Matt Arsenault via cfe-commits


@@ -105,6 +105,11 @@ void AMDGPUABIInfo::computeInfo(CGFunctionInfo &FI) const {
   if (!getCXXABI().classifyReturnType(FI))
 FI.getReturnInfo() = classifyReturnType(FI.getReturnType());
 
+  // srets / indirect returns are unconditionally in the alloca AS.
+  if (FI.getReturnInfo().isIndirect())
+FI.getReturnInfo().setIndirectAddrSpace(
+getDataLayout().getAllocaAddrSpace());

arsenm wrote:

It's also ok to just hardcode to AMDGPUAS::PRIVATE_ADDRESS

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-24 Thread Alex Voicu via cfe-commits


@@ -5158,14 +5155,17 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the

AlexVlx wrote:

Done.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-24 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 01/11] `sret` args should always point to the `alloca` AS, so
 we can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind wri

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits


@@ -1350,7 +1350,7 @@ bool ItaniumCXXABI::classifyReturnType(CGFunctionInfo 
&FI) const {
   // If C++ prohibits us from making a copy, return by address.
   if (!RD->canPassInRegisters()) {
 auto Align = CGM.getContext().getTypeAlignInChars(FI.getReturnType());
-FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, /*ByVal=*/false);
+FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, 0, /*ByVal=*/false);

AlexVlx wrote:

We could use the target equivalent of `LangAS::Default`, which would at least 
partially cover one of @rjmccall's suggestions.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits


@@ -296,18 +296,25 @@ void AggExprEmitter::withReturnValueSlot(
  (RequiresDestruction && Dest.isIgnored());
 
   Address RetAddr = Address::invalid();
-  RawAddress RetAllocaAddr = RawAddress::invalid();
 
   EHScopeStack::stable_iterator LifetimeEndBlock;
   llvm::Value *LifetimeSizePtr = nullptr;
   llvm::IntrinsicInst *LifetimeStartInst = nullptr;
   if (!UseTemp) {
-RetAddr = Dest.getAddress();
+// It is possible for the existing slot we are using directly to have been
+// allocated in the correct AS for an indirect return, and then cast to
+// the default AS (this is the behaviour of CreateMemTemp), however we know
+// that the return address is expected to point to the uncasted AS, hence 
we
+// strip possible pointer casts here.

AlexVlx wrote:

Yes, Dest could have been allocated via simple `CreateMemTemp`, which would 
then cast from the AllocaAS to the DefaultAS. As the code is currently 
structured I didn't see a way to disambiguate between allocating for a return 
vs allocating for a "normal" aggregate, so handling it here seemed like the 
only way to do it tidily.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits


@@ -5158,14 +5155,17 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the

AlexVlx wrote:

Forgot to remove this, apologies.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Matt Arsenault via cfe-commits


@@ -5158,14 +5155,17 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the

arsenm wrote:

Comment false now that this uses CreateMemTempWithoutCast?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Matt Arsenault via cfe-commits


@@ -5389,11 +5389,22 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// The only plausible mismatch here would be for pointer address 
spaces,
+// which can happen e.g. when passing a sret arg that is in the 
AllocaAS
+// to a function that takes a pointer to and argument in the DefaultAS.
+// We assume that the target has a reasonable mapping for the DefaultAS
+// (it can be casted to from incoming specific ASes), and insert an AS
+// cast to address the mismatch.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  assert(V->getType()->isPointerTy() && "Only pointers can mismatch!");
+  auto FormalAS = CallInfo.arguments()[ArgNo]
+  .type.getQualifiers()
+  .getAddressSpace();
+  auto ActualAS = I->Ty.getAddressSpace();
+  V = getTargetHooks().performAddrSpaceCast(
+  *this, V, ActualAS, FormalAS, 
IRFuncTy->getParamType(FirstIRArg));

arsenm wrote:

Not sure why a target hook is needed to just insert an addrspacecast, but this 
seems to be prior art 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Matt Arsenault via cfe-commits


@@ -1350,7 +1350,7 @@ bool ItaniumCXXABI::classifyReturnType(CGFunctionInfo 
&FI) const {
   // If C++ prohibits us from making a copy, return by address.
   if (!RD->canPassInRegisters()) {
 auto Align = CGM.getContext().getTypeAlignInChars(FI.getReturnType());
-FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, /*ByVal=*/false);
+FI.getReturnInfo() = ABIArgInfo::getIndirect(Align, 0, /*ByVal=*/false);

arsenm wrote:

Comment parameter name. Maybe it's possible to do better than just hardcoded 0? 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Matt Arsenault via cfe-commits


@@ -296,18 +296,25 @@ void AggExprEmitter::withReturnValueSlot(
  (RequiresDestruction && Dest.isIgnored());
 
   Address RetAddr = Address::invalid();
-  RawAddress RetAllocaAddr = RawAddress::invalid();
 
   EHScopeStack::stable_iterator LifetimeEndBlock;
   llvm::Value *LifetimeSizePtr = nullptr;
   llvm::IntrinsicInst *LifetimeStartInst = nullptr;
   if (!UseTemp) {
-RetAddr = Dest.getAddress();
+// It is possible for the existing slot we are using directly to have been
+// allocated in the correct AS for an indirect return, and then cast to
+// the default AS (this is the behaviour of CreateMemTemp), however we know
+// that the return address is expected to point to the uncasted AS, hence 
we
+// strip possible pointer casts here.

arsenm wrote:

Is this still true in this version? 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

Gentle ping.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-22 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 01/10] `sret` args should always point to the `alloca` AS, so
 we can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind wri

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-18 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 01/10] `sret` args should always point to the `alloca` AS, so
 we can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind wri

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-15 Thread Alex Voicu via cfe-commits


@@ -5390,11 +5390,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  auto IRTy = IRFuncTy->getParamType(FirstIRArg);
+  auto MaybeSRetArg = dyn_cast_or_null(V);
+  if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr())
+V = Builder.CreateAddrSpaceCast(V, IRTy);

AlexVlx wrote:

OK, I *think* that I've found a possibly acceptable middle ground (both for 
this and your other objection). Note that I am not rejecting the fact that we 
probably want `LangAS`es threaded through the indirect interfaces that deal 
with ASes, but it gives me a bit of trepidation to do it as part of this PR. 
I'd prefer to wrap this up, as it blocks some other work, and then open a 
separate PR/discussion around re-doing the interfaces.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-15 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 01/10] `sret` args should always point to the `alloca` AS, so
 we can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind wri

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-15 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/9] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread Alex Voicu via cfe-commits


@@ -5159,16 +5156,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.

AlexVlx wrote:

It's not really blind, this is actually captured in current tests e.g. 
`CodeGen/sret.c`, please see: . This 
currently works because `sret` gets arbitrarily placed in the default AS, 
switching it over to anything but will break it. This happens when we receive a 
pre-`alloca`ed return value slot, which gets created in 
`AggExprEmitter::withReturnValueSlot` iff we cannot elide the temporary. This 
uses `CreateMemTemp` which inserts a cast to the default AS. An alternative 
would be to instead use `CreateMemTempWithoutCast`.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread John McCall via cfe-commits


@@ -5159,16 +5156,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.

rjmccall wrote:

Sorry, what?  It seems really wrong to be blindly stripping pointer casts here. 
 Can you explain what code pattern is leading to us not having a pointer in the 
right address space?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread John McCall via cfe-commits


@@ -5390,11 +5390,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  auto IRTy = IRFuncTy->getParamType(FirstIRArg);
+  auto MaybeSRetArg = dyn_cast_or_null(V);
+  if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr())
+V = Builder.CreateAddrSpaceCast(V, IRTy);

rjmccall wrote:

This is the prevailing existing practice in Clang CodeGen; you'll notice we do 
the same thing in `CreateTempAlloca`.  We are trying to allow targets to 
completely own the lowering of address spaces to IR.  The idea is that targets 
may want to distinguish address spaces in the frontend without distinguishing 
them in the backend, or they may decide that they need the address space 
conversion operation to be more complex than a simple IR `addrspacecast`.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread Alex Voicu via cfe-commits


@@ -5390,11 +5390,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  auto IRTy = IRFuncTy->getParamType(FirstIRArg);
+  auto MaybeSRetArg = dyn_cast_or_null(V);
+  if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr())
+V = Builder.CreateAddrSpaceCast(V, IRTy);

AlexVlx wrote:

That's fair, but I'm not entirely sure that isn't simply excessive here - we 
already have the types, and the only mismatch for `sret` can be in the AS, I 
believe; reverting to LangAS from target ASes seems a bit roundabout. I think 
@arsenm had a related objection to this cast being unconditional, which I 
haven't handled.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-14 Thread John McCall via cfe-commits


@@ -5390,11 +5390,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  auto IRTy = IRFuncTy->getParamType(FirstIRArg);
+  auto MaybeSRetArg = dyn_cast_or_null(V);
+  if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr())
+V = Builder.CreateAddrSpaceCast(V, IRTy);

rjmccall wrote:

As a general rule, we try to use the target hook to perform address-space 
conversions.  That target hook is expressed in terms of AST address spaces, 
which is one reason I think we need to thread a LangAS through.  If we need to 
do the same for the other indirect cases, so be it.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-13 Thread Alex Voicu via cfe-commits

AlexVlx wrote:

Gentle ping - I'd like to progress this in one form or another (target hook or 
extending indirect to carry an AS). If there are other ideas / strong 
preferences / over-my-dead-body objections, they'd be more than welcome.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-13 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/8] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-07 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/8] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-06 Thread Alex Voicu via cfe-commits


@@ -1780,6 +1780,14 @@ class TargetInfo : public TransferrableTargetInfo,
 return 0;
   }
 
+  /// \returns Target specific address space for indirect (e.g. sret) 
arguments.
+  /// If such an address space exists, it must be convertible to and from the
+  /// alloca address space. If it does not, std::nullopt is returned and the
+  /// alloca address space will be used.
+  virtual std::optional getIndirectArgAddressSpace() const {

AlexVlx wrote:

Err, I forgot to delete this. RE: `unsigned` vs `LangAS`, I used the former for 
symmetry with other interfaces. I agree with @rjmccall that `LangAS` would be 
safer / would make more sense, however `getIndirectAliased` already uses 
numeric vs typed.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-06 Thread Matt Arsenault via cfe-commits


@@ -1780,6 +1780,14 @@ class TargetInfo : public TransferrableTargetInfo,
 return 0;
   }
 
+  /// \returns Target specific address space for indirect (e.g. sret) 
arguments.
+  /// If such an address space exists, it must be convertible to and from the
+  /// alloca address space. If it does not, std::nullopt is returned and the
+  /// alloca address space will be used.
+  virtual std::optional getIndirectArgAddressSpace() const {

arsenm wrote:

This also shouldn't be optional. There always must be a definitive IR address 
space.

Also I'm not sure I follow why this is still necessary if you've modified 
getIndirect to carry the address space. RetAI should have this info now? 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-06 Thread John McCall via cfe-commits


@@ -1780,6 +1780,14 @@ class TargetInfo : public TransferrableTargetInfo,
 return 0;
   }
 
+  /// \returns Target specific address space for indirect (e.g. sret) 
arguments.
+  /// If such an address space exists, it must be convertible to and from the
+  /// alloca address space. If it does not, std::nullopt is returned and the
+  /// alloca address space will be used.
+  virtual std::optional getIndirectArgAddressSpace() const {

rjmccall wrote:

I would expect this to be a `LangAS`, since that's what our address-space 
conversion lowerings are generally expressed in terms of.  This also has the 
advantage of avoiding a *lot* of heartache with implicit conversions around 
`ABIInfo::getIndirect`, since `LangAS` is a scoped enum.  And `LangAS::Default` 
is a much more reasonable default argument for things like 
`ABIArgInfo::getIndirect` than IR addrspace 0.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-05 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/5] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-04 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/5] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-03 Thread John McCall via cfe-commits


@@ -1672,10 +1672,11 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) 
{
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+auto AddressSpace = CGM.getTarget().getIndirectArgAddressSpace();

rjmccall wrote:

That's what I was thinking, yeah.  There should be plenty of space for that 
without inflating `ABIInfo`, right?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-02 Thread Alex Voicu via cfe-commits


@@ -1672,10 +1672,11 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) 
{
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+auto AddressSpace = CGM.getTarget().getIndirectArgAddressSpace();

AlexVlx wrote:

Sadly no, that's not usable here as is, that's only for `byref` args 
(IndirectAliased). I do wonder if we should extend Indirect to also carry an 
AS, maybe that's the natural solution here.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Matt Arsenault via cfe-commits


@@ -1672,10 +1672,11 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) 
{
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+auto AddressSpace = CGM.getTarget().getIndirectArgAddressSpace();

arsenm wrote:

I would expect this to come from the ABIArgInfo/retAI, for the specific value 
and not a new side hook. Actually, is the address space already correct in 
retAI.getIndirectAddrSpace? 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Alex Voicu via cfe-commits


@@ -23,8 +25,10 @@ X Test()
   // sret argument.
   // CHECK-CXX98: call void @_ZN1XC1ERKS_(
   // CHECK-CXX11: call void @_ZN1XC1EOS_(
+  // CHECK-CXX11-NONZEROALLOCAAS: call void @_ZN1XC1EOS_(

AlexVlx wrote:

Made an attempt to.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/5] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/4] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx updated 
https://github.com/llvm/llvm-project/pull/114062

>From d2d2d3d5db3f639aab178f9ca9a20db2842d2b65 Mon Sep 17 00:00:00 2001
From: Alex Voicu 
Date: Tue, 29 Oct 2024 14:20:44 +
Subject: [PATCH 1/3] `sret` args should always point to the `alloca` AS, so we
 can use that.

---
 clang/lib/CodeGen/CGCall.cpp   | 15 ---
 clang/test/CodeGen/partial-reinitialization2.c |  4 ++--
 clang/test/CodeGen/sret.c  | 11 +++
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..56acfae7ae9e51 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1672,8 +1672,7 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) {
 
   // Add type for sret argument.
   if (IRFunctionArgs.hasSRetArg()) {
-QualType Ret = FI.getReturnType();
-unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret);
+unsigned AddressSpace = CGM.getDataLayout().getAllocaAddrSpace();
 ArgTypes[IRFunctionArgs.getSRetArgNo()] =
 llvm::PointerType::get(getLLVMContext(), AddressSpace);
   }
@@ -5145,7 +5144,6 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // If the call returns a temporary with struct return, create a temporary
   // alloca to hold the result, unless one is given to us.
   Address SRetPtr = Address::invalid();
-  RawAddress SRetAlloca = RawAddress::invalid();
   llvm::Value *UnusedReturnSizePtr = nullptr;
   if (RetAI.isIndirect() || RetAI.isInAlloca() || RetAI.isCoerceAndExpand()) {
 // For virtual function pointer thunks and musttail calls, we must always
@@ -5159,16 +5157,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 } else if (!ReturnValue.isNull()) {
   SRetPtr = ReturnValue.getAddress();
 } else {
-  SRetPtr = CreateMemTemp(RetTy, "tmp", &SRetAlloca);
+  SRetPtr = CreateMemTempWithoutCast(RetTy, "tmp");
   if (HaveInsertPoint() && ReturnValue.isUnused()) {
 llvm::TypeSize size =
 CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(RetTy));
-UnusedReturnSizePtr = EmitLifetimeStart(size, SRetAlloca.getPointer());
+UnusedReturnSizePtr = EmitLifetimeStart(size, 
SRetPtr.getBasePointer());
   }
 }
 if (IRFunctionArgs.hasSRetArg()) {
+  // If the caller allocated the return slot, it is possible that the
+  // alloca was AS casted to the default as, so we ensure the cast is
+  // stripped before binding to the sret arg, which is in the allocaAS.
   IRCallArgs[IRFunctionArgs.getSRetArgNo()] =
-  getAsNaturalPointerTo(SRetPtr, RetTy);
+  getAsNaturalPointerTo(SRetPtr, RetTy)->stripPointerCasts();
 } else if (RetAI.isInAlloca()) {
   Address Addr =
   Builder.CreateStructGEP(ArgMemory, RetAI.getInAllocaFieldIndex());
@@ -5740,7 +5741,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
   // pop this cleanup later on. Being eager about this is OK, since this
   // temporary is 'invisible' outside of the callee.
   if (UnusedReturnSizePtr)
-pushFullExprCleanup(NormalEHLifetimeMarker, SRetAlloca,
+pushFullExprCleanup(NormalEHLifetimeMarker, SRetPtr,
  UnusedReturnSizePtr);
 
   llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
diff --git a/clang/test/CodeGen/partial-reinitialization2.c 
b/clang/test/CodeGen/partial-reinitialization2.c
index e709c1d4ad1ee1..7949a69555031e 100644
--- a/clang/test/CodeGen/partial-reinitialization2.c
+++ b/clang/test/CodeGen/partial-reinitialization2.c
@@ -91,8 +91,8 @@ void test5(void)
 // CHECK-LABEL: test6
 void test6(void)
 {
-  // CHECK: [[LP:%[a-z0-9]+]] = getelementptr{{.*}}%struct.LLP2P2, ptr{{.*}}, 
i32 0, i32 0
-  // CHECK: call {{.*}}get456789(ptr {{.*}}[[LP]])
+  // CHECK: [[VAR:%[a-z0-9]+]] = alloca
+  // CHECK: call {{.*}}get456789(ptr {{.*}}sret{{.*}} [[VAR]])
 
   // CHECK: [[CALL:%[a-z0-9]+]] = call {{.*}}@get235()
   // CHECK: store{{.*}}[[CALL]], {{.*}}[[TMP0:%[a-z0-9.]+]]
diff --git a/clang/test/CodeGen/sret.c b/clang/test/CodeGen/sret.c
index 6d905e89b2c6fd..3b4914f29d2bfe 100644
--- a/clang/test/CodeGen/sret.c
+++ b/clang/test/CodeGen/sret.c
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -Wno-strict-prototypes -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -Wno-strict-prototypes -triple amdgcn-amd-amdhsa 
-emit-llvm -o - | FileCheck --check-prefix=NONZEROALLOCAAS %s
 
 struct abc {
  long a;
@@ -6,18 +7,28 @@ struct abc {
  long c;
  long d;
  long e;
+ long f;
+ long g;
+ long h;
+ long i;
+ long j;
 };
 
 struct abc foo1(void);
 // CHECK-DAG: declare {{.*}} @foo1(ptr dead_on_unwind writable 
sret(%struct.abc)
+// NONZEROALLOCAAS-DAG: declare {{.*}} @foo1(ptr addrspace(5) dead_on_unwind 
writable sret(%struct.abc)
 struct abc foo2();
 // CHECK-DAG: declare {{.*}} @foo2(ptr dead_on_unwind writa

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-30 Thread John McCall via cfe-commits

rjmccall wrote:

I agree that it doesn't meaningfully come from a source-level type and should 
be specified by the target lowering.  I just want to make sure we write the new 
code in a way that plausibly supports the target ABI specifying something other 
than "it's always in the alloca AS".  Can we put the required AS in the 
`ABIInfo` for the result/parameter and then just make a best effort to perform 
AS conversions in all the places necessary?  (It'd be a requirement that the 
alloca AS can be promoted into that AS, of course.) And then, of course, if we 
miss a few then some target will have some bugs to fix, but we'll at least have 
tried.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Alex Voicu via cfe-commits

https://github.com/AlexVlx commented:

> Is this a target-independent decision?  I could certainly imagine a target 
> with a generic AS wanting to specify that indirect return addresses (and 
> maybe even parameters?) should be in that rather than the alloca AS; among 
> other things, it would allow return values to be used to initialize objects 
> in arbitrary memory.  In C and C-derived languages, maybe that just avoids a 
> memcpy, but in C++ it avoids a potentially non-trivial move. 

This is a good point that I had not fully considered; I'll (weakly) push back 
by pointing out that if a target wanted to do that, it'd have to change quite a 
few things because now we seem to pretty much `alloca` storage for `sret` args 
by default, and thus we just end up with some extra spurious AS casts / 
reliance on AS inference doing the right thing. However, your question made me 
go and look at C++ use cases, which unearthed a pretty big (general) challenge 
with this change: for cases where the alloca AS and the default AS differ, we 
can end up trying to pass an `sret` arg directly to a callee that takes a 
pointer to the default AS. I've added a test that exposes this and a potential 
fix, but I'm not hyper keen on the latter - unfortunately, I see no better way 
to deal with this.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits


@@ -23,8 +25,10 @@ X Test()
   // sret argument.
   // CHECK-CXX98: call void @_ZN1XC1ERKS_(
   // CHECK-CXX11: call void @_ZN1XC1EOS_(
+  // CHECK-CXX11-NONZEROALLOCAAS: call void @_ZN1XC1EOS_(

arsenm wrote:

Can you add more context checks here?

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits


@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.

arsenm wrote:

The cast is probably unavoidable here. You need to support flat addressing for 
any of c++ to work anyway, so that's fine for the GPU case 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Alex Voicu via cfe-commits


@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.

AlexVlx wrote:

No this is not the `inalloca` case, it's the case when you have e.g. a C++ move 
ctor (`Foo(Foo&&)`) which in IR expands into a function taking two pointers to 
the default AS (`this` and a pointer to the moved from arg). If you're moving 
into the `sret` arg, you try to bind this to `this`, and you end up here. See 
the `no-elide-constructors` test that's part of this PR. 

Re: not inserting the cast, you're right it's probably not correct to insert it 
blindly. I *think* the only thing we can safely handle is if the mismatched arg 
is a pointer to the default AS, and should error out otherwise. The only 
mechanism we have for creating temporaries is `alloca`ing them, and it's not 
even clear what it'd mean to create a temporary in some arbitrary AS. This is 
probably fine though because I think the only offenders here would be the C++ 
ctors (perhaps member functions in general, at worst), as their IR signature is 
derived from the default AS, as there's no fixed argument type to inform it.

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm edited 
https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits


@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.
 if (FirstIRArg < IRFuncTy->getNumParams() &&
-V->getType() != IRFuncTy->getParamType(FirstIRArg))
-  V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+V->getType() != IRFuncTy->getParamType(FirstIRArg)) {
+  auto IRTy = IRFuncTy->getParamType(FirstIRArg);
+  auto MaybeSRetArg = dyn_cast_or_null(V);
+  if (MaybeSRetArg && MaybeSRetArg->hasStructRetAttr())
+V = Builder.CreateAddrSpaceCast(V, IRTy);
+  else
+V = Builder.CreateBitCast(V, IRTy);

arsenm wrote:

I'm assuming this is a pointer bitcast, which isn't necessary anymore 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits


@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 V->getType()->isIntegerTy())
   V = Builder.CreateZExt(V, ArgInfo.getCoerceToType());
 
-// If the argument doesn't match, perform a bitcast to coerce it.  This
-// can happen due to trivial type mismatches.
+// If the argument doesn't match, we are either trying to pass an
+// alloca-ed sret argument directly, and the alloca AS does not match
+// the default AS, case in which we AS cast it, or we have a trivial
+// type mismatch, and thus perform a bitcast to coerce it.

arsenm wrote:

Inserting the cast might not be correct. Might need to create another temporary 
with the other address space, and memcpy.

Is this only the inalloca case? That's the weird windows only thing? 

https://github.com/llvm/llvm-project/pull/114062
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >