Issue 183265
Summary [AArch64][SVE] SVE load/store intrinsics fail with “Calling a function with a bad signature!” for non-zero address space pointers
Labels new issue
Assignees
Reporter nikhil-m-k
    I’m encountering a codegen failure when SVE load/store intrinsics are used with pointers in non-zero address spaces. Running `interleaved-access` pass, which converts `llvm.vector.deinterleave4.nxv16f32 ` to SVE intrinsics, produces the following error:
`Calling a function with a bad signature!`
This happens when ptr addrspace(1) is passed to SVE load/store intrinsics as shown below in the reproducer test case.

This is because in `llvm/include/llvm/IR/IntrinsicsAArch64.td` the SVE load intrinsics such as `AdvSIMD_1Vec_PredLoad_Intrinsic`,  `AdvSIMD_2Vec_PredLoad_Intrinsic` etc are defined only to accept **`llvm_ptr_ty`** as the argument. 

However, their counterparts in the NEON intrinsics such as `AdvSIMD_1Vec_Load_Intrinsic`, `AdvSIMD_2Vec_Load_Intrinsic` are defined to accept **`llvm_anyptr_ty`** in their arguments.

- Was this restriction intentional?

- If not, would it be acceptable to generalize SVE intrinsics to accept arbitrary address spaces (e.g., via `llvm_anyptr_ty`) and update the AArch64 ISel lowering accordingly?

I’m happy to work on a patch and tests if this is considered a supported use case.

Reproducer:
```
; ModuleID = 'test.ll'
source_filename = "test.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "aarch64-unknown-linux-gnu"

declare ptr @malloc_fn(i64, i8)

define void @forward() local_unnamed_addr {
  %1 = call ptr @malloc_fn(i64 6656, i8 1)
  %2 = call ptr @malloc_fn(i64 6666, i8 1)
  %3 = addrspacecast ptr %2 to ptr addrspace(1)
  %lsr.iv16089 = addrspacecast ptr %1 to ptr addrspace(1)

  %promoted2396 = load float, ptr addrspace(1) %3, align 4
  %wide.load4241 = load <vscale x 4 x float>, ptr addrspace(1) %lsr.iv16089, align 4
  %wide.vec4243  = load <vscale x 16 x float>, ptr addrspace(1) %lsr.iv16089, align 4

  %strided.vec4244 =
    call { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }
      @llvm.vector.deinterleave4.nxv16f32(<vscale x 16 x float> %wide.vec4243)

  %4 = extractvalue { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> } %strided.vec4244, 0
  %5 = fmul <vscale x 4 x float> %wide.load4241, %4
  %6 = call float @llvm.vector.reduce.fadd.nxv4f32(float %promoted2396, <vscale x 4 x float> %5)
  %7 = call float @llvm.vector.reduce.fadd.nxv4f32(float %6, <vscale x 4 x float> %5)

  store float %7, ptr addrspace(1) %3, align 4
  ret void
}

declare { <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float>, <vscale x 4 x float> }
  @llvm.vector.deinterleave4.nxv16f32(<vscale x 16 x float>)

declare float @llvm.vector.reduce.fadd.nxv4f32(float, <vscale x 4 x float>)
```
Command:
`llvm-project/llvm-build/bin/opt -passes=interleaved-access -mtriple=aarch64-linux-gnu -mattr=+sve -S test.ll`
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to