| Issue |
55417
|
| Summary |
[AArch64] Crash in aarch64 backend when compiling vsri/vcvtfxs2fp intrinsics in certain pattern
|
| Labels |
|
| Assignees |
|
| Reporter |
Benjins
|
Using ARM Neon intrinsics in a certain pattern causes the compiler backend to crash on an invalid dag, and in debug mode causes an assertion earlier in the process
Reduced original C++ that triggered the issue when compiled with clang on armv8 and -O1:
(Godbolt link: https://godbolt.org/z/MEvPj1EWf )
```
#include <arm_neon.h>
float64_t do_stuff(const double* dVals) {
float64x1_t var0 = vld1_f64((const float64_t*)&dVals[0]);
float64x1_t var1 = vrndi_f64(var0);
float64_t var2 = vget_lane_f64(var1, 0);
int64_t var3 = vcvtd_s64_f64(var2);
int64_t var4 = vsrid_n_s64(var3, var3, 1);
float64_t var5 = vcvtd_n_f64_s64(var4, 1);
return var5;
}
```
which in a Release build gives:
```
fatal error: error in backend: Cannot select: intrinsic %llvm.aarch64.neon.vcvtfxs2fp
```
[Full error log](https://github.com/llvm/llvm-project/files/8683239/2022_05_12_clang_arm_neon_backend_error.txt)
this can be reduced to the following IR:
(Godbolt link: https://godbolt.org/z/dGG6474P4 )
```
; Function Attrs: argmemonly mustprogress nofree nosync nounwind readonly willreturn uwtable
define dso_local noundef double @do_stuff(ptr nocapture noundef readnone %iVals, ptr nocapture noundef readnone %fVals, ptr nocapture noundef readonly %dVals) local_unnamed_addr #0 {
entry:
%arrayidx = getelementptr inbounds double, ptr %dVals, i64 16
%0 = load <1 x double>, ptr %arrayidx, align 8
%vrndi_v1.i = call <1 x double> @llvm.nearbyint.v1f64(<1 x double> %0) #3
%vget_lane = extractelement <1 x double> %vrndi_v1.i, i64 0
%vcvtd_s64_f64.i = call i64 @llvm.aarch64.neon.fcvtzs.i64.f64(double %vget_lane) #3
%1 = insertelement <1 x i64> poison, i64 %vcvtd_s64_f64.i, i64 0
%vsrid_n_s647 = call <1 x i64> @llvm.aarch64.neon.vsri.v1i64(<1 x i64> %1, <1 x i64> %1, i32 1)
%2 = extractelement <1 x i64> %vsrid_n_s647, i64 0
%vcvtd_n_f64_s64 = call double @llvm.aarch64.neon.vcvtfxs2fp.f64.i64(i64 %2, i32 1)
ret double %vcvtd_n_f64_s64
}
; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare <1 x i64> @llvm.aarch64.neon.vsri.v1i64(<1 x i64>, <1 x i64>, i32) #1
; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare double @llvm.aarch64.neon.vcvtfxs2fp.f64.i64(i64, i32) #1
; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone speculatable willreturn
declare <1 x double> @llvm.nearbyint.v1f64(<1 x double>) #2
; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare i64 @llvm.aarch64.neon.fcvtzs.i64.f64(double) #1
```
The error about the vcvtfxs2fp intrinsic seems to be a downstream issue, since running llc in Debug mode gives the following assertion:
```
Assertion failed: Vec.getValueSizeInBits() == 128 && "unexpected vector size on extract_vector_elt!", file llvm-project\llvm\lib\Target\AArch64\AArch64ISelLowering.cpp, line 15025
```
at
```
> llc.exe!tryCombineFixedPointConvert(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI, llvm::SelectionDAG & DAG) Line 15024 C++
llc.exe!performIntrinsicCombine(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI, const llvm::AArch64Subtarget * Subtarget) Line 15999 C++
llc.exe!llvm::AArch64TargetLowering::PerformDAGCombine(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI) Line 18773 C++
llc.exe!`anonymous namespace'::DAGCombiner::combine(llvm::SDNode * N) Line 1787 C++
llc.exe!`anonymous namespace'::DAGCombiner::Run(llvm::CombineLevel AtLevel) Line 1574 C++
llc.exe!llvm::SelectionDAG::Combine(llvm::CombineLevel Level, llvm::AAResults * AA, llvm::CodeGenOpt::Level OptLevel) Line 24699 C++
llc.exe!llvm::SelectionDAGISel::CodeGenAndEmitDAG() Line 917 C++
```
The DAG at that point:
```
SelectionDAG has 19 nodes:
t8: v1i64 = BUILD_VECTOR Constant:i64<0>
t0: ch = EntryToken
t19: f64 = fnearbyint ConstantFP:f64<0.000000e+00>
t20: v1f64 = BUILD_VECTOR t19
t5: f64 = extract_vector_elt t20, Constant:i64<0>
t7: i64 = llvm.aarch64.neon.fcvtzs TargetConstant:i64<474>, t5
t9: v1i64 = insert_vector_elt t8, t7, Constant:i64<0>
t12: v1i64 = llvm.aarch64.neon.vsri TargetConstant:i64<630>, t8, t9, Constant:i32<1>
t13: i64 = extract_vector_elt t12, Constant:i64<0>
t15: f64 = llvm.aarch64.neon.vcvtfxs2fp TargetConstant:i64<626>, t13, Constant:i32<1>
t17: ch,glue = CopyToReg t0, Register:f64 $d0, t15
t18: ch = AArch64ISD::RET_FLAG t17, Register:f64 $d0, t17:1
```
which does initially appear to be invalid due to extract_vector_elt's args being only 64-bit
I have verified that this still repros on the latest trunk (b1aed14bfea07508e4b9d864168c1ae6b5b5c665)
For context: this code was produced by a fuzzer to test codegen, it was not manually written
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs