Issue 55417
Summary [AArch64] Crash in aarch64 backend when compiling vsri/vcvtfxs2fp intrinsics in certain pattern
Labels
Assignees
Reporter Benjins
    Using ARM Neon intrinsics in a certain pattern causes the compiler backend to crash on an invalid dag, and in debug mode causes an assertion earlier in the process

Reduced original C++ that triggered the issue when compiled with clang on armv8 and -O1:
(Godbolt link: https://godbolt.org/z/MEvPj1EWf )
```
#include <arm_neon.h>
float64_t do_stuff(const double* dVals) {
	float64x1_t var0 = vld1_f64((const float64_t*)&dVals[0]);
	float64x1_t var1 = vrndi_f64(var0);
	float64_t var2 = vget_lane_f64(var1, 0);
	int64_t var3 = vcvtd_s64_f64(var2);
	int64_t var4 = vsrid_n_s64(var3, var3, 1);
	float64_t var5 = vcvtd_n_f64_s64(var4, 1);
	return var5;
}
```

which in a Release build gives:
```
fatal error: error in backend: Cannot select: intrinsic %llvm.aarch64.neon.vcvtfxs2fp
```

[Full error log](https://github.com/llvm/llvm-project/files/8683239/2022_05_12_clang_arm_neon_backend_error.txt)

this can be reduced to the following IR:
(Godbolt link: https://godbolt.org/z/dGG6474P4 )
```
; Function Attrs: argmemonly mustprogress nofree nosync nounwind readonly willreturn uwtable
define dso_local noundef double @do_stuff(ptr nocapture noundef readnone %iVals, ptr nocapture noundef readnone %fVals, ptr nocapture noundef readonly %dVals) local_unnamed_addr #0 {
entry:
  %arrayidx = getelementptr inbounds double, ptr %dVals, i64 16
  %0 = load <1 x double>, ptr %arrayidx, align 8
  %vrndi_v1.i = call <1 x double> @llvm.nearbyint.v1f64(<1 x double> %0) #3
  %vget_lane = extractelement <1 x double> %vrndi_v1.i, i64 0
  %vcvtd_s64_f64.i = call i64 @llvm.aarch64.neon.fcvtzs.i64.f64(double %vget_lane) #3
  %1 = insertelement <1 x i64> poison, i64 %vcvtd_s64_f64.i, i64 0
  %vsrid_n_s647 = call <1 x i64> @llvm.aarch64.neon.vsri.v1i64(<1 x i64> %1, <1 x i64> %1, i32 1)
  %2 = extractelement <1 x i64> %vsrid_n_s647, i64 0
  %vcvtd_n_f64_s64 = call double @llvm.aarch64.neon.vcvtfxs2fp.f64.i64(i64 %2, i32 1)
  ret double %vcvtd_n_f64_s64
}

; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare <1 x i64> @llvm.aarch64.neon.vsri.v1i64(<1 x i64>, <1 x i64>, i32) #1

; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare double @llvm.aarch64.neon.vcvtfxs2fp.f64.i64(i64, i32) #1

; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone speculatable willreturn
declare <1 x double> @llvm.nearbyint.v1f64(<1 x double>) #2

; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
declare i64 @llvm.aarch64.neon.fcvtzs.i64.f64(double) #1
```

The error about the vcvtfxs2fp intrinsic seems to be a downstream issue, since running llc in Debug mode gives the following assertion:
```
Assertion failed: Vec.getValueSizeInBits() == 128 && "unexpected vector size on extract_vector_elt!", file llvm-project\llvm\lib\Target\AArch64\AArch64ISelLowering.cpp, line 15025
```
at
```
>	llc.exe!tryCombineFixedPointConvert(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI, llvm::SelectionDAG & DAG) Line 15024	C++
 	llc.exe!performIntrinsicCombine(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI, const llvm::AArch64Subtarget * Subtarget) Line 15999	C++
 	llc.exe!llvm::AArch64TargetLowering::PerformDAGCombine(llvm::SDNode * N, llvm::TargetLowering::DAGCombinerInfo & DCI) Line 18773	C++
 	llc.exe!`anonymous namespace'::DAGCombiner::combine(llvm::SDNode * N) Line 1787	C++
 	llc.exe!`anonymous namespace'::DAGCombiner::Run(llvm::CombineLevel AtLevel) Line 1574	C++
 	llc.exe!llvm::SelectionDAG::Combine(llvm::CombineLevel Level, llvm::AAResults * AA, llvm::CodeGenOpt::Level OptLevel) Line 24699	C++
 	llc.exe!llvm::SelectionDAGISel::CodeGenAndEmitDAG() Line 917	C++
```

The DAG at that point:
```
SelectionDAG has 19 nodes:
  t8: v1i64 = BUILD_VECTOR Constant:i64<0>
    t0: ch = EntryToken
                  t19: f64 = fnearbyint ConstantFP:f64<0.000000e+00>
                t20: v1f64 = BUILD_VECTOR t19
              t5: f64 = extract_vector_elt t20, Constant:i64<0>
            t7: i64 = llvm.aarch64.neon.fcvtzs TargetConstant:i64<474>, t5
          t9: v1i64 = insert_vector_elt t8, t7, Constant:i64<0>
        t12: v1i64 = llvm.aarch64.neon.vsri TargetConstant:i64<630>, t8, t9, Constant:i32<1>
      t13: i64 = extract_vector_elt t12, Constant:i64<0>
    t15: f64 = llvm.aarch64.neon.vcvtfxs2fp TargetConstant:i64<626>, t13, Constant:i32<1>
  t17: ch,glue = CopyToReg t0, Register:f64 $d0, t15
  t18: ch = AArch64ISD::RET_FLAG t17, Register:f64 $d0, t17:1
```
which does initially appear to be invalid due to extract_vector_elt's args being only 64-bit

I have verified that this still repros on the latest trunk (b1aed14bfea07508e4b9d864168c1ae6b5b5c665)

For context: this code was produced by a fuzzer to test codegen, it was not manually written

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to