[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
kmclaughlin added inline comments. Comment at: llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll:31 +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.flogb.nxv2f64( %a, + %pg, Allen wrote: > hi, kmclaughlin: > Sorry for the naive question: > flogb is an unary instruction showed in assemble . Why shall we need %a as > an **input** operand in the instrinsic? can it be similar with > ``` > %a = call @llvm.aarch64.sve.flogb.nxv2f64( i1> %pg, %b) > ``` Hi @Allen, The first input to this intrinsic is the passthru, which contains the values used for inactive lanes of the predicate `%pg`. The inactive lanes can be set to zero, merged with separate vector or set to unknown. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
Allen added inline comments. Herald added a project: All. Comment at: llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll:31 +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.flogb.nxv2f64( %a, + %pg, hi, kmclaughlin: Sorry for the naive question: flogb is an unary instruction showed in assemble . Why shall we need %a as an **input** operand in the instrinsic? can it be similar with ``` %a = call @llvm.aarch64.sve.flogb.nxv2f64( %pg, %b) ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
This revision was automatically updated to reflect the committed changes. Closed by commit rG8881ac9c3986: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics (authored by kmclaughlin). Changed prior to commit: https://reviews.llvm.org/D70253?vs=229341&id=231886#toc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 Files: llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/lib/Target/AArch64/SVEInstrFormats.td llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-widening-mul-acc.ll llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll @@ -0,0 +1,191 @@ +; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s + +; +; FADDP +; + +define @faddp_f16( %pg, %a, %b) { +; CHECK-LABEL: faddp_f16: +; CHECK: faddp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @faddp_f32( %pg, %a, %b) { +; CHECK-LABEL: faddp_f32: +; CHECK: faddp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @faddp_f64( %pg, %a, %b) { +; CHECK-LABEL: faddp_f64: +; CHECK: faddp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMAXP +; + +define @fmaxp_f16( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f16: +; CHECK: fmaxp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fmaxp_f32( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f32: +; CHECK: fmaxp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @fmaxp_f64( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f64: +; CHECK: fmaxp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMAXNMP +; + +define @fmaxnmp_f16( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f16: +; CHECK: fmaxnmp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fmaxnmp_f32( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f32: +; CHECK: fmaxnmp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv4f32( %pg, + %a, + %b) + ret %out +} + +define @fmaxnmp_f64( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f64: +; CHECK: fmaxnmp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMINP +; + +define @fminp_f16( %pg, %a, %b) { +; CHECK-LABEL: fminp_f16: +; CHECK: fminp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fminp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fminp_f32( %pg, %a, %b) { +; CHECK-LABEL: fminp_f32: +; CHECK: fminp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fminp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @fminp_f64( %pg, %a
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
sdesmalen accepted this revision. sdesmalen added a comment. This revision is now accepted and ready to land. Thanks @kmclaughlin , LGTM. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + kmclaughlin wrote: > sdesmalen wrote: > > efriedma wrote: > > > kmclaughlin wrote: > > > > sdesmalen wrote: > > > > > I'd expect the `llvm_i32_ty` to be an immediate for these > > > > > instructions, right? If so you'll need to add `ImmArg` to the > > > > > list of properties. > > > > > > > > > Thanks for taking a look at this :) I tried your suggestion of adding > > > > ImmAr to the list of properties here but had some problems with it > > > > (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't > > > > think this is too much of an issue here as we have additional checks on > > > > the immediate with VectorIndexH32b, which ensures the immediate is in > > > > the correct range. > > > The point of immarg markings isn't to assist the backend; it's to ensure > > > IR optimizations don't break your intrinsic calls. > > The pattern is probably not matching because the immediate operand is a > > `TargetConstant` where the `AsmVectorIndexOpnd` derives from `ImmLeaf`, > > rather than `TImmLeaf` as introduced by D58232. > Thanks for the suggestion, this was the reason why the patterns were not > matching! As this also affects many of the existing intrinsics not added here > or in D70437, I would prefer to address this fully in a separate patch - do > you have objections to this? Okay, I'm happy with you want to make that change in a separate patch. It will also be needed for several of the other SVE intrinsics. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
kmclaughlin added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + sdesmalen wrote: > efriedma wrote: > > kmclaughlin wrote: > > > sdesmalen wrote: > > > > I'd expect the `llvm_i32_ty` to be an immediate for these instructions, > > > > right? If so you'll need to add `ImmArg` to the list of > > > > properties. > > > > > > > Thanks for taking a look at this :) I tried your suggestion of adding > > > ImmAr to the list of properties here but had some problems with it > > > (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't > > > think this is too much of an issue here as we have additional checks on > > > the immediate with VectorIndexH32b, which ensures the immediate is in the > > > correct range. > > The point of immarg markings isn't to assist the backend; it's to ensure IR > > optimizations don't break your intrinsic calls. > The pattern is probably not matching because the immediate operand is a > `TargetConstant` where the `AsmVectorIndexOpnd` derives from `ImmLeaf`, > rather than `TImmLeaf` as introduced by D58232. Thanks for the suggestion, this was the reason why the patterns were not matching! As this also affects many of the existing intrinsics not added here or in D70437, I would prefer to address this fully in a separate patch - do you have objections to this? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
sdesmalen added inline comments. Herald added a reviewer: efriedma. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + efriedma wrote: > kmclaughlin wrote: > > sdesmalen wrote: > > > I'd expect the `llvm_i32_ty` to be an immediate for these instructions, > > > right? If so you'll need to add `ImmArg` to the list of properties. > > > > > Thanks for taking a look at this :) I tried your suggestion of adding > > ImmAr to the list of properties here but had some problems with it > > (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't > > think this is too much of an issue here as we have additional checks on the > > immediate with VectorIndexH32b, which ensures the immediate is in the > > correct range. > The point of immarg markings isn't to assist the backend; it's to ensure IR > optimizations don't break your intrinsic calls. The pattern is probably not matching because the immediate operand is a `TargetConstant` where the `AsmVectorIndexOpnd` derives from `ImmLeaf`, rather than `TImmLeaf` as introduced by D58232. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
efriedma added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + kmclaughlin wrote: > sdesmalen wrote: > > I'd expect the `llvm_i32_ty` to be an immediate for these instructions, > > right? If so you'll need to add `ImmArg` to the list of properties. > > > Thanks for taking a look at this :) I tried your suggestion of adding > ImmAr to the list of properties here but had some problems with it (i.e. > Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't think this > is too much of an issue here as we have additional checks on the immediate > with VectorIndexH32b, which ensures the immediate is in the correct range. The point of immarg markings isn't to assist the backend; it's to ensure IR optimizations don't break your intrinsic calls. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
kmclaughlin added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + sdesmalen wrote: > I'd expect the `llvm_i32_ty` to be an immediate for these instructions, > right? If so you'll need to add `ImmArg` to the list of properties. > Thanks for taking a look at this :) I tried your suggestion of adding ImmAr to the list of properties here but had some problems with it (i.e. Cannot select: intrinsic %llvm.aarch64.sve.fmlalb.lane). I don't think this is too much of an issue here as we have additional checks on the immediate with VectorIndexH32b, which ensures the immediate is in the correct range. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
sdesmalen added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:898 + llvm_i32_ty], +[IntrNoMem]>; + I'd expect the `llvm_i32_ty` to be an immediate for these instructions, right? If so you'll need to add `ImmArg` to the list of properties. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70253/new/ https://reviews.llvm.org/D70253 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D70253: [AArch64][SVE2] Implement remaining SVE2 floating-point intrinsics
kmclaughlin created this revision. kmclaughlin added reviewers: huntergr, sdesmalen, dancgr. Herald added subscribers: hiraditya, kristof.beyls, tschuett. Herald added a project: LLVM. Adds the following intrinsics: - faddp - fmaxp, fminp, fmaxnmp & fminnmp - fmlalb, fmlalt, fmlslb & fmlslt - flogb Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D70253 Files: llvm/include/llvm/IR/IntrinsicsAArch64.td llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td llvm/lib/Target/AArch64/SVEInstrFormats.td llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-int-binary-logarithm.ll llvm/test/CodeGen/AArch64/sve2-intrinsics-fp-widening-mul-acc.ll llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll Index: llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll === --- /dev/null +++ llvm/test/CodeGen/AArch64/sve2-intrinsics-non-widening-pairwise-arith.ll @@ -0,0 +1,191 @@ +; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve2 < %s | FileCheck %s + +; +; FADDP +; + +define @faddp_f16( %pg, %a, %b) { +; CHECK-LABEL: faddp_f16: +; CHECK: faddp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @faddp_f32( %pg, %a, %b) { +; CHECK-LABEL: faddp_f32: +; CHECK: faddp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @faddp_f64( %pg, %a, %b) { +; CHECK-LABEL: faddp_f64: +; CHECK: faddp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.faddp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMAXP +; + +define @fmaxp_f16( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f16: +; CHECK: fmaxp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fmaxp_f32( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f32: +; CHECK: fmaxp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @fmaxp_f64( %pg, %a, %b) { +; CHECK-LABEL: fmaxp_f64: +; CHECK: fmaxp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMAXNMP +; + +define @fmaxnmp_f16( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f16: +; CHECK: fmaxnmp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fmaxnmp_f32( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f32: +; CHECK: fmaxnmp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv4f32( %pg, + %a, + %b) + ret %out +} + +define @fmaxnmp_f64( %pg, %a, %b) { +; CHECK-LABEL: fmaxnmp_f64: +; CHECK: fmaxnmp z0.d, p0/m, z0.d, z1.d +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fmaxnmp.nxv2f64( %pg, + %a, + %b) + ret %out +} + +; +; FMINP +; + +define @fminp_f16( %pg, %a, %b) { +; CHECK-LABEL: fminp_f16: +; CHECK: fminp z0.h, p0/m, z0.h, z1.h +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fminp.nxv8f16( %pg, + %a, + %b) + ret %out +} + +define @fminp_f32( %pg, %a, %b) { +; CHECK-LABEL: fminp_f32: +; CHECK: fminp z0.s, p0/m, z0.s, z1.s +; CHECK-NEXT: ret + %out = call @llvm.aarch64.sve.fminp.nxv4f32( %pg, +%a, +%b) + ret %out +} + +define @fminp_f64( %pg, %a, %b) { +; CHECK-LABEL: fminp_f64: +; CHECK: fmi