Andrew Pinski <[email protected]> writes:
> __builtin_aarch64_im_lane_boundsi is known not to throw or call back into
> another
> function since it will either folded into an NOP or will produce a compiler
> error.
>
> This fixes the ICE by fixing the missed optimization. It does not fix the
> underlying
> issue with fold_marked_statements; which I filed as PR 117668.
>
> Built and tested for aarch64-linux-gnu.
>
> PR target/117665
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-builtins.cc
> (aarch64_init_simd_builtin_functions):
> Pass nothrow and leaf as attributes to aarch64_general_add_builtin for
> __builtin_aarch64_im_lane_boundsi.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/aarch64/lane-bound-1.C: New test.
> * gcc.target/aarch64/lane-bound-3.c: New test.
>
> Signed-off-by: Andrew Pinski <[email protected]>
Yeah, ok. I'm a bit nervous about making this builtin easier to optimise,
but the real fix for that of course is to do the checking in the frontend
(as for the new pragma-based approach). The C++ test is convincing that
(a) we do still emit the error for obvious dead code and (b) not marking
the function has a user-visible effect.
Thanks,
Richard
> ---
> gcc/config/aarch64/aarch64-builtins.cc | 6 ++++-
> .../g++.target/aarch64/lane-bound-1.C | 21 +++++++++++++++
> .../gcc.target/aarch64/lane-bound-3.c | 27 +++++++++++++++++++
> 3 files changed, 53 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/g++.target/aarch64/lane-bound-1.C
> create mode 100644 gcc/testsuite/gcc.target/aarch64/lane-bound-3.c
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc
> b/gcc/config/aarch64/aarch64-builtins.cc
> index b860e22f01f..e26ee323a2d 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -1482,10 +1482,14 @@ aarch64_init_simd_builtin_functions (bool
> called_from_pragma)
> size_type_node,
> intSI_type_node,
> NULL);
> + /* aarch64_im_lane_boundsi should be leaf and nothrow as it
> + is expanded as nop or will cause an user error. */
> + tree attrs = aarch64_add_attribute ("nothrow", NULL_TREE);
> + attrs = aarch64_add_attribute ("leaf", attrs);
> aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_LANE_CHECK]
> = aarch64_general_add_builtin ("__builtin_aarch64_im_lane_boundsi",
> lane_check_fpr,
> - AARCH64_SIMD_BUILTIN_LANE_CHECK);
> + AARCH64_SIMD_BUILTIN_LANE_CHECK, attrs);
> }
>
> for (i = 0; i < ARRAY_SIZE (aarch64_simd_builtin_data); i++, fcode++)
> diff --git a/gcc/testsuite/g++.target/aarch64/lane-bound-1.C
> b/gcc/testsuite/g++.target/aarch64/lane-bound-1.C
> new file mode 100644
> index 00000000000..cb3e99816a1
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/lane-bound-1.C
> @@ -0,0 +1,21 @@
> +// { dg-do compile }
> +// { dg-options "" }
> +#include <arm_neon.h>
> +
> +// vgetq_lane_u64 should not cause any
> +// exceptions to thrown so even at -O0
> +// removeme should have been removed.
> +void removeme()
> +__attribute__((error("nothrow")));
> +int _setjmp();
> +void hh(uint64x2_t c, int __b)
> +{
> + try {
> + vgetq_lane_u64(c, __b);
> + // { dg-error "must be a constant immediate" "" { target *-*-* } 0 }
> + } catch (...)
> + {
> + removeme(); // { dg-bogus "declared with attribute error" }
> + }
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c
> b/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c
> new file mode 100644
> index 00000000000..9e0dad372cb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/lane-bound-3.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* PR target/117665 */
> +/* __builtin_aarch64_im_lane_boundsi was causing an abnormal
> + edge to the setjmp but then the builtin was folded into a nop
> + and that edge was never removed but the edge was not needed in
> + the first place. */
> +
> +#include <arm_neon.h>
> +
> +__attribute__((always_inline))
> +static inline
> +void h(uint64x2_t c, int __b) {
> + /* Use vgetq_lane_u64 to get a
> + __builtin_aarch64_im_lane_boundsi */
> + vgetq_lane_u64(c, __b);
> +
> + __builtin_unreachable();
> +}
> +
> +int _setjmp();
> +void hh(uint64x2_t c) {
> + int __b = 0;
> + if (_setjmp())
> + h(c, 0);
> +}