On Tue, Nov 26, 2024 at 2:25 AM <[email protected]> wrote:
>
> From: Pan Li <[email protected]>
>
> There are some forms like below failed to recog the SAT_ADD
Some forms like below failed to be recognized as SAT_ADD ...
> pattern for target i386. It is related to some match pattern
> extraction but get fixed after the refactor of the SAT_ADD
> pattern. Thus, add testcases to ensure we may have similar
> issue in futrue.
>
> #define DEF_SAT_ADD(T) \
> T sat_add_##T (T x, T y) \
> { \
> T res; \
> res = x + y; \
> res |= -(T)(res < x); \
> return res; \
> }
>
> #define VEC_DEF_SAT_ADD(T) \
> void vec_sat_add(T * restrict a, T * restrict b) \
> { \
> for (int i = 0; i < 8; i++) \
> b[i] = sat_add_##T (a[i], b[i]); \
> }
>
> DEF_SAT_ADD (uint32_t)
> VEC_DEF_SAT_ADD (uint32_t)
>
> The below test suites are passed for this patch.
> make -k check-gcc RUNTESTFLAGS="--target_board=unix\{,-m32\}
> i386.exp=pr112600-5a-*.c"
>
> PR target/112600
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr112600-5a-u16.c: New test.
> * gcc.target/i386/pr112600-5a-u32.c: New test.
> * gcc.target/i386/pr112600-5a-u64.c: New test.
> * gcc.target/i386/pr112600-5a-u8.c: New test.
> * gcc.target/i386/pr112600-5a.h: New test.
>
> Signed-off-by: Pan Li <[email protected]>
> ---
> .../gcc.target/i386/pr112600-5a-u16.c | 10 +++++++++
> .../gcc.target/i386/pr112600-5a-u32.c | 10 +++++++++
> .../gcc.target/i386/pr112600-5a-u64.c | 12 ++++++++++
> .../gcc.target/i386/pr112600-5a-u8.c | 11 ++++++++++
> gcc/testsuite/gcc.target/i386/pr112600-5a.h | 22 +++++++++++++++++++
> 5 files changed, 65 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr112600-5a.h
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c
> b/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c
> new file mode 100644
> index 00000000000..8a4a3e4443c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u16.c
> @@ -0,0 +1,10 @@
> +/* PR target/112600 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */
> +
> +#include "pr112600-5a.h"
> +
> +DEF_SAT_ADD (uint16_t)
> +VEC_DEF_SAT_ADD (uint16_t)
> +
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c
> b/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c
> new file mode 100644
> index 00000000000..3a35f4c9770
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u32.c
> @@ -0,0 +1,10 @@
> +/* PR target/112600 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */
> +
> +#include "pr112600-5a.h"
> +
> +DEF_SAT_ADD (uint32_t)
> +VEC_DEF_SAT_ADD (uint32_t)
Remove the vector form, since we know it won't be recognized.
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c
> b/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c
> new file mode 100644
> index 00000000000..57d05d33fd7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u64.c
> @@ -0,0 +1,12 @@
> +/* PR target/112600 */
> +/* { dg-do compile } */
/* { dg-do compile { target { ! ia32 } } } */
to limit the testcase for 64bit targets only.
> +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */
> +
> +#include "pr112600-5a.h"
> +
> +DEF_SAT_ADD (uint64_t)
> +VEC_DEF_SAT_ADD (uint64_t)
Remove the vector form, since we know it won't be recognized.
> +
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 0 "optimized" { target {
> any-opts { "-m32" } } } } } */
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" { target {
> no-opts { "-m32" } } } } } */
Why are there 2 instances detected for the 64bit target? Only scalar
form can be optimized, so I'd expect:
/* { dg-final { scan-tree-dump-times ".SAT_ADD " 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c
> b/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c
> new file mode 100644
> index 00000000000..f8f224af730
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a-u8.c
> @@ -0,0 +1,11 @@
> +/* PR target/112600 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2 -fdump-tree-optimized" } */
> +
> +#include "pr112600-5a.h"
> +
> +DEF_SAT_ADD (uint8_t)
> +VEC_DEF_SAT_ADD (uint8_t)
> +
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 3 "optimized" { target {
> any-opts { "-m32" } } } } } */
> +/* { dg-final { scan-tree-dump-times ".SAT_ADD " 2 "optimized" { target {
> no-opts { "-m32" } } } } } */
Why are the results different between 32bit and 64bit targets? Results
should be the same because both scalar and vector uint8_t forms can be
optimized for both targets, similar to the uint16_t case.
Uros.
> diff --git a/gcc/testsuite/gcc.target/i386/pr112600-5a.h
> b/gcc/testsuite/gcc.target/i386/pr112600-5a.h
> new file mode 100644
> index 00000000000..1e753695e81
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr112600-5a.h
> @@ -0,0 +1,22 @@
> +#ifndef HAVE_DEFINED_PR112600_5A_H
> +#define HAVE_DEFINED_PR112600_5A_H
> +
> +#include <stdint.h>
> +
> +#define DEF_SAT_ADD(T) \
> +T sat_add_##T (T x, T y) \
> +{ \
> + T res; \
> + res = x + y; \
> + res |= -(T)(res < x); \
> + return res; \
> +}
> +
> +#define VEC_DEF_SAT_ADD(T) \
> +void vec_sat_add(T * restrict a, T * restrict b) \
> +{ \
> + for (int i = 0; i < 8; i++) \
> + b[i] = sat_add_##T (a[i], b[i]); \
> +}
> +
> +#endif
> --
> 2.43.0
>