Re: [PATCH] amdgcn: Enable SIMD vectorization of math functions

Andrew Stubbs Wed, 01 Mar 2023 02:02:05 -0800

On 28/02/2023 23:01, Kwok Cheung Yeung wrote:

Hello
This patch implements the TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTIONtarget hook for the AMD GCN architecture, such that when vectorized,calls to builtin standard math functions such as asinf, exp, pow etc.are converted to calls to the recently added vectorized math functionsfor GCN in Newlib. The -fno-math-errno flag is required in addition tothe usual vectorization optimization flags for this to occur, and someof the math functions (the larger double-precision ones) require a largestack size to function properly.
This patch requires the GCN vector math functions in Newlib to function- these were included in the recent 4.3.0.20230120 snapshot. As this wasa minimum requirement starting from the patch 'amdgcn, libgomp: Manuallyallocated stacks', this should not be a problem.
I have added new testcases in the testsuite that compare the output ofthe vectorized math functions against the scalar, passing if they aresufficiently close. With the testcase for standalone GCN (withoutlibgomp) in gcc.target/gcn/, there is a problem since gcn-run currentlycannot set the stack size correctly in DejaGnu testing, so I have madeit a compile test for now - it is still useful to check that calls tothe correct functions are being made. The runtime correctness is stillcovered by the libgomp test.
Okay for trunk?


The main part of the patch is OK, with the small changes below.

Others have pointed out that "omp declare simd" exists, but you and Ihave been all through that verbally, long ago, and as Tobias says theoffload compiler cannot rely on markup in the host compiler's headerfiles to solve this problem.

@@ -7324,6 +7429,11 @@ gcn_dwarf_register_span (rtx rtl)
   gcn_simd_clone_compute_vecsize_and_simdlen
 #undef  TARGET_SIMD_CLONE_USABLE
 #define TARGET_SIMD_CLONE_USABLE gcn_simd_clone_usable
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION \
+  gcn_vectorize_builtin_vectorized_function
+#undef TARGET_LIBC_HAS_FUNCTION
+#define TARGET_LIBC_HAS_FUNCTION gcn_libc_has_function
 #undef  TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P
 #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P \
   gcn_small_register_classes_for_mode_p


Please keep these in alphabetical order.

+/* Ideally this test should be run, but the math routines require a large
+   stack and gcn-run currently does not respect the stack-size parameter.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fno-math-errno -mstack-size=3000000 
-fdump-tree-vect" } */

This isn't ideal. The dg-set-target-env-var directive (I think this isit?) can set GCN_STACK_SIZE, which gcn-run does honour, but I realisethat doesn't work with remote test targets (like ours).

I suggest adding an additional test that sets the envvar and #includesthe code from this one; one test to scan the dumps, one test to run it.Like this .... (untested, syntax uncertain).


/* { dg-do run } */
/* { dg-options "-O2 -ftree-vectorize -fno-math-errno" } */
/* { dg-set-target-env-var "GCN_STACK_SIZE" "3000000" } */
#include "simd-math-1.c"

The run test will get skipped in our test environment (and anyone elseusing remote), but the libgomp test should make up for that.


Andrew

Re: [PATCH] amdgcn: Enable SIMD vectorization of math functions

Reply via email to