On Wed, 19 Jan 2022, Jakub Jelinek wrote: > Hi! > > power10 has modv4si3 expander and so vectorizes the following testcase > where Fortran modulo is FLOOR_MOD_EXPR. > optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR > variants is umod_optab or smod_optab, but that isn't true, that optab > actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc > has code how to adjust the TRUNC_MOD_EXPR into those by emitting some > extra comparisons and conditional updates. Similarly for *_DIV_EXPR, > except in that case it actually needs both division and modulo. > > While it would be possible to handle it in expmed.cc for vectors as well, > we'd need to be sure all the vector operations we need for that are > available, and furthermore we wouldn't account for that in the costing. > > So, IMHO it is better to stop pretending those non-truncating (and > non-exact) div/mod operations have an optab. For GCC 13, we should > IMHO pattern match these in tree-vect-patterns.cc and transform them > to truncating div/mod with follow-up adjustments and let the vectorizer > vectorize that. As written in the PR, for signed operands: > r = x %[fl] y; > is > r = x % y; if (r && (x ^ y) < 0) r += y; > and > d = x /[fl] y; > is > r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; > and > r = x %[cl] y; > is > r = x % y; if (r && (x ^ y) >= 0) r -= y; > and > d = /[cl] y; > is > r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; > (too lazy to figure out rounding div/mod now). I'll create a PR > for that. > The patch also extends a match.pd optimization that floor_mod on > unsigned operands is actually trunc_mod. > > Bootstrapped/regtested on powerpc64le-linux, ok for trunk?
OK. Thanks, Richard. > 2022-01-19 Jakub Jelinek <ja...@redhat.com> > > PR middle-end/102860 > * match.pd (x %[fl] y -> x % y): New simplification for > unsigned integral types. > * optabs-tree.cc (optab_for_tree_code): Return unknown_optab > for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. > > * gfortran.dg/pr102860.f90: New test. > > --- gcc/match.pd.jj 2022-01-19 00:00:57.876401009 +0100 > +++ gcc/match.pd 2022-01-19 10:46:56.710675088 +0100 > @@ -425,12 +425,15 @@ (define_operator_list SYNC_FETCH_AND_AND > { build_minus_one_cst (type); }))) > > /* For unsigned integral types, FLOOR_DIV_EXPR is the same as > - TRUNC_DIV_EXPR. Rewrite into the latter in this case. */ > -(simplify > - (floor_div @0 @1) > - (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) > - && TYPE_UNSIGNED (type)) > - (trunc_div @0 @1))) > + TRUNC_DIV_EXPR. Rewrite into the latter in this case. Similarly > + for MOD instead of DIV. */ > +(for floor_divmod (floor_div floor_mod) > + trunc_divmod (trunc_div trunc_mod) > + (simplify > + (floor_divmod @0 @1) > + (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) > + && TYPE_UNSIGNED (type)) > + (trunc_divmod @0 @1)))) > > /* Combine two successive divisions. Note that combining ceil_div > and floor_div is trickier and combining round_div even more so. */ > --- gcc/optabs-tree.cc.jj 2022-01-18 11:58:59.740979799 +0100 > +++ gcc/optabs-tree.cc 2022-01-19 10:56:13.755956061 +0100 > @@ -56,17 +56,32 @@ optab_for_tree_code (enum tree_code code > case MULT_HIGHPART_EXPR: > return TYPE_UNSIGNED (type) ? umul_highpart_optab : > smul_highpart_optab; > > - case TRUNC_MOD_EXPR: > case CEIL_MOD_EXPR: > case FLOOR_MOD_EXPR: > case ROUND_MOD_EXPR: > + /* {s,u}mod_optab implements TRUNC_MOD_EXPR. For scalar modes, > + expansion has code to adjust TRUNC_MOD_EXPR into the desired other > + modes, but for vector modes it does not. The adjustment code > + should be instead emitted in tree-vect-patterns.cc. */ > + if (TREE_CODE (type) == VECTOR_TYPE) > + return unknown_optab; > + /* FALLTHRU */ > + case TRUNC_MOD_EXPR: > return TYPE_UNSIGNED (type) ? umod_optab : smod_optab; > > - case RDIV_EXPR: > - case TRUNC_DIV_EXPR: > case CEIL_DIV_EXPR: > case FLOOR_DIV_EXPR: > case ROUND_DIV_EXPR: > + /* {,u}{s,u}div_optab implements {TRUNC,EXACT}_DIV_EXPR or RDIV_EXPR. > + For scalar modes, expansion has code to adjust TRUNC_DIV_EXPR > + into the desired other modes, but for vector modes it does not. > + The adjustment code should be instead emitted in > + tree-vect-patterns.cc. */ > + if (TREE_CODE (type) == VECTOR_TYPE) > + return unknown_optab; > + /* FALLTHRU */ > + case RDIV_EXPR: > + case TRUNC_DIV_EXPR: > case EXACT_DIV_EXPR: > if (TYPE_SATURATING (type)) > return TYPE_UNSIGNED (type) ? usdiv_optab : ssdiv_optab; > --- gcc/testsuite/gfortran.dg/pr102860.f90.jj 2022-01-19 10:46:56.712675060 > +0100 > +++ gcc/testsuite/gfortran.dg/pr102860.f90 2022-01-19 10:46:56.712675060 > +0100 > @@ -0,0 +1,10 @@ > +! PR middle-end/102860 > +! { dg-do compile { target { powerpc*-*-* } } } > +! { dg-require-effective-target powerpc_vsx_ok } > +! { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { > "-mcpu=power10" } } > +! { dg-options "-O2 -mcpu=power10" } > + > +function foo(a) > + integer(kind=4) :: a(1024) > + a(:) = modulo (a(:), 39) > +end function > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)