[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-05-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #17 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6d79958a50eb5419ebb9baa5ef880aabeef05467

commit r9-10157-g6d79958a50eb5419ebb9baa5ef880aabeef05467
Author: Jakub Jelinek 
Date:   Wed May 11 20:37:01 2022 +0200

testsuite: Fix up pr102860.f90 for gcc 9 [PR105570]

Apparently -mcpu=power10 is gcc 10+, but the PR102860 change otherwise
made sense also for 9.x.  So just adjusting testcase...

2022-05-11  Jakub Jelinek  

PR middle-end/102860
PR testsuite/105570
* gfortran.dg/pr102860.f90: Use -mcpu=power9 instead of
-mcpu=power10.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-05-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #16 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:95f6eb7ae707482fdeed57b0906dacb8e675385d

commit r9-10118-g95f6eb7ae707482fdeed57b0906dacb8e675385d
Author: Jakub Jelinek 
Date:   Wed Jan 19 15:03:45 2022 +0100

match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

power10 has modv4si3 expander and so vectorizes the following testcase
where Fortran modulo is FLOOR_MOD_EXPR.
optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
variants is umod_optab or smod_optab, but that isn't true, that optab
actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
except in that case it actually needs both division and modulo.

While it would be possible to handle it in expmed.cc for vectors as well,
we'd need to be sure all the vector operations we need for that are
available, and furthermore we wouldn't account for that in the costing.

So, IMHO it is better to stop pretending those non-truncating (and
non-exact) div/mod operations have an optab.  For GCC 13, we should
IMHO pattern match these in tree-vect-patterns.cc and transform them
to truncating div/mod with follow-up adjustments and let the vectorizer
vectorize that.  As written in the PR, for signed operands:
r = x %[fl] y;
is
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
and
r = x %[cl] y;
is
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
(too lazy to figure out rounding div/mod now).  I'll create a PR
for that.
The patch also extends a match.pd optimization that floor_mod on
unsigned operands is actually trunc_mod.

2022-01-19  Jakub Jelinek  

PR middle-end/102860
* match.pd (x %[fl] y -> x % y): New simplification for
unsigned integral types.
* optabs-tree.c (optab_for_tree_code): Return unknown_optab
for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

* gfortran.dg/pr102860.f90: New test.

(cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-05-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #15 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:03bd2b252bca8c0ba4aeecab8b560d751f2ef57d

commit r10-10669-g03bd2b252bca8c0ba4aeecab8b560d751f2ef57d
Author: Jakub Jelinek 
Date:   Wed Jan 19 15:03:45 2022 +0100

match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

power10 has modv4si3 expander and so vectorizes the following testcase
where Fortran modulo is FLOOR_MOD_EXPR.
optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
variants is umod_optab or smod_optab, but that isn't true, that optab
actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
except in that case it actually needs both division and modulo.

While it would be possible to handle it in expmed.cc for vectors as well,
we'd need to be sure all the vector operations we need for that are
available, and furthermore we wouldn't account for that in the costing.

So, IMHO it is better to stop pretending those non-truncating (and
non-exact) div/mod operations have an optab.  For GCC 13, we should
IMHO pattern match these in tree-vect-patterns.cc and transform them
to truncating div/mod with follow-up adjustments and let the vectorizer
vectorize that.  As written in the PR, for signed operands:
r = x %[fl] y;
is
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
and
r = x %[cl] y;
is
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
(too lazy to figure out rounding div/mod now).  I'll create a PR
for that.
The patch also extends a match.pd optimization that floor_mod on
unsigned operands is actually trunc_mod.

2022-01-19  Jakub Jelinek  

PR middle-end/102860
* match.pd (x %[fl] y -> x % y): New simplification for
unsigned integral types.
* optabs-tree.c (optab_for_tree_code): Return unknown_optab
for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

* gfortran.dg/pr102860.f90: New test.

(cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #14 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:baf18a6d47c4325be004cb6289d4057b113de282

commit r11-9504-gbaf18a6d47c4325be004cb6289d4057b113de282
Author: Jakub Jelinek 
Date:   Wed Jan 19 15:03:45 2022 +0100

match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

power10 has modv4si3 expander and so vectorizes the following testcase
where Fortran modulo is FLOOR_MOD_EXPR.
optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
variants is umod_optab or smod_optab, but that isn't true, that optab
actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
except in that case it actually needs both division and modulo.

While it would be possible to handle it in expmed.cc for vectors as well,
we'd need to be sure all the vector operations we need for that are
available, and furthermore we wouldn't account for that in the costing.

So, IMHO it is better to stop pretending those non-truncating (and
non-exact) div/mod operations have an optab.  For GCC 13, we should
IMHO pattern match these in tree-vect-patterns.cc and transform them
to truncating div/mod with follow-up adjustments and let the vectorizer
vectorize that.  As written in the PR, for signed operands:
r = x %[fl] y;
is
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
and
r = x %[cl] y;
is
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
(too lazy to figure out rounding div/mod now).  I'll create a PR
for that.
The patch also extends a match.pd optimization that floor_mod on
unsigned operands is actually trunc_mod.

2022-01-19  Jakub Jelinek  

PR middle-end/102860
* match.pd (x %[fl] y -> x % y): New simplification for
unsigned integral types.
* optabs-tree.c (optab_for_tree_code): Return unknown_optab
for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

* gfortran.dg/pr102860.f90: New test.

(cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-19 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #13 from Jakub Jelinek  ---
Fixed now.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:ffc7f200adbdf47f14b3594d9b21855c19cf797a

commit r12-6739-gffc7f200adbdf47f14b3594d9b21855c19cf797a
Author: Jakub Jelinek 
Date:   Wed Jan 19 15:03:45 2022 +0100

match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

power10 has modv4si3 expander and so vectorizes the following testcase
where Fortran modulo is FLOOR_MOD_EXPR.
optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
variants is umod_optab or smod_optab, but that isn't true, that optab
actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
except in that case it actually needs both division and modulo.

While it would be possible to handle it in expmed.cc for vectors as well,
we'd need to be sure all the vector operations we need for that are
available, and furthermore we wouldn't account for that in the costing.

So, IMHO it is better to stop pretending those non-truncating (and
non-exact) div/mod operations have an optab.  For GCC 13, we should
IMHO pattern match these in tree-vect-patterns.cc and transform them
to truncating div/mod with follow-up adjustments and let the vectorizer
vectorize that.  As written in the PR, for signed operands:
r = x %[fl] y;
is
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
and
r = x %[cl] y;
is
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
(too lazy to figure out rounding div/mod now).  I'll create a PR
for that.
The patch also extends a match.pd optimization that floor_mod on
unsigned operands is actually trunc_mod.

2022-01-19  Jakub Jelinek  

PR middle-end/102860
* match.pd (x %[fl] y -> x % y): New simplification for
unsigned integral types.
* optabs-tree.cc (optab_for_tree_code): Return unknown_optab
for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

* gfortran.dg/pr102860.f90: New test.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-19 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #11 from Jakub Jelinek  ---
Created attachment 52229
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52229&action=edit
gcc12-pr102860-2.patch

And here is IMHO a better one that for VECTOR_TYPE just returns unknown_optab
so that the callers don't expect vectorized %[fl] etc. will work.

For both patches, tree-vect-patterns.cc work deferred for stage1, will create a
PR for it once this PR is fixed.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-19 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #52218|0   |1
is obsolete||

--- Comment #10 from Jakub Jelinek  ---
Created attachment 52228
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52228&action=edit
gcc12-pr102860-1.patch

One possible patch.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-18 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #9 from rguenther at suse dot de  ---
On Tue, 18 Jan 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>  CC||jakub at gcc dot gnu.org
> 
> --- Comment #7 from Jakub Jelinek  ---
> Short testcase:
> function foo(a)
>   integer(kind=4) :: a(1024)
>   a(:) = modulo (a(:), 39)
> end function
> -O2 -mcpu=power10.
> vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
> (and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
> same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
> TRUNC_*).
> I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
> in the vectorizer and tree-vect-generic.cc

True.

> Further gradual improvements can be:
> 1) match.pd has:
> /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
>TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  */
> (simplify
>  (floor_div @0 @1)
>  (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>   && TYPE_UNSIGNED (type))
>   (trunc_div @0 @1)))
> but expmed.cc has:
>   /* Promote floor rounding to trunc rounding for unsigned operations.  */
>   if (unsignedp)
> {
>   if (code == FLOOR_DIV_EXPR)
> code = TRUNC_DIV_EXPR;
>   if (code == FLOOR_MOD_EXPR)
> code = TRUNC_MOD_EXPR;
>   if (code == EXACT_DIV_EXPR && op1_is_pow2)
> code = TRUNC_DIV_EXPR;
> }
> Shouldn't we make it
> (for floor_divmod (floor_div floor_mod)
>  trunc_divmod (trunc_div trunc_mod)
>  (simplify
>   (floor_divmod @0 @1)
>   (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>&& TYPE_UNSIGNED (type))
>(trunc_divmod @0 @1
> ?

Yeah, if the simplification is incomplete we should amend it.

> 2) as the RTL optabs really do just trunc div/mod, perhaps
> tree-vect-patterns.cc
> could be changed to replace some or all of those operations with the trunc
> operation followed by some arith and cond_exprs so that the vectorizer knows
> actual cost of those operations.
> E.g. it seems expmed.cc expands
> r = x %[fl] y;
> as
> r = x % y; if (r && (x ^ y) < 0) r += y;
> and
> d = x /[fl] y;
> would be
> r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
> Looking at wide-int.h,
> r = x %[cl] y;
> as
> r = x % y; if (r && (x ^ y) >= 0) r -= y;
> and
> d = /[cl] y;
> as
> r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
> All of the above for signed, as I said earlier, unsigned [fl] is the same as
> trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
> [rd] is even more complex.

That sounds reasonable as well.  I think we can do 0) and 1) now and
defer 2) to the next stage1, maybe tracking it with an enhancement
bugreport.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek  ---
Created attachment 52218
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52218&action=edit
gcc12-pr102860-wip.patch

Untested patch that avoids the ICE, but still doesn't add the
tree-vect-patterns.cc cases.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
Short testcase:
function foo(a)
  integer(kind=4) :: a(1024)
  a(:) = modulo (a(:), 39)
end function
-O2 -mcpu=power10.
vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
(and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
TRUNC_*).
I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
in the vectorizer and tree-vect-generic.cc
Further gradual improvements can be:
1) match.pd has:
/* For unsigned integral types, FLOOR_DIV_EXPR is the same as
   TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  */
(simplify
 (floor_div @0 @1)
 (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
  && TYPE_UNSIGNED (type))
  (trunc_div @0 @1)))
but expmed.cc has:
  /* Promote floor rounding to trunc rounding for unsigned operations.  */
  if (unsignedp)
{
  if (code == FLOOR_DIV_EXPR)
code = TRUNC_DIV_EXPR;
  if (code == FLOOR_MOD_EXPR)
code = TRUNC_MOD_EXPR;
  if (code == EXACT_DIV_EXPR && op1_is_pow2)
code = TRUNC_DIV_EXPR;
}
Shouldn't we make it
(for floor_divmod (floor_div floor_mod)
 trunc_divmod (trunc_div trunc_mod)
 (simplify
  (floor_divmod @0 @1)
  (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
   && TYPE_UNSIGNED (type))
   (trunc_divmod @0 @1
?
2) as the RTL optabs really do just trunc div/mod, perhaps
tree-vect-patterns.cc
could be changed to replace some or all of those operations with the trunc
operation followed by some arith and cond_exprs so that the vectorizer knows
actual cost of those operations.
E.g. it seems expmed.cc expands
r = x %[fl] y;
as
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
would be
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
Looking at wide-int.h,
r = x %[cl] y;
as
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
as
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
All of the above for signed, as I said earlier, unsigned [fl] is the same as
trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
[rd] is even more complex.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2022-01-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-12-14 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #6 from luoxhu at gcc dot gnu.org ---
Fortran's modulo is floor_mod as documented here:
https://gcc.gnu.org/onlinedocs/gfortran/MODULO.html?

Syntax:
RESULT = MODULO(A, P)

Return value:
The type and kind of the result are those of the arguments. (As a GNU
extension, kind is the largest kind of the actual arguments.)

If A and P are of type INTEGER:
MODULO(A,P) has the value R such that A=Q*P+R, where Q is an integer and R is
between 0 (inclusive) and P (exclusive).

If A and P are of type REAL:
MODULO(A,P) has the value of A - FLOOR (A / P) * P.

The returned value has the same sign as P and a magnitude less than the
magnitude of P.


program test_modulo
  print *, modulo(17,3)
  print *, modulo(17.5,5.5)

  print *, modulo(-17,3)
  print *, modulo(-17.5,5.5)

  print *, modulo(17,-3)
  print *, modulo(17.5,-5.5)
end program


LD_LIBRARY_PATH=./x86_64-pc-linux-gnu/libgfortran/.libs/ ./a.out

   2
   1.
   1
   4.5000
  -1
  -4.5000

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-12-14 Thread luoxhu at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

luoxhu at gcc dot gnu.org changed:

   What|Removed |Added

 CC||luoxhu at gcc dot gnu.org

--- Comment #5 from luoxhu at gcc dot gnu.org ---
P8, P9 and X86 doesn't vectorize the floor_mod operation, so they passed.
The fix in #c2 only fixes ICE, but execution still fails, reason is R239 is
used but not defined.

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-10-26 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #4 from seurer at gcc dot gnu.org ---
For completeness this also affects two other test cases:

FAIL: libgomp.fortran/simd2.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O3 -g  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O3 -g  (test for excess errors)

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-10-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #3 from Richard Biener  ---
*** Bug 102938 has been marked as a duplicate of this bug. ***

[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

2021-10-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-10-21
  Component|target  |middle-end

--- Comment #2 from Richard Biener  ---
We are expanding

vect__5.73_271 = vect__4.72_269 %[fl] { 39, 39, 39, 39 };

produced from vectorizing

  _5 = _4 %[fl] 39;

optab_for_tree_code does

case TRUNC_MOD_EXPR:
case CEIL_MOD_EXPR:
case FLOOR_MOD_EXPR:
case ROUND_MOD_EXPR:
  return TYPE_UNSIGNED (type) ? umod_optab : smod_optab;

somehow the vectorizer finds an optab to vectorize this but RTL expansion
fails up to

/* No luck with division elimination or divmod.  Have to do it
   by conditionally adjusting op0 *and* the result.  */

expand_divmod never seems to use smod_optab for FLOOR_MOD_EXPR.

So this seems to be a latent issue but definitely this expansion code
doing compare & jump has to be gated on !VECTOR_TYPE since do_cmp_and_jump
cannot work with vector arguments.  And then there's a fallback missing
I guess.  Simply gating produces

(insn 35 34 36 (set (reg:V4SI 238)
(const_vector:V4SI [
(const_int -52 [0xffcc]) repeated x4
])) "simd2.f90":11:30 -1
 (nil))

(insn 36 35 37 (set (reg:V4SI 237 [ vect__4.72 ])
(plus:V4SI (reg:V4SI 181 [ vect_vec_iv_.68 ])
(reg:V4SI 238))) "simd2.f90":11:30 -1
 (nil))

(insn 37 36 38 (set (reg:V4SI 240)
(reg:V4SI 239)) "simd2.f90":11:30 -1
 (nil))

(insn 38 37 39 (set (reg:V4SI 242)
(const_vector:V4SI [
(const_int 2 [0x2]) repeated x4
])) "simd2.f90":11:30 -1
 (nil))

(insn 39 38 40 (set (reg:V4SI 241)
(ashift:V4SI (reg:V4SI 240)
(reg:V4SI 242))) "simd2.f90":11:30 -1
 (nil))

(insn 40 39 41 (set (reg:V4SI 240)
(reg:V4SI 241)) "simd2.f90":11:30 -1
 (nil))

(insn 41 40 42 (set (reg:V4SI 243)
(plus:V4SI (reg:V4SI 240)
(reg:V4SI 239))) "simd2.f90":11:30 -1
 (nil))

(insn 42 41 43 (set (reg:V4SI 245)
(const_vector:V4SI [
(const_int 3 [0x3]) repeated x4
])) "simd2.f90":11:30 -1
 (nil))

(insn 43 42 44 (set (reg:V4SI 244)
(ashift:V4SI (reg:V4SI 243)
(reg:V4SI 245))) "simd2.f90":11:30 -1
 (nil))

(insn 44 43 45 (set (reg:V4SI 243)
(reg:V4SI 244)) "simd2.f90":11:30 -1
 (nil))

(insn 45 44 46 (set (reg:V4SI 246)
(minus:V4SI (reg:V4SI 243)
(reg:V4SI 239))) "simd2.f90":11:30 -1
 (nil))

(insn 46 45 0 (set (reg:V4SI 221 [ vect__5.73 ])
(minus:V4SI (reg:V4SI 237 [ vect__4.72 ])
(reg:V4SI 246))) "simd2.f90":11:30 -1
 (nil))

which I guess is OK for trunc_mod but not floor_mod, but it fixes the ICE.

diff --git a/gcc/expmed.c b/gcc/expmed.c
index bbdd0e71d20..0ae57cc3f8a 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -4850,7 +4850,7 @@ expand_divmod (int rem_flag, enum tree_code code,
machine_mode mode,

/* No luck with division elimination or divmod.  Have to do it
   by conditionally adjusting op0 *and* the result.  */
-   {
+   if (!VECTOR_MODE_P (mode)) {
  rtx_code_label *label1, *label2, *label3, *label4, *label5;
  rtx adjusted_op0;
  rtx tem;


The floor-mod is present in .original already:

simd2.f90.005t.original:b[(integer(kind=8)) i + -1] = (i + -52) %[fl]
39;

looks like the modulo intrinsic is floor_mod?

b(i) = modulo (i - 52, 39)