[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #17 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6d79958a50eb5419ebb9baa5ef880aabeef05467 commit r9-10157-g6d79958a50eb5419ebb9baa5ef880aabeef05467 Author: Jakub Jelinek Date: Wed May 11 20:37:01 2022 +0200 testsuite: Fix up pr102860.f90 for gcc 9 [PR105570] Apparently -mcpu=power10 is gcc 10+, but the PR102860 change otherwise made sense also for 9.x. So just adjusting testcase... 2022-05-11 Jakub Jelinek PR middle-end/102860 PR testsuite/105570 * gfortran.dg/pr102860.f90: Use -mcpu=power9 instead of -mcpu=power10.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #16 from CVS Commits --- The releases/gcc-9 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:95f6eb7ae707482fdeed57b0906dacb8e675385d commit r9-10118-g95f6eb7ae707482fdeed57b0906dacb8e675385d Author: Jakub Jelinek Date: Wed Jan 19 15:03:45 2022 +0100 match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860] power10 has modv4si3 expander and so vectorizes the following testcase where Fortran modulo is FLOOR_MOD_EXPR. optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR variants is umod_optab or smod_optab, but that isn't true, that optab actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc has code how to adjust the TRUNC_MOD_EXPR into those by emitting some extra comparisons and conditional updates. Similarly for *_DIV_EXPR, except in that case it actually needs both division and modulo. While it would be possible to handle it in expmed.cc for vectors as well, we'd need to be sure all the vector operations we need for that are available, and furthermore we wouldn't account for that in the costing. So, IMHO it is better to stop pretending those non-truncating (and non-exact) div/mod operations have an optab. For GCC 13, we should IMHO pattern match these in tree-vect-patterns.cc and transform them to truncating div/mod with follow-up adjustments and let the vectorizer vectorize that. As written in the PR, for signed operands: r = x %[fl] y; is r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; is r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; and r = x %[cl] y; is r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; is r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; (too lazy to figure out rounding div/mod now). I'll create a PR for that. The patch also extends a match.pd optimization that floor_mod on unsigned operands is actually trunc_mod. 2022-01-19 Jakub Jelinek PR middle-end/102860 * match.pd (x %[fl] y -> x % y): New simplification for unsigned integral types. * optabs-tree.c (optab_for_tree_code): Return unknown_optab for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. * gfortran.dg/pr102860.f90: New test. (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #15 from CVS Commits --- The releases/gcc-10 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:03bd2b252bca8c0ba4aeecab8b560d751f2ef57d commit r10-10669-g03bd2b252bca8c0ba4aeecab8b560d751f2ef57d Author: Jakub Jelinek Date: Wed Jan 19 15:03:45 2022 +0100 match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860] power10 has modv4si3 expander and so vectorizes the following testcase where Fortran modulo is FLOOR_MOD_EXPR. optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR variants is umod_optab or smod_optab, but that isn't true, that optab actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc has code how to adjust the TRUNC_MOD_EXPR into those by emitting some extra comparisons and conditional updates. Similarly for *_DIV_EXPR, except in that case it actually needs both division and modulo. While it would be possible to handle it in expmed.cc for vectors as well, we'd need to be sure all the vector operations we need for that are available, and furthermore we wouldn't account for that in the costing. So, IMHO it is better to stop pretending those non-truncating (and non-exact) div/mod operations have an optab. For GCC 13, we should IMHO pattern match these in tree-vect-patterns.cc and transform them to truncating div/mod with follow-up adjustments and let the vectorizer vectorize that. As written in the PR, for signed operands: r = x %[fl] y; is r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; is r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; and r = x %[cl] y; is r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; is r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; (too lazy to figure out rounding div/mod now). I'll create a PR for that. The patch also extends a match.pd optimization that floor_mod on unsigned operands is actually trunc_mod. 2022-01-19 Jakub Jelinek PR middle-end/102860 * match.pd (x %[fl] y -> x % y): New simplification for unsigned integral types. * optabs-tree.c (optab_for_tree_code): Return unknown_optab for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. * gfortran.dg/pr102860.f90: New test. (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #14 from CVS Commits --- The releases/gcc-11 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:baf18a6d47c4325be004cb6289d4057b113de282 commit r11-9504-gbaf18a6d47c4325be004cb6289d4057b113de282 Author: Jakub Jelinek Date: Wed Jan 19 15:03:45 2022 +0100 match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860] power10 has modv4si3 expander and so vectorizes the following testcase where Fortran modulo is FLOOR_MOD_EXPR. optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR variants is umod_optab or smod_optab, but that isn't true, that optab actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc has code how to adjust the TRUNC_MOD_EXPR into those by emitting some extra comparisons and conditional updates. Similarly for *_DIV_EXPR, except in that case it actually needs both division and modulo. While it would be possible to handle it in expmed.cc for vectors as well, we'd need to be sure all the vector operations we need for that are available, and furthermore we wouldn't account for that in the costing. So, IMHO it is better to stop pretending those non-truncating (and non-exact) div/mod operations have an optab. For GCC 13, we should IMHO pattern match these in tree-vect-patterns.cc and transform them to truncating div/mod with follow-up adjustments and let the vectorizer vectorize that. As written in the PR, for signed operands: r = x %[fl] y; is r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; is r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; and r = x %[cl] y; is r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; is r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; (too lazy to figure out rounding div/mod now). I'll create a PR for that. The patch also extends a match.pd optimization that floor_mod on unsigned operands is actually trunc_mod. 2022-01-19 Jakub Jelinek PR middle-end/102860 * match.pd (x %[fl] y -> x % y): New simplification for unsigned integral types. * optabs-tree.c (optab_for_tree_code): Return unknown_optab for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. * gfortran.dg/pr102860.f90: New test. (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #13 from Jakub Jelinek --- Fixed now.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #12 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ffc7f200adbdf47f14b3594d9b21855c19cf797a commit r12-6739-gffc7f200adbdf47f14b3594d9b21855c19cf797a Author: Jakub Jelinek Date: Wed Jan 19 15:03:45 2022 +0100 match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860] power10 has modv4si3 expander and so vectorizes the following testcase where Fortran modulo is FLOOR_MOD_EXPR. optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR variants is umod_optab or smod_optab, but that isn't true, that optab actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc has code how to adjust the TRUNC_MOD_EXPR into those by emitting some extra comparisons and conditional updates. Similarly for *_DIV_EXPR, except in that case it actually needs both division and modulo. While it would be possible to handle it in expmed.cc for vectors as well, we'd need to be sure all the vector operations we need for that are available, and furthermore we wouldn't account for that in the costing. So, IMHO it is better to stop pretending those non-truncating (and non-exact) div/mod operations have an optab. For GCC 13, we should IMHO pattern match these in tree-vect-patterns.cc and transform them to truncating div/mod with follow-up adjustments and let the vectorizer vectorize that. As written in the PR, for signed operands: r = x %[fl] y; is r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; is r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; and r = x %[cl] y; is r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; is r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; (too lazy to figure out rounding div/mod now). I'll create a PR for that. The patch also extends a match.pd optimization that floor_mod on unsigned operands is actually trunc_mod. 2022-01-19 Jakub Jelinek PR middle-end/102860 * match.pd (x %[fl] y -> x % y): New simplification for unsigned integral types. * optabs-tree.cc (optab_for_tree_code): Return unknown_optab for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. * gfortran.dg/pr102860.f90: New test.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #11 from Jakub Jelinek --- Created attachment 52229 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52229&action=edit gcc12-pr102860-2.patch And here is IMHO a better one that for VECTOR_TYPE just returns unknown_optab so that the callers don't expect vectorized %[fl] etc. will work. For both patches, tree-vect-patterns.cc work deferred for stage1, will create a PR for it once this PR is fixed.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Jakub Jelinek changed: What|Removed |Added Attachment #52218|0 |1 is obsolete|| --- Comment #10 from Jakub Jelinek --- Created attachment 52228 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52228&action=edit gcc12-pr102860-1.patch One possible patch.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #9 from rguenther at suse dot de --- On Tue, 18 Jan 2022, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||jakub at gcc dot gnu.org > > --- Comment #7 from Jakub Jelinek --- > Short testcase: > function foo(a) > integer(kind=4) :: a(1024) > a(:) = modulo (a(:), 39) > end function > -O2 -mcpu=power10. > vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR > (and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the > same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to > TRUNC_*). > I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR > in the vectorizer and tree-vect-generic.cc True. > Further gradual improvements can be: > 1) match.pd has: > /* For unsigned integral types, FLOOR_DIV_EXPR is the same as >TRUNC_DIV_EXPR. Rewrite into the latter in this case. */ > (simplify > (floor_div @0 @1) > (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) > && TYPE_UNSIGNED (type)) > (trunc_div @0 @1))) > but expmed.cc has: > /* Promote floor rounding to trunc rounding for unsigned operations. */ > if (unsignedp) > { > if (code == FLOOR_DIV_EXPR) > code = TRUNC_DIV_EXPR; > if (code == FLOOR_MOD_EXPR) > code = TRUNC_MOD_EXPR; > if (code == EXACT_DIV_EXPR && op1_is_pow2) > code = TRUNC_DIV_EXPR; > } > Shouldn't we make it > (for floor_divmod (floor_div floor_mod) > trunc_divmod (trunc_div trunc_mod) > (simplify > (floor_divmod @0 @1) > (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) >&& TYPE_UNSIGNED (type)) >(trunc_divmod @0 @1 > ? Yeah, if the simplification is incomplete we should amend it. > 2) as the RTL optabs really do just trunc div/mod, perhaps > tree-vect-patterns.cc > could be changed to replace some or all of those operations with the trunc > operation followed by some arith and cond_exprs so that the vectorizer knows > actual cost of those operations. > E.g. it seems expmed.cc expands > r = x %[fl] y; > as > r = x % y; if (r && (x ^ y) < 0) r += y; > and > d = x /[fl] y; > would be > r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; > Looking at wide-int.h, > r = x %[cl] y; > as > r = x % y; if (r && (x ^ y) >= 0) r -= y; > and > d = /[cl] y; > as > r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; > All of the above for signed, as I said earlier, unsigned [fl] is the same as > trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1. > [rd] is even more complex. That sounds reasonable as well. I think we can do 0) and 1) now and defer 2) to the next stage1, maybe tracking it with an enhancement bugreport.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #8 from Jakub Jelinek --- Created attachment 52218 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52218&action=edit gcc12-pr102860-wip.patch Untested patch that avoids the ICE, but still doesn't add the tree-vect-patterns.cc cases.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- Short testcase: function foo(a) integer(kind=4) :: a(1024) a(:) = modulo (a(:), 39) end function -O2 -mcpu=power10. vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR (and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to TRUNC_*). I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR in the vectorizer and tree-vect-generic.cc Further gradual improvements can be: 1) match.pd has: /* For unsigned integral types, FLOOR_DIV_EXPR is the same as TRUNC_DIV_EXPR. Rewrite into the latter in this case. */ (simplify (floor_div @0 @1) (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) && TYPE_UNSIGNED (type)) (trunc_div @0 @1))) but expmed.cc has: /* Promote floor rounding to trunc rounding for unsigned operations. */ if (unsignedp) { if (code == FLOOR_DIV_EXPR) code = TRUNC_DIV_EXPR; if (code == FLOOR_MOD_EXPR) code = TRUNC_MOD_EXPR; if (code == EXACT_DIV_EXPR && op1_is_pow2) code = TRUNC_DIV_EXPR; } Shouldn't we make it (for floor_divmod (floor_div floor_mod) trunc_divmod (trunc_div trunc_mod) (simplify (floor_divmod @0 @1) (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) && TYPE_UNSIGNED (type)) (trunc_divmod @0 @1 ? 2) as the RTL optabs really do just trunc div/mod, perhaps tree-vect-patterns.cc could be changed to replace some or all of those operations with the trunc operation followed by some arith and cond_exprs so that the vectorizer knows actual cost of those operations. E.g. it seems expmed.cc expands r = x %[fl] y; as r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; would be r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; Looking at wide-int.h, r = x %[cl] y; as r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; as r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; All of the above for signed, as I said earlier, unsigned [fl] is the same as trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1. [rd] is even more complex.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #6 from luoxhu at gcc dot gnu.org --- Fortran's modulo is floor_mod as documented here: https://gcc.gnu.org/onlinedocs/gfortran/MODULO.html? Syntax: RESULT = MODULO(A, P) Return value: The type and kind of the result are those of the arguments. (As a GNU extension, kind is the largest kind of the actual arguments.) If A and P are of type INTEGER: MODULO(A,P) has the value R such that A=Q*P+R, where Q is an integer and R is between 0 (inclusive) and P (exclusive). If A and P are of type REAL: MODULO(A,P) has the value of A - FLOOR (A / P) * P. The returned value has the same sign as P and a magnitude less than the magnitude of P. program test_modulo print *, modulo(17,3) print *, modulo(17.5,5.5) print *, modulo(-17,3) print *, modulo(-17.5,5.5) print *, modulo(17,-3) print *, modulo(17.5,-5.5) end program LD_LIBRARY_PATH=./x86_64-pc-linux-gnu/libgfortran/.libs/ ./a.out 2 1. 1 4.5000 -1 -4.5000
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 luoxhu at gcc dot gnu.org changed: What|Removed |Added CC||luoxhu at gcc dot gnu.org --- Comment #5 from luoxhu at gcc dot gnu.org --- P8, P9 and X86 doesn't vectorize the floor_mod operation, so they passed. The fix in #c2 only fixes ICE, but execution still fails, reason is R239 is used but not defined.
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #4 from seurer at gcc dot gnu.org --- For completeness this also affects two other test cases: FAIL: libgomp.fortran/simd2.f90 -O2 (internal compiler error) FAIL: libgomp.fortran/simd2.f90 -O2 (test for excess errors) FAIL: libgomp.fortran/simd2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) FAIL: libgomp.fortran/simd2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: libgomp.fortran/simd2.f90 -O3 -g (internal compiler error) FAIL: libgomp.fortran/simd2.f90 -O3 -g (test for excess errors) FAIL: libgomp.fortran/simd3.f90 -O2 (internal compiler error) FAIL: libgomp.fortran/simd3.f90 -O2 (test for excess errors) FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: libgomp.fortran/simd3.f90 -O3 -g (internal compiler error) FAIL: libgomp.fortran/simd3.f90 -O3 -g (test for excess errors) FAIL: libgomp.fortran/simd4.f90 -O2 (internal compiler error) FAIL: libgomp.fortran/simd4.f90 -O2 (test for excess errors) FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: libgomp.fortran/simd4.f90 -O3 -g (internal compiler error) FAIL: libgomp.fortran/simd4.f90 -O3 -g (test for excess errors)
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 --- Comment #3 from Richard Biener --- *** Bug 102938 has been marked as a duplicate of this bug. ***
[Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2021-10-21 Component|target |middle-end --- Comment #2 from Richard Biener --- We are expanding vect__5.73_271 = vect__4.72_269 %[fl] { 39, 39, 39, 39 }; produced from vectorizing _5 = _4 %[fl] 39; optab_for_tree_code does case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR: case FLOOR_MOD_EXPR: case ROUND_MOD_EXPR: return TYPE_UNSIGNED (type) ? umod_optab : smod_optab; somehow the vectorizer finds an optab to vectorize this but RTL expansion fails up to /* No luck with division elimination or divmod. Have to do it by conditionally adjusting op0 *and* the result. */ expand_divmod never seems to use smod_optab for FLOOR_MOD_EXPR. So this seems to be a latent issue but definitely this expansion code doing compare & jump has to be gated on !VECTOR_TYPE since do_cmp_and_jump cannot work with vector arguments. And then there's a fallback missing I guess. Simply gating produces (insn 35 34 36 (set (reg:V4SI 238) (const_vector:V4SI [ (const_int -52 [0xffcc]) repeated x4 ])) "simd2.f90":11:30 -1 (nil)) (insn 36 35 37 (set (reg:V4SI 237 [ vect__4.72 ]) (plus:V4SI (reg:V4SI 181 [ vect_vec_iv_.68 ]) (reg:V4SI 238))) "simd2.f90":11:30 -1 (nil)) (insn 37 36 38 (set (reg:V4SI 240) (reg:V4SI 239)) "simd2.f90":11:30 -1 (nil)) (insn 38 37 39 (set (reg:V4SI 242) (const_vector:V4SI [ (const_int 2 [0x2]) repeated x4 ])) "simd2.f90":11:30 -1 (nil)) (insn 39 38 40 (set (reg:V4SI 241) (ashift:V4SI (reg:V4SI 240) (reg:V4SI 242))) "simd2.f90":11:30 -1 (nil)) (insn 40 39 41 (set (reg:V4SI 240) (reg:V4SI 241)) "simd2.f90":11:30 -1 (nil)) (insn 41 40 42 (set (reg:V4SI 243) (plus:V4SI (reg:V4SI 240) (reg:V4SI 239))) "simd2.f90":11:30 -1 (nil)) (insn 42 41 43 (set (reg:V4SI 245) (const_vector:V4SI [ (const_int 3 [0x3]) repeated x4 ])) "simd2.f90":11:30 -1 (nil)) (insn 43 42 44 (set (reg:V4SI 244) (ashift:V4SI (reg:V4SI 243) (reg:V4SI 245))) "simd2.f90":11:30 -1 (nil)) (insn 44 43 45 (set (reg:V4SI 243) (reg:V4SI 244)) "simd2.f90":11:30 -1 (nil)) (insn 45 44 46 (set (reg:V4SI 246) (minus:V4SI (reg:V4SI 243) (reg:V4SI 239))) "simd2.f90":11:30 -1 (nil)) (insn 46 45 0 (set (reg:V4SI 221 [ vect__5.73 ]) (minus:V4SI (reg:V4SI 237 [ vect__4.72 ]) (reg:V4SI 246))) "simd2.f90":11:30 -1 (nil)) which I guess is OK for trunc_mod but not floor_mod, but it fixes the ICE. diff --git a/gcc/expmed.c b/gcc/expmed.c index bbdd0e71d20..0ae57cc3f8a 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -4850,7 +4850,7 @@ expand_divmod (int rem_flag, enum tree_code code, machine_mode mode, /* No luck with division elimination or divmod. Have to do it by conditionally adjusting op0 *and* the result. */ - { + if (!VECTOR_MODE_P (mode)) { rtx_code_label *label1, *label2, *label3, *label4, *label5; rtx adjusted_op0; rtx tem; The floor-mod is present in .original already: simd2.f90.005t.original:b[(integer(kind=8)) i + -1] = (i + -52) %[fl] 39; looks like the modulo intrinsic is floor_mod? b(i) = modulo (i - 52, 39)