[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #30 from GCC Commits --- The releases/gcc-13 branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:2602b71103d5ef2ef86000cac832b31dad3dfe2b commit r13-8813-g2602b71103d5ef2ef86000cac832b31dad3dfe2b Author: Richard Sandiford Date: Fri May 31 15:56:05 2024 +0100 vect: Tighten vect_determine_precisions_from_range [PR113281] This was another PR caused by the way that vect_determine_precisions_from_range handles shifts. We tried to narrow 32768 >> x to a 16-bit shift based on range information for the inputs and outputs, with vect_recog_over_widening_pattern (after PR110828) adjusting the shift amount. But this doesn't work for the case where x is in [16, 31], since then 32-bit 32768 >> x is a well-defined zero, whereas no well-defined 16-bit 32768 >> y will produce 0. We could perhaps generate x < 16 ? 32768 >> x : 0 instead, but since vect_determine_precisions_from_range was never really supposed to rely on fix-ups, it seems better to fix that instead. The patch also makes the code more selective about which codes can be narrowed based on input and output ranges. This showed that vect_truncatable_operation_p was missing cases for BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR (equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1). pr113281-1.c is the original testcase. pr113281-[23].c failed before the patch due to overly optimistic narrowing. pr113281-[45].c previously passed and are meant to protect against accidental optimisation regressions. gcc/ PR target/113281 * tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove workaround for right shifts. (vect_truncatable_operation_p): Handle NEGATE_EXPR and BIT_NOT_EXPR. (vect_determine_precisions_from_range): Be more selective about which codes can be narrowed based on their input and output ranges. For shifts, require at least one more bit of precision than the maximum shift amount. gcc/testsuite/ PR target/113281 * gcc.dg/vect/pr113281-1.c: New test. * gcc.dg/vect/pr113281-2.c: Likewise. * gcc.dg/vect/pr113281-3.c: Likewise. * gcc.dg/vect/pr113281-4.c: Likewise. * gcc.dg/vect/pr113281-5.c: Likewise. (cherry picked from commit 1a8261e047f7a2c2b0afb95716f7615cba718cd1)
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #29 from Robin Dapp --- Just to document again: The test case should not be vectorized and at some point we will adjust the cost model so it is not going to be. I'd prefer to base that decision on real uarchs rather than adjust the generic cost model right away though.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #28 from JuzheZhong --- The original cost model I did work for all cases but with some middle-end changes the cost model failed. I don't have time to figure out what's going on here. Robin may be interested at it.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #27 from Patrick O'Neill --- (In reply to Andrew Pinski from comment #26) > (In reply to Edwin Lu from comment #25) > > It's still persisting on trunk (at least for pr113281-1.c > > https://godbolt.org/z/M9EK44hKe) > > I looked into what the vectorizer produces: > vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8; > _22 = (int) a.4_25; > vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 > } >> vect__22.13_31; > _12 = 32872 >> _22; > vect_b_7.15_34 = (vector(8) short int) vect__12.14_33; > > that is valid thing to do. That is do the shift in `vector(8) int` and then > do a truncation. The issue originally was about doing the shift in > `vector(8) short` which is not happening here. The regressed testcase looks like its testing if riscv vectorizes the code at all (the first issue Juzhe noted in comment #3 and then fixed). So this is a performance regression for risc-v, not correctness.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #26 from Andrew Pinski --- (In reply to Edwin Lu from comment #25) > It's still persisting on trunk (at least for pr113281-1.c > https://godbolt.org/z/M9EK44hKe) I looked into what the vectorizer produces: vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8; _22 = (int) a.4_25; vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 } >> vect__22.13_31; _12 = 32872 >> _22; vect_b_7.15_34 = (vector(8) short int) vect__12.14_33; that is valid thing to do. That is do the shift in `vector(8) int` and then do a truncation. The issue originally was about doing the shift in `vector(8) short` which is not happening here.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 Edwin Lu changed: What|Removed |Added CC||ewlu at rivosinc dot com --- Comment #25 from Edwin Lu --- (In reply to Richard Sandiford from comment #24) > Fixed on trunk so far, but it's latent on branches. I'll see what > the trunk fallout is like before asking about backports. It looks like we have a regression for riscv I was going through the scan dump failures on trunk and ended up revisiting https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 where gcc.dg/vect/costmodel/riscv/rvv/pr113281-[125].c are failing the scan-dump checks. I didn't realize at the time that the scan dumps were checking code correctness and ended up ignoring it. It's still persisting on trunk (at least for pr113281-1.c https://godbolt.org/z/M9EK44hKe) A bisection on https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 commit range suggests https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1 introduced it. # first bad commit: [1a8261e047f7a2c2b0afb95716f7615cba718cd1] vect: Tighten vect_determine_precisions_from_range [PR113281] Configuration ../configure --prefix=$(pwd) --with-multilib-generator="rv64gcv-lp64d--" make stamps/build-gcc-linux-stage1 -j 32 Testing ./build-gcc-linux-stage1/gcc/cc1 ../gcc/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c -march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow -fdiagnostics-plain-output -march=rv64gcv_zvl256b -mabi=lp64d -O3 -ftree-vectorize -ffat-lto-objects -fno-ident -o pr113281-1.s
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 Jakub Jelinek changed: What|Removed |Added Priority|P1 |P2 Target Milestone|14.0|11.5 Summary|[14 Regression] Wrong code |[11/12/13 Regression] |due to vectorization of |Latent wrong code due to |shift reduction and missing |vectorization of shift |promotions since r14-3027 |reduction and missing ||promotions since r9-1590