[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Assignee|alalaw01 at gcc dot gnu.org|unassigned at gcc

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED

[Bug middle-end/70189] New: Combine constant-pool logic from gimplify + SRA

2016-03-11 Thread alalaw01 at gcc dot gnu.org
Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Following PR/63679 (r232506), gimplify.c (gimplify_init_constructor) uses lots of heuristics to choose between pushing initializers out to the constant pool

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #13 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Fri Mar 11 12:08:01 2016 New Revision: 234138 URL: https://gcc.gnu.org/viewcvs?rev=234138=gcc=rev Log: Fix PR/70013 gcc: PR tree-optimization/70013

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #12 from alalaw01 at gcc dot gnu.org --- Thanks, Martin - yes, I see. Patch posted at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00680.html after full regtest.

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2016-03-10 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #8 from alalaw01 at gcc dot gnu.org --- Indeed, the -DFOO=1 case vectorizes with -fno-tree-dominator-opts.

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2016-03-10 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #7 from alalaw01 at gcc dot gnu.org --- Looking at where the peeling happens. In both -DFOO=0 and -DFOO=1 cases, 107.ch2 peels the inner loop header, so there is an i<=max test in the outer loop before the inner loop. Howe

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2016-03-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #5 from alalaw01 at gcc dot gnu.org --- In the -DFOO=0 case, we have peeled an extra copy of the inner loop condition, i <= max_7, above the loop. scalar evolution (final_value_replacement_loop) works, because it sees the inner l

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #10 from alalaw01 at gcc dot gnu.org --- Hmmm, so this fixes the ICE, generating: SR.5_12 = MEM[(struct S0[2] *)&*.LC0].f0; MEM[(struct S0[2] *)&*.LC0].f0 = SR.5_12; d = *.LC0; d$3$f0_14 = MEM[(struct S0[2] *)&*.

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2016-03-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #4 from alalaw01 at gcc dot gnu.org --- loopinit introduces the exit phi in much the same way for both -DFOO=0 and -DFOO=1, so the difference is in sccp. In the -DFOO=0 case, sccp does this (removing TODO_cleanup_cfg from

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2016-03-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #3 from alalaw01 at gcc dot gnu.org --- So in the not-vectorized case (-DFOO=1), we get for the inner loop: : # i_27 = PHI <i_22(5), i_16(7)> _8 = (long unsigned int) i_27; _9 = _8 * 4; _11 = data_10(D) + _9; _13

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #9 from alalaw01 at gcc dot gnu.org --- In analyze_access_subtree (since r147980, "New implementation of SRA", 2009): else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL) root->grp_un

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #7 from alalaw01 at gcc dot gnu.org --- *second* half, sorry. grp_to_be_replaced is here true, but grp_unscalarized_data is false, so handle_unscalarized_data_in_subtree sets sad->refreshed=UDH_LEFT and we build the access to the

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #6 from alalaw01 at gcc dot gnu.org --- Ugh, initializing the scalar replacement for the first half of d, with a value read from the first half of d (should be from the first half of *.LC0).

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #5 from alalaw01 at gcc dot gnu.org --- Prior to SRA, we have d = *.LC0; d$0$f0_7 = MEM[(struct S0[2] *)&*.LC0].f0; e$f0_9 = MEM[(struct S0[2] *) + 3B].f0; _3 = (int) d$0$f0_7; c = _3; _5 = (int) e$f0_9; __builtin_pr

[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization

2016-03-07 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-03-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #87 from alalaw01 at gcc dot gnu.org --- Great, many thanks for the tests, I was worried if we had hit another distinct issue. (Of course this would be better on gcc-patches!)

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-03-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #84 from alalaw01 at gcc dot gnu.org --- Bah. Do you normally use -fno-aggressive-loop-optimizations? With -funknown-commons, did you try with/out aggressive loop opts? Powerpc{,64}{be,le} ? The unknown-commons testcase I included

[Bug bootstrap/60632] ICE in regcprop.c (copyprop_hardreg_forward_1)

2016-03-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60632 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-03-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #82 from alalaw01 at gcc dot gnu.org --- For those who haven't seen it, I've put forward this patch on the mailing list: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01746.html based on a suggestion from Jakub. (Unlike Richi's

[Bug tree-optimization/65963] Missed vectorization of loads strided with << when equivalent * succeeds

2016-02-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65963 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED

[Bug middle-end/66877] [6 Regression] FAIL: gcc.dg/vect/vect-over-widen-3-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vect_recog_over_widening_pattern: detected" 2

2016-02-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66877 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #79 from alalaw01 at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #78) > > That would pessimize it too much IMHO. I'm not sure how to evaluate the pessimization, given it's thought to be a widespread pseudo-F

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #77 from alalaw01 at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #72) > > Patch as posted passed bootstrap & regtest. Adjusted according to > comments but not tested otherwise - please s

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #53 from alalaw01 at gcc dot gnu.org --- (In reply to Thomas Koenig from comment #44) > I don't have access to SPEC, so I can only guess... Is there maybe an > equivalence involved, something like Turns out the COMMON is ac

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #43 from alalaw01 at gcc dot gnu.org --- Yeah, I plan to add a fortran-specific option for this, it's easy enough, but I can't run the gfortran testsuite with that, because there are lots of C files in there too, for which

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #39 from alalaw01 at gcc dot gnu.org --- Created attachment 37726 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37726=edit Proposed patch (without flag). Here's a prototype patch, that sets TYPE_SIZE to NULL_TREE but lea

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-09 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #37 from alalaw01 at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #36) > As Richard said, you can do similar (invalid too) stuff in C too, say: > struct S { int a[1]; } s; > in one TU and > struct S { i

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-08 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|RESOLVED|REOPENED

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-05 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #32 from alalaw01 at gcc dot gnu.org --- (In reply to rguent...@suse.de from comment #31) > > Thus a "fix" for the case where treating a[i] as a[0] is the issue > would be > &

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-05 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #27 from

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-04 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #20 from alalaw01 at gcc dot gnu.org --- Hmmm, hang on. In unport.fppized.f, shouldn't we be using the 'F2C/GCC COMPILER ON PC RUNNING UNIX (LINUX,BSD386,ETC)' version? In which case X has size (1) everywhere?

[Bug fortran/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-04 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Resolution|DUPLICATE |FIXED --- Comment #23 from

[Bug tree-optimization/69368] [6 Regression] spec2006 test case 416.gamess fails with the g++ 6.0 compiler starting with r232508

2016-02-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 --- Comment #10 from alalaw01 at gcc dot gnu.org --- The stores are getting optimized out because equal_mem_array_ref_p considers equal pairs of MEM_REFS like fmcom.x[_168] and fmcom.x[_208] That is, a ARRAY_REF whose first operand

[Bug middle-end/66877] [6 Regression] FAIL: gcc.dg/vect/vect-over-widen-3-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vect_recog_over_widening_pattern: detected" 2

2016-01-22 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66877 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |ASSIGNED

[Bug testsuite/69380] [6 Regression] FAIL: g++.dg/tree-ssa/pr69336.C scan-tree-dump-not optimized "cmap"

2016-01-21 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69380 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Target|arm-none-eabi powerpc*-*-* |arm-none-eabi powerpc

[Bug tree-optimization/69352] [6 Regression] profiledbootstrap failure with --with-build-config=bootstrap-lto

2016-01-19 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69352 --- Comment #9 from alalaw01 at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #7) > There are various bugs in the r232508 change. > The > gcc_assert (sz0 == sz1); > gcc_assert (max0 == max1); > gcc_asser

[Bug tree-optimization/69336] Constant value not detected

2016-01-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2016-01-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 --- Comment #40 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jan 18 12:40:43 2016 New Revision: 232508 URL: https://gcc.gnu.org/viewcvs?rev=232508=gcc=rev Log: Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c PR

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2016-01-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 --- Comment #39 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jan 18 12:29:02 2016 New Revision: 232506 URL: https://gcc.gnu.org/viewcvs?rev=232506=gcc=rev Log: Make SRA scalarize constant-pool loads PR target/63679 gcc

[Bug middle-end/68112] [6 Regression] FAIL: gcc.target/i386/avx512ifma-vpmaddhuq-2.c (test for excess errors)

2016-01-13 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED

[Bug target/69053] [6 Regression] ICE in build_vector_from_val

2016-01-12 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69053 --- Comment #9 from alalaw01 at gcc dot gnu.org --- I can confirm that both Richi's patch in comment 6 and my patchlet in comment 3, pass bootstrap + check-gcc on ARM and AArch64, and fix the ICE observed on ARM. (ICE never observed on AArch64.)

[Bug tree-optimization/67682] Missed vectorization: (another) straight-line memcpy/memset not vectorized when equivalent loop is

2016-01-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67682 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED

[Bug tree-optimization/69166] [6 Regression] ICE in get_initial_def_for_reduction, at tree-vect-loop.c:4188

2016-01-08 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69166 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|RESOLVED|REOPENED Last

[Bug target/69053] [6 Regression] ICE in build_vector_from_val

2016-01-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69053 --- Comment #3 from alalaw01 at gcc dot gnu.org --- Well, this fixes it, but I'm not sure it fixes it in the right place... diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index ee32166..bd66aa5 100644 --- a/gcc/tree-vect-loop.c +++ b

[Bug target/69053] [6 Regression] ICE in build_vector_from_val

2016-01-05 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69053 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed

[Bug target/69053] [6 Regression] ICE in build_vector_from_val

2016-01-05 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69053 --- Comment #2 from alalaw01 at gcc dot gnu.org --- build_vector_from_val then gets called to build a vector (4) unsigned long, from an int* (which is the right signedness and size, but being a pointer it is not types_compatible_p).

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-22 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #23 from alalaw01 at gcc dot gnu.org --- Yes, difficult. I'm conscious that this is stage 3, and worried about adding too much complexity, especially if we're writing code that we'd eventually drop in favour of a more complete

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #20 from alalaw01 at gcc dot gnu.org --- > Would be nice to have a reduced testcase for this one. Working on it. Sadly it's fortran :( The SLP tree that gets cancelled, is quite big (and quite untreelike, if we could

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #21 from alalaw01 at gcc dot gnu.org --- Here's the smallest testcase I could come up with (where SLP gets cancelled, but we end up with fewer st2's than before)...the key seems to be things being used in multiple places. #define N

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-16 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #18 from alalaw01 at gcc dot gnu.org --- Well, we've seen this patch fix some of the vectorizer performance regressions we've had on some benchmarks. On SPEC...the "SLP cancelled" case triggers all over the place, b

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-14 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #13 from alalaw01 at gcc dot gnu.org --- Hmmm, I realize a "definite" codegen improvement was maybe a bad choice of wording. A "substantial" (albeit uncertain!) improvement, may have been more accurate... Howeve

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-11 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #10 from alalaw01 at gcc dot gnu.org --- This causes to FAIL the scan-tree-dump-times 'vectorizing stmts using SLP' in slp-perm-{1,2,3,5,6,7,8,11}.c. Looking at the assembler before and after... slp-perm-1.c: this looks a big win

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-08 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #6 from alalaw01 at gcc dot gnu.org --- Well, I can confirm that the patch generates load-lanes/store-lanes instead of SLP, all over the (vect) testsuite. All execution tests are passing :) so it *may* just be a case of updating a lot

[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-08 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #8 from alalaw01 at gcc dot gnu.org --- Adding a check against BB SLP avoids some regressions caused by bailing out of BB SLP when we can't then do a load/store-lanes.

[Bug tree-optimization/68707] New: testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-04 Thread alalaw01 at gcc dot gnu.org
Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Target: aarch64, arm Created attachment 36928 --> https://gcc.gnu.org/bugzi

[Bug tree-optimization/68707] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

2015-12-04 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 --- Comment #1 from alalaw01 at gcc dot gnu.org --- Created attachment 36929 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36929=edit tree-vect-details dump (after patch, with SLP)

[Bug tree-optimization/68681] New: testcase gcc.dg/vect/pr45752.c fails on AArch64

2015-12-03 Thread alalaw01 at gcc dot gnu.org
: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Target: aarch64 Created attachment 36900 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36900=edit tree-vect-details dump Since r231015 (ht

[Bug tree-optimization/68549] [6 Regression] ICE: in verify_loop_structure, at cfgloop.c:1669

2015-11-26 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68549 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug c/68385] New: ICE building libstdc++ on arm-none-eabi

2015-11-17 Thread alalaw01 at gcc dot gnu.org
Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Target: arm-none-eabi Created attachment 36738 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36738=edit Reduced testcase Starting with r230365, building gcc for

[Bug tree-optimization/65963] Missed vectorization of loads strided with << when equivalent * succeeds

2015-11-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65963 --- Comment #4 from alalaw01 at gcc dot gnu.org --- I confirm the testcase fails execution on armeb-none-eabi (also at -O0), but it does so both with and without the patch to tree-scalar-evolution.c, which did not change codegen (at -O2 -ftree

[Bug tree-optimization/65963] Missed vectorization of loads strided with << when equivalent * succeeds

2015-11-05 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65963 --- Comment #2 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Thu Nov 5 18:39:38 2015 New Revision: 229825 URL: https://gcc.gnu.org/viewcvs?rev=229825=gcc=rev Log: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant gcc

[Bug rtl-optimization/68182] ICE in reorder_basic_blocks_simple building libitm/beginend.cc

2015-11-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68182 --- Comment #1 from alalaw01 at gcc dot gnu.org --- Created attachment 36636 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36636=edit Preprocessed source (compressed)

[Bug rtl-optimization/68182] New: ICE in reorder_basic_blocks_simple building libitm/beginend.cc

2015-11-02 Thread alalaw01 at gcc dot gnu.org
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Host: x86_64 Target: x86_64 Preprocessed source attached; command-line $ /work/alalaw01/build/./gcc

[Bug tree-optimization/56118] Piecewise vector / complex initialization from constants not combined

2015-11-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56118 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug tree-optimization/68165] Not constant-folding setting vector element

2015-11-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68165 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED

[Bug tree-optimization/68165] New: Not constant-folding setting vector element

2015-10-30 Thread alalaw01 at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- I believe these two C functions are equivalent: typedef float __attribute__((__vector_size__ (2 * sizeof(float

[Bug middle-end/68112] [6 Regression] FAIL: gcc.target/i386/avx512ifma-vpmaddhuq-2.c (test for excess errors)

2015-10-29 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 --- Comment #4 from alalaw01 at gcc dot gnu.org --- Sure, but gcc exploits undefinedness of multiply, so rewriting shift to multiply is not equivalent in the general case :(. One way forward might be to make definedness of overflow a bit finer

[Bug middle-end/68112] [6 Regression] FAIL: gcc.target/i386/avx512ifma-vpmaddhuq-2.c (test for excess errors)

2015-10-28 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 --- Comment #2 from alalaw01 at gcc dot gnu.org --- So (a << CONSTANT) is not equivalent to a * (1<<CONSTANT), as the former is well-defined, whereas the latter invokes UB if bits would have been shifted off the far end.

[Bug tree-optimization/67683] Missed vectorization: shifts of an induction variable

2015-10-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67683 alalaw01 at gcc dot gnu.org changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill

[Bug tree-optimization/57558] Loop not vectorized if iteration count could be infinite

2015-09-25 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57558 --- Comment #4 from alalaw01 at gcc dot gnu.org --- Here's another example, extracted from another benchmark - it vectorizes if INDEX is defined to 'long' but not if INDEX is 'short': #include unsigned char *t_run_test(unsigned char *in, int N

[Bug tree-optimization/67681] Missed vectorization: induction variable used after loop

2015-09-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67681 --- Comment #2 from alalaw01 at gcc dot gnu.org --- Being stupid here, but why does the outer loop having multiple exits matter - it's the inner loop that should be vectorized? FOO was a macro used to selectively make the test i>max disapp

[Bug tree-optimization/67682] New: Missed vectorization: (another) straight-line memcpy/memset not vectorized when equivalent loop is

2015-09-22 Thread alalaw01 at gcc dot gnu.org
Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- Target: aarch64 This code: void test (int*__restrict a, int*__restrict b

[Bug tree-optimization/67681] New: Missed vectorization: induction variable used after loop

2015-09-22 Thread alalaw01 at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Target Milestone: --- The inner loop here: void addlog2 (int *data) { int i = 1; for (int j=0; j<=30; j++) { int max = 1 << j; if

[Bug tree-optimization/67683] New: Missed vectorization: shifts of an induction variable

2015-09-22 Thread alalaw01 at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Blocks: 53947 Target Milestone: --- This testcase: void test (unsigned char *data, int max) { unsigned short val = 0xcdef

[Bug middle-end/65965] Straight-line memcpy/memset not vectorized when equivalent loop is

2015-09-22 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65965 --- Comment #4 from alalaw01 at gcc dot gnu.org --- (In reply to Richard Biener from comment #3) > Fixed for GCC 6. Indeed. I note that the same testcase does _not_ SLP/vectorize if I use consecutive indices: void test (int*__restrict a,

[Bug tree-optimization/67283] GCC regression over inlining of returned structures

2015-09-18 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #13 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Fri Sep 18 10:55:11 2015 New Revision: 227901 URL: https://gcc.gnu.org/viewcvs?rev=227901=gcc=rev Log: completely_scalarize arrays as well as records. gcc/: PR

[Bug target/63870] [Aarch64] [ARM] Errors in use of NEON intrinsics are reported incorrectly

2015-09-08 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63870 --- Comment #10 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Tue Sep 8 19:43:39 2015 New Revision: 227557 URL: https://gcc.gnu.org/viewcvs?rev=227557=gcc=rev Log: ARM/AArch64 Testsuite] Add float16 lane_f16_indices tests

[Bug target/67439] ICE: unrecognizable insn compiling arm-fp16 testcases with -march=armv7-a and -mrestrict-it

2015-09-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67439 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed

[Bug tree-optimization/67283] GCC regression over inlining of returned structures

2015-08-28 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #12 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Fri Aug 28 15:04:17 2015 New Revision: 227303 URL: https://gcc.gnu.org/viewcvs?rev=227303root=gccview=rev Log: Revert: completely_scalarize arrays as well as records

[Bug tree-optimization/67283] GCC regression over inlining of returned structures

2015-08-27 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 --- Comment #7 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Thu Aug 27 15:40:10 2015 New Revision: 227265 URL: https://gcc.gnu.org/viewcvs?rev=227265root=gccview=rev Log: completely_scalarize arrays as well as records gcc

[Bug tree-optimization/67283] GCC regression over inlining of returned structures

2015-08-27 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67283 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2015-08-03 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 --- Comment #37 from alalaw01 at gcc dot gnu.org --- Hmmm, no it's not the hashing - that pretty much ignores all types. It's the comparison in hashable_expr_equal_p, which just uses operand_equal_p, specifically this part (in fold-const.c

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2015-07-29 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 --- Comment #35 from alalaw01 at gcc dot gnu.org --- So it should be happening in dom2. On x86, input to dom2 is vect_cst_.9_31 = { 0, 1, 2, 3 }; [...]MEM[(int *)a] = vect_cst_.9_31; [...]vect__13.3_20 = MEM[(int *)a]; resulting

[Bug target/63679] [5/6 Regression][AArch64] Failure to constant fold.

2015-07-28 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug target/66964] Assembler error during ARM cross compile

2015-07-23 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66964 --- Comment #7 from alalaw01 at gcc dot gnu.org --- No new regressions bootstrapping that path on gcc-5-branch (--with-arch=armv7-a --with-fpu=neon-fp16 --with-float=hard). However, compiling the testcase with -dp reveals the bad strd's

[Bug target/66964] Assembler error during ARM cross compile

2015-07-22 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66964 --- Comment #6 from alalaw01 at gcc dot gnu.org --- Bootstrap+test in progress FYI. However, that patch *does not* fix this failure; there must be some other route.

[Bug target/66791] New: Replace builtins with gcc vector extensions code

2015-07-07 Thread alalaw01 at gcc dot gnu.org
: target Assignee: unassigned at gcc dot gnu.org Reporter: alalaw01 at gcc dot gnu.org Blocks: 47562 Target Milestone: --- Target: arm Lots of ARM neon intrinsics are implemented using builtins backing onto patterns in neon.md. These are opaque

[Bug target/65956] [5/6 Regression] Another ARM overaligned arg passing issue

2015-07-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65956 --- Comment #5 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jul 6 17:32:07 2015 New Revision: 225469 URL: https://gcc.gnu.org/viewcvs?rev=225469root=gccview=rev Log: 2015-07-06 Alan Lawrence alan.lawre...@arm.com

[Bug target/65956] [5/6 Regression] Another ARM overaligned arg passing issue

2015-07-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65956 --- Comment #3 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jul 6 16:58:16 2015 New Revision: 225465 URL: https://gcc.gnu.org/viewcvs?rev=225465root=gccview=rev Log: [ARM] PR/65956 AAPCS update for alignment attribute gcc

[Bug target/65956] [5/6 Regression] Another ARM overaligned arg passing issue

2015-07-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65956 --- Comment #4 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jul 6 17:06:00 2015 New Revision: 225466 URL: https://gcc.gnu.org/viewcvs?rev=225466root=gccview=rev Log: Fix eipa_src AAPCS issue (PR target/65956) 2015-05-05

[Bug target/65956] [5/6 Regression] Another ARM overaligned arg passing issue

2015-07-06 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65956 --- Comment #6 from alalaw01 at gcc dot gnu.org --- Author: alalaw01 Date: Mon Jul 6 17:37:50 2015 New Revision: 225470 URL: https://gcc.gnu.org/viewcvs?rev=225470root=gccview=rev Log: Backport r225466: tests from 'Fix eipa_src AAPCS issue (PR

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2015-07-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 65946, which changed state. Bug 65946 Summary: Simple loop with if-statement not vectorized https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65946 What|Removed |Added

[Bug middle-end/65946] Simple loop with if-statement not vectorized

2015-07-02 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65946 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED

[Bug target/64134] (vector float){0, 0, b, a} Uses stores when it does not need to

2015-06-26 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64134 alalaw01 at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED

[Bug tree-optimization/57600] Turn 2 comparisons into 1 with the min

2015-06-19 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57600 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug tree-optimization/61171] vectorization fails for a reduction in presence of subtraction

2015-06-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61171 alalaw01 at gcc dot gnu.org changed: What|Removed |Added CC||alalaw01 at gcc dot gnu.org

[Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64

2015-06-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952 --- Comment #7 from alalaw01 at gcc dot gnu.org --- (In reply to Richard Biener from comment #6) So aarch64 has no DImode vectors? Or just no DImode multiply (but it has a DImode vector shift?). Yes, the latter.

[Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64

2015-06-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952 --- Comment #8 from alalaw01 at gcc dot gnu.org --- (In reply to alalaw01 from comment #7) (In reply to Richard Biener from comment #6) So aarch64 has no DImode vectors? Or just no DImode multiply (but it has a DImode vector shift?). Yes

[Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64

2015-06-17 Thread alalaw01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952 --- Comment #5 from alalaw01 at gcc dot gnu.org --- So the above example tends to get fully unrolled, but even on an example with 32 ptrs rather than 4, yes the vectorizer fails because of the multiplication - but the multiplication is gone

  1   2   >