from:"Alan Lawrence"

Re: [PATCH] Fix PR68067

2015-11-06 Thread Alan Lawrence

On 06/11/15 10:39, Richard Biener wrote: ../spec2000/benchspec/CINT2000/254.gap/src/polynom.c:358:11: error: location references block not in block tree l1_279 = PHI <1(28), l1_299(33)> ^^^ this is the error to look at! It means that the GC heap will be corrupted quite easily. Thanks, I'll

Re: [PATCH] Fix PR68067

2015-11-06 Thread Alan Lawrence

On 28/10/15 13:38, Richard Biener wrote: Applied as follows. Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2015-10-28 Richard Biener * fold-const.c (negate_expr_p): Adjust the division case to properly avoid introducing undefined overflow. (fold_negat

Re: [PATCH 6/6] Make SRA replace constant-pool loads

2015-11-05 Thread Alan Lawrence

On 3 November 2015 at 14:01, Richard Biener wrote: > > Hum. I still wonder why we need all this complication ... Well, certainly I'd love to make it simpler, and if the complication is because I've gone about trying to deal with especially Ada in the wrong way... > I would > expect that if > w

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-11-05 Thread Alan Lawrence

On 03/11/15 13:39, Richard Biener wrote: > On Tue, Oct 27, 2015 at 6:38 PM, Alan Lawrence wrote: >> >> Say I...P are consecutive, the input would have gaps 0 1 1 1 1 1 1 1. If we >> split the load group, we would want subgroups with gaps 0 1 1 1 and 0 1 1 1? > > As sai

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-05 Thread Alan Lawrence

On 3 November 2015 at 11:35, Richard Biener wrote: > > I think this should simply re-write A << B to (type) (unsigned-type) A > * (1U << B). > > Does that then still vectorize the signed case? I didn't realize our representation of chrec's could express that. Yes, it does - thanks! (And the avx51

Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-11-05 Thread Alan Lawrence

On 30/10/15 10:54, Eric Botcazou wrote: > On 30/10/15 10:44, Richard Biener wrote: >> >> I think you want to use wide-ints here and >> >> wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN >> (...)), TYPE_SIGN (TYPE_DOMAIN (..))); >> wide_int maxidx = ... >> >> you can then simply

Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-11-04 Thread Alan Lawrence

> s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines > of the comment. > > It looks like get_ctor_element_at_index has numerous formatting > problems. In particular you didn't indent the braces across the board > properly. Also check for tabs vs spaces issues please. Yes, y

[PATCH/RFTesting][MIPS] Migrate reduction optabs in mips-ps-3d.md

2015-11-03 Thread Alan Lawrence

There are still a few uses of the old reduc_[us](plus|min|max)_ optabs remaining. This migrates the instances in mips-ps-3d.md. This seemed straightforward, as mips-ps-3d.md also provides a vec_extractv2sf. I tried to be conservative and handle all the possible cases for endianness, this may be ov

[PATCH][i386]Migrate reduction optabs to reduc__scal

2015-11-03 Thread Alan Lawrence

This migrates the various reduction optabs in sse.md to use the reduce-to-scalar form. I took the straightforward approach (equivalent to the migration code in expr.c/optabs.c) of generating a vector temporary, using the existing code to reduce to that, and extracting lane 0, in each pattern. Boot

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence

On 3 November 2015 at 10:27, Alan Lawrence wrote: > That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if > instead of my patch 2 (normalization of MEM_REFs) we have this: > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 4327990..2889a96 100644 > --

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence

On 30/10/15 05:35, Jeff Law wrote: > On 10/29/2015 01:18 PM, Alan Lawrence wrote: >> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs >> (with >> pointer type to the array element type). >> >> gcc/ChangeLog: >> >> * t

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-03 Thread Alan Lawrence

On 27/10/15 22:27, H.J. Lu wrote: > > It caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 Bah :(. So yes, in general case, we can't rewrite (a << 1) to (a * 2) as for signed types (0x7f...f) << 1 == -2 whereas (0x7f...f * 2) is undefined behaviour. Oh well :(... I don't have a real

Re: [PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-11-02 Thread Alan Lawrence

On 26/10/15 16:26, Alan Lawrence wrote: The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-11-02 Thread Alan Lawrence

On 02/11/15 14:38, Alan Lawrence wrote: > I'm a bit puzzled as to why nobody else has been seeing this, as it's been happening to me as part of building gcc on x86_64, but since this patch I've been seeing an ICE in vec::operator[] in reorder_basic_blocks_simple, building

[PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads

2015-10-29 Thread Alan Lawrence

This is a revision of previous series at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01485.html , and follows on from the first two patches of that series, which have been pushed already. A few things have happened since. The previous patch 3, making SRA generate ARRAY_REFS, is removed. As Marti

[PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-10-29 Thread Alan Lawrence

This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where Richi points out that CONSTRUCTOR elements are not necessarily ordered. I wasn't sure of a good naming convention for the new get_ctor_element_at_index, other suggestions welcome. gcc/ChangeLog: * gimple-fold.

[PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions.

2015-10-29 Thread Alan Lawrence

gcc/ChangeLog: * tree-sra.c (scalarizable_type_p): Comment variable-length arrays. (completely_scalarize): Comment zero-length arrays. (get_access_replacement): Correct comment re. precondition. --- gcc/tree-sra.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) d

[PATCH 6/6] Make SRA replace constant-pool loads

2015-10-29 Thread Alan Lawrence

This has changed quite a bit since the previous revision (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada and specifically Ada on ARM. I didn't find a good alternative to scanning for constant-pool accesses "as we go" through the function, and although I didn't find an

[PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-10-29 Thread Alan Lawrence

The code I added to completely_scalarize for arrays isn't right in some cases of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada testsuite). On ARM, this prevents a failure bootstrapping Ada with the next patch, as well as a few ACATS tests (e.g. c64106a). Some discussion

[PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs.

2015-10-29 Thread Alan Lawrence

This makes dom2 identify e.g. MEM[(int[8] *)...] with MEM[(int *)...]. These are not generally equivalent as they have different aliasing behaviour but they have the same value as far as dom is concerned and so this helps find more equivalences. There is some question over the best policy here, bu

[PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-10-29 Thread Alan Lawrence

This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with pointer type to the array element type). gcc/ChangeLog: * tree-ssa-dom.c (dom_normalize_single_rhs): New. (dom_normalize_gimple_stmt): New. (lookup_avail_expr): Call dom_normalize_gimple_stmt.

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-27 Thread Alan Lawrence

On 26/10/15 15:04, Richard Biener wrote: apart from the fact that you'll post a new version you need to adjust GROUP_GAP. You also seem to somewhat "confuse" "first I stmts" and "a group of size I", those are not the same when the group has haps. I'd say "a group of size i" makes the most sense

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-27 Thread Alan Lawrence

--in-reply-to On 26/10/15 08:58, Richard Biener wrote: > > On Fri, Oct 23, 2015 at 5:15 PM, Alan Lawrence wrote: >> + chrec2 = fold_build2 (LSHIFT_EXPR, TREE_TYPE (rhs1), >> + build_int_cst (TREE_TYPE (rhs1), 1), > > 'type' inst

[PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-10-26 Thread Alan Lawrence

The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0 *f_2(D)+0 S2 A16]) (const_double:HF

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-25 Thread Alan Lawrence

On 23 October 2015 at 16:20, Alan Lawrence wrote: > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > index ab54a48..b012d78 100644 > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > @@ -16,

[PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-23 Thread Alan Lawrence

vect_analyze_slp_instance currently only creates an slp_instance if _all_ stores in a group fitted the same pattern. This patch splits non-matching groups up on vector boundaries, allowing only part of the group to be SLP'd, or multiple subgroups to be SLP'd differently. The algorithm could be mad

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-23 Thread Alan Lawrence

On 19/10/15 12:49, Richard Biener wrote: > Err, you should always do the shift in the type of rhs1. You should also > avoid the chrec_convert of rhs2 above for shifts. Err, yes, indeed. Needed to keep the chrec_convert before the chrec_fold_multiply, and the rest followed. How's this? Bootstr

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-22 Thread Alan Lawrence

On closer inspection I think you can also remove this guy (from loongson.md): (define_insn "reduc_uplus_v8qi" [(set (match_operand:V8QI 0 "register_operand" "=f") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "f")] UNSPEC_LOONGSON_BIADD))] "TARGET_HARD_FL

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-10-22 Thread Alan Lawrence

Just one very small point... On 19/10/15 09:17, Alan Hayward wrote: > - if (check_reduction > - && (!commutative_tree_code (code) || !associative_tree_code (code))) > + if (check_reduction) > { > - if (dump_enabled_p ()) > -report_vect_op (MSG_MISSED_OPTIMIZATION, def_st

[PATCH][Testsuite] Add --param sra-max-scalarization-size-Ospeed to sra-12.c

2015-10-21 Thread Alan Lawrence

gcc.dg/tree-ssa/sra-12.c is skipped on a bunch of targets, including AArch64, because the default max-scalarization-size depends on MOVE_RATIO, and on those targets thus ends up being too small for SRA to optimize the testcase. Recently I noticed that the test has been failing for some time on ARM

[PATCH][AArch64 Testsuite][Trivial?] Remove divisions-to-produce-NaN from vdiv_f.c

2015-10-20 Thread Alan Lawrence

The test vdiv_f.c #define's NAN to (0.0 / 0.0). This produces extra scalar fdiv's, which complicate the scan-assembler testing. We can remove these by using __builtin_nan instead. Tested on AArch64 Linux. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vdiv_f.c: Use __builtin_nan. --- g

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-19 Thread Alan Lawrence

On 14/10/15 23:02, Charles Baylis wrote: On 12 October 2015 at 11:58, Alan Lawrence wrote: > Given we are making changes here to how this all works on bigendian, have you tested armeb at all? I tested on big endian, and it passes, except Well, I asked because it seemed good to m

[PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-16 Thread Alan Lawrence

This lets the vectorizer handle some simple strides expressed using left-shift rather than mul, e.g. a[i << 1] (whereas previously only a[i * 2] would have been handled). This patch does *not* handle the general case of shifts - neither a[i << j] nor a[1 << i] will be handled; that would be a sign

[PATCH][Testsuite] Turn on 64-bit-vector tests for AArch64

2015-10-16 Thread Alan Lawrence

This enables tests bb-slp-11.c and bb-slp-26.c for AArch64. Both of these are currently passing on little- and big-endian. (Tested on aarch64-none-linux-gnu and aarch64_be-none-elf). OK for trunk? gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect64): Add AA

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-12 Thread Alan Lawrence

On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c ... case NEON_ARG_MEMORY: /* Check if expand failed. */ if (op[argc] == const0_rtx) { - va_end (a

Re: [PATCH 2/3] [ARM] PR63870 Mark lane indices of vldN/vstN with appropriate qualifier

2015-10-12 Thread Alan Lawrence

On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 2667866..251afdc 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -4261,8 +4261,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD1_LANE))] "TARG

Re: [[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-12 Thread Alan Lawrence

On 09/10/15 22:01, Jeff Law wrote: So my question for the series as a whole is whether or not we need to do something for the other languages, particularly Fortran. I was a bit surprised to see this stuff bleed into the C/C++ front-ends and obviously wonder if it's bled into Fortran, Ada, Java,

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-07 Thread Alan Lawrence

On 07/10/15 11:50, Simon Dardis wrote: On the change from smin/smax it was a deliberate change as I managed to confuse myself of the mode patterns, correct version follows. Reverted back to VWHB for smax/smin. Stylistic point addressed. No new regression, ok for commit? Well, I'm not a MIPS

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-06 Thread Alan Lawrence

Thanks for working on this, Simon! On 01/10/15 15:43, Simon Dardis wrote: -(define_expand "reduc_smax_" - [(match_operand:VWHB 0 "register_operand" "") - (match_operand:VWHB 1 "register_operand" "")] +(define_expand "reduc_smax_scal_" + [(match_operand:HI 0 "register_operand" "") + (match_

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence

On 21/09/15 15:38, James Greenhalgh wrote: On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote: [Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence

[Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian view of +;; the vector You mean vec_unpacks_{hi,lo} (which seems to go against the *architectural* bit after this

Re: [PR64164] drop copyrename, integrate into expand

2015-09-18 Thread Alan Lawrence

On 02/09/15 23:12, Alexandre Oliva wrote: On Sep 2, 2015, Alan Lawrence wrote: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): Thanks. The failure mode was different in the current, revampe

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-18 Thread Alan Lawrence

On 18/09/15 09:35, Richard Biener wrote: Btw, we ditched the original reduce-to-vector variant due to its endianess issues (it only had _one_ element of the vector contain the reduction result). Re-introducing reduce-to-vector but with the reduction result in all elements wouldn't have any issu

[PATCH][RS6000] Migrate from reduc_xxx to reduc_xxx_scal optabs

2015-09-18 Thread Alan Lawrence

This is a respin of https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html after discovering that patch was broken on power64le - thanks to Bill Schmidt for pointing out that gcc112 is the opposite endianness to gcc110... This time I decided to avoid any funny business with making RTL match othe

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-18 Thread Alan Lawrence

On 18/09/15 13:17, Richard Biener wrote: Ok, I see. That this case is already vectorized is because it implements MAX_EXPR, modifying it slightly to int foo (int *a) { int val = 0; for (int i = 0; i < 1024; ++i) if (a[i] > val) val = a[i] + 1; return val; } makes it no lo

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-17 Thread Alan Lawrence

On 15/09/15 08:43, Richard Biener wrote: > > Sorry for chiming in so late... Not at all, TYVM for your help! > TREE_CONSTANT isn't the correct thing to test. You should use > TREE_CODE () == INTEGER_CST instead. Done (in some cases, via tree_fits_shwi_p). > Also you need to handle > NULL_TREE

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence

On 16/09/15 17:19, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: I proposed a patch to migrate PPC off the old patterns, but have forgotten to ping it recently - last at https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html ... (ping?!) Hi Alan, Thanks for

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence

On 16/09/15 17:10, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence

On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL, UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL, UNSPEC_REDUC_UMIN_

Re: [PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-09-15 Thread Alan Lawrence

On 15/09/15 10:43, James Greenhalgh wrote: > > It is convenient that this falls out, but likely surprising for nregs. > Please add a comment to nregs explaining the dual use of nregs to represent > both the number of Q registers used for the type, and the number of elements > touched by the structu

[PATCH][AArch64 array_mode 1/8] Rename vec_store_lanes_lane to aarch64_vec_store_lanes_lane

2015-09-15 Thread Alan Lawrence

vec_store_lanes{oi,ci,xi}_lane are not standard pattern names, so using them in aarch64-simd.md is misleading. This adds an aarch64_ prefix to those pattern names, paralleling aarch64_vec_load_lanes_lane. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarc

[PATCH][AArch64 array_mode 4/8] Remove EImode

2015-09-15 Thread Alan Lawrence

This removes EImode from the (AArch64) compiler, and all mention of or support for it. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist): Update comment. * config/aarch64/aarch64-builtins.c (

[PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-09-15 Thread Alan Lawrence

The previous patches leave ld[234]_lane, st[234]_lane, and ld[234]r expanders all nearly identical, so we can easily parameterize across the number of lanes and combine them. For the ld_lane pattern, I switched from the VCONQ attribute to just using the MODE attribute, this is identical for all

[PATCH][AArch64 array_mode 5/8] Remove V_FOUR_ELEM, again using BLKmode + set_mem_size.

2015-09-15 Thread Alan Lawrence

This removes V_FOUR_ELEM in the same way that patch 3 removed V_THREE_ELEM, again using BLKmode + set_mem_size. (This makes the four-lane expanders very similar to the three-lane expanders, and they will be combined in patch 7.) bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog:

[PATCH][AArch64 array_mode 8/8] Add d-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-09-15 Thread Alan Lawrence

This adds an AARCH64_VALID_SIMD_DREG_MODE exactly paralleling the existing ...QREG... macro. The new test now compiles (at -O3) to: test_1: add v1.2s, v1.2s, v5.2s add v2.2s, v2.2s, v6.2s add v3.2s, v3.2s, v7.2s add v0.2s, v0.2s, v4.2s ret

[PATCH][AArch64 array_mode 6/8] Remove V_TWO_ELEM, again using BLKmode + set_mem_size.

2015-09-15 Thread Alan Lawrence

Same logic as previous; this makes the 2-, 3-, and 4-lane expanders all follow the same pattern. bootstrapped and check-gcc on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_ld2r, aarch64_vec_load_lanesoi_lane, aarch64_vec_store_lan

[PATCH][AArch64 array_mode 2/8] Remove VSTRUCT_DREG, use BLKmode for d-reg aarch64_st/ld expands

2015-09-15 Thread Alan Lawrence

aarch64_st and aarch64_ld expanders back onto 12 insns aarch64_{ld,st}{2,3,4}_dreg (for VD and DX modes), using the VSTRUCT_DREG iterator over TI/EI/OI modes to represent the block of memory transferred. Instead, use BLKmode for all memory transfers, explicitly setting mem_size. Bootstrapped and c

[PATCH][AArch64 array_mode 3/8] Stop using EImode in aarch64-simd.md and iterators.md

2015-09-15 Thread Alan Lawrence

The V_THREE_ELEM attribute used BLKmode for most sizes, but occasionally EImode. This patch changes to BLKmode in all cases, explicitly setting memory size (thus, preserving size for the cases that were EImode, and setting size for the first time for cases that were already BLKmode). The patterns

Re: [PATCH][AArch64 0/8] Add D-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-09-15 Thread Alan Lawrence

Here's a rebased version, which fixes conflicts with float16 and Christophe's fixes for bigendian lane indices. Also fiddled around with whitespace in aarch64-simd.md

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-14 Thread Alan Lawrence

Ping. (Rerevert with 5 lines extra paranoia in scalarizable_type_p). Thanks, Alan On 08/09/15 13:43, Martin Jambor wrote: Hi, On Mon, Sep 07, 2015 at 02:15:45PM +0100, Alan Lawrence wrote: In-Reply-To: <55e0697d.2010...@arm.com> On 28/08/15 16:08, Alan Lawrence wrote: Alan Lawrence

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-14 Thread Alan Lawrence

On 11/09/15 14:19, Bill Schmidt wrote: A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar that has to be broadcast back to a vector, and the best way to implement it for us already has the max value in all positions of a vector. But that is something we should be able to f

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-10 Thread Alan Lawrence

On 09/09/15 11:31, Alan Lawrence wrote: Hmmm, hang on. I'm not quite sure what the actual issue/bug is here, but is this the same issue as my patch 12 "with BE RTL fix"? (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01482.html, explanation last at https://gcc.gnu.org/ml/gcc-

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-09 Thread Alan Lawrence

Hmmm, hang on. I'm not quite sure what the actual issue/bug is here, but is this the same issue as my patch 12 "with BE RTL fix"? (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01482.html, explanation last at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02365.html) I pushed this as r227551 las

Re: [PATCH 15/15][ARM] Update sourcebuild.texi with testsuite/effective-target hooks

2015-09-08 Thread Alan Lawrence

Original message here: https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02363.html On 28/07/15 12:27, Alan Lawrence wrote: > This documents the change to arm_neon_fp16_ok in the first patch; the addition > of arm_neon_fp16_hw_ok in the last patch; and corrects a cross-reference. > > (I

Re: [PATCH][AArch64] Improve code generation for float16 vector code

2015-09-08 Thread Alan Lawrence

On 08/09/15 09:26, James Greenhalgh wrote: On Tue, Sep 08, 2015 at 09:21:08AM +0100, James Greenhalgh wrote: On Mon, Sep 07, 2015 at 02:09:01PM +0100, Alan Lawrence wrote: On 04/09/15 13:32, James Greenhalgh wrote: In that case, these should be implemented as inline assembly blocks. As it

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_i{f32_f16,f16_f32}

2015-09-08 Thread Alan Lawrence

Ping. (Thanks, Christophe!) Correct version here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01501.html Cheers, Alan On 25/08/15 15:21, Christophe Lyon wrote: On 25 August 2015 at 15:57, Alan Lawrence wrote: Sorry - wrong version posted. The hunk for add_options_for_arm_neon_fp16 has

Re: [PATCH 13/15][ARM/AArch64 Testsuite] Add float16 tests to advsimd-intrinsics testsuite

2015-09-08 Thread Alan Lawrence

Ping. (Thanks, Christophe!). Original message: https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02366.html On 25/08/15 14:28, Alan Lawrence wrote: Christophe Lyon wrote: On 28 July 2015 at 13:26, Alan Lawrence wrote: This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00488.html

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-07 Thread Alan Lawrence

In-Reply-To: <55e0697d.2010...@arm.com> On 28/08/15 16:08, Alan Lawrence wrote: > Alan Lawrence wrote: >> >> Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix >> (that declares arrays with any of these properties as unscalarizable). &

[PATCH][AArch64] Improve code generation for float16 vector code

2015-09-07 Thread Alan Lawrence

On 04/09/15 13:32, James Greenhalgh wrote: > In that case, these should be implemented as inline assembly blocks. As it > stands, the code generation for these intrinsics will be very poor with this > patch applied. > > I'm going to hold off OKing this until I see a follow-up to fix the code > gene

Re: [PR64164] drop copyrename, integrate into expand

2015-09-03 Thread Alan Lawrence

On 02/09/15 23:12, Alexandre Oliva wrote: On Sep 2, 2015, Alan Lawrence wrote: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): Thanks. The failure mode was different in the current, revampe

Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alan Lawrence

On 14/08/15 19:57, Alexandre Oliva wrote: I'm glad it appears to be working to everyone's satisfaction now. I've just committed it as r226901, with only a context adjustment to account for a change in use_register_for_decl in function.c. /me crosses fingers :-) Here's the patch as checked in:

Re: [testsuite] Don't xfail gcc.dg/vect/no-scevccp-outer-11.c

2015-09-01 Thread Alan Lawrence

Rainer Orth wrote: It seems that since 20150717, gcc.dg/vect/no-scevccp-outer-11.c XPASSes everywhere: XPASS: gcc.dg/vect/no-scevccp-outer-11.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED." 1 To reduce testsuite noise, I'd like to remove the xfail as follows. Tested with the appropriate r

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence

Alan Lawrence wrote: Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix (that declares arrays with any of these properties as unscalarizable). Monday is a bank holiday in UK and so I expect to get back to you on Tuesday. --Alan In the meantime I'

[PATCH] Tidy tree-ssa-dom.c: Use dom_valueize more.

2015-08-28 Thread Alan Lawrence

The code in the dom_valueize function is duplicated a number of times; so, call the function. Also remove a comment in lookup_avail_expr re const_and_copies, describing one of said duplicates, that looks like it was superceded in r87787. Bootstrapped + check-gcc on x86-none-linux-gnu. gcc/Change

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence

Richard Biener wrote: On Fri, 28 Aug 2015, Alan Lawrence wrote: Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see the

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence

Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see the problem too now, investigating --Alan

Fixing sra-12.c (was: Re: [PATCH 2/5] completely_scalarize arrays as well as records)

2015-08-27 Thread Alan Lawrence

Jeff Law wrote: diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c new file mode 100644 index 000..e251058 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c @@ -0,0 +1,38 @@ +/* Verify that SRA total scalarization works on records containi

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-27 Thread Alan Lawrence

Martin Jambor wrote: > > First, I would be much > happier if you added a proper comment to scalarize_elem function which > you forgot completely. The name is not very descriptive and it has > quite few parameters too. > > Second, this patch should also fix PR 67283. It would be great if you > cou

Re: [PATCH 1/5] Refactor completely_scalarize_var

2015-08-27 Thread Alan Lawrence

Martin Jambor wrote: > > If you change what the function does, you have to change the comment > too. If I am not mistaken, even with the whole patch set applied, the > first sentence would still be: "Create total_scalarization accesses > for all scalar type fields in VAR and for VAR as a whole." A

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-26 Thread Alan Lawrence

Richard Biener wrote: One extra question is does the way we limit total scalarization work well for arrays? I suppose we have either sth like the maximum size of an aggregate we scalarize or the maximum number of component accesses we create? Only the former and that would be kept intact.

Re: [RFC 4/5] Handle constant-pool entries

2015-08-26 Thread Alan Lawrence

Jeff Law wrote: The question I have is why this differs from the effects of patch #5. That would seem to indicate that there's things we're not getting into the candidate tables with this approach?!? I'll answer this first, as I think (Richard and) Martin have identified enough other issues

[PATCH][AArch64 array_mode 8/8] Add d-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-08-26 Thread Alan Lawrence

This adds an AARCH64_VALID_SIMD_DREG_MODE exactly paralleling the existing ...QREG... macro, and as a driveby fixes mode->(MODE) in the latter. The new test now compiles (at -O3) to: test_1: add v1.2s, v1.2s, v5.2s add v2.2s, v2.2s, v6.2s add v3.2s, v3.2s, v7.2

[PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-08-26 Thread Alan Lawrence

The previous patches leave ld[234]_lane, st[234]_lane, and ld[234]r expanders all nearly identical, so we can easily parameterize across the number of lanes and combine them. For the ld_lane pattern, I switched from the VCONQ attribute to just using the MODE attribute, this is identical for all

[PATCH][AArch64 array_mode 5/8] Remove V_FOUR_ELEM, again using BLKmode + set_mem_size.

2015-08-26 Thread Alan Lawrence

This removes V_FOUR_ELEM in the same way that patch 3 removed V_THREE_ELEM, again using BLKmode + set_mem_size. (This makes the four-lane expanders very similar to the three-lane expanders, and they will be combined in patch 7.) bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog:

[PATCH][AArch64 array_mode 3/8] Stop using EImode in aarch64-simd.md and iterators.md

2015-08-26 Thread Alan Lawrence

The V_THREE_ELEM attribute used BLKmode for most sizes, but occasionally EImode. This patch changes to BLKmode in all cases, explicitly setting memory size (thus, preserving size for the cases that were EImode, and setting size for the first time for cases that were already BLKmode). The patterns

[PATCH][AArch64 array_mode 6/8] Remove V_TWO_ELEM, again using BLKmode + set_mem_size.

2015-08-26 Thread Alan Lawrence

Same logic as previous; this makes the 2-, 3-, and 4-lane expanders all follow the same pattern. bootstrapped and check-gcc on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_ld2r, aarch64_vec_load_lanesoi_lane, aarch64_vec_store_lan

[PATCH][AArch64 array_mode 2/8] Remove VSTRUCT_DREG, use BLKmode for d-reg aarch64_st/ld expands

2015-08-26 Thread Alan Lawrence

aarch64_st and aarch64_ld expanders back onto 12 insns aarch64_{ld,st}{2,3,4}_dreg (for VD and DX modes), using the VSTRUCT_DREG iterator over TI/EI/OI modes to represent the block of memory transferred. Instead, use BLKmode for all memory transfers, explicitly setting mem_size. Bootstrapped and c

[PATCH][AArch64 array_mode 4/8] Remove EImode

2015-08-26 Thread Alan Lawrence

This removes EImode from the (AArch64) compiler, and all mention of or support for it. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist): Update comment. * config/aarch64/aarch64-builtins.c (e

[PATCH][AArch64 array_mode 1/8] Rename vec_store_lanes_lane to aarch64_vec_store_lanes_lane

2015-08-26 Thread Alan Lawrence

vec_store_lanes{oi,ci,xi}_lane are not standard pattern names, so using them in aarch64-simd.md is misleading. This adds an aarch64_ prefix to those pattern names, paralleling aarch64_vec_load_lanes_lane. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarc

[PATCH][AArch64 0/8] Add D-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-08-26 Thread Alan Lawrence

The end goal of this series of patches is to enable 64bit vector modes for TARGET_ARRAY_MODE_SUPPORTED_P, achieved in the last patch. At present, doing so causes ICEs with illegal subregs (e.g. returning the middle bits from a large int mode covering 3 vectors); the patchset avoids these by first r

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_{f16_f32,f32_f16}

2015-08-25 Thread Alan Lawrence

Christophe Lyon wrote: On 28 July 2015 at 13:27, Alan Lawrence wrote: gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp: set additional flags for neon-fp16 support. * gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New. Is that

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_i{f32_f16,f16_f32}

2015-08-25 Thread Alan Lawrence

Sorry - wrong version posted. The hunk for add_options_for_arm_neon_fp16 has moved to the previous patch! This version also fixes some whitespace issues. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New. * lib/target-supports.exp (check_effe

Re: [PATCH 13/15][ARM/AArch64 Testsuite] Add float16 tests to advsimd-intrinsics testsuite

2015-08-25 Thread Alan Lawrence

Christophe Lyon wrote: On 28 July 2015 at 13:26, Alan Lawrence wrote: This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00488.html, fixing up the testsuite for float16 vectors. Relative to the previous version, most of the additions to the tests are now within #if..#endif such

Re: [PATCH 0/15][ARM/AArch64] Add support for float16_t vectors (v3)

2015-08-25 Thread Alan Lawrence

Alan Lawrence wrote: All AArch64 patches are unchanged from previous version. However, in response to discussion, the ARM patches are changed (much as I suggested https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02249.html); this version: * Hides the existing vcvt_f16_f32 and vcvt_f32_f16

[PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Alan Lawrence

When SRA completely scalarizes an array, this patch changes the generated accesses from e.g. MEM[(int[8] *)&a + 4B] = 1; to a[1] = 1; This overcomes a limitation in dom2, that accesses to equivalent chunks of e.g. MEM[(int[8] *)&a] are not hashable_expr_equal_p with accesses to e.g. ME

[PATCH 1/5] Refactor completely_scalarize_var

2015-08-25 Thread Alan Lawrence

This is a small refactoring/renaming patch, it just moves the call to "completely_scalarize_record" out from completely_scalarize_var, and renames the latter to create_total_scalarization_access. This is because the next patch needs to drop the "_record" suffix and I felt it would be confusing to

[RFC 5/5] Always completely replace constant pool entries

2015-08-25 Thread Alan Lawrence

I used this as a means of better-testing the previous changes, as it exercises the constant replacement code a whole lot more. Indeed, quite a few tests are now optimized away to nothing on AArch64... Always pulling in constants, is almost certainly not what we want, but we may nonetheless want so

[PATCH 2/5] completely_scalarize arrays as well as records

2015-08-25 Thread Alan Lawrence

This changes the completely_scalarize_record path to also work on arrays (thus allowing records containing arrays, etc.). This just required extending the existing type_consists_of_records_p and completely_scalarize_record methods to handle things of ARRAY_TYPE as well as RECORD_TYPE. Hence, I rena

[PATCH 0/5][tree-sra.c] PR/63679 Make SRA replace constant pool loads

2015-08-25 Thread Alan Lawrence

ssa-dom-cse-2.c fails on a number of platforms because the input array is pushed out to the constant pool, preventing later stages from folding away the entire computation. This patch series fixes the failure by extending SRA to pull the constants back in. This is my first patch(set) to SRA and as

[RFC 4/5] Handle constant-pool entries

2015-08-25 Thread Alan Lawrence

This makes SRA replace loads of records/arrays from constant pool entries, with elementwise assignments of the constant values, hence, overcoming the fundamental problem in PR/63679. As a first pass, the approach I took was to look for constant-pool loads as we scanned through other accesses, and

< 1 2 3 4 5 6 >

101 - 200 of 583 matches

Mail list logo