[PATCH AArch64 2/2] Replace temporary inline assembler for vget_high

2014-08-12 Thread Alan Lawrence
This patch replaces the current inline assembler for the vget_high intrinsics in arm_neon.h with a sequence of other calls, in a similar fashion to vget_low. Unlike the assembler, these are all transparent to the front-end, so should enable better optimization through the mid-end. Tested

[PATCH AArch64 1/3] Don't disparage add/sub in SIMD registers

2014-08-12 Thread Alan Lawrence
(It is no more expensive.) gcc/ChangeLog: * config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize SIMD reg variant.diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index

[PATCH AArch64 2/3] Add SIMD-reg variants of logical operators and/ior/xor/not

2014-08-12 Thread Alan Lawrence
This patch adds SIMD register variants for and, ior, xor and not - similarly to add/sub, the H/W supports it, and it'll be more efficient if the values are there already, e.g. if passed as [u]int64x1_t parameters. gcc/ChangeLog: * config/aarch64/aarch64.md (optabmode3, one_cmplmode2):

[PATCH AArch64 0/3] Optimizations for 64x1 vectors, also fixes/enables XOR

2014-08-12 Thread Alan Lawrence
Following the change to make arm_neon.h's (u?)int64x1_t types into vectors, these types are now passed in the SIMD registers rather than general purpose registers, which often results in poor quality code in functions taking or returning these types. Often values are moved from vector registers

[PATCH AArch64 3/3] Fix XOR_one_cmpl pattern; add SIMD-reg variants for BIC,ORN,EON

2014-08-12 Thread Alan Lawrence
[When I wrote that xor was broken on GPRs and this fixes it, I meant xor_one_cmpl rather than xor, sorry!] The pattern for xor_one_cmpl never matched, due to the action of combine_simplify_rtx; hence, separate this pattern out from that for ORN/BIC. ORN/BIC have equivalent SIMD-reg variants,

Re: [PATCH AArch64 3/3] Fix XOR_one_cmpl pattern; add SIMD-reg variants for BIC,ORN,EON

2014-08-12 Thread Alan Lawrence
...patch attached... Alan Lawrence wrote: [When I wrote that xor was broken on GPRs and this fixes it, I meant xor_one_cmpl rather than xor, sorry!] The pattern for xor_one_cmpl never matched, due to the action of combine_simplify_rtx; hence, separate this pattern out from that for ORN/BIC

[PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.

2014-08-04 Thread Alan Lawrence
At the moment, for two-element vectors, __builtin_shuffle (vector, (mask) {C, C}) for identical constants C outputs a zip (with both argument vectors the same) rather than a dup. Dup is more obvious and easier to read, so prefer it. For big-endian, aarch64_evpc_dup always aborts; however tests

[RFC][Vectorizer, AArch64] Fix PR/61114 by redefining REDUC_xxx_EXPR tree codes to return scalars

2014-08-01 Thread Alan Lawrence
This fixes PR/61114 by redefining the REDUC_{MIN,MAX,PLUS}_EXPR tree codes. These are presently documented as producing a vector with the result in element 0, and this is inconsistent with their use in tree-vect-loop.c (which on bigendian targets pulls the bits out of the other end of the

[PATCH AArch64] Removed unused get_lane and dup_lane builtins.

2014-08-01 Thread Alan Lawrence
None of the variants of __builtin_aarch64_get_lane or __builtin_aarch64_dup_lane are used in arm_neon.h (vdup_lane uses vget_lane and vdup_n, vget_lane uses be_checked_get_lane to do an endianness swap, vdup_n uses gcc vector extension code). So remove them. Regression tested on

[PATCH AArch64] Rename [u]int32x1_t to [u]int32_t (resp 16x1, 8x1) in arm_neon.h

2014-07-24 Thread Alan Lawrence
The ACLE spec does not mention the int32x1_t, uint32x1_t, int16x1_t, uint16x1_t, int8x1_t or uint8x1_t types currently in arm_neon.h, but just 'standard' types int32_t, int16_t, etc. This patch is a global search-and-replace across arm_neon.h (and the tests that depend on it). Regressed

Re: Does anyone use Ada on Alpha?

2014-07-24 Thread Alan Lawrence
[_ZNKSt7num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE6do_getES3_S3_RSt8ios_baseRSt12_Ios_IostateRe]+0x2ac): additional relocation overflows omitted from the output collect2: error: ld returned 1 exit status ../../gcc-4.9.1/gcc/ada/gcc-interface/Make-lang.in:596: recipe for target 'gnat1' faile Alan Lawrence

Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c

2014-07-17 Thread Alan Lawrence
. So I'd defer to the maintainers as to whether one might be preferable to the other...(but my unproven suspicion is that the two are equivalent, and no case where result_mode != shift_mode is possible!) --Alan Alan Lawrence wrote: Thanks for the suggestions! I think I've got a reasonably

Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c

2014-07-16 Thread Alan Lawrence
to produce an ashiftrt of one mode containing an XOR of another without an intervening sign_extend, zero_extend or subreg.) --Alan Jeff Law wrote: On 06/30/14 13:05, Alan Lawrence wrote: combine.c includes a check which prevents (ashiftrt (xor A C2) C1) from being commuted to (xor (ashiftrt A C1

Does anyone use Ada on Alpha?

2014-07-16 Thread Alan Lawrence
...as I've not managed to build such a gcc. If so, is there any chance you could please test check-ada with the following patch (in gcc/ directory), which rolls back r76965: Index: combine.c === --- combine.c (revision 212523)

[PATCH 4.9][ARM] Backport r211369 PR/61062 Fix arm_neon.h ZIP/UZP/TRN for Bigendian

2014-07-08 Thread Alan Lawrence
Alan Lawrence alan.lawre...@arm.com * config/arm/arm_neon.h (vtrn_s8, vtrn_s16, vtrn_u8, vtrn_u16, vtrn_p8, vtrn_p16, vtrn_s32, vtrn_f32, vtrn_u32, vtrnq_s8, vtrnq_s16, vtrnq_s32, vtrnq_f32, vtrnq_u8, vtrnq_u16, vtrnq_u32, vtrnq_p8, vtrnq_p16, vzip_s8, vzip_s16

[PATCH 4.9][AArch64] Backport 211892: PR/60825 Make float64x1_t in arm_neon.h a proper vector type

2014-07-08 Thread Alan Lawrence
vreinterpret implemention gcc/ChangeLog: Backport r211892 from mainline. 2014-06-23 Alan Lawrence alan.lawre...@arm.com PR target/60825 * config/aarch64/aarch64.c (aarch64_simd_mangle_map): Add entry for V1DFmode. * config/aarch64/aarch64

[PATCH 4.9][AArch64][testsuite] Backport r211502: PR/59843 Fix ICE on singleton vector of float on AArch64

2014-07-04 Thread Alan Lawrence
No regressions on aarch64-none-elf; new tests passing on aarch64-none-elf, arm-none-eabi, x86_64-unknown-linux-gnu: NA-PASS gcc.dg/vect/vect-singleton_1.c (test for warnings, line 20) NA-PASS gcc.dg/vect/vect-singleton_1.c (test for excess errors) gcc/ChangeLog: *

Re: [PATCH 4.9][AArch64][testsuite] Backport r211502: PR/59843 Fix ICE on singleton vector of float on AArch64

2014-07-04 Thread Alan Lawrence
Patch here. Alan Lawrence wrote: No regressions on aarch64-none-elf; new tests passing on aarch64-none-elf, arm-none-eabi, x86_64-unknown-linux-gnu: NA-PASS gcc.dg/vect/vect-singleton_1.c (test for warnings, line 20) NA-PASS gcc.dg/vect/vect-singleton_1.c (test for excess errors) gcc

[PATCH][ARM/AArch64 Testsuite] Fix vext[us]64_1.c test on ARM by unsharing test body

2014-07-03 Thread Alan Lawrence
Moving into own thread from https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01895.html This fixes the compilation failures of gcc.target/arm/simd/vexts64_1.c and gcc.target/arm/simd/vextu64_1.c that I introduced in r by unsharing the test body on AArch64. (As [u]int64x1_t are vector types on

[PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c

2014-06-30 Thread Alan Lawrence
combine.c includes a check which prevents (ashiftrt (xor A C2) C1) from being commuted to (xor (ashiftrt A C1) (ashiftrt C2 C1)) for constants C1, C2 if C2 has its sign bit set. Specifically, this prevents (ashiftrt (not A) C1) from being commuted to (not (ashiftrt A C1)) because the

[Obvious][AArch64 testsuite] Add --save-temps to singleton_intrinsics_1.c test.

2014-06-24 Thread Alan Lawrence
Scan-assembler test was running with dg-do assemble and not generating any assembler to scan. Tested on aarch64-none-elf and aarch64_be-none-elf; 40 * UNRESOLVED-PASS tests in singleton_intrinsics_1.c Patch (below) committed as r211934. --Alan -- Index:

Re: [PATCH AArch64 2/2] PR/60825 Make {int,uint}64x1_t in arm_neon.h a proper vector type

2014-06-24 Thread Alan Lawrence
: Likewise. * gcc.target/aarch64/simd/ext_u64_1.c: Likewise. * gcc.target/aarch64/simd/ext_s64.x: Remove. * gcc.target/aarch64/simd/ext_u64.x: Remove. --Alan James Greenhalgh wrote: On Thu, Jun 19, 2014 at 01:30:32PM +0100, Alan Lawrence wrote: diff --git a/gcc/testsuite

[PATCH 4.9 ARM] Backport r210219: Neon Intrinsics TLC - remove ML

2014-06-23 Thread Alan Lawrence
As for 4.8, I'm intending to backport the ZIP/UZP/TRN fix for ARM big-endian in r211369 of mainline. That patches arm_neon.h, so again we need to remove the OCAML code by which that file is autogenerated...ok? --Alancommit e83cb5fff3687316ff391e9e7a8c65df2d35c880 Author: Alan Lawrence

[PATCH AArch64 0/2] PR/60825 Make {int,uint,float}64x1_t in arm_neon.h a proper vector type

2014-06-19 Thread Alan Lawrence
According to the ARM C Language Extensions the 64x1 types should all be passed in the SIMD registers rather than GPRs, and should not be assignment-compatible with [u]int64_t / float64_t (as they are at present). These two patches (first for float64x1_t, second for [u]int64x1_t) make these

[PATCH AArch64 1/2] PR/60825 Make float64x1_t in arm_neon.h a proper vector type

2014-06-19 Thread Alan Lawrence
equivalent) type signature; and add some new ABI tests. gcc/ChangeLog: 2014-06-19 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64.c (aarch64_simd_mangle_map): Add entry for V1DFmode. * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_type_mode

[PATCH 4.8 ARM] Backport of r211369: PR/61062 Fix arm_neon.h ZIP/UZP/TRN for bigendian

2014-06-19 Thread Alan Lawrence
This backports straightforwardly; no regressions on arm-none-eabi or armeb-none-eabi, and FAIL-PASS of the new ZIP, UZP, and TRN execution tests from r209908, r209947 and r210422 (running locally). --Alandiff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h index 4d945ce..a930e05

[PATCH 4.8 ARM] Backport r210219: Neon Intrinsics TLC - remove ML

2014-06-16 Thread Alan Lawrence
: Delete. * doc/arm-neon-intrinsics.texi: Update comment. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9ab54f..80fa36b 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,14 @@ +2014-06-16 Alan Lawrence alan.lawre...@arm.com + + Backport r210219 from mainline + + 2014-05-08

[Committed, Obvious] Remove duplicated vect-singleton_1.c test body

2014-06-13 Thread Alan Lawrence
) - -TEST (char, int8x1_t, s8) Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 211634) +++ gcc/testsuite/ChangeLog (revision 211635) @@ -1,3 +1,7 @@ +2014-06-13 Alan Lawrence alan.lawre...@arm.com + + * gcc.dg/vect

Re: [PATCH ARM] PR/61062 Fix arm_neon.h ZIP/UZP/TRN for bigendian

2014-06-09 Thread Alan Lawrence
Baking on trunk as of rev 211369. --Alan Ramana Radhakrishnan wrote: On Wed, May 14, 2014 at 2:52 PM, Alan Lawrence alan.lawre...@arm.com wrote: Hi, Due to differences in how the ARM C Language Extensions and gcc's vector extensions deal with indices within vectors, the __builtin_shuffle

Re: [AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics

2014-06-03 Thread Alan Lawrence
Ok, this fixes it. We'll output an ext...#0, which is little more than a MOV, but that seems appropriate in the circumstance. Regression tested check-gcc and check-g++ on aarch64-none-elf and aarch64_be-none-elf. Ok for trunk? --Alan Alan Lawrence wrote: Yes, reproduced. Seems the mid-end

Re: [AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics

2014-06-03 Thread Alan Lawrence
gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_evpc_ext): Allow+handle location==0. ? --Alan Marcus Shawcroft wrote: On 3 June 2014 11:21, Alan Lawrence alan.lawre...@arm.com wrote: Ok, this fixes it. We'll output an ext...#0, which is little more than a MOV, but that seems

Re: [AArch64/ARM 2/3] Recognize shuffle patterns for REV instructions on AArch64, rewrite intrinsics.

2014-06-03 Thread Alan Lawrence
to aarch64_evpc_rev ahead of the call to aarch64_evpc_ext in aarch64_expand_vec_perm_const_1. The actual patch I committed is attached. Cheers, Alan Marcus Shawcroft wrote: On 15 May 2014 16:52, Alan Lawrence alan.lawre...@arm.com wrote: 2014-05-15 Alan Lawrence alan.lawre...@arm.com

Re: [AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics

2014-06-03 Thread Alan Lawrence
Pushed as r211177. Thanks, Alan Marcus Shawcroft wrote: On 3 June 2014 12:19, Alan Lawrence alan.lawre...@arm.com wrote: gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_evpc_ext): Allow+handle location==0. ? Allow and handle location == 0. Otherwise OK /Marcus

Re: [AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics

2014-06-02 Thread Alan Lawrence
/build/cross-validation/gcc/211058/report-build-info.html Can you check? Thanks, Christophe. On 19 May 2014 14:53, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 23 April 2014 21:22, Alan Lawrence alan.lawre...@arm.com wrote: 2014-03-27 Alan Lawrence alan.lawre...@arm.com

[PATCH AArch64 0/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
The __builtin_ functions registered by aarch64_init_simd_builtins use signed and/or unsigned types according to the qualifiers defined in aarch64-builtins.c and used in aarch64-simd-builtins.def. These __builtin functions are then used in arm_neon.h, with explicit casts converting between the

[PATCH AArch64 1/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
, vsriq_n_u8, vsriq_n_u16, vsriq_n_u32, vsriq_n_u64, vsrid_n_u64): Remove casts. Alan Lawrence wrote: The __builtin_ functions registered by aarch64_init_simd_builtins use signed and/or unsigned types according to the qualifiers defined in aarch64-builtins.c and used in aarch64-simd

[PATCH AArch64 2/2] Correct signedness of builtins, remove casts from arm_neon.h

2014-05-29 Thread Alan Lawrence
This adds another set of qualifiers to aarch64-builtins.c, and removes more casts from arm_neon.h, for the suqadd, ushl, urshl, urshr_n, ushll_n, and sshl intrinsics. gcc/ChangeLog: * gcc/config/aarch64/aarch64-builtins.c (aarch64_types_binop_ssu_qualifiers): New static data.

Re: [PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-29 Thread Alan Lawrence
functions as they are not in the spec. Alan Lawrence wrote: No, hold that, vfmaq_n_f64 has been added back in the latest version (to which I linked). Hang on... --Alan Alan Lawrence wrote: arm_neon.h contains a bunch of functions (for example, the wonderful vcgez_u* intrinsics - that's

Re: [AArch64/ARM 3/3] Add execution tests of ARM EXT intrinsics

2014-05-29 Thread Alan Lawrence
Alan Lawrence alan.lawre...@arm.com gcc.target/arm/simd/vextQf32.c: New file. gcc.target/arm/simd/vextQp16.c: New file. gcc.target/arm/simd/vextQp8.c: New file. gcc.target/arm/simd/vextQs16.c: New file. gcc.target/arm/simd/vextQs32.c: New file

[PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-27 Thread Alan Lawrence
This patch just deletes those intrinsics. OK for trunk? Cheers, Alan gcc/ChangeLog: 2014-05-27 Alan Lawrence alan.lawre...@arm.com * config/aarch64/arm_neon.h (vfmaq_n_f64, vmlaq_n_f64, vmlsq_n_f64, vrsrtsq_f64, vtst_p16, vtstq_p16, vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8

Re: [PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-05-27 Thread Alan Lawrence
No, hold that, vfmaq_n_f64 has been added back in the latest version (to which I linked). Hang on... --Alan Alan Lawrence wrote: arm_neon.h contains a bunch of functions (for example, the wonderful vcgez_u* intrinsics - that's an unsigned comparison of greater-than-or-equal-to zero

Re: [AArch64/ARM 2/3] Recognize shuffle patterns for REV instructions on AArch64, rewrite intrinsics.

2014-05-15 Thread Alan Lawrence
Sure, here is a revised patch (replacing a with __a). I've retested (the various TBL dependencies have all been committed), no regressions on aarch64-none-elf or aarch64_be-none-elf. May I propose gcc/ChangeLog: 2014-05-15 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64

[PATCH AArch64 / testsuite] Add V1DFmode, fixes PR/59843

2014-05-15 Thread Alan Lawrence
This fixes an ICE on AArch64 when compiling code with a vector of exactly one double, and seems the most specific/accurate way of fixing that specific case. I've included a test case of a range of other singleton vector types too (compiles on aarch64-none-elf, x64_64, arm-none-eabi). No

Re: [PATCH AArch64 / testsuite] Add V1DFmode, fixes PR/59843

2014-05-15 Thread Alan Lawrence
Oops, I missed: gcc/ChangeLog: 2014-05-15 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64-modes.def: Add V1DFmode. * config/aarch64/aarch64.c (aarch64_vector_mode_supported_p): Support V1DFmode. gcc/testsuite/ChangeLog: 2014-05-15 Alan Lawrence

Pushed with obvious fix: Re: [AArch64/ARM 3/3] Add execution tests of ARM TRN Intrinsics

2014-05-14 Thread Alan Lawrence
, Alan Ramana Radhakrishnan wrote: On Fri, Mar 28, 2014 at 3:50 PM, Alan Lawrence alan.lawre...@arm.com wrote: Final patch in series, adds new tests of the ARM TRN Intrinsics, that also check the execution results, reusing the test bodies introduced into AArch64 in the first patch. (These tests

[PATCH ARM] PR/61062 Fix arm_neon.h ZIP/UZP/TRN for bigendian

2014-05-14 Thread Alan Lawrence
Hi, Due to differences in how the ARM C Language Extensions and gcc's vector extensions deal with indices within vectors, the __builtin_shuffle masks used to implement the ZIP, UZP and TRN Neon Intrinsics in arm_neon.h are correct only for little-endian. (The problem on bigendian has recently

Re: [AArch64/ARM 3/3] Add execution tests of ARM ZIP Intrinsics

2014-05-06 Thread Alan Lawrence
, Alan Lawrence alan.lawre...@arm.com wrote: Final patch adds new tests of the ARM ZIP Intrinsics (subsuming the autogenerated ones in testsuite/gcc.target/arm/neon/), that also check the execution results, reusing the test bodies introduced into AArch64 in the first patch. All tests passing on arm

Re: [PATCH AARCH64] One-line tidy of bit-twiddle expression in aarch64.c

2014-05-02 Thread Alan Lawrence
Whilst I agree with Richard H that it is obvious, my feeling is that the assertion does no harm, so have committed rev 210005 with Richard E's changes. --Alan Richard Henderson wrote: On 04/29/2014 05:42 AM, Richard Earnshaw wrote: On 23/04/14 16:20, Alan Lawrence wrote: This patch

[Committed] [AArch64/ARM 1/3] Add execution + assembler tests of the AArch64 ZIP Intrinsics.

2014-04-29 Thread Alan Lawrence
Committed as revision 209893. Marcus Shawcroft wrote: On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote: This adds DejaGNU tests of the existing AArch64 vzip_* intrinsics, both checking the assembler output and the runtime results. Test bodies are in separate files ready

[Committed] Added myself to MAINTAINERS (Write After Approval)

2014-04-29 Thread Alan Lawrence
langt...@llnl.gov Chris Lattner sa...@nondot.org Terry Laurenzo tlaure...@gmail.com +Alan Lawrence alan.lawre...@arm.com Georg-Johann Lay

[Committed] [AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics using __builtin_shuffle

2014-04-29 Thread Alan Lawrence
Committed as revision 209906. A. Marcus Shawcroft wrote: On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote: This patch replaces the temporary inline assembler for vzip_* in arm_neon.h with equivalent calls to __builtin_shuffle. These are matched

[PATCH AARCH64] One-line tidy of bit-twiddle expression in aarch64.c

2014-04-23 Thread Alan Lawrence
patches in http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip expression.diff --git

[PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction

2014-04-23 Thread Alan Lawrence
broken prior to this patch and are not affected. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64-simd.md (vec_perm): Enable for bigendian. * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Remove assert against bigendian and adjust

[AArch64/ARM 0/3] Patch series for REV permute instructions

2014-04-23 Thread Alan Lawrence
The meat of this is in the second patch, which makes the AArch64 backend look for shuffle masks that can be turned into REV instructions, and updates the VREV Neon Intrinsics to use __builtin_shuffle rather than the current inline assembler; this then produces the same instructions (unless the

[AArch64/ARM 1/3] Add execution + assembler tests of AArch64 REV Neon Intrinsics

2014-04-23 Thread Alan Lawrence
: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/vrev16p8_1.c: New file. * gcc.target/aarch64/simd/vrev16p8.x: New file. * gcc.target/aarch64/simd/vrev16qp8_1.c: New file. * gcc.target/aarch64/simd/vrev16qp8.x: New file. * gcc.target

[AArch64/ARM 2/3] Recognize shuffle patterns for REV instructions on AARch64, rewrite intrinsics.

2014-04-23 Thread Alan Lawrence
in vectorization) without that patch. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/iterators.md: add a REVERSE iterator and rev_op attribute for REV64/32/16 insns. * config/aarch64/aarch64-simd.md: add corresponding define_insn

[AArch64/ARM 3/3] Add execution tests of ARM REV intrinsics

2014-04-23 Thread Alan Lawrence
patch. Testsuite driver simd.exp from http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01500.html, will ensure that's committed first. All passing on arm-none-eabi. gcc/testsuite/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com gcc.target/arm/simd/vrev16p8_1.c: New file

[AArch64/ARM 0/3] Patch series for EXT instructions

2014-04-23 Thread Alan Lawrence
The meat of this is in the second patch, which makes the AArch64 backend look for shuffle masks that can be turned into EXT instructions, and updates the vext[q]_* Neon Intrinsics to use __builtin_shuffle rather than the current inline assembler; this then produces the same instructions (unless

[AArch64/ARM 2/3] Detect EXT patterns to vec_perm_const, use for EXT intrinsics

2014-04-23 Thread Alan Lawrence
, no regressions following http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html ; prior to this, there is some testsuite noise because of unexpected success in vectorizing loops. New tests from first patch passing on both targets. gcc/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com

[AArch64/ARM 3/3] Add execution tests of ARM EXT intrinsics

2014-04-23 Thread Alan Lawrence
gcc.target/arm/simd/simd.exp from corresponding patch for ZIP Intrinsics http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01500.html, will commit that first. All tests passing on arm-none-eabi. gcc/testsuite/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com gcc.target/arm/simd

[PATCH, AArch64] Enable shuffle on big-endian and turn on the testsuite

2014-04-11 Thread Alan Lawrence
/gcc-patches/2014-04/msg00079.html) ? gcc/ChangeLog: 2014-04-11 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64.c (aarch64_evpc_tbl): enable for bigendian. gcc/testsuite/ChangeLog: 2014-04-11 Alan Lawrence alan.lawre...@arm.com * lib/target-supports.exp

Re: [Patch, AArch64] Fix shuffle for big-endian.

2014-04-04 Thread Alan Lawrence
Sorry to be pedantic again, but 'wierd' should be spelt 'weird'. Otherwise, looks good to me and much neater than before. (Seems you'd rather keep the re-enabling, here and in the testsuite, for another patch?) --Alan Tejas Belagod wrote: Richard Henderson wrote: On 02/21/2014 08:30 AM, Tejas

[AArch64/ARM 0/3] Patch series for TRN Intrinsics

2014-03-28 Thread Alan Lawrence
Much like the ZIP and UZP intrinsics, the vtrn[q]_* intrinsics are implemented with inline __asm__, which blocks compiler analysis. This series replaces those calls with __builtin_shuffle, which produce the same** assembler instructions. ** except for two-element vectors, where UZP, ZIP and

[AArch64/ARM 1/3] Add execution + assembler tests of AArch64 TRN Intrinsics

2014-03-28 Thread Alan Lawrence
there (will commit ZIP tests first). All tests passing on aarch64-none-elf and aarch64_be-none-elf. testsuite/ChangeLog: 2012-03-28 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/vtrnf32_1.c: New file. * gcc.target/aarch64/simd/vtrnf32.x: New file

[AArch64/ARM 3/3] Add execution tests of ARM TRN Intrinsics

2014-03-28 Thread Alan Lawrence
gcc.target/arm/simd/simd.exp from corresponding patch for ZIP Intrinsics, will commit that first. All tests passing on arm-none-eabi. testsuite/ChangeLog: 2012-03-28 Alan Lawrence alan.lawre...@arm.com * gcc.target/arm/simd/vtrnqf32_1.c: New file. * gcc.target/arm/simd

[AArch64/ARM 2/3] Reimplement AArch64 TRN intrinsics with __builtin_shuffle

2014-03-28 Thread Alan Lawrence
Lawrence alan.lawre...@arm.com * gcc.target/aarch64/vtrns32.c: Expect zip[12] insn rather than trn[12]. * gcc.target/aarch64/vtrnu32.c: Likewise. * gcc.target/aarch64/vtrnf32.c: Likewise. gcc/ChangeLog: 2014-03-28 Alan Lawrence alan.lawre...@arm.com * config

[AArch64/ARM 0/3] Patch series for ZIP intrinsics

2014-03-27 Thread Alan Lawrence
Hi, AArch64 zip_* intrinsics are currently implemented with temporary inline asm, which prevent analysis through themselves. This is to replace those asm blocks with (equivalent) calls to __builtin_shuffle, which produce the same assembler instructions (unless gcc can do better). First

[AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics using __builtin_shuffle

2014-03-27 Thread Alan Lawrence
-none-elf. gcc/ChangeLog: 2012-03-27 Alan Lawrence alan.lawre...@arm.com * config/aarch64/arm_neon.h (vzip1_f32, vzip1_p8, vzip1_p16, vzip1_s8, vzip1_s16, vzip1_s32, vzip1_u8, vzip1_u16, vzip1_u32, vzip1q_f32, vzip1q_f64, vzip1q_p8, vzip1q_p16, vzip1q_s8, vzip1q_s16

[AArch64/ARM 1/3] Add execution + assembler tests of the AArch64 ZIP Intrinsics.

2014-03-27 Thread Alan Lawrence
tests passing on aarch64-none-elf and aarch64_be-none-elf. testsuite/ChangeLog: 2014-03-25 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/simd.exp: New file. * gcc.target/aarch64/simd/vzipf32_1.c: New file. * gcc.target/aarch64/simd/vzipf32.x: New file

[AArch64/ARM 3/3] Add execution tests of ARM ZIP Intrinsics

2014-03-27 Thread Alan Lawrence
: 2012-03-27 Alan Lawrence alan.lawre...@arm.com * gcc.target/arm/simd/simd.exp: New file. * gcc.target/arm/simd/vzipqf32_1.c: New file. * gcc.target/arm/simd/vzipqp16_1.c: New file. * gcc.target/arm/simd/vzipqp8_1.c: New file. * gcc.target/arm/simd/vzipqs16_1

[AArch64/ARM 0/3] Patch series for UZP intrinsics

2014-03-27 Thread Alan Lawrence
Hi, Much like the zip intrinsics, the vuzp_* intrinsics are implemented with inline ASM, which prevents compiler analysis. This series replaces those with calls to _builtin_shuffle, which produce the same** assembler instructions. (**except for two-element vectors where UZP and ZIP are

[AArch64/ARM 1/3] Add execution + assembler tests of AArch64 UZP Intrinsics

2014-03-27 Thread Alan Lawrence
there (will commit ZIP tests first). All tests passing on aarch64-none-elf and aarch64_be-none-elf. testsuite/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/vuzpf32_1.c: New file. * gcc.target/aarch64/simd/vuzpf32.x: New file. * gcc.target/aarch64

[AArch64/ARM 2/3] Rewrite AArch64 UZP Intrinsics using __builtin_shuffle

2014-03-27 Thread Alan Lawrence
Lawrence alan.lawre...@arm.com * gcc.target/aarch64/vuzps32_1.c: Expect zip1/2 insn rather than uzp1/2. * gcc.target/aarch64/vuzpu32_1.c: Likewise. * gcc.target/aarch64/vuzpf32_1.c: Likewise. gcc/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * config

[AArch64/ARM 3/3] Add execution tests of ARM UZP Intrinsics

2014-03-27 Thread Alan Lawrence
corresponding patch for ZIP Intrinsics, will commit that first. All tests passing on arm-none-eabi. gcc/testsuite/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * gcc.target/arm/simd/vuzpqf32_1.c: New file. * gcc.target/arm/simd/vuzpqp16_1.c: New file. * gcc.target/arm/simd

Re: [Patch, AArch64] Fix shuffle for big-endian.

2014-03-24 Thread Alan Lawrence
this and fix up the testsuite shortly... Cheers, Alan Alan Lawrence wrote: I've been doing some local testing using this patch as a basis for some of my own work on NEON intrinsics, and it seems good to me. A couple of points: (1) Re. the comment that If two vectors, we end up with a wierd mixed

[PATCH AArch64] Fix aarch64_simd_valid_immediate for Bigendian

2014-03-21 Thread Alan Lawrence
++ and FAIL-PASS as listed below. Ok for trunk (stage 4) ? Cheers, Alan gcc/ChangeLog: 2014-03-21 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): reverse order of elements for bigendian. = FAIL-PASS in gcc testsuite: c-c++-common

Re: [Patch, AArch64] Fix shuffle for big-endian.

2014-03-12 Thread Alan Lawrence
I've been doing some local testing using this patch as a basis for some of my own work on NEON intrinsics, and it seems good to me. A couple of points: (1) Re. the comment that If two vectors, we end up with a wierd mixed-endian mode on NEON: firstly wierd should be spelt weird; secondly, if I

[AArch64] Make -mcpu, -march and -mtune case-insensitive.

2014-01-17 Thread Alan Lawrence
Small patch to make the -mcpu, -march and -mtune command-line options case-insensitive, allowing e.g. -mcpu=CortexA57 -march=ARMv8-A. Tested on aarch64-none-elf with no regressions; options passed onto e.g. ld are always lowercase (as before). OK for trunk? --Alan ChangeLog: 2014-01-17 Alan

[ARM] Make -mcpu, -march and -mtune case-insensitive

2014-01-16 Thread Alan Lawrence
This is a small patch that makes the -mcpu, -march and -mtune command-line options case-insensitive, allowing e.g. -mcpu=Cortex-A15 -march=ARMv7. Regression tested on arm-none-eabi with no issues; options passed onto e.g. ld are always lowercase (as previously). OK for trunk? --Alan ChangeLog:

<    1   2   3   4   5   6