Re: [PATCH][AArch64][2/2] (Re)Implement vcopy_lane intrinsics

2016-06-22 Thread Marcus Shawcroft
On 7 June 2016 at 17:56, Kyrill Tkachov  wrote:

> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2016-06-07  Kyrylo Tkachov  
> James Greenhalgh  
>
> * config/aarch64/arm_neon.h (vcopyq_lane_f32, vcopyq_lane_f64,
> vcopyq_lane_p8, vcopyq_lane_p16, vcopyq_lane_s8, vcopyq_lane_s16,
> vcopyq_lane_s32, vcopyq_lane_s64, vcopyq_lane_u8, vcopyq_lane_u16,
> vcopyq_lane_u32, vcopyq_lane_u64): Reimplement in C.
> (vcopy_lane_f32, vcopy_lane_f64, vcopy_lane_p8, vcopy_lane_p16,
> vcopy_lane_s8, vcopy_lane_s16, vcopy_lane_s32, vcopy_lane_s64,
> vcopy_lane_u8, vcopy_lane_u16, vcopy_lane_u32, vcopy_lane_u64,
> vcopy_laneq_f32, vcopy_laneq_f64, vcopy_laneq_p8, vcopy_laneq_p16,
> vcopy_laneq_s8, vcopy_laneq_s16, vcopy_laneq_s32, vcopy_laneq_s64,
> vcopy_laneq_u8, vcopy_laneq_u16, vcopy_laneq_u32, vcopy_laneq_u64,
> vcopyq_laneq_f32, vcopyq_laneq_f64, vcopyq_laneq_p8, vcopyq_laneq_p16,
> vcopyq_laneq_s8, vcopyq_laneq_s16, vcopyq_laneq_s32, vcopyq_laneq_s64,
> vcopyq_laneq_u8, vcopyq_laneq_u16, vcopyq_laneq_u32, vcopyq_laneq_u64):
> New intrinsics.
>
> 2016-06-07  Kyrylo Tkachov  
> James Greenhalgh  
>
> * gcc.target/aarch64/vect_copy_lane_1.c: New test.

OK, thanks /Marcus


Re: [Patch AArch64] Simplify reduc_plus_scal_v2[sd]f sequence

2016-05-17 Thread Marcus Shawcroft
On 17 May 2016 at 12:02, James Greenhalgh <james.greenha...@arm.com> wrote:
> On Tue, May 17, 2016 at 11:32:36AM +0100, Marcus Shawcroft wrote:
>> On 17 May 2016 at 10:06, James Greenhalgh <james.greenha...@arm.com> wrote:
>> >
>> > Hi,
>> >
>> > This is just a simplification, it probably makes life easier for register
>> > allocation in some corner cases and seems the right thing to do. We don't
>> > use the internal version elsewhere, so we're safe to delete it and change
>> > the types.
>> >
>> > OK?
>> >
>> > Bootstrapped on AArch64 with no issues.
>>
>> Help me understand why this is ok for BE ?
>
> The reduc_plus_scal_ pattern wants to take a vector and return a scalar
> value representing the sum of the lanes of that vector. We want to go
> from V2DFmode to DFmode.
>
> The architectural instruction FADDP writes to a scalar value in the low
> bits of the register, leaving zeroes in the upper bits.
>
> i.e.
>
> faddp  d0, v1.2d
>
> 128 640
>  |0x0| v1.d[0] + v1.d[1]  |
>
> In the current implementation, we use the
> aarch64_reduc_plus_internal pattern, which treats the result of
> FADDP as a vector of two elements. We then need an extra step to extract
> the correct scalar value from that vector. From GCC's point of view the lane
> containing the result is either lane 0 (little-endian) or lane 1
> (big-endian), which is why the current code is endian dependent. The extract
> operation will always be a NOP move from architectural bits 0-63 to
> architectural bits 0-63 - but we never elide the move as future passes can't
> be certain that the upper bits are zero (they come out of an UNSPEC so
> could be anything).
>
> However, this is all unneccesary. FADDP does exactly what we want,
> regardless of endianness, we just need to model the instruction as writing
> the scalar value in the first place. Which is what this patch wires up.
>
> We probably just missed this optimization in the migration from the
> reduc_splus optabs (which required a vector return value) to the
> reduc_plus_scal optabs (which require a scalar return value).
>
> Does that help?


Yep. Thanks. OK to commit. /Marcus

> Thanks,
> James
>


Re: [Patch AArch64] Simplify reduc_plus_scal_v2[sd]f sequence

2016-05-17 Thread Marcus Shawcroft
On 17 May 2016 at 10:06, James Greenhalgh  wrote:
>
> Hi,
>
> This is just a simplification, it probably makes life easier for register
> allocation in some corner cases and seems the right thing to do. We don't
> use the internal version elsewhere, so we're safe to delete it and change
> the types.
>
> OK?
>
> Bootstrapped on AArch64 with no issues.

Help me understand why this is ok for BE ?

Cheers
/Marcus


Re: [Patch AArch64] Delete ASM_OUTPUT_DEF and fallback to default .set directive

2016-05-17 Thread Marcus Shawcroft
On 17 May 2016 at 10:13, James Greenhalgh  wrote:
>
> Hi,
>
> As in the ARM port [1] , the AArch64 port wants to put out "b = a" to set
> an alias. This doesn't cause us any trouble yet, as the AArch64 port doesn't
> warn for this construct - but at the same time there is no reason for us
> not to put out a .set directive - this seems to have been copied from the
> ARM port when section anchor support was added in 2012. Looking through
> the chain, we'll get a default definition for ASM_OUTPUT_DEF if SET_ASM_OP
> is defined, and we get SET_ASM_OP defined through config/elfos.h for
> all the AArch64 targets I can see in config.gcc. So we're safe to drop
> this.
>
> Bootstrapped on aarch64-none-linux-gnu.
>
> OK?

OK /Marcus


Re: [AArch64] Remove AARCH64_EXTRA_TUNE_RECIP_SQRT from Cortex-A57 tuning

2016-02-16 Thread Marcus Shawcroft
On 11 January 2016 at 12:04, James Greenhalgh  wrote:

> 2015-12-11  James Greenhalgh  
>
> * config/aarch64/aarch64.c (cortexa57_tunings): Remove
> AARCH64_EXTRA_TUNE_RECIP_SQRT.
>

OK /Marcus


Re: [Patch AArch64] Restrict 16-bit sqrdml{sa}h instructions to FP_LO_REGS

2016-02-16 Thread Marcus Shawcroft
On 26 January 2016 at 16:04, James Greenhalgh  wrote:

> 2016-01-25  James Greenhalgh  
>
> * config/aarch64/aarch64.md
> (arch64_sqrdmlh_lane): Fix register
> constraints for operand 3.
> (aarch64_sqrdmlh_laneq): Likewise.
>

OK /Marcus


Re: [Patch AArch64] GCC 6 regression in vector performance. - Fix vector initialization to happen with lane load instructions.

2016-02-16 Thread Marcus Shawcroft
On 20 January 2016 at 15:22, James Greenhalgh  wrote:

> gcc/
>
> 2016-01-20  James Greenhalgh  
> Ramana Radhakrishnan  
>
> * config/aarch64/aarch64.c (aarch64_expand_vector_init): Refactor,
> always use lane loads to construct non-constant vectors.
>
> gcc/testsuite/
>
> 2016-01-20  James Greenhalgh  
> Ramana Radhakrishnan  
>
> * gcc.target/aarch64/vector_initialization_nostack.c: New.
>

OK /Marcus


Re: [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt

2016-02-16 Thread Marcus Shawcroft
On 11 January 2016 at 11:53, James Greenhalgh  wrote:
>

> ---
> 2015-12-10  James Greenhalgh  
>
> * config/aarch64/aarch64.c (use_rsqrt_p): Always use software
> reciprocal sqrt for -mlow-precision-recip-sqrt.
>

OK /Marcus


Re: Backport: [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609

2016-01-12 Thread Marcus Shawcroft
On 18 December 2015 at 12:13, James Greenhalgh  wrote:

> Looking back at the patch just before I hit commit, the 4.9 backport was
> a little different (as we still have a CANNOT_CHANGE_MODE_CLASS there).
> We can drop the aarch64-protos.h and aarch64.h changes, and we need to
> change the sense of the new check, such that we can return true for the
> case added by this patch, and false for the limited number of other safe
> cases in 4.9.
>
> Bootstrapped on aarch64-none-linux-gnu.
>
> OK?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2015-12-14  James Greenhalgh  
>
> Backport from mainline.
> 2015-12-09  James Greenhalgh  
>
> PR rtl-optimization/67609
> * config/aarch64/aarch64.c
> (aarch64_cannot_change_mode_class): Don't permit word_mode
> subregs of full vector modes.
> * config/aarch64/aarch64.md (aarch64_movdi_low): Use
> zero_extract rather than truncate.
> (aarch64_movdi_high): Likewise.
>
> gcc/testsuite/
>
> 2015-12-14  James Greenhalgh  
>
> Backport from mainline.
> 2015-12-09  James Greenhalgh  
>
> PR rtl-optimization/67609
> * gcc.dg/torture/pr67609.c: New.
>

OK /Marcus


Re: [AARCH64][ACLE] Implement __ARM_FP_FENV_ROUNDING in aarch64 backend.

2016-01-11 Thread Marcus Shawcroft
On 11 January 2016 at 08:12, Bilyan Borisov  wrote:

> 2015-XX-XX  Bilyan Borisov  
>
> * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): New
> macro
> definition.
>
> gcc/testsuite/
>
> 2015-XX-XX  Bilyan Borisov  
>
> * gcc.target/aarch64/fesetround-checking-baremetal.c: New.
> * gcc.target/aarch64/fesetround-checking-linux.c: Likewise.

for new test cases in gcc.target/aarch64 use the name convention here
https://gcc.gnu.org/wiki/TestCaseWriting, in this case:

fesetround_checking_baremetal_1.c
fesetround_checking_linux_1.c

If we are distinguishing between glibc/musl/bionic/uclibc then the use
of the name "linux" in the last test case would seem to be
inappropriate (all of glibc/musl/bionic and uclibc can be used on top
of a linux kernel). Suggest s/linux/glibc/ in the test case name.

Cheers
/Marcus


Re: [AARCH64][ACLE] Implement __ARM_FP_FENV_ROUNDING in aarch64 backend.

2016-01-11 Thread Marcus Shawcroft
On 11 January 2016 at 10:46, Alan Lawrence  wrote:

> However, the test doesn't really look at whether we're using glibc vs
> musl/bionic/uclibc, only at whether we are targeting -linux-gnu or
> -none-elf.

Fair point, the test case is not aligned with the implementation.
Rather than hang the test on a target triple the other approach would
be to try and write a test that uses the functionality that is gated
by the predefine.

Cheers
/Marcus


Re: [AArch64] Simplify TLS pattern by hardcoding relocation modifiers into pattern

2015-12-17 Thread Marcus Shawcroft
On 10 September 2015 at 12:28, Jiong Wang  wrote:
>
> TLS instruction sequences are always with fixed format, there is no need
> to use operand modifier, we can hardcode the relocation modifiers into
> instruction pattern, all those redundant checks in aarch64_print_operand
> can be removed.
>
> OK for trunk?
>
> 2015-09-10  Jiong Wang  
>
> gcc/
>   * config/aarch64/aarch64.md (ldr_got_tiny): Hardcode relocation
>   modifers.
>   (tlsgd_small): Likewise.
>   (tlsgd_tiny): Likewise.
>   (tlsie_small_): Likewise.
>   (tlsie_small_sidi): Likewise.
>   (tlsie_tiny_): Likewise.
>   (tlsie_tiny_sidi): Likewise.
>   (tlsle12_): Likewise.
>   (tlsle24_): Likewise.
>   (tlsdesc_small_): Likewise.
>   (tlsdesc_small_pseudo_): Likewise.
>   (tlsdesc_tiny_): Likewise.
>   (tlsdesc_tiny_pseudo_): Likewise.
>   * config/aarch64/aarch64.c (aarch64_print_operand): Delete useless
>   check on 'A', 'L', 'G'.

Hi, Is there a particular reason for wanting to change this?

Cheers
/Marcus


Re: [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609

2015-12-14 Thread Marcus Shawcroft
On 14 December 2015 at 11:01, James Greenhalgh <james.greenha...@arm.com> wrote:
> On Wed, Dec 09, 2015 at 01:13:20PM +0000, Marcus Shawcroft wrote:
>> On 27 November 2015 at 13:01, James Greenhalgh <james.greenha...@arm.com> 
>> wrote:
>>
>> > 2015-11-27  James Greenhalgh  <james.greenha...@arm.com>
>> >
>> > * config/aarch64/aarch64-protos.h
>> > (aarch64_cannot_change_mode_class): Bring back.
>> > * config/aarch64/aarch64.c
>> > (aarch64_cannot_change_mode_class): Likewise.
>> > * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Likewise.
>> > * config/aarch64/aarch64.md (aarch64_movdi_low): Use
>> > zero_extract rather than truncate.
>> > (aarch64_movdi_high): Likewise.
>> >
>> > 2015-11-27  James Greenhalgh  <james.greenha...@arm.com>
>> >
>> > * gcc.dg/torture/pr67609.c: New.
>> >
>>
>> + detailed dicussion.  In all other cases, we want to be premissive
>>
>> s/premissive/permissive/
>>
>> OK /Marcus
>
> Thanks.
>
> This has had a week or so to soak on trunk now, is it OK to backport to GCC
> 5 and 4.9?
>
> The patch applies as-good-as clean, with only a little bit to fix up in
> aarch64-protos.h to keep alphabetical order, and I've bootstrapped and tested
> the backports with no issue.

OK /Marcus


Re: [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609

2015-12-09 Thread Marcus Shawcroft
On 27 November 2015 at 13:01, James Greenhalgh  wrote:

> 2015-11-27  James Greenhalgh  
>
> * config/aarch64/aarch64-protos.h
> (aarch64_cannot_change_mode_class): Bring back.
> * config/aarch64/aarch64.c
> (aarch64_cannot_change_mode_class): Likewise.
> * config/aarch64/aarch64.h (CANNOT_CHANGE_MODE_CLASS): Likewise.
> * config/aarch64/aarch64.md (aarch64_movdi_low): Use
> zero_extract rather than truncate.
> (aarch64_movdi_high): Likewise.
>
> 2015-11-27  James Greenhalgh  
>
> * gcc.dg/torture/pr67609.c: New.
>

+ detailed dicussion.  In all other cases, we want to be premissive

s/premissive/permissive/

OK /Marcus


Re: [AArch64] Emit square root using the Newton series

2015-12-09 Thread Marcus Shawcroft
On 8 December 2015 at 21:35, Evandro Menezes  wrote:
>Emit square root using the Newton series
>
>2015-12-03  Evandro Menezes  
>
>gcc/
> * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
>Declare new
> function.
> * config/aarch64/aarch64-simd.md (sqrt2): New
>expansion and
> insn definitions.
> * config/aarch64/aarch64-tuning-flags.def
> (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro.
> * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define
>new function.
> * config/aarch64/aarch64.md (sqrt2): New expansion
>and insn
> definitions.
> * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt):
>Expand option
> description.
> * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise.
>
> This patch extends the patch that added support for implementing x^-1/2
> using the Newton series by adding support for x^1/2 as well.

Hi Evandro, What benchmarking have you done on this patch?
/M


Re: [ARM] Fix PR middle-end/65958

2015-12-04 Thread Marcus Shawcroft
On 3 December 2015 at 12:17, Eric Botcazou  wrote:
>> I can understand this restriction, but...
>>
>> > +  /* See the same assertion on PROBE_INTERVAL above.  */
>> > +  gcc_assert ((first % 4096) == 0);
>>
>> ... why isn't this a test that FIRST is aligned to PROBE_INTERVAL?
>
> Because that isn't guaranteed, FIRST is related to the size of the protection
> area while PROBE_INTERVAL is related to the page size.
>
>> blank line between declarations and code. Also, can we come up with a
>> suitable define for 4096 here that expresses the context and then use
>> that consistently through the remainder of this function?
>
> OK, let's use ARITH_BASE.
>
>> > +(define_insn "probe_stack_range"
>> > +  [(set (match_operand:DI 0 "register_operand" "=r")
>> > +   (unspec_volatile:DI [(match_operand:DI 1 "register_operand" "0")
>> > +(match_operand:DI 2 "register_operand" "r")]
>> > +UNSPEC_PROBE_STACK_RANGE))]
>>
>> I think this should really use PTRmode, so that it's ILP32 ready (I'm
>> not going to ask you to make sure that works though, since I suspect
>> there are still other issues to resolve with ILP32 at this time).
>
> Done.  Manually tested for now, I'll fully test it if approved.

Looks ok to me.  OK /Marcus


Re: [PATCH AArch64]Use aarch64_sync_memory_operand in atomic_store pattern

2015-12-04 Thread Marcus Shawcroft
On 4 December 2015 at 03:34, Bin Cheng  wrote:

> 2015-12-01  Bin Cheng  
>
> * config/aarch64/atomics.md (atomic_store): Use predicate
> aarch64_sync_memory_operand.
>

OK /Marcus


Re: [PATCH][AArch64] Don't allow -mgeneral-regs-only to change the .arch assembler directives

2015-12-04 Thread Marcus Shawcroft

On 04/12/15 14:40, Kyrill Tkachov wrote:

Ping.
This almost fell through the cracks.
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00055.html

Thanks,
Kyrill

On 01/10/15 14:00, Kyrill Tkachov wrote:

Hi all,

As part of the SWITCHABLE_TARGET work I inadvertently changed the
behaviour of -mgeneral-regs-only with respect to the .arch directives
that we emit.
The behaviour of -mgeneral-regs-only in GCC 5 and earlier is such that
it disallows the usage of FP/SIMD registers but does *not* stop the
compiler from
emitting the +fp,+simd etc extensions in the .arch directive of the
generated assembly. This is to accommodate users who may want to write
inline assembly
in a file compiled with -mgeneral-regs-only.

This patch restores the trunk behaviour in that respect to that of GCC
5 and the documentation for the option is tweaked a bit to reflect that.
Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-10-01  Kyrylo Tkachov  

  * config/aarch64/aarch64.c (aarch64_override_options_internal):
  Do not alter target_flags due to TARGET_GENERAL_REGS_ONLY_P.
  * doc/invoke.texi (AArch64 options): Mention that
-mgeneral-regs-only
  does not affect the assembler directives.

2015-10-01  Kyrylo Tkachov  

  * gcc.target/aarch64/mgeneral-regs_4.c: New test.




OK /M



Re: [AArch64] Add register constraints to add3_pluslong

2015-12-04 Thread Marcus Shawcroft
On 4 December 2015 at 19:42, James Greenhalgh  wrote:
>
> Hi,
>
> This patch fixes a bug I spotted in the add3_pluslong insn_and_split
> pattern. We need to give register constraints, otherwise the register
> allocator can do whatever it likes. This manifests as an ICE on AArch64
> with -mabi=ilp32:
>
> gcc foo.c -O2 -mabi=ilp32
>
>   error: could not split insn
>}
>^
>
>   (insn:TI 85 95 7 (set (mem/c:DI (plus:DI (reg/f:DI 29 x29)
>   (const_int 40 [0x28])) [1 %sfp+-65528 S8 A64])
>   (plus:DI (plus:DI (reg/f:DI 29 x29)
>   (const_int 16 [0x10]))
>   (const_int 65552 [0x10010]))) foo.c:7 95 {*adddi3_pluslong}
>(nil))
>
> The patch simply constrains the pattern to use w/x registers.
>
> Bootstrapped on aarch64-none-linux-gnu and cross-tested on aarch64-none-elf
> with no issues.
>
> OK?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2015-12-04  James Greenhalgh  
>
> * config/aarch64/aarch64.md (add3_pluslong): Add register
> constraints.
>
> gcc/testsuite/
>
> 2015-12-04  James Greenhalgh  
>
> * gcc.c-torture/compile/20151204.c: New.
>

+main(int argc, char** argv)

Space before (.

OK
/M


Re: [PATCH 1/5] [AARCH64]: Move #undef into .def files.

2015-11-18 Thread Marcus Shawcroft
On 17 November 2015 at 22:10, Andrew Pinski  wrote:
>
> This moves the #undef from the header files to the .def files like was done
> for builtins.def (https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00662.html).
>
> OK?   Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>
> Thanks,
> Andrew Pinski
>
> * config/aarch64/aarch64-arches.def (AARCH64_ARCH): #undef at the end.
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Likewise.
> * config/aarch64/aarch64-fusion-pairs.def (AARCH64_FUSION_PAIR): Likewise.
> * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION): 
> Likewise.
> * config/aarch64/aarch64-opts.h (AARCH64_CORE): Don't #undef here.
> (AARCH64_ARCH): Likewise.
> * config/aarch64/aarch64-protos.h (AARCH64_FUSION_PAIR): Likewise.
> (AARCH64_EXTRA_TUNING_OPTION): Likewise.
> * config/aarch64/aarch64.c (AARCH64_FUION_PAIR): Likewise.
> (AARCH64_EXTRA_TUNING_OPTION): Likewise.
> (AARCH64_ARCH): Likewise.
> (AARCH64_CORE): Likewise.
> (AARCH64_OPT_EXTENSION): Likewise.
> * config/aarch64/aarch64.h (AARCH64_CORE): Likewise.
> * config/aarch64/driver-aarch64.c (AARCH64_OPT_EXTENSION): Likewise.
> (AARCH64_CORE): Likewise.
> (AARCH64_ARCH): Likewise.
> * common/config/aarch64/aarch64-common.c: Likewise.

OK Thanks /Marcus


Re: [Patch AArch64] Add support for Cortex-A35

2015-11-17 Thread Marcus Shawcroft
On 16 November 2015 at 14:36, James Greenhalgh  wrote:

> 2015-11-16  James Greenhalgh  
>
> * config/aarch64/aarch64-cores.def (cortex-a35): New.
> * config/aarch64/aarch64.c (cortexa35_tunings): New.
> * config/aarch64/aarch64-tune.md: Regenerate.
> * doc/invoke.texi (-mcpu): Add Cortex-A35
>

OK /M


Re: [PATCH][AArch64] PR target/68129: Define TARGET_SUPPORTS_WIDE_INT

2015-11-09 Thread Marcus Shawcroft
On 9 November 2015 at 11:32, Kyrill Tkachov  wrote:

> 2015-11-09  Kyrylo Tkachov  
>
> PR target/68129
> * config/aarch64/aarch64.h (TARGET_SUPPORTS_WIDE_INT): Define to 1.
> * config/aarch64/aarch64.c (aarch64_print_operand, CONST_DOUBLE):
> Delete VOIDmode case.  Assert that mode is not VOIDmode.
> * config/aarch64/predicates.md (const0_operand): Remove const_double
> match.
>
> 2015-11-09  Kyrylo Tkachov  
>
> PR target/68129
> * gcc.target/aarch64/pr68129_1.c: New test.

Hi, This test isn't aarch64 specific, does it need to be in gcc.target/aarch64 ?

Cheers
/Marcus


Re: [PATCH][AArch64] PR target/68129: Define TARGET_SUPPORTS_WIDE_INT

2015-11-09 Thread Marcus Shawcroft
On 9 November 2015 at 15:45, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
>
> On 09/11/15 15:34, Marcus Shawcroft wrote:
>>
>> On 9 November 2015 at 11:32, Kyrill Tkachov <kyrylo.tkac...@arm.com>
>> wrote:
>>
>>> 2015-11-09  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>
>>>  PR target/68129
>>>  * config/aarch64/aarch64.h (TARGET_SUPPORTS_WIDE_INT): Define to 1.
>>>  * config/aarch64/aarch64.c (aarch64_print_operand, CONST_DOUBLE):
>>>  Delete VOIDmode case.  Assert that mode is not VOIDmode.
>>>  * config/aarch64/predicates.md (const0_operand): Remove const_double
>>>  match.
>>>
>>> 2015-11-09  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>
>>>  PR target/68129
>>>  * gcc.target/aarch64/pr68129_1.c: New test.
>>
>> Hi, This test isn't aarch64 specific, does it need to be in
>> gcc.target/aarch64 ?
>
>
> Not really, here is the patch with the test in gcc.dg/ if that's preferred.


OK /Marcus


Re: [PATCH][AArch64][cleanup] Remove uses of CONST_DOUBLE_HIGH, CONST_DOUBLE_LOW

2015-11-09 Thread Marcus Shawcroft
On 9 November 2015 at 11:34, Kyrill Tkachov  wrote:

> * config/aarch64/aarch64.c (aarch64_simd_valid_immediate):
> Remove integer CONST_DOUBLE handling.  It should never occur.

OK /Marcus


Re: [PATCH][AArch64] Fix insn types

2015-10-30 Thread Marcus Shawcroft
On 20 October 2015 at 17:14, Evandro Menezes  wrote:
> Kyrill,
>
> Indeed, the correct log would be:
>
> The type assigned to some insn definitions was not correct.
>
> gcc/
> * config/aarch64/aarch64.md
> (*movhf_aarch64): Change the type of "mov %0.h[0], %1.h[0] to
> "neon_move".
> (*movtf_aarch64): Change the type of "fmov %s0, wzr" to "f_mcr".
> (*cmov_insn): Change the types of "mov %0, {-1,1}" to
> "mov_imm".
> (*cmovsi_insn_uxtw): Likewise.
>
> Thank you,
>

OK thanks,  committed as r229572.

/Marcus


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-30 Thread Marcus Shawcroft
On 20 October 2015 at 00:40, Evandro Menezes  wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.

Cheers
/Marcus


Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Marcus Shawcroft
On 28 October 2015 at 10:07, Kyrill Tkachov  wrote:
> Hi all,
>
> This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding
> when processing an msubsi insn
> with subregs:
> (insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
> (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
> (mult:SI (subreg:SI (reg:DI 83) 0)
> (subreg:SI (reg:DI 75 [ _20 ]) 0 schedice.c:10 357
> {*msubsi}
>
> The register_operand predicate for that pattern allows subregs (I think
> correctly).
> The code in aarch_accumulator_forwarding doesn't take that into account and
> ends up
> taking a REGNO of a SUBREG, causing a checking error.
>
> This patch fixes that by stripping the subregs off the accumulator rtx
> before
> checking that the inner expression is a REG and taking its REGNO.
>
> The testcase now works fine with an aarch64-none-elf toolchain configure for
> RTL checking.
>
> The testcase is taken verbatim from the BZ entry for PR 68088.
> Since this function is shared between arm and aarch64 I've bootstrapped and
> tested it on both
> and I'll need ok's for both ports.
>
> Ok for trunk?

rtl.h exposes reg_or_subregno() already doesn't that do what we need here?

The test case is not aarch64 specific therefore I think convention is
that it should go into a generic directory.

Cheers
/Marcus


Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Marcus Shawcroft
On 29 October 2015 at 13:50, Kyrill Tkachov  wrote:

>>> Ok for trunk?
>>
>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>
>
> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
> It will ICE on any other rtx. Here I want to strip the subreg if it is
> a subreg, but leave it as it is otherwise.

OK, I follow.

>> The test case is not aarch64 specific therefore I think convention is
>> that it should go into a generic directory.
>
>
> Ok, I'll put it in gcc.dg/


OK with the test case moved. Thanks /Marcus


Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-10-28 Thread Marcus Shawcroft
On 23 October 2015 at 13:34, H.J. Lu <hjl.to...@gmail.com> wrote:
> On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
> <marcus.shawcr...@gmail.com> wrote:
>> Hi,
>>
>> This patch breaks the distinction between build and host. For example
>> consider a configure along these lines:
>>
>> ./configure --host=aarch64-none-linux-gnu
>> --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu
>> HJ, could you take a look into this issue?
>>
>
> Try this.

Hi,  Can we get a review from one of the build machinery maintainers
for this patch?

Cheers
/Marcus


Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-10-23 Thread Marcus Shawcroft
Hi,

This patch breaks the distinction between build and host. For example
consider a configure along these lines:

./configure --host=aarch64-none-linux-gnu
--target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu

Will result in:

CXX_FOR_BUILD='g++'
CXX='aarch64-none-linux-gnu-g++'

the gcc/configure fragment:

+# Check if -no-pie works.
+AC_CACHE_CHECK([for -no-pie option],
+  [gcc_cv_no_pie],
+  [saved_LDFLAGS="$LDFLAGS"
+   LDFLAGS="$LDFLAGS -no-pie"
+   AC_LINK_IFELSE([int main(void) {return 0;}],
+ [gcc_cv_no_pie=yes],
+ [gcc_cv_no_pie=no])
+   LDFLAGS="$saved_LDFLAGS"])
+if test "$gcc_cv_no_pie" = "yes"; then
+  NO_PIE_FLAG="-no-pie"
+fi
+AC_SUBST([NO_PIE_FLAG])

will check if CXX supports -no-pic and set NO_PIE_FLAG accordingly.
The gcc/Makefile.in fragment:

@@ -761,6 +769,7 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)

 # Native linker and preprocessor flags.  For x-fragment overrides.
 BUILD_LDFLAGS=@BUILD_LDFLAGS@
+BUILD_LDFLAGS += @NO_PIE_FLAG@

constructs the flags which will ultimately be used to compile the
build machines gen* programs.  That compilation uses CXX_FOR_BUILD
with the NO_PIE_FLAG that was originally probed for CXX.  This is
incorrect if CXX_FOR_BUILD != CXX

HJ, could you take a look into this issue?

Thanks
/Marcus


Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-10-23 Thread Marcus Shawcroft
On 23 October 2015 at 13:34, H.J. Lu <hjl.to...@gmail.com> wrote:
> On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
> <marcus.shawcr...@gmail.com> wrote:
>> Hi,
>>
>> This patch breaks the distinction between build and host. For example
>> consider a configure along these lines:
>>
>> ./configure --host=aarch64-none-linux-gnu
>> --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu

>> HJ, could you take a look into this issue?
>>
>
> Try this.

Thanks HJ. This solves the issue I was seeing.

Cheers
/Marcus


Re: [PATCH] [AArch64] support -mfentry feature for arm64

2015-10-22 Thread Marcus Shawcroft
On 22 October 2015 at 14:21, Li Bin  wrote:
> From: Jiangjiji 
>
> * gcc/config/aarch64/aarch64.opt: Add a new option.
> * gcc/config/aarch64/aarch64.c: Add some new functions and Macros.
> * gcc/config/aarch64/aarch64.h: Modify PROFILE_HOOK and FUNCTION_PROFILER.
>
> Signed-off-by: Jiangjiji 
> Signed-off-by: Li Bin 

Hi,  Please work with Maxim
https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html towards a generic
solution to this issue.  If, in the future it becomes apparent that
his proposal is not viable then we can reconsider a backend centric
solution.

Cheers
/Marcus


Re: [Patch AArch64 63304] Fix issue with global state.

2015-10-20 Thread Marcus Shawcroft
On 16 October 2015 at 12:05, Ramana Radhakrishnan
 wrote:

> 2015-10-15  Ramana Radhakrishnan  
>
> PR target/63304
> * config/aarch64/aarch64.c (aarch64_nopcrelative_literal_loads): New.
> (aarch64_expand_mov_immediate): Use 
> aarch64_nopcrelative_literal_loads.
> (aarch64_classify_address): Likewise.
> (aarch64_secondary_reload): Likewise.
> (aarch64_override_options_after_change_1): Adjust.
> * config/aarch64/aarch64.md 
> (aarch64_reload_movcp):
> Use aarch64_nopcrelative_literal_loads.
> (aarch64_reload_movcp): Likewise.
> * config/aarch64/aarch64-protos.h 
> (aarch64_nopcrelative_literal_loads): Declare
>
> 2015-10-15  Jiong Wang  
> Ramana Radhakrishnan  
>
> PR target/63304
> * gcc.target/aarch64/pr63304-1.c: New test.

> +/* Global flag for PC relative loads.  */
> +bool aarch64_nopcrelative_literal_loads;

Yuk... but I don't have a better suggestion to hand.

> +++ b/gcc/testsuite/gcc.target/aarch64/pr63304-1.c

In that directory I've been pestering folks to follow the name
convention on the wiki for new additions, so _1 please (or change the
wiki ;-).

Otherwise OK with me.

Cheers
/Marcus


Re: [PATCH][AArch64] Enable fusion of AES instructions

2015-10-20 Thread Marcus Shawcroft
On 14 October 2015 at 13:30, Wilco Dijkstra  wrote:
> Enable instruction fusion of dependent AESE; AESMC and AESD; AESIMC pairs. 
> This can give up to 2x
> speedup on many AArch64 implementations. Also model the crypto instructions 
> on Cortex-A57 according
> to the Optimization Guide.
>
> Passes regression tests.
>
> ChangeLog:
> 2015-10-14  Wilco Dijkstra  
>
> * gcc/config/aarch64/aarch64.c (cortexa53_tunings): Add AES fusion.
> (cortexa57_tunings): Likewise.
> (cortexa72_tunings): Likewise.
> (arch_macro_fusion_pair_p): Add support for AES fusion.
> * gcc/config/aarch64/aarch64-fusion-pairs.def: Add AES_AESMC entry.

These AArch64 changes are OK

> * gcc/config/arm/aarch-common.c (aarch_crypto_can_dual_issue):
> Allow virtual registers before reload so early scheduling works.
> * gcc/config/arm/cortex-a57.md (cortex_a57_crypto_simple): Use
> correct latency and pipeline.
> (cortex_a57_crypto_complex): Likewise.
> (cortex_a57_crypto_xor): Likewise.
> (define_bypass): Add AES bypass.

... but wait for Ramana or Kyrill to respond on these changes before you commit.

Cheers
/Marcus


Re: [PATCH][AArch64][1/2] Add fmul-by-power-of-2+fcvt optimisation

2015-10-20 Thread Marcus Shawcroft
On 19 October 2015 at 14:57, Kyrill Tkachov  wrote:

> 2015-10-19  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.md
>  (*aarch64_fcvt2_mult): New pattern.
> * config/aarch64/aarch64-simd.md
>  (*aarch64_fcvt2_mult): Likewise.
> * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle above patterns.
> (aarch64_fpconst_pow_of_2): New function.
> (aarch64_vec_fpconst_pow_of_2): Likewise.
> * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow_of_2): Declare
> prototype.
> (aarch64_vec_fpconst_pow_of_2): Likewise.
> * config/aarch64/predicates.md (aarch64_fp_pow2): New predicate.
> (aarch64_fp_vec_pow2): Likewise.
>
> 2015-10-19  Kyrylo Tkachov  
>
> * gcc.target/aarch64/fmul_fcvt_1.c: New test.
> * gcc.target/aarch64/fmul_fcvt_2.c: Likewise.

+char buf[64];
+sprintf (buf, "fcvtz\\t%%0., %%1., #%d", fbits);

Prefer snprintf please.

+  }
+  [(set_attr "type" "neon_fp_to_int_")]
+)
+
+

Superflous blank line here ?

+  *cost += rtx_cost (XEXP (x, 0), VOIDmode,
+ (enum rtx_code) code, 0, speed);

My understanding is the unnecessary use of enum  is now discouraged,
(rtx_code) is sufficient in this case.

+  int count = CONST_VECTOR_NUNITS (x);
+  int i;
+  for (i = 1; i < count; i++)

Push the int into the for initializer.
Push the rhs of the count assignment into the for condition and drop
the definition of count.

+/* { dg-final { scan-assembler "fcvtzs\tw\[0-9\], s\[0-9\]*.*#2" } } */

I'd prefer scan-assembler-times or do you have a particular reason to
avoid it in these tests?

Cheers
/Marcus


Re: [PATCH][AArch64] Add support for 64-bit vector-mode ldp/stp

2015-10-20 Thread Marcus Shawcroft
On 20 October 2015 at 17:26, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
> Hi Marcus,
>
> On 20/10/15 17:05, Marcus Shawcroft wrote:
>>
>> On 16 October 2015 at 13:58, Kyrill Tkachov <kyrylo.tkac...@arm.com>
>> wrote:
>>>
>>> Hi all,
>>>
>>> We already support load/store-pair operations on the D-registers when
>>> they
>>> contain an FP value, but the peepholes/sched-fusion machinery that
>>> do all the hard work currently ignore 64-bit vector modes.
>>>
>>> This patch adds support for fusing loads/stores of 64-bit vector operands
>>> into ldp and stp instructions.
>>> I've seen this trigger a few times in SPEC2006. Not too many times, but
>>> the
>>> times it did trigger the code seemed objectively better
>>> i.e. long sequences of ldr and str instructions essentially halved in
>>> size.
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>>
>>> Ok for trunk?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2015-10-16  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>
>>>  * config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):
>>
>> We have several different flavours of fusion in the backend, this one
>> is specifically load/stores, perhaps making that clear in the name of
>> this predicate will avoid confusion further down the line?
>
> Thanks for the review,
>
> This particular type of fusion is called sched_fusion in various
> places in the compiler and its implementation in aarch64 is only for
> load/store merging (indeed, the only usage of sched_fusion currently
> is to merge loads/stores in arm and aarch64).
>
> So, I think that sched_fusion in the name already conveys the information
> that it's the ldp/stp one rather than macro fusion. In fact, there is a
> macro fusion of ADRP and an LDR instruction,
> so having sched_fusion in the name is actually a better differentiator than
> mentioning loads/stores as both types of fusion deal with loads in some way.
>
> Is it ok to keep the name as is?

Thanks for the justification, patch is OK to commit /Marcus


Re: [PATCH][Testsuite] Turn on 64-bit-vector tests for AArch64

2015-10-20 Thread Marcus Shawcroft
On 16 October 2015 at 12:26, Alan Lawrence  wrote:
> This enables tests bb-slp-11.c and bb-slp-26.c for AArch64. Both of these are
> currently passing on little- and big-endian.
>
> (Tested on aarch64-none-linux-gnu and aarch64_be-none-elf).
>
> OK for trunk?
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp (check_effective_target_vect64): Add 
> AArch64.
> ---
>  gcc/testsuite/lib/target-supports.exp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 3088369..bd03108 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -4762,6 +4762,7 @@ proc check_effective_target_vect64 { } {
>  if { ([istarget arm*-*-*]
>   && [check_effective_target_arm_neon_ok]
>   && [check_effective_target_arm_little_endian])
> +|| [istarget aarch64*-*-*]
>   || [istarget sparc*-*-*] } {
> set et_vect64_saved 1
>  }
> --
> 1.9.1
>

OK /Marcus


Re: [PATCH][AArch64] Add support for 64-bit vector-mode ldp/stp

2015-10-20 Thread Marcus Shawcroft
On 16 October 2015 at 13:58, Kyrill Tkachov  wrote:
> Hi all,
>
> We already support load/store-pair operations on the D-registers when they
> contain an FP value, but the peepholes/sched-fusion machinery that
> do all the hard work currently ignore 64-bit vector modes.
>
> This patch adds support for fusing loads/stores of 64-bit vector operands
> into ldp and stp instructions.
> I've seen this trigger a few times in SPEC2006. Not too many times, but the
> times it did trigger the code seemed objectively better
> i.e. long sequences of ldr and str instructions essentially halved in size.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-10-16  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):

We have several different flavours of fusion in the backend, this one
is specifically load/stores, perhaps making that clear in the name of
this predicate will avoid confusion further down the line?

> New function.
> (fusion_load_store): Use it.
> * config/aarch64/aarch64-ldpstp.md: Add new peephole2s for
> ldp and stp in VD modes.
> * config/aarch64/aarch64-simd.md (load_pair, VD): New pattern.
> (store_pair, VD): Likewise.
>
> 2015-10-16  Kyrylo Tkachov  
>
> * gcc.target/aarch64/stp_vec_64_1.c: New test.
> * gcc.target/aarch64/ldp_vec_64_1.c: New test.

Otherwise OK /Marcus


Re: [PATCH][AArch64][1/2] Add fmul-by-power-of-2+fcvt optimisation

2015-10-20 Thread Marcus Shawcroft
On 20 October 2015 at 17:31, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
>
> On 20/10/15 17:28, Ramana Radhakrishnan wrote:
>>
>> On Tue, Oct 20, 2015 at 4:26 PM, Marcus Shawcroft
>> <marcus.shawcr...@gmail.com> wrote:
>>>
>>> On 19 October 2015 at 14:57, Kyrill Tkachov <kyrylo.tkac...@arm.com>
>>> wrote:
>>>
>>>> 2015-10-19  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>
>>>>  * config/aarch64/aarch64.md
>>>>   (*aarch64_fcvt2_mult): New pattern.
>>>>  * config/aarch64/aarch64-simd.md
>>>>   (*aarch64_fcvt2_mult): Likewise.
>>>>  * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle above
>>>> patterns.
>>>>  (aarch64_fpconst_pow_of_2): New function.
>>>>  (aarch64_vec_fpconst_pow_of_2): Likewise.
>>>>  * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow_of_2):
>>>> Declare
>>>>  prototype.
>>>>  (aarch64_vec_fpconst_pow_of_2): Likewise.
>>>>  * config/aarch64/predicates.md (aarch64_fp_pow2): New predicate.
>>>>  (aarch64_fp_vec_pow2): Likewise.
>>>>
>>>> 2015-10-19  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>
>>>>  * gcc.target/aarch64/fmul_fcvt_1.c: New test.
>>>>  * gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
>>>
>>> +char buf[64];
>>> +sprintf (buf, "fcvtz\\t%%0., %%1., #%d", fbits);
>>>
>>> Prefer snprintf please
>>
>> Should we also update our coding standards for this ? i.e. use the non
>> `n' versions of the string functions only if you have a very good
>> reason. Even more so in the run time support libraries !
>
>
> My understanding is one should *really* use snprintf if any part of the
> string comes from the user. So in this case it probably doesn't make a
> difference. However, it's not hard to use the 'n' version and thinking
> about the maximum length is probably a good practice to get into.

Using snprintf here means that if/when someone fettles around with the
pattern in the future they won't silently over run the buffer that
they forgot to resize for their new longer pattern. Although in the
case the likely hood of that happening seems rather small it is good
practice to simply avoid the none n version.

Cheers
/Marcus


Re: [PATCH][AArch64][1/2] Add fmul-by-power-of-2+fcvt optimisation

2015-10-20 Thread Marcus Shawcroft
On 20 October 2015 at 16:47, Kyrill Tkachov  wrote:

> Here's the patch updated as per your feedback.
>
> How's this?
>
> Thanks,
> Kyrill
>
> 2015-10-20  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.md
> (*aarch64_fcvt2_mult): New pattern.
> * config/aarch64/aarch64-simd.md
> (*aarch64_fcvt2_mult): Likewise.
> * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle above patterns.
> (aarch64_fpconst_pow_of_2): New function.
> (aarch64_vec_fpconst_pow_of_2): Likewise.
> * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow_of_2): Declare
> prototype.
> (aarch64_vec_fpconst_pow_of_2): Likewise.
> * config/aarch64/predicates.md (aarch64_fp_pow2): New predicate.
> (aarch64_fp_vec_pow2): Likewise.
>
> 2015-10-20  Kyrylo Tkachov  
>
>
> * gcc.target/aarch64/fmul_fcvt_1.c: New test.
> * gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
>

OK /Marcus


Re: [PATCH][AArch64 Testsuite][Trivial?] Remove divisions-to-produce-NaN from vdiv_f.c

2015-10-20 Thread Marcus Shawcroft
On 20 October 2015 at 13:40, Alan Lawrence  wrote:
> The test vdiv_f.c #define's NAN to (0.0 / 0.0). This produces extra scalar 
> fdiv's, which complicate the scan-assembler testing. We can remove these by 
> using __builtin_nan instead.
>
> Tested on AArch64 Linux.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/vdiv_f.c: Use __builtin_nan.

OK
/Marcus


Re: [PATCH] 2015-10-19 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-20 Thread Marcus Shawcroft
On 4 January 1970 at 00:02, Benedikt Huber
 wrote:
> * config/aarch64/aarch64-builtins.c: Builtins for rsqrt and rsqrtf.
> * config/aarch64/aarch64-protos.h: Declare.
> * config/aarch64/aarch64-simd.md: Matching expressions for frsqrte and
> frsqrts.
> * config/aarch64/aarch64-tuning-flags.def: Added recip_sqrt.
> * config/aarch64/aarch64.c: New functions. Emit rsqrt estimation code 
> when
> applicable.
> * config/aarch64/aarch64.md: Added enum entries.
> * config/aarch64/aarch64.opt: Added option -mlow-precision-recip-sqrt.
> * testsuite/gcc.target/aarch64/rsqrt_asm_check_common.h: Common 
> macros for
> assembly checks.
> * testsuite/gcc.target/aarch64/rsqrt_asm_check_negative_1.c: Make sure
> frsqrts and frsqrte are not emitted.
> * testsuite/gcc.target/aarch64/rsqrt_asm_check_1.c: Make sure frsqrts 
> and
> frsqrte are emitted.
> * testsuite/gcc.target/aarch64/rsqrt_1.c: Functional tests for rsqrt.

OK /Marcus


Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
Hi,

 A few more style nits:

> +  builtin_decls_data bdda[] = {

New line before  {

> +{double_type_node, "__builtin_aarch64_rsqrt_df", 
> AARCH64_BUILTIN_RSQRT_DF},

Space after {
Space  before }

> +void aarch64_emit_swrsqrt (rtx, rtx);
> +

> +tree aarch64_builtin_rsqrt (unsigned int fn, bool md_fn);
> +

Drop the formal argument names as you did in the first declaration.

See my previous comment w.r.t the naming of new test cases in
gcc.target/aarch64, at least the following still need s/-/_/

> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-common.h 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-common.h
> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-negative_1.c 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check-negative_1.c
> diff --git a/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check_1.c 
> b/gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check_1.c

> +//   With -ffast-math these return positive INF.
> +//   t_double (-0.0, -inf);
> +//   t_float (-0.0, -inff);
> +
> +//   The reason here is that -ffast-math flushes to zero.
> +//   t_double  (__DBL_MIN__/256, 0X1.00P+515);
> +//   t_float (__FLT_MIN__/256, 0X1.00P+67);

Comment consistently with the rest of the backend ie /* */

Thanks
/Marcus


Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
On 16 October 2015 at 15:31, Benedikt Huber
 wrote:
> I introduced this in revision 7 due to a request from James Greenhalgh.
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00963.html
>
>> Given that this is all so mechanical, I'd have a preference towards
>> refactoring this to loop over some structured data.
>
> Do you mean, that I should get rid of the typedef and leave the struct 
> without it?
> Or should I completely drop the struct?

The use of the struct is fine, we are being discouraged from using
unnecessary typedefs.  Just rewrite it as:

 struct builtin_decls_data
   {
   ...
   };

The references to the typedef'd name don't need to be modified.
Cheers
/Marcus


Re: [PATCH] 2015-10-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-10-16 Thread Marcus Shawcroft
On 16 October 2015 at 14:59, Benedikt Huber
 wrote:

> +  typedef struct
> +  {
> +tree type_node;
> +const char *builtin_name;
> +int function_code;
> +  } builtin_decls_data;

Please address Oleg's comment.

Cheers
/Marcus


Re: [RFC VTV] Fix VTV for targets that have section anchors.

2015-10-13 Thread Marcus Shawcroft

On 09/10/15 10:17, Ramana Radhakrishnan wrote:

This started as a Friday afternoon project ...

It turned out enabling VTV for AArch64 and ARM was a matter of fixing PR67868 
which essentially comes from building libvtv with section anchors turned on. 
The problem was that the flow of control from output_object_block through to 
switch_section did not have the same special casing for the vtable section that 
exists in assemble_variable.

Once this was done, I managed to build and test aarch64-none-linux-gnu with 
--enable-vtable-verify, a similar test was done for armhf.

Testing showed no regressions in the gcc/ g++ testsuites for aarch64 and armhf
Testing showed no failures in libvtv testsuite for aarch64 but a few more 
failures - see below.
Testing showed 2 failures in libstdc++-v3 testsuite compared to without vtable 
verification.

FAIL: libstdc++-abi/abi_check
FAIL: experimental/filesystem/iterators/directory_iterator.cc execution test

However both these failures also occur on x86_64 - so I'm content to declare 
victory on AArch64
as far as basic enablement goes.

On ARM I see the following failures that I still need to debug - I can see that 
the failure is because the write to _ZN4_VTVI1BE12__vtable_mapE does not elicit 
a SEGV but I need to go further than that.

FAIL: libvtv.cc/thunk_vtable_map_attack.cc -O0 -fvtable-verify=std execution 
test
FAIL: libvtv.cc/thunk_vtable_map_attack.cc -O2 -fvtable-verify=std execution 
test
FAIL: libvtv.cc/thunk_vtable_map_attack.cc -O0 -fvtable-verify=preinit 
execution test
FAIL: libvtv.cc/thunk_vtable_map_attack.cc -O2 -fvtable-verify=preinit 
execution test


Questions -

1. Are the generic changes to varasm.c ok ?
2. Can we take the AArch64 support in now, given this amount of testing ? 
Marcus / Caroline ?


+  aarch64*-*-linux*)
+   VTV_SUPPORTED=yes
;;

Ramana, Go ahead an add the aarch64 enable once you have the 
pre-requisite varasm changes approved.


Cheers
/Marcus


3. Any suggestions / helpful debug hints for VTV debugging (other than turning 
VTV_DEBUG on and inspecting trace) ?

There's an arm*-*-* hunk there but I'm easy about applying that right now and 
figuring out the issues
over time. In case we don't fix it for 6.0 we can rip the support out before 
release.


Thanks,
Ramana

P.S. (Yes, I'll provide a Changelog :) )





Re: [PATCH, aarch64]: Remove AARCH64_ROUND_UP and AARCH64_ROUND_DOWN defines

2015-10-12 Thread Marcus Shawcroft
On 12 October 2015 at 12:28, Uros Bizjak  wrote:
> Remove private definitions and use equivalent global macros instead.
>
> 2015-10-12  Uros Bizjak  
>
> * config/aarch/aarch64.h (AARCH64_ROUND_UP): Remove.
> (AARCH64_ROUND_DOWN): Ditto.
> * config/aarch64/aarch64.c: Use ROUND_UP instead of AARCH64_ROUND_UP.
>
> Tested by building a crosscompiler to aarch64-linux-gnu.
>
> OK for mainline?

OK Thanks /Marcus


Re: C PATCH for c/65345 (file-scope _Atomic expansion with floats)

2015-10-06 Thread Marcus Shawcroft
On 6 October 2015 at 12:29, Ramana Radhakrishnan
 wrote:

> Thanks for the explanation Eric, by that explanation I do not see the need to 
> adjust for TARGET_EXPR or mark_addressable in the backends.
>
> Here are the patches that I'm testing - I will apply the ARM one after 
> testing finishes - my previous testing broke because of some other reasons.
>
> The AArch64 patch cleared testing - ok to apply ?
>

> PR c/65345
>
> * config/aarch64/aarch64-builtins.c 
> (aarch64_atomic_assign_expand_fenv): Use create_tmp_var_raw instead of 
> create_tmp_var.

OK /Marcus


Re: [AArch64] [TLSIE][2/2] Implement TLS IE for tiny model

2015-10-06 Thread Marcus Shawcroft
On 5 October 2015 at 14:19, Jiong Wang  wrote:

> 2015-10-05 James Greenhalgh 
>Jiong Wang  
>
> gcc/
>   * config/aarch64/aarch64.md (tlsie_tiny_sidi): Replace "" with "w".
>

How about a test case?
/Marcus


Re: [AArch64][TLSGD][2/2] Implement TLS GD traditional for tiny code model

2015-10-06 Thread Marcus Shawcroft
On 27 August 2015 at 10:54, Jiong Wang  wrote:
>
> As described this is the main implementaion patch.
>
> 2015-08-26  Jiong Wang  
>
> gcc/
>   * configure.ac: Add check for binutils global dynamic tiny code model
>   relocation support.
>   * configure: Regenerate.
>   * config.in: Regenerate.
>   * config/aarch64/aarch64.md (tlsgd_tiny): New define_insn.
>   * config/aarch64/aarch64-protos.h (aarch64_symbol_type): New
>   enumeration SYMBOL_TINY_TLSGD.
>   (aarch64_symbol_context): New comment on SYMBOL_TINY_TLSGD.
>   * config/aarch64/aarch64.c (aarch64_classify_tls_symbol): Support
>   SYMBOL_TINY_TLSGD.
>   (aarch64_print_operand): Likewise.
>   (aarch64_expand_mov_immediate): Likewise.
>   (aarch64_load_symref_appropriately): Likewise.
>
> gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_aarch64_tlsgdtiny):
>   New effective check.
>   * gcc.target/aarch64/tlsgd_small_1.c: New testcase.
>   * gcc.target/aarch64/tlsgd_tiny_1.c: Likewise.
>

+#ifdef HAVE_AS_TINY_TLSGD_RELOCS
+  return SYMBOL_TINY_TLSGD;
+#else
+  return SYMBOL_SMALL_TLSGD;
+#endif

Rather than introduce blocks of conditional compilation it is better
to gate different behaviours with a test on a constant expression. In
this case add something like this:

#if define(HAVE_AS_TINY_TLSGD_RELOCS)
#define USE_TINY_TLSGD 1
#else
#define USE_TINY_TLSGD 0
#endif

up near the definition of TARGET_HAVE_TLS then write the above
fragment without using the preprocessor:

return USE_TINY_TLSGD ? SYMBOL_TINY_TLSGD : SYMBOL_SMALL_TLSGD;

or similar.


@@ -1024,6 +1024,7 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
  return;
   }

+case SYMBOL_TINY_TLSGD:
 case SYMBOL_SMALL_TLSGD:
   {
  rtx_insn *insns;
@@ -1031,7 +1032,10 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
  rtx resolver = aarch64_tls_get_addr ();

  start_sequence ();
- aarch64_emit_call_insn (gen_tlsgd_small (result, imm, resolver));
+ if (type == SYMBOL_SMALL_TLSGD)
+  aarch64_emit_call_insn (gen_tlsgd_small (result, imm, resolver));
+ else
+  aarch64_emit_call_insn (gen_tlsgd_tiny (result, imm, resolver));
  insns = get_insns ();
  end_sequence ();

Add a separate case statment for SYMBOL_TINY_TLSGD rather than reusing
the case statement for SYMBOL_SMALL_TLSGD and then needing to add
another test against symbol type within the body of the case
statement.

+(define_insn "tlsgd_tiny"
+  [(set (match_operand 0 "register_operand" "")
+ (call (mem:DI (match_operand:DI 2 "" "")) (const_int 1)))
+   (unspec:DI [(match_operand:DI 1 "aarch64_valid_symref" "S")]
UNSPEC_GOTTINYTLS)
+   (clobber (reg:DI LR_REGNUM))
+  ]
+  ""
+  "adr\tx0, %A1;bl\t%2;nop";
+  [(set_attr "type" "multiple")
+   (set_attr "length" "12")])

I don't think the explicit clobber LR_REGNUM is required since your
change last September:
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02654.html

Thanks
/Marcus


Re: [AArch64][TLSGD][1/2] Remove unncessary define_expand for TLS GD traditional

2015-10-06 Thread Marcus Shawcroft
On 27 August 2015 at 10:52, Jiong Wang  wrote:

> 2015-08-27  Jiong Wang  
>
> gcc/
>   * config/aarch64/aarch64.md (tlsgd_small): Delete this define_expand.
>   (*tlsgd_small): Rename this define_insn to "tlsgd_small";

OK /Marcus


Re: [Patch AArch64] Use default_elf_asm_named_section instead of special cased hook

2015-10-02 Thread Marcus Shawcroft
On 2 October 2015 at 11:08, Ramana Radhakrishnan
 wrote:

> * config/aarch64/aarch64-elf.h (TARGET_ASM_NAMED_SECTION): Use
> default_elf_asm_named_section.
> * config/aarch64/aarch64.c (aarch64_elf_asm_named_section): Delete.
> ---
>  gcc/config/aarch64/aarch64-elf.h |  2 +-
>  gcc/config/aarch64/aarch64.c | 74 
> 
>  2 files changed, 1 insertion(+), 75 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-elf.h 
> b/gcc/config/aarch64/aarch64-elf.h
> index 1ce6343..70aa845 100644
> --- a/gcc/config/aarch64/aarch64-elf.h
> +++ b/gcc/config/aarch64/aarch64-elf.h
> @@ -154,7 +154,7 @@ ASM_MABI_SPEC
>  #define TYPE_OPERAND_FMT   "%%%s"
>
>  #undef TARGET_ASM_NAMED_SECTION
> -#define TARGET_ASM_NAMED_SECTION  aarch64_elf_asm_named_section
> +#define TARGET_ASM_NAMED_SECTION  default_elf_asm_named_section

Isn't it sufficient to simply remove the #define completely and rely
on the default from elfos.h?
/M


Re: [AArch64/testsuite] Add more TLS local executable testcases

2015-10-02 Thread Marcus Shawcroft
On 22 September 2015 at 17:49, Jiong Wang  wrote:

> 2015-09-22  Jiong Wang  
>
> gcc/testsuite/
>* gcc.target/aarch64/tlsle12_tiny_1.c: New testcase for tiny model.
>* gcc.target/aarch64/tlsle24_tiny_1.c: Likewise.
>* gcc.target/aarch64/tlsle_sizeadj_tiny_1.c: TLS size truncation test
>for tiny model.
>* gcc.target/aarch64/tlsle_sizeadj_small_1.c: TLS size truncation test
>for small model.
>

OK /Marcus


Re: [Patch AArch64] Use default_elf_asm_named_section instead of special cased hook

2015-10-02 Thread Marcus Shawcroft
On 2 October 2015 at 14:01, Ramana Radhakrishnan
 wrote:

>>>  #undef TARGET_ASM_NAMED_SECTION
>>> -#define TARGET_ASM_NAMED_SECTION  aarch64_elf_asm_named_section
>>> +#define TARGET_ASM_NAMED_SECTION  default_elf_asm_named_section
>>
>> Isn't it sufficient to simply remove the #define completely and rely
>> on the default from elfos.h?
>
> Doh ! you're right - Yeah should be coming in from config/elfos.h given that 
> we don't support anything other than elf in the aarch64 port.
>
> Ok to commit without that hunk ?

Yes. /Marcus


Re: [Patch AArch64] Improve SIMD concatenation with zeroes

2015-10-02 Thread Marcus Shawcroft

On 02/10/15 09:12, James Greenhalgh wrote:


2015-10-01  James Greenhalgh  

* config/aarch64/aarch64-simd.md (*aarch64_combinez): Add
alternatives for reads from memory and moves from general-purpose
registers.
(*aarch64_combinez_be): Likewise.

2015-10-01  James Greenhalgh  

* gcc.target/aarch64/vect_combine_zeroes_1.c: New.



OK /Marcus



Re: [Patch 2/2 ARM/AArch64] Add a new Cortex-A53 scheduling model

2015-10-01 Thread Marcus Shawcroft

On 25/09/15 08:59, James Greenhalgh wrote:


Hi,

This patch introduces a new scheduling model for Cortex-A53.

Bootstrapped and tested on arm-none-linux-gnueabi and aarch64-none-linux-gnu
and checked with a variety of popular benchmarking and microbenchmarking
suites to show a benefit.

OK?

Thanks,
James

---
2015-09-25  James Greenhalgh  

* config/arm/aarch-common-protos.h
(aarch_accumulator_forwarding): New.
(aarch_forward_to_shift_is_not_shifted_reg): Likewise.
* config/arm/aarch-common.c (aarch_accumulator_forwarding): New.
(aarch_forward_to_shift_is_not_shifted_reg): Liekwise.
* config/arm/cortex-a53.md: Rewrite.



OK aarch64 with Kyrill's comments fixed.
/M



Re: [PATCH][AArch64] Add separate insn sched class for vector LDP & STP

2015-09-29 Thread Marcus Shawcroft

On 29/09/15 00:52, Evandro Menezes wrote:

In some micro-architectures the insns to load or store pairs of vector
registers are implemented rather differently from those affecting lanes
in vector registers.  Then, it's important that such insns be described
likewise differently in the scheduling model.

This patch adds the insn types neon_ldp{,_q} and neon_stp{,_q} apart
from the current neon_load2_2reg_q and neon_store2_2reg_q types,
respectively.



Hi,

The AArch64 part of this is OK. Please wait for Kyrill or Ramana to 
comment on ARM side.  Cheers /Marcus



Thank you,

-- Evandro Menezes


0001-AArch64-Add-separate-insn-sched-class-for-vector-LDP.patch


 From 340249dcd2af8dfce486cb4f62d4eaf285c6a799 Mon Sep 17 00:00:00 2001
From: Evandro Menezes
Date: Mon, 28 Sep 2015 15:00:00 -0500
Subject: [PATCH] [AArch64] Add separate insn sched class for vector LDP & STP

2015-09-28  Evandro Menezes

gcc/
* config/arm/types.md (neon_ldp, neon_ldp_q, neon_stp, neon_stp_q):
add new insn types for vector load and store pairs.


s/add/Add/ and likewise the rest of the changelog comments.


* config/arm/cortex-a53.md (cortex_a53_f_load_2reg): add insn
types "neon_ldp{,_q}".
* config/arm/cortex-a57.md (neon_load_c): add insn types
"neon_ldp{,_q}".
(neon_store_complex): add insn types "neon_stp{,_q}".
* config/aarch64/aarch64-simd.md (aarch64_be_movoi): add insn types
"neon_{ldp,stp}_q".




Re: [AArch64] Fix Prefetch ICE

2015-09-29 Thread Marcus Shawcroft
On 28 September 2015 at 06:27, Hurugalawadi, Naveen
 wrote:
> Hi Marcus,
>
> Thanks for the review and comments.
>
>>> OK and can you back port to 5 ?
>
> Please find attached the backported patch on gcc-5-branch.
>
> Regression tested on AArch64 without any issues.
>
> 2015-09-28  Andrew Pinski  
>
> ChangeLog
>
> * config/aarch64/aarch64.md (prefetch):
> Change the predicate of operand 0 to register_operand.

Thank you, please commit it if you have not already.
/M


Re: [AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-09-28 Thread Marcus Shawcroft
On 28 July 2015 at 14:12, Jiong Wang  wrote:
>
> The instruction sequences for preparing argument for TLS descriptor
> runtime resolver and the later function call to resolver can actually be
> hoisted out of the loop.
>
> Currently we can't because we have exposed the hard register X0 as
> destination of "set".  While GCC's RTL data flow infrastructure will
> skip or do very conservative assumption when hard register involved in
> and thus some loop IV opportunities are missed.

This patch feels like we are botching the back end to over come a
limitation in RTL IV opt.  Isn't the real solution in RTL IV opt ?

/Marcus

>
> This patch add another "tlsdesc_small_pseudo_" pattern, and avoid
> expose x0 to gcc generic code.
>
> Generally, we define a new register class FIXED_R0 which only contains 
> register
> 0, so the instruction sequences generated from the new add pattern is the same
> as tlsdesc_small_, while the operand 0 is wrapped as pseudo register 
> that
> RTL IV opt can handle it.
>
> Ideally, we should allow operand 0 to be any pseudo register, but then
> we can't model the override of x0 caused by the function call which is
> hidded by the UNSPEC.
>
> So here, we restricting operand 0 to be x0, the override of x0 can be
> reflected to the gcc.
>
> OK for trunk?
>
> 2015-07-28  Ramana Radhakrishnan  
> Jiong Wang  
>
> gcc/
>   * config/aarch64/aarch64.d (tlsdesc_small_pseudo_): New pattern.
>   * config/aarch64/aarch64.h (reg_class): New enumeration FIXED_REG0.
>   (REG_CLASS_NAMES): Likewise.
>   (REG_CLASS_CONTENTS): Likewise.
>   * config/aarch64/aarch64.c (aarch64_class_max_nregs): Likewise.
>   (aarch64_register_move_cost): Likewise.
>   (aarch64_load_symref_appropriately): Invoke the new added pattern if
>   possible.
>   * config/aarch64/constraints.md (Uc0): New constraint.
>
> gcc/testsuite.
>   * gcc.target/aarch64/tlsdesc_hoist.c: New testcase.
>
> --
> Regards,
> Jiong
>


Re: [Patch 1/2 AArch64/ARM] Give AArch64 ROR (Immediate) a new type attribute

2015-09-25 Thread Marcus Shawcroft
On 25 September 2015 at 14:19, James Greenhalgh
 wrote:

> 2015-09-25  James Greenhalgh  
>
> * config/arm/types.md (type): Add rotate_imm.
> * config/aarch64/aarch64.md (*ror3_insn): Split out the
> ROR immediate case.
> (*rorsi3_insn_uxtw): Likewise.
> * config/aarch64/thunderx.md (thunderx_shift): Add rotate_imm.
> * config/arm/cortex-a53.md (cortex_a53_alu_shift): Add rotate_imm.
> * config/arm/cortex-a57.md (cortex_a53_alu): Add rotate_imm.

OK with me.  /Marcus


Re: [AArch64] Fix Prefetch ICE

2015-09-24 Thread Marcus Shawcroft
On 24 September 2015 at 07:47, Hurugalawadi, Naveen
 wrote:
> Hi,
>
> Please find attached the patch that fixes an ICE for prefetch.
>
> The predicate is too lose for the constraints. Hence, the patch tightens
> up the predicate to be exactly as constraint  allows, avoids a “reload”
> and allows better code generation.
>
> Submitted on behalf of Andrew Pinski.
>
> Thanks,
> Naveen
>
> 2015-09-24  Andrew Pinski  
>
> ChangeLog
>
> * config/aarch64/aarch64.md (prefetch):
> Change the predicate of operand 0 to register_operand.

Hi, OK and can you back port to 5 ? Thanks /Marcus


Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-22 Thread Marcus Shawcroft
On 21 September 2015 at 15:38, James Greenhalgh
 wrote:

> ---
> gcc/
>
> 2015-09-21  James Greenhalgh  
>
> * config/aarch64/aarch64-simd.md
>
> (aarch64_float_truncate_hi_v4sf): Rewrite as an expand.
> (aarch64_float_truncate_hi_v4sf_le): New.
> (aarch64_float_truncate_hi_v4sf_be): Likewise.
>
> gcc/testsuite/
>
> 2015-09-21  James Greenhalgh  
>
> * gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New.
>

+#include 
+

I don't think this include is required, otherwise OK /Marcus


Re: [AArch64/testsuite] Add more TLS local executable testcases

2015-09-22 Thread Marcus Shawcroft
On 26 August 2015 at 14:58, Jiong Wang  wrote:
>
> This patch cover tlsle tiny model tests, tls size truncation for tiny &
> small model included also.
>
> All testcases pass native test.
>
> OK for trunk?
>
> 2015-08-26  Jiong Wang  
>
> gcc/testsuite/
>   * gcc.target/aarch64/tlsle12_tiny_1.c: New testcase for tiny model.
>   * gcc.target/aarch64/tlsle24_tiny_1.c: Likewise.
>   * gcc.target/aarch64/tlsle_sizeadj_tiny_1.c: TLS size truncation test
>   for tiny model.
>   * gcc.target/aarch64/tlsle_sizeadj_small_1.c: TLS size truncation test
>   for small model.

Hi Jiong
These tests are fragile if the test suite is run with an explicit
-mcmodel=* option.  I think they need appropriate dg-skip-if
directives to ensure they only run with the expected model.
/Marcus


Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_i{f32_f16,f16_f32}

2015-09-22 Thread Marcus Shawcroft
On 25 August 2015 at 14:57, Alan Lawrence  wrote:
> Sorry - wrong version posted. The hunk for add_options_for_arm_neon_fp16 has
> moved to the previous patch! This version also fixes some whitespace issues.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New.
> * lib/target-supports.exp
> (check_effective_target_arm_neon_fp16_hw_ok): New.
> ---
>  .../aarch64/advsimd-intrinsics/vcvt_f16.c  | 98 
> ++
>  gcc/testsuite/lib/target-supports.exp  | 15 
>  2 files changed, 113 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c 
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c
> new file mode 100644
> index 000..a2cfd38
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c

> +#include 

We should avoid dependencies on external header files.  In this case I
can't see that anything in math.h is needed any way. Otherwise OK with
me.

/Marcus


Re: [AArch64] Handle const address in aarch64_print_operand

2015-09-22 Thread Marcus Shawcroft
On 8 September 2015 at 16:03, Jiong Wang  wrote:

> 2015-09-08  Jiong Wang  
>
> gcc/
>   * config/aarch64/aarch64.c (aarch64_print_operand): Add "CONST"
>   support.

OK /Marcus


Re: [Aarch64][target/PR 67143][5.2] Backport correct constraints for atomic operations.

2015-09-22 Thread Marcus Shawcroft
On 9 September 2015 at 12:43, Matthew Wahab  wrote:

> gcc/
> 2015-09-09  Matthew Wahab  
>
> Backport from mainline
> 2015-08-14  Matthew Wahab  
>
> PR target/67143
> * config/aarch64/atomics.md (atomic_): Replace
> 'lconst_atomic' with 'const_atomic'.
> (atomic_fetch_): Likewise.
> (atomic__fetch): Likewise.
> * config/aarch64/iterators.md (lconst-atomic): Move below
> 'const_atomic'.
> (const_atomic): New.
>
> gcc/testsuite/
> 2015-09-09  Matthew Wahab  
>
> Backport from mainline
> 2015-08-14  Matthew Wahab  
> Matthias Klose  
>
> PR target/67143
> * gcc.c-torture/compile/pr67143.c: New
> * gcc.target/aarch64/atomic-op-imm.c
> (atomic_fetch_add_negative_RELAXED): New.
> (atomic_fetch_sub_negative_ACQUIRE): New.

OK for gcc-5 branch /Marcus


Re: [AArch64] Delete aarch64_symbol_context which is not used

2015-09-22 Thread Marcus Shawcroft
On 8 September 2015 at 16:00, Jiong Wang  wrote:
>
> The concept of aarch64_symbol_context is not used in AArch64, this patch
> remove it and all relevant code.
>
> ok for trunk?
>
> 2015-09-08  Jiong. Wang  
>
> gcc/
>   * config/aarch64/aarch64-protos.h (aarch64_symbol_context): Delete.
>   * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Likewise.
>   (aarch64_cannot_force_const_mem): Likewise.
>   (aarch64_classify_address): Likewise.
>   (aarch64_classify_symbolic_expression): Likewise.
>   (aarch64_print_operand): Likewise.
>   (aarch64_classify_symbol): Likewise.
>   (aarch64_mov_operand_p): Likewise.
>   * config/aarch64/predicates.md (aarch64_valid_symref): Likewise.
>   (aarch64_mov_operand): Likewise.

OK /Marcus


Re: [PATCH] 2015-09-03 Benedikt Huber <benedikt.hu...@theobroma-systems.com> Philipp Tomsich <philipp.toms...@theobroma-systems.com>

2015-09-21 Thread Marcus Shawcroft
Hi,

Thanks for your work on this.  There are a bunch of predominantly
style nits in line below.  My none nit comments on this patch are:

This should be left turned off for all cores where we have not seen
benchmark numbers to indicate that this optimization is a benefit, we
can take patches for each core in the future once numbers are
available.

Given the absence of numbers for some of the cores and the significant
impact on one core this should remain disabled by default on generic.

We don't necessarily want a proliferation of user orientated tuning
options without good reason, instead the per core default tuning
behaviour should make the right choice per core, hence I think -mrecip
should be dropped.  We already have the -moverride= mechanism to
provide a developer orientated mechanism to override the per core
tuning flag.

Further comments inline

Thanks

/Marcus

On 7 September 2015 at 11:40, Benedikt Huber
 wrote:


--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -29,4 +29,5 @@
  AARCH64_TUNE_ to give an enum name. */

 AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
+AARCH64_EXTRA_TUNING_OPTION ("mrecip_default_enabled", MRECIP_DEFAULT_ENABLED)

This name is exposed via the -moverride=tune= option, perhaps a
better name would be:
+AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT)


> +/* Add builtins for reciprocal square root. */
> +void
> +aarch64_add_builtin_rsqrt (void)
> +{
> +  tree fndecl = NULL;
> +  tree ftype = NULL;
> +
> +  tree V2SF_type_node = build_vector_type (float_type_node, 2);
> +  tree V2DF_type_node = build_vector_type (double_type_node, 2);
> +  tree V4SF_type_node = build_vector_type (float_type_node, 4);
> +
> +  ftype = build_function_type_list (double_type_node, double_type_node, 
> NULL_TREE);

Line length exceeds 80 characters.

> +  fndecl = add_builtin_function ("__builtin_aarch64_rsqrt_df",
> +ftype, AARCH64_BUILTIN_RSQRT_DF, BUILT_IN_MD, NULL, NULL_TREE);
> +  aarch64_builtin_decls[AARCH64_BUILTIN_RSQRT_DF] = fndecl;
> +
> +  ftype = build_function_type_list (float_type_node, float_type_node, 
> NULL_TREE);

Line length exceeds 80 characters.

> +/* Function to expand reciprocal square root builtins. */
> +static rtx
> +aarch64_expand_builtin_rsqrt (int fcode, tree exp, rtx target)
> +{
> +  rtx pat;
> +  tree arg0 = CALL_EXPR_ARG (exp, 0);
> +  rtx op0 = expand_normal (arg0);
> +
> +  enum insn_code c;
> +
> +  switch (fcode)
> +{
> +  case AARCH64_BUILTIN_RSQRT_DF:
> +c = CODE_FOR_rsqrt_df2; break;

Leading blocks of 8 spaces should be TAB's, likewise the rest of the patch.

> +/* Return builtin for reciprocal square root. */
> +tree
> +aarch64_builtin_rsqrt (unsigned int fn, bool md_fn)

Blank line between function comments and function, likewise through
the rest of the patch please.

> diff --git a/gcc/config/aarch64/aarch64-opts.h 
> b/gcc/config/aarch64/aarch64-opts.h
> index bf6bb7b..f8e79cb 100644
> --- a/gcc/config/aarch64/aarch64-opts.h
> +++ b/gcc/config/aarch64/aarch64-opts.h
> @@ -73,4 +73,11 @@ enum aarch64_code_model {
>AARCH64_CMODEL_LARGE
>  };
>
> +/* Each core can have -mrecip enabled or disabled by default. */
> +enum aarch64_mrecip {

{ on new line please.

> +  AARCH64_MRECIP_OFF = 0,
> +  AARCH64_MRECIP_ON,
> +  AARCH64_MRECIP_DEFAULT,

Trailing , will give a "comma at end of enumerator list" warning every
time this file is included,  drop the comma please.

> +/* Function to decide when to use
> + * reciprocal square root builtins. */

Drop the leading * on follow on comment lines, here and elsewhere in
the patch please.

> +/* Select reciprocal square root initial estimate
> + * insn depending on machine mode. */
> +rsqrte_type get_rsqrte_type (enum machine_mode mode)

New line between type and function name please, likewise for other
instance in this patch.

> +  for (int i = 0; i < iterations; ++i)
> +{
> +  rtx x1 = gen_reg_rtx (mode);
> +  rtx x2 = gen_reg_rtx (mode);
> +  rtx x3 = gen_reg_rtx (mode);
> +  emit_set_insn (x2, gen_rtx_MULT (mode, x0, x0));
> +
> +  emit_insn ((*get_rsqrts_type (mode)) (x3, xsrc, x2));
> +
> +  emit_set_insn (x1, gen_rtx_MULT (mode, x0, x3));
> +  x0 = x1;
> +}
> +
> +  emit_move_insn (dst, x0);
> +  return;

Superflous return, drop it please.

> +#undef TARGET_BUILTIN_RECIPROCAL
> +#define TARGET_BUILTIN_RECIPROCAL aarch64_builtin_reciprocal
> +

The rest of these TARGET_* defines are in alphabetical order, please
insert the new one in order.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> +
> +mlow-precision-recip-sqrt
> +Common Var(flag_mrecip_low_precision_sqrt) Optimization
> +Run fewer approximation steps to reduce latency and precision.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

> +@item -mlow-precision-recip-sqrt
> +@item -mno-low-precision-recip-sqrt
> +@opindex 

Re: [AArch64] Implement copysign[ds]f3

2015-09-16 Thread Marcus Shawcroft
On 16 September 2015 at 08:40, James Greenhalgh
 wrote:
>
> Hi,
>
> This patch adds expanders for copysigndf3 and copysignsf3 to the AArch64
> backend. These use the BSL/BIT/BIF insn to save us from the default
> expansion pattern.
>
> Bootstrapped on aarch64-none-linux-gnu with no issues, and checked
> the performance to show a slight improvement to FP routines using
> copysign or SIGN from fortran.
>
> OK?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2015-09-16  James Greenhalgh  
>
> * config/aarch64/aarch64.md (copysigndf3): New.
> (copysignsf3): Likewise.
>
> gcc/testsuite/
>
> 2015-09-16  James Greenhalgh  
>
> * gcc.target/aarch64/copysign_1.c: New.
> * gcc.target/aarch64/copysign_2.c: New.
>

OK /Marcus


Re: [RFC AArch64][PR 63304] Handle literal pools for functions 1 MiB in size.

2015-08-27 Thread Marcus Shawcroft
On 27 July 2015 at 15:33, Ramana Radhakrishnan
ramana.radhakrish...@foss.arm.com wrote:

 DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

 PR target/63304
 * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
 nopcrelative_literal_loads.
 (aarch64_classify_address): Likewise.
 (aarch64_constant_pool_reload_icode): Define.
 (aarch64_secondary_reload): Handle secondary reloads for
 literal pools.
 (aarch64_override_options): Handle nopcrelative_literal_loads.
 (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
 * config/aarch64/aarch64.md 
 (aarch64_reload_movcpALLTF:modeP:mode):
 Define.
 (aarch64_reload_movcpVALL:modeP:mode): Likewise.
 * config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
 * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
 predicate.
 * doc/invoke.texi (mnopc-relative-literal-loads): Document.

This looks OK to me. It needs rebasing, but OK if the rebase is
trival.  Default on is fine.  Hold off on the back ports for a couple
of weeks.
Cheers
/Marcus


Re: [PATCH][AARCH64]Fix for branch offsets over 1 MiB

2015-08-26 Thread Marcus Shawcroft
On 25 August 2015 at 14:12, Andre Vieira andre.simoesdiasvie...@arm.com wrote:

 gcc/ChangeLog:
 2015-08-07  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 Andre Vieira  andre.simoesdiasvie...@arm.com

 * config/aarch64/aarch64.md (*condjump): Handle functions  1 Mib.
 (*cboptabmode1): Likewise.
 (*tboptabmode1): Likewise.
 (*cboptabmode1): Likewise.
 * config/aarch64/iterators.md (inv_cb): New code attribute.
 (inv_tb): Likewise.
 * config/aarch64/aarch64.c (aarch64_gen_far_branch): New.
 * config/aarch64/aarch64-protos.h (aarch64_gen_far_branch): New.

 gcc/testsuite/ChangeLog:
 2015-08-07  Andre Vieira  andre.simoesdiasvie...@arm.com

 * gcc.target/aarch64/long_branch_1.c: New test.

OK /Marcus


Re: [AArch64][TLSLE][1/3] Add the option -mtls-size for AArch64

2015-08-26 Thread Marcus Shawcroft
On 25 August 2015 at 15:15, Jiong Wang jiong.w...@arm.com wrote:

 2015-08-25  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.opt (mtls-size): New entry.
   * config/aarch64/aarch64.c (initialize_aarch64_tls_size): New function.
   (aarch64_override_options_internal): Call initialize_aarch64_tls_size.
   * doc/invoke.texi (AArch64 Options): Document -mtls-size.


OK Thanks /Marcus


Re: [AArch64] [TLSIE][1/2] Rename test source file for reuse

2015-08-25 Thread Marcus Shawcroft
On 19 June 2015 at 10:15, Jiong Wang jiong.w...@arm.com wrote:

 Rename test source from tlsle.c into tls.c for reuse purpose.

 tls.c will be used as test source file for all TLS test, we just need to
 specify different tls options in different testcases.

 2015-06-19  Jiong Wang  jiong.w...@arm.com

 gcc/testsuite/
   * gcc.target/aarch64/tlsle.c: Rename to tls.c
   * gcc.target/aarch64/aarch64/tlsle12.c: Update source file name.
   * gcc.target/aarch64/aarch64/tlsle24.c: Ditto.
   * gcc.target/aarch64/aarch64/tlsle32.c: Ditto.

OK Thanks /Marcus


Re: [AArch64] [TLSIE][2/2] Implement TLS IE for tiny model

2015-08-25 Thread Marcus Shawcroft
On 19 June 2015 at 10:15, Jiong Wang jiong.w...@arm.com wrote:

 Currently, TLS IE is supported on small model only. This patch implement
 TLS Initial-exec model support for AArch64 tiny memory model.

 Under tiny model, we only allow 1M loadable segment size, one single ldr
 instruction is enough for addressing the got entry for TLS IE directly.

 The code sequence is:

 A:  mrs  tp, tpidr_el0
 B0: ldr  t0, :gottprel:x1  R_AARCH64_TLSIE_LD_GOTTPREL_PREL19  x1
 B1: add  t0, t0, tp

 B0 and B1 should not be scheduled, as the pattern will be recognized
 later for linker IE model to LE model optimization.

 2015-06-19  Marcus Shawcroft marcus.shawcr...@arm.com
 Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.md (UNSPEC_GOTTINYTLS): New UNSPEC.
   (tlsie_tiny_mode): New define_insn.
   (tlsie_tiny_sidi): Ditto.
   * config/aarch64/aarch64-protos.h (aarch64_symbol_type): Define
   SYMBOL_TINY_TLSIE.
   (aarch64_symbol_context): New comment for SYMBOL_TINY_TLSIE.
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Support
   SYMBOL_TINY_TLSIE.
   (aarch64_expand_mov_immediate): Ditto.
   (aarch64_print_operand): Ditto.
   (arch64_classify_tls_symbol): Ditto.

 gcc/testsuite/
   * gcc.target/aarch64/tlsie_tiny.c: New testcase.

 --
 Regards,
 Jiong


OK /Marcus


Re: [AArch64][TLSLE][1/3] Add the option -mtls-size for AArch64

2015-08-25 Thread Marcus Shawcroft
On 19 August 2015 at 15:26, Jiong Wang jiong.w...@arm.com wrote:

 2015-08-19  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.opt (mtls-size): New entry.
   * config/aarch64/aarch64.c (initialize_aarch64_tls_size): New function.
   (aarch64_override_options_internal): Call initialize_aarch64_tls_size.
   * doc/invoke.texi (AArch64 Options): Document -mtls-size.

 --
 Regards,
 Jiong


+case AARCH64_CMODEL_TINY:
+  /* The maximum TLS size allowed under tiny is 1M.  */
+  if (aarch64_tls_size  20)
+ aarch64_tls_size = 20;

The only valid values of aarch64_tls_size handled/expected by the
remainder of the patch set is 12,24,32,48 so setting the value to 20
here doesn;t make sense.

/Marcus


Re: [AArch64][TLSLE][2/3] Add the option -mtls-size for AArch64

2015-08-25 Thread Marcus Shawcroft
 2015-08-19  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64-protos.h (aarch64_symbol_type): Rename
   SYMBOL_TLSLE to SYMBOL_TLSLE24.
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Likewise
   (aarch64_expand_mov_immediate): Likewise
   (aarch64_print_operand): Likewise
   (aarch64_classify_symbol): Likewise


OK /Marcus


Re: [AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-25 Thread Marcus Shawcroft
 2015-08-19  Marcus Shawcroft  marcus.shawcr...@arm.com
 Jiong Wang  jiong.w...@arm.com
 gcc/
   * config/aarch64/aarch64.c (initialize_aarch64_tls_size): Set default
   tls size for tiny, small, large memory model.
   (aarch64_load_symref_appropriately): Support new symbol types.
   (aarch64_expand_mov_immediate): Likewise.
   (aarch64_print_operand): Likewise.
   (aarch64_classify_tls_symbol): Likewise.
   * config/aarch64/aarch64-protos.h (aarch64_symbol_context): Likewise.
   (aarch64_symbol_type): Likewise.
   * config/aarch64/aarch64.md (tlsle): Deleted.
   (tlsle12_mode): New define_insn.
   (tlsle24_mode): Likewise.
   (tlsle32_mode): Likewise.
   (tlsle48_mode): Likewise.
   * doc/sourcebuild.texi (AArch64-specific attributes): Document
   aarch64_tlsle32.

 gcc/testsuite/
   * lib/target-supports.exp (check_effective_target_aarch64_tlsle32):
   New test directive.
   * gcc.target/aarch64/tlsle_1.x: New test source.
   * gcc.target/aarch64/tlsle12.c: New testcase.
   * gcc.target/aarch64/tlsle24.c: New testcase.
   * gcc.target/aarch64/tlsle32.c: New testcase.
 --

OK /Marcus


Re: [AArch64] Break -mcpu tie between the compiler and assembler

2015-08-20 Thread Marcus Shawcroft
On 20 August 2015 at 09:15, James Greenhalgh james.greenha...@arm.com wrote:

 2015-08-19  James Greenhalgh  james.greenha...@arm.com

 * common/config/aarch64/aarch64-common.c
 (AARCH64_CPU_NAME_LENGTH): Delete.
 (aarch64_option_extension): New.
 (all_extensions): Likewise.
 (processor_name_to_arch): Likewise.
 (arch_to_arch_name): Likewise.
 (all_cores): New.
 (all_architectures): Likewise.
 (aarch64_get_extension_string_for_isa_flags): Likewise.
 (aarch64_rewrite_selected_cpu): Change to rewrite CPU names to
 architecture names.
 * config/aarch64/aarch64-protos.h
 (aarch64_get_extension_string_for_isa_flags): New.
 * config/aarch64/aarch64.c (aarch64_print_extension): Delete.
 (aarch64_option_print): Get the string to print from
 aarch64_get_extension_string_for_isa_flags.
 (aarch64_declare_function_name): Likewise.
 * config/aarch64/aarch64.h (BIG_LITTLE_SPEC): Rename to...
 (MCPU_TO_MARCH_SPEC): This.
 (ASM_CPU_SPEC): Use it.
 (BIG_LITTLE_SPEC_FUNCTIONS): Rename to...
 (MCPU_TO_MARCH_SPEC_FUNCTIONS): ...This.
 (EXTRA_SPEC_FUNCTIONS): Use it.

OK /Marcus


Re: [Patch] Add to the libgfortran/newlib bodge to detect ftruncate support in ARM/AArch64/SH

2015-08-20 Thread Marcus Shawcroft
On 20 August 2015 at 09:31, James Greenhalgh james.greenha...@arm.com wrote:

 Hi,

 Steve's patch in 2013 [1] to fix the MIPS newlib/libgfortran build
 causes subtle issues for an ARM/AArch64 newlib/libgfortran build. The
 problem is that ARM/AArch64 (and SH) define a stub function for
 ftruncate, which we would previously have auto-detected, but which is not
 part of the hardwiring Steve added.

 Continuing the tradition of building bodge on bodge on bodge, this patch
 hardwires HAVE_FTRUNCATE on for ARM/AArch64/SH, which does fix the issue
 I was seeing.

This is the second breakage I'm aware of due to the introduction of
this hardwire code, the first being related to strtold.  My
recollection is that it is only the mips target that requires the
newlib API hardwiring. Ideally we should rely only on the
AC_CHECK_FUNCS_ONCE probe code and avoid the hardwire entirely.

Perhaps a better approach for trunk would be something along the lines of:

case ${host}--x${with_newlib} in
mips*--xyes)
  hardwire_newlib=1;;
esac
if test ${hardwire_newlib:-0} -eq 1; then
  ... existing AC_DEFINES hardwire code
else
  ... existing AC_CHECK_FUNCS_ONCE probe code
fi

In effect limiting the hardwire to just the target which is unable to
probe.  For backport to 4.9 and 5 I think James' more conservative
patch is probably more appropriate.

What do folks think?

Cheers
/Marcus


Re: ira.c update_equiv_regs patch causes gcc/testsuite/gcc.target/arm/pr43920-2.c regression

2015-08-18 Thread Marcus Shawcroft
On 18 August 2015 at 10:25, Alex Velenko alex.vele...@arm.com wrote:


 On 31/07/15 12:04, Alex Velenko wrote:

 On 29/07/15 23:14, Jeff Law wrote:

 On 07/28/2015 12:18 PM, Alex Velenko wrote:

 On 21/04/15 06:27, Jeff Law wrote:

 On 04/20/2015 01:09 AM, Shiva Chen wrote:

 Hi, Jeff

 Thanks for your advice.

 can_replace_by.patch is the new patch to handle both cases.

 pr43920-2.c.244r.jump2.ori is the original  jump2 rtl dump

 pr43920-2.c.244r.jump2.patch_can_replace_by is the jump2 rtl dump
 after patch  can_replace_by.patch

 Could you help me to review the patch?

 Thanks.  This looks pretty good.

 I expanded the comment for the new function a bit and renamed the
 function in an effort to clarify its purpose.  From reviewing
 can_replace_by, it seems it should have been handling this case, but
 clearly wasn't due to implementation details.

 I then bootstrapped and regression tested the patch on x86_64-linux-gnu
 where it passed.  I also instrumented that compiler to see how often
 this code triggers.  During a bootstrap it triggers a couple hundred
 times (which is obviously a proxy for cross jumping improvements).  So
 it's triggering regularly on x86_64, which is good.

 I also verified that this fixes BZ64916 for an arm-non-eabi toolchain
 configured with --with-arch=armv7.

 Installed on the trunk.  No new testcase as it's covered by existing
 tests.

 Thanks,,
 jeff


 Hi,
 I see this patch been committed in r56 on trunk. Is it okay to port
 this to fsf-5?

 It's not a regression, so backporting it would be generally frowned
 upon.  If you feel strongly about it, you should ask Jakub, Joseph or
 Richi (the release managers) for an exception to the general policy.

 jeff

 Hi Jakub,
 Can this commit be ported to fsf-5? It fixed gcc.target/arm/pr43920-2.c
 at the time, so I think it is a good idea to port. Please, see
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64916
 Kind regards,
 Alex


 Ping!

 Currently this test is passed on fsf-trunk, but not passed on fsf-5, so I
 think it is a regression on fsf-5:

That does not make it a regression, it is only a regression if a
version prior to 5 passes, how does this test behave on 4.9?

Cheers
/Marcus


Re: ira.c update_equiv_regs patch causes gcc/testsuite/gcc.target/arm/pr43920-2.c regression

2015-08-18 Thread Marcus Shawcroft
On 18 August 2015 at 10:25, Alex Velenko alex.vele...@arm.com wrote:


 On 31/07/15 12:04, Alex Velenko wrote:

 On 29/07/15 23:14, Jeff Law wrote:

 On 07/28/2015 12:18 PM, Alex Velenko wrote:

 On 21/04/15 06:27, Jeff Law wrote:

 On 04/20/2015 01:09 AM, Shiva Chen wrote:

 Hi, Jeff

 Thanks for your advice.

 can_replace_by.patch is the new patch to handle both cases.

 pr43920-2.c.244r.jump2.ori is the original  jump2 rtl dump

 pr43920-2.c.244r.jump2.patch_can_replace_by is the jump2 rtl dump
 after patch  can_replace_by.patch

 Could you help me to review the patch?

 Thanks.  This looks pretty good.

 I expanded the comment for the new function a bit and renamed the
 function in an effort to clarify its purpose.  From reviewing
 can_replace_by, it seems it should have been handling this case, but
 clearly wasn't due to implementation details.

 I then bootstrapped and regression tested the patch on x86_64-linux-gnu
 where it passed.  I also instrumented that compiler to see how often
 this code triggers.  During a bootstrap it triggers a couple hundred
 times (which is obviously a proxy for cross jumping improvements).  So
 it's triggering regularly on x86_64, which is good.

 I also verified that this fixes BZ64916 for an arm-non-eabi toolchain
 configured with --with-arch=armv7.

 Installed on the trunk.  No new testcase as it's covered by existing
 tests.

 Thanks,,
 jeff


 Hi,
 I see this patch been committed in r56 on trunk. Is it okay to port
 this to fsf-5?

 It's not a regression, so backporting it would be generally frowned
 upon.  If you feel strongly about it, you should ask Jakub, Joseph or
 Richi (the release managers) for an exception to the general policy.

 jeff

 Hi Jakub,
 Can this commit be ported to fsf-5? It fixed gcc.target/arm/pr43920-2.c
 at the time, so I think it is a good idea to port. Please, see
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64916
 Kind regards,
 Alex


 Ping!

 Currently this test is passed on fsf-trunk, but not passed on fsf-5, so I
 think it is a regression on fsf-5:

That does not make it a regression, it is only a regression if a
version prior to 5 passes, how does this test behave on 4.9?

Cheers
/Marcus


Re: [Aarch64] Adjust tests to take LSE extension into account.

2015-08-18 Thread Marcus Shawcroft
On 18 August 2015 at 09:51, Matthew Wahab matthew.wa...@foss.arm.com wrote:

 gcc/testsuite
 2015-08-18  Matthew Wahab  matthew.wa...@arm.com

 * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Adjust
 dg-options to disable LSE extensions.
 * gcc.target/aarch64/atomic-op-acq_rel.c: Likewise.
 * gcc.target/aarch64/atomic-op-acquire.c: Likewise.
 * gcc.target/aarch64/atomic-op-char.c: Likewise.
 * gcc.target/aarch64/atomic-op-consume.c: Likewise.
 * gcc.target/aarch64/atomic-op-imm.c: Likewise.
 * gcc.target/aarch64/atomic-op-int.c: Likewise.
 * gcc.target/aarch64/atomic-op-long.c: Likewise.
 * gcc.target/aarch64/atomic-op-relaxed.c: Likewise.
 * gcc.target/aarch64/atomic-op-seq_cst.c: Likewise.
 * gcc.target/aarch64/atomic-op-release.c: Likewise.

 * gcc.target/aarch64/atomic-op-short.c: Likewise.
 * gcc.target/aarch64/sync-comp-swap.c: Likewise.
 * gcc.target/aarch64/sync-op-acquire.c: Likewise.
 * gcc.target/aarch64/sync-op-full.c: Likewise.
 * gcc.target/aarch64/sync-op-release.c: Likewise.



OK /Marcus


Re: [PATCH][AARCH64]Add backend combine_bfi pattern.

2015-08-18 Thread Marcus Shawcroft
On 5 August 2015 at 17:46, Renlin Li renlin...@arm.com wrote:
 Hi Kyrill,


 On 30/07/15 17:08, Kyrill Tkachov wrote:

 Hi Renlin,

 On 30/07/15 16:50, Renlin Li wrote:

 Hi all,

 This insn should match the following similar rtx pattern and remove the
 redundant zero_extend operation if the width of zero_extract and
 inner-size of zero_extend totally match.

 (set (zero_extract:SI (reg/i:SI 0 x0)
  (const_int 8 [0x8])
  (const_int 0 [0]))
 (zero_extend:SI (reg:QI 1 x1 [ y ])))


 aarch64-none-elf regression tests Okay. Okay to commit?

 Regards,
 Renlin

 gcc/ChangeLog:

 2015-07-30  Renlin Li  renlin...@arm.com

* config/aarch64/aarch64.md (combine_bfi): New pattern.

 gcc/testsuite/ChangeLog:

 2015-07-30  Renlin Li  renlin...@arm.com

* gcc.target/aarch64/combine-bfi.c: New.

 +(define_insn *combine_bfiGPI:modeALLX:mode
 +  [(set (zero_extract:GPI (match_operand:GPI 0 register_operand +r)
 + (match_operand 1 const_int_operand n)
 + (match_operand 2 const_int_operand n))
 +   (zero_extend:GPI (match_operand:ALLX 3  register_operand r)))]
 +  UINTVAL (operands[1]) == ALLX:sizen
 +  bfi\\t%w0, %w3, %2, %1
 +  [(set_attr type bfm)]
 +)

 I notice we don't have any other patterns in aarch64 that start with
 combine_*.
 Would it be better to name them something like
 *aarch64_bfiGPI:modeALLX:mode4 instead?

 Thanks for the suggestion. I have adjust the patch accordingly.

 Regards,
 Renlin



 gcc/ChangeLog:

 2015-08-05  Renlin Li  renlin...@arm.com

 * config/aarch64/aarch64.md
   (*aarch64_bfiGPI:modeALLX:mode4): New pattern.

 gcc/testsuite/ChangeLog:

 2015-08-05  Renlin Li  renlin...@arm.com

 * gcc.target/aarch64/combine-bfi.c: New.

For new test cases in this directory, follow the guidance given on the
wiki here https://gcc.gnu.org/wiki/TestCaseWriting, specifically add
the suffix _N, hence combine_bfi_1.c

OK with that change.
/Marcus


Re: [PATCH][AARCH64] Make arm_align_max_stack_pwr.c and arm_align_max_pwr.c compile testcase, instead of execution.

2015-08-18 Thread Marcus Shawcroft
On 28 July 2015 at 16:51, Renlin Li renlin...@arm.com wrote:

 2015-07-28  Renlin Li  renlin...@arm.com

 * gcc.target/aarch64/arm_align_max_pwr.c: Make it a compile test case,
 check the assembly.
 * gcc.target/aarch64/arm_align_max_stack_pwr.c: Likewise.


Hi,

 #include stdio.h
 #include assert.h

Test cases should not rely on external headers, see
https://gcc.gnu.org/wiki/HowToPrepareATestcase and it looks like
neither of these headers are actually required so OK with the includes
removed (and tested).

Cheers
/Marcus


Re: [PATCH][AArch64][8/14] Implement TARGET_OPTION_VALID_ATTRIBUTE_P

2015-07-24 Thread Marcus Shawcroft
On 21 July 2015 at 16:37, James Greenhalgh james.greenha...@arm.com wrote:
 On Thu, Jul 16, 2015 at 04:20:59PM +0100, Kyrill Tkachov wrote:

 +static bool
 +aarch64_process_one_target_attr (char *arg_str, const char* pragma_or_attr)
 +{
 +  bool ret;
 +  bool invert = false;
 +
 +  int len = strlen (arg_str);
 +
 +  if (len == 0)
 +{
 +  error (malformed target %s, pragma_or_attr);
 +  return false;
 +}
 +
 +  char *str_to_check = (char *) alloca (len + 1);

 Seems to go against your approach earlier in the patch series of XSTRDUP,
 it would be nice to stay consistent.

Agreed, we should be consistent.  Each of the other instances of
alloca/xstrdup that I've seen in this patch series are actually
superflous, we simply copy the string, use it then bin it, better to
remove the dup completely.

In the choice between xstrdup() and alloca(), since alloca() is an
option, better to choose alloca() and remove the need for each of the
explicit free() calls in the exit paths. The alloca() route  makes it
less likely that we accidentally introduce a space leak in the future.

Cheers
/Marcus


Re: [PATCH][AArch64] Fix LINUX_TARGET_LINK_SPEC to be consistent with ARM

2015-07-24 Thread Marcus Shawcroft
On 22 July 2015 at 18:13, Szabolcs Nagy szabolcs.n...@arm.com wrote:
 Same as
 https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01387.html
 but for AArch64.

 -dynamic-linker is only passed to the linker if !static  !shared.

 -rdynamic handling is changed too to be consistent with arm:
 only pass -export-dynamic if !static.

 2015-07-22  Szabolcs Nagy  szabolcs.n...@arm.com

 PR target/65711
 * config/aarch64/aarch64-linux.h (LINUX_TARGET_LINK_SPEC): Move
 -dynamic-linker within %{!static %{!shared, and -rdynamic within
 %{!static.

OK and I think we should backport this to 5 and 4.9

Cheers
/Marcus


Re: [PATCH][AArch64] elf toolchain does not pass -shared linker option

2015-07-24 Thread Marcus Shawcroft
On 22 July 2015 at 18:22, Szabolcs Nagy szabolcs.n...@arm.com wrote:

 2015-07-22  Szabolcs Nagy  szabolcs.n...@arm.com

 * config/aarch64/aarch64-elf-raw.h (LINK_SPEC): Handle -h, -static,
 -shared, -symbolic, -rdynamic.

OK, this should be back ported to 5 and 4.9 aswell.

Thanks
/Marcus


Re: [AArch64][1/2] Mark GOT related MEM rtx as const to help RTL loop IV

2015-07-10 Thread Marcus Shawcroft
On 7 July 2015 at 13:33, Jiong Wang jiong.w...@arm.com wrote:

 2015-07-06  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Mark mem as
   READONLY and NOTRAP for PIC symbol.

 gcc/testsuite/
   * gcc.target/aarch64/got_mem_hoist.c: New test.

Looks, OK to me.  Follow the guidance on the wiki here
https://gcc.gnu.org/wiki/TestCaseWriting when naming new test cases
and add _1 suffix.  Otherwise OK.
/Marcus


Re: [PATCH][AArch64] PR target/66731 Fix fnmul insn with -frounding-math

2015-07-06 Thread Marcus Shawcroft
On 6 July 2015 at 09:20, Szabolcs Nagy szabolcs.n...@arm.com wrote:

 2015-07-06  Szabolcs Nagy  szabolcs.n...@arm.com

 * gcc.target/aarch64/fnmul-1.c: New.
 * gcc.target/aarch64/fnmul-2.c: New.
 * gcc.target/aarch64/fnmul-3.c: New.
 * gcc.target/aarch64/fnmul-4.c: New.

+float
+foo_s (float a, float b)
+{
+   /* { dg-final { scan-assembler fnmul\\ts\[0-9\]+, s\[0-9\]+,
s\[0-9\]+ } } */
+   return -(a * b);
+}

Indentation should set at two spaces.
/Marcus


Re: [Patch ARM-AArch64/testsuite Neon intrinsics: vget_lane

2015-07-03 Thread Marcus Shawcroft
On 2 July 2015 at 14:44, Christophe Lyon christophe.l...@linaro.org wrote:
 Hi,

 Here is the missing test for ARM/AArch64 AdvSIMD intrinsic: vget_lane.

 Tested on arm, armeb, aarch64 and aarch64_be targets (using QEMU).

 The tests all pass, expect on armeb where vgetq_lane_s64 and
 vgetq_lane_u64 fail. I haven't investigated in details yet.

 OK for trunk?

 2015-07-02  Christophe Lyon  christophe.l...@linaro.org

 * gcc.target/aarch64/advsimd-intrinsics/vget_lane.c: New testcase.


OK /Marcus


Re: [AArch64][TLSLE][2/N] Rename tlsle_small to tlsle

2015-06-26 Thread Marcus Shawcroft
On 20 May 2015 at 12:19, Jiong Wang jiong.w...@arm.com wrote:

 Similar to the rename from SYMBOL_SMALL_TPREL to SYMBOL_TLSLE, this
 patch rename the rtl pattern name.

 ok for trunk?

 2015-05-19  Jiong Wang  jiong.w...@arm.com
 gcc/
   * config/aarch64/aarch64.md (tlsle_small): Rename to tlsle.
   (tlsle_small_mode): Rename to tlsle_mode.
   * config/aarc64/aarch64.c (aarch64_load_symref_appropriately): Use new
   pattern name.

OK /Marcus


Re: [AArch64][TLSLE][3/N] Add UNSPEC_TLSLE

2015-06-26 Thread Marcus Shawcroft
On 20 May 2015 at 12:21, Jiong Wang jiong.w...@arm.com wrote:

 Add new unspec name UNSPEC_TLSLE, use it for all tlsle pattern.

 ok for trunk?

 2015-05-19  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.md (UNSPEC_TLSLE): New enumeration.
   (tlsle): Use new unspec name.
   (tlsle_mode): Ditto.

OK /Marcus


Re: [Aarch64] Expand +rdma documentation, small changes to march and mcpu text.

2015-06-26 Thread Marcus Shawcroft
On 22 June 2015 at 17:17, Matthew Wahab matthew.wa...@arm.com wrote:
 Hello,

 The documentation for the ARMv8.1 +rdma option doesn't mention that enabling
 it
 also implies enabling Adv.SIMD. This patch fixes that.

 The documentation for the -march and -mcpu options are also a little messy,
 this
 patch tries to make the text clearer and adds a (texinfo) link to the
 subsection
 documenting the feature modifiers.

 Tested by checking the html output.

 Ok for trunk?
 Matthew

 2015-06-22  Matthew Wahab  matthew.wa...@arm.com

 * doc/invoke.texi (Aarch64 Options, -march): Split out arch and
 feature description, split out the native option, add a link to
 the feature documentation, rearrange and slightly rewrite text.
 (Aarch64 options, -mcpu): Likewise.
 (Aarch64 options, Feature Modifiers): Add an anchor.  Mention +rdma
 implies Adv. SIMD.

OK /Marcus


Re: [AArch64][TLSLE][N/N] Implement local executable mode for all memory model

2015-06-26 Thread Marcus Shawcroft
On 21 May 2015 at 17:49, Jiong Wang jiong.w...@arm.com wrote:

 2015-05-14  Jiong Wang  jiong.w...@arm.com
 gcc/
   * config/aarch64/aarch64.c (aarch64_print_operand): Support tls_size.
   * config/aarch64/aarch64.md (tlsle): Choose proper instruction
   sequences.
   (tlsle_mode): New define_insn.
   (tlsle_movsym_mode): Ditto.
   * config/aarch64/constraints.md (Uta): New constraint.
   (Utb): Ditto.
   (Utc): Ditto.
   (Utd): Ditto.

 gcc/testsuite/
   * gcc.target/aarch64/tlsle.c: New test source.
   * gcc.target/aarch64/tlsle12.c: New testcase.
   * gcc.target/aarch64/tlsle24.c: New testcase.
   * gcc.target/aarch64/tlsle32.c: New testcase.



  case SYMBOL_TLSLE:
-  asm_fprintf (asm_out_file, :tprel_lo12_nc:);
+  if (aarch64_tls_size = 12)
+/* Make sure TLS offset fit into 12bit.  */
+asm_fprintf (asm_out_file, :tprel_lo12:);
+  else
+asm_fprintf (asm_out_file, :tprel_lo12_nc:);
   break;

Use the existing classify_symbol mechanism we use throughout the
aarch64 backend.  Specifically rename SYMBOL_TLSLE as SYMBOL_TLSLE24
and introduce the 3 missing flavours then use the symbol
classification to control behaviour such as this modifier selection.

Cheers
/Marcus


Re: [AArch64][TLSLE][1/N] Rename SYMBOL_SMALL_TPREL to SYMBOL_TLSLE

2015-06-26 Thread Marcus Shawcroft
On 20 May 2015 at 11:56, Jiong Wang jiong.w...@arm.com wrote:

 2015-05-19  Marcus Shawcroft  marcus.shawcr...@arm.com
 Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64-protos.h (arch64_symbol_type): Rename
   SYMBOL_SMALL_TPREL to SYMBOL_TLSLE.
   (aarch64_symbol_context): Ditto.
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Ditto.
   (aarch64_expand_mov_immediate): Ditto.
   (aarch64_print_operand): Ditto.
   (aarch64_classify_tls_symbol): Ditto.
 --
 Regards,
 Jiong


OK /Marcus


Re: [AArch64][TLSLE][5/N] Recognize -mtls-size

2015-06-26 Thread Marcus Shawcroft
On 21 May 2015 at 17:44, Jiong Wang jiong.w...@arm.com wrote:

 This patch add -mtls-size option for AArch64. This option let user to do
 finer control on code generation for various TLS model on AArch64.

 For example, for TLS LE, user can specify smaller tls-size, for example
 4K which is quite usual, to let AArch64 backend generate more efficient
 instruction sequences.

 Currently, -mtls-size accept all integer, then will translate it into
 12(4K), 24(16M), 32(4G), 48(256TB) based on the value.

 no functional change.

 ok for trunk?

 2015-05-20  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64.opt (mtls-size): New entry.
   * config/aarch64/aarch64.c (initialize_aarch64_tls_size): New function.
   * doc/invoke.texi (AArch64 Options): Document -mtls-size.

+mtls-size=
+Target RejectNegative Joined UInteger Var(aarch64_tls_size) Init(24)
+Specifies size of the TLS data area, default size is 16M. Accept any
integer, but the value
+will be transformed into 12(4K), 24(16M), 32(4G), 48(256TB)
+

Can we follow the mechanism used by rs6000 and limit the accepted
values here using an Enum to just the valid values: 12, 24, 32, 48?

+@item -mtls-size=@var{size}
+@opindex mtls-size
+Specify the size of TLS area. You can specify smaller value to get better code
+generation for TLS variable access. Currently, we accept any integer, but will
+turn them into 12(4K), 24(16M), 32(4G), 48(256TB) according to the integer
+value.
+

How about:
Specify bit size of immediate TLS offsets.  Valid values are 12, 24, 32, 48.

Thanks
/Marcus


Re: [Patch AArch64 3/4] De-const-ify struct tune_params

2015-06-26 Thread Marcus Shawcroft
On 23 June 2015 at 09:49, James Greenhalgh james.greenha...@arm.com wrote:

 Hi,

 If we want to overwrite parts of this structure, we're going to need it
 to be more malleable than it is presently.

 Run through and remove const from each of the members, create a non-const
 tuning structure we can modify, and set aarch64_tune_params to always
 point to this new structure. Change the -mtune parsing code to take a
 copy of the tuning structure in use rather than just taking the
 reference from within the processor struct. Change all the current
 users of aarch64_tune_params which no longer need to dereference a
 pointer.

 Checked on aarch64-none-linux-gnueabi with no issues.

 OK?

 Thanks,
 James

 ---
 2015-06-23  James Greenhalgh  james.greenha...@arm.com

 * config/aarch64/aarch64-protos.h (tune_params): Remove
 const from members.
 (aarch64_tune_params): Remove const, change to no longer be
 a pointer.
 * config/aarch64/aarch64.c (aarch64_tune_params): Remove const,
 change to no longer be a pointer, initialize to generic_tunings.
 (aarch64_min_divisions_for_recip_mul): Change dereference of
 aarch64_tune_params to member access.
 (aarch64_reassociation_width): Likewise.
 (aarch64_rtx_mult_cost): Likewise.
 (aarch64_address_cost): Likewise.
 (aarch64_branch_cost): Likewise.
 (aarch64_rtx_costs): Likewise.
 (aarch64_register_move_cost): Likewise.
 (aarch64_memory_move_cost): Likewise.
 (aarch64_sched_issue_rate): Likewise.
 (aarch64_builtin_vectorization_cost): Likewise.
 (aarch64_override_options): Take a copy of the selected tuning
 struct in to aarch64_tune_params, rather than just setting
 a pointer, change dereferences of aarch64_tune_params to member
 accesses.
 (aarch64_override_options_after_change): Change dereferences of
 aarch64_tune_params to member access.
 (aarch64_macro_fusion_p): Likewise.
 (aarch_macro_fusion_pair_p): Likewise.
 * config/aarch64/cortex-a57-fma-steering.c (gate): Likewise.


OK
/Marcus


Re: [AArch64][1/2] Rename SYMBOL_SMALL_GOT to SYMBOL_SMALL_GOT_4G

2015-06-26 Thread Marcus Shawcroft
On 26 June 2015 at 10:23, Jiong Wang jiong.w...@arm.com wrote:

 OK. Reworked this patch. Removed those redundant memory model check by
 adding new symbol classification. Patch splitted into two:

   * [1/2] Rename SYMBOL_SMALL_GOT to SYMBOL_SMALL_GOT_4G
   * [2/2] Implement -fpic for -mcmodel=small

 This is the first patch which only renmae SYMBOL_SMALL_GOT into
 SYMBOL_SMALL_GOT_4G. Please review, Thanks.

 2015-06-26  Jiong Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64-protos.h (aarch64_symbol_type): Rename
   SYMBOL_SMALL_GOT to SYMBOL_SMALL_GOT_4G.
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Ditto.
   (aarch64_expand_mov_immediate): Ditto.
   (aarch64_print_operand): Ditto.
   (aarch64_classify_symbol): Ditto.


OK, thanks Jiong.  /Marcus


Re: [AArch64][2/2] Implement -fpic for -mcmodel=small

2015-06-26 Thread Marcus Shawcroft
On 26 June 2015 at 10:32, Jiong Wang jiong.w...@arm.com wrote:

 This patch respin https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01804.html.

 A new symbol classification SYMBOL_SMALL_GOT_28K added to represent symbol
 which needs go through GOT table and it's under -fpic/-mcmodel-small. the 
 _28K
 suffix can reflect the symbol's attribute better, and by introducing this new
 symbol type, we could avoid checking aarch64_cmodel at some extent
 though still needs the check somewhere.

 All other code logic not changed.

 OK for trunk?

 Thanks.

 2015-06-26  Jiong. Wang  jiong.w...@arm.com

 gcc/
   * config/aarch64/aarch64-protos.h (aarch64_symbol_type): New type
   SYMBOL_SMALL_GOT_28K.
   * config/aarch64/aarch64.md: (ldr_got_small_mode): Support new GOT
   relocation modifiers.
   (unspec): New enum UNSPEC_GOTMALLPIC28K.
   (ldr_got_small_28k_mode): New.
   (ldr_got_small_28k_sidi): New.
   * config/aarch64/iterators.md (got_modifier): New mode iterator.
   * config/aarch64/aarch64-otps.h (aarch64_code_model): New model.
   * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Support
   SYMBOL_SMALL_GOT_28K.
   (aarch64_rtx_costs): Add costs for new instruction sequences.
   (initialize_aarch64_code_model): Initialize new model.
   (aarch64_classify_symbol): Recognize new model and new symbol 
 classification.
   (aarch64_asm_preferred_eh_data_format): Support new model.
   (aarch64_load_symref_appropriately): Generate new instruction
   sequences for -fpic.
   (TARGET_USE_PSEUDO_PIC_REG): New definition.
   (aarch64_use_pseudo_pic_reg): New function.

 gcc/testsuite/
   * gcc.target/aarch64/pic-small.c: New testcase.


OK, Thanks Jiong.  Could you prepare a NEWS entry for this change?
Cheers
/Marcus


Re: [Patch AArch64 4/4] Add -moverride tuning command, and wire it up for control of fusion and fma-steering

2015-06-26 Thread Marcus Shawcroft
On 23 June 2015 at 09:49, James Greenhalgh james.greenha...@arm.com wrote:

 Hi,

 This final patch adds support for the new command line option
 -moverride. The purpose of this command line is to allow expert-level users
 of the compiler, and those comfortable with experimenting with the compiler,
 *unsupported* full access to the tuning structures used in the AArch64
 back-end.

 For now, we only enable command-line access to the fusion pairs to
 enable and whether or not to use the Cortex-A57 FMA register renaming
 pass. Though in future we can expand this further.

 With this patch, you might write something like:

   -moverride=fuse=adrp+add.cmp+branch:tune=rename_fma_regs

 To enable fusion of adrp+add and cmp+branch and to enable the
 fma-rename pass.

 I've bootstrapped and tested the patch set on aarch64-none-linux-gnu
 with BOOT_CFLAGS set to the example string above, and again in the
 standard configuration with no issues.

 OK?

 Thanks,
 James

 ---
 2015-06-23  James Greenhalgh  james.greenha...@arm.com

 * config/aarch64/aarch64.opt: (override): New.
 * doc/invoke.texi (override): Document.
 * config/aarch64/aarch64.c (aarch64_flag_desc): New
 (aarch64_fusible_pairs): Likewise.
 (aarch64_tuning_flags): Likewise.
 (aarch64_tuning_override_function): Likewise.
 (aarch64_tuning_override_functions): Likewise.
 (aarch64_parse_one_option_token): Likewise.
 (aarch64_parse_boolean_options): Likewise.
 (aarch64_parse_fuse_string): Likewise.
 (aarch64_parse_tune_string): Likewise.
 (aarch64_parse_one_override_token): Likewise.
 (aarch64_parse_override_string): Likewise.
 (aarch64_override_options): Parse the -override string if it
 is present.


+static const struct aarch64_tuning_override_function
+  aarch64_tuning_override_functions[] =

The indentation looks odd here, but otherwise OK /Marcus


  1   2   3   4   5   6   7   8   9   10   >