RE: [committed] Fix MIPS p5600 scheduler

2014-07-17 Thread Matthew Fortune
Hi Richard,

Thanks for fixing this. I'm afraid I managed to get confused between
failures we had seen sporadically in our development work and thought
they were known regressions on trunk waiting to be fixed when actually
we were introducing them.

Apologies for the breakage.

Regards,
Matthew

> -Original Message-
> From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
> Sent: 17 July 2014 21:18
> To: gcc-patches@gcc.gnu.org
> Cc: Jaydeep Patil; Matthew Fortune
> Subject: [committed] Fix MIPS p5600 scheduler
> 
> The p5600 scheduler wasn't restricting itself to -mtune=p5600 and so
> was being used for other CPUs too.  This showed up as a failure in
> various tests, including gcc.target/mips/octeon-pipe-1.c.  (Thinking
> about it, it was probably also why umips-lwp-*.c started failing,
> although the patch I just committed is still OK after this fix.)
> 
> Guys: please make sure you do a before-and-after comparison of test results,
> even if it "obviously" shouldn't be necessary.  This amount of fallout
> in gcc.target/mips would have been a red flag that something was wrong.
> 
> Tested on mips64-linux-gnu and applied.
> 
> Thanks,
> Richard
> 
> 
> gcc/
>   * config/mips/p5600.md: Add missing cpu tests.
> 
> Index: gcc/config/mips/p5600.md
> ===
> --- gcc/config/mips/p5600.md  2014-07-17 20:53:50.423095856 +0100
> +++ gcc/config/mips/p5600.md  2014-07-17 20:53:50.764100479 +0100
> @@ -47,52 +47,62 @@ (define_reservation "p5600_alq_alu" "p56
> 
>  ;; fadd, fsub
>  (define_insn_reservation "p5600_fpu_fadd" 4
> -  (eq_attr "type" "fadd,fabs,fneg")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fadd,fabs,fneg"))
>"p5600_fpu_long, p5600_fpu_apu")
> 
>  ;; fabs, fneg, fcmp
>  (define_insn_reservation "p5600_fpu_fabs" 2
> -  (eq_attr "type" "fabs,fneg,fcmp,fmove")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fabs,fneg,fcmp,fmove"))
>"p5600_fpu_short, p5600_fpu_apu")
> 
>  ;; fload
>  (define_insn_reservation "p5600_fpu_fload" 8
> -  (eq_attr "type" "fpload,fpidxload")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fpload,fpidxload"))
>"p5600_fpu_long, p5600_fpu_apu")
> 
>  ;; fstore
>  (define_insn_reservation "p5600_fpu_fstore" 1
> -  (eq_attr "type" "fpstore,fpidxstore")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fpstore,fpidxstore"))
>"p5600_fpu_short, p5600_fpu_apu")
> 
>  ;; fmadd
>  (define_insn_reservation "p5600_fpu_fmadd" 9
> -  (eq_attr "type" "fmadd")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fmadd"))
>"p5600_fpu_long, p5600_fpu_apu")
> 
>  ;; fmul
>  (define_insn_reservation "p5600_fpu_fmul" 5
> -  (eq_attr "type" "fmul")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fmul"))
>"p5600_fpu_long, p5600_fpu_apu")
> 
>  ;; fdiv, fsqrt
>  (define_insn_reservation "p5600_fpu_div" 17
> -  (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt"))
>"p5600_fpu_long, p5600_fpu_apu*17")
> 
>  ;; fcvt
>  (define_insn_reservation "p5600_fpu_fcvt" 4
> -  (eq_attr "type" "fcvt")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "fcvt"))
>"p5600_fpu_long, p5600_fpu_apu")
> 
>  ;; mtc
>  (define_insn_reservation "p5600_fpu_fmtc" 7
> -  (eq_attr "type" "mtc")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "mtc"))
>"p5600_fpu_short, p5600_fpu_store")
> 
>  ;; mfc
>  (define_insn_reservation "p5600_fpu_fmfc" 4
> -  (eq_attr "type" "mfc")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "type" "mfc"))
>"p5600_fpu_short, p5600_fpu_store")
> 
>  ;; madd/msub feeding into the add source
> @@ -105,100 +115,120 @@ (define_bypass 5 "p5600_fpu_fmadd" "p560
> 
>  ;; and
>  (define_insn_reservation "p5600_int_and" 1
> -  (eq_attr "move_type" "logical")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "move_type" "logical"))
>"p5600_alq_alu")
> 
>  ;; lui
>  (define_insn_reservation "p5600_int_lui" 1
> -  (eq_attr "move_type" "const")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "move_type" "const"))
>"p5600_alq_alu")
> 
>  ;; Load lb, lbu, lh, lhu, lq, lw, lw_i2f, lwxs
>  (define_insn_reservation "p5600_int_load" 4
> -  (eq_attr "move_type" "load")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "move_type" "load"))
>"p5600_agq_ldsta")
> 
>  ;; store
>  (define_insn_reservation "p5600_int_store" 3
> -  (eq_attr "move_type" "store")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "move_type" "store"))
>"p5600_agq_ldsta")
> 
>  ;; andi, sll, srl, seb, seh
>  (define_insn_reservation "p5600_int_arith_1" 1
> -  (eq_attr "move_type" "andi,sll0,signext")
> +  (and (eq_attr "cpu" "p5600")
> +   (eq_attr "move_type" "andi,sll0,signext"))
>"p5600_agq_al2 | p5600_alq_alu")
> 
>  ;; addi, addiu, ori, xori, add, addu
>  (define_insn_reservation "p5600_int_arith_2" 1
> -  (eq_attr "alu_

[PATCH, libatomic, alpha]: Add -mfp-trap-mode=sui to compile flags

2014-07-17 Thread Uros Bizjak
Hello!

-mfp-trap-mode=sui is needed in addition to -mieee to compile fenv.c
for older alphas to generate inexact exceptions.

2013-07-18  Uros Bizjak  

* configure.tgt (alpha*): Add -mfp-trap-mode=sui to XCFLAGS.

Tested on alpha-linux-gnu, committed to mainline SVN.

Uros.

Index: configure.tgt
===
--- configure.tgt   (revision 212748)
+++ configure.tgt   (working copy)
@@ -27,7 +27,11 @@
 # work out any special compilation flags as necessary.

 case "${target_cpu}" in
-  alpha*)  ARCH=alpha ;;
+  alpha*)
+   # fenv.c needs this option to generate inexact exceptions.
+   XCFLAGS="${XCFLAGS} -mfp-trap-mode=sui"
+   ARCH=alpha
+   ;;
   rs6000 | powerpc*)   ARCH=powerpc ;;
   sh*) ARCH=sh ;;


Re: [PATCH] RTEMS: Add Nios 2 support

2014-07-17 Thread Chung-Lin Tang
For the default multilib settings, it looks like you just intended to
use -mcustom-fpu-cfg=60-2. I suggest you modify t-rtems to do that
instead of enumerating the individual FPU insn options.

Other than that, the patch looks okay.

Chung-Lin

On 2014/6/26 07:43 PM, Sebastian Huber wrote:
> diff --git a/gcc/config/nios2/t-rtems b/gcc/config/nios2/t-rtems
> new file mode 100644
> index 000..f95fa3c
> --- /dev/null
> +++ b/gcc/config/nios2/t-rtems
> @@ -0,0 +1,133 @@
> +# Custom RTEMS multilibs
> +
> +MULTILIB_OPTIONS = mhw-mul mhw-mulx mhw-div mcustom-fadds=253 
> mcustom-fdivs=255 mcustom-fmuls=252 mcustom-fsubs=254
> +
> +# Enumeration of multilibs
> +
> +# MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fsubs=254
> +# MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fadds=253
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fdivs=255
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mhw-div
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252
> +MULTILIB_EXCEPTIONS += 
> mhw-mul/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254
> +MULTILIB_EXCEPTIONS += mhw-mul/mcustom-fadds=253/mcustom-fdivs=255

Re: [PATCH] RTEMS: Add Nios 2 support

2014-07-17 Thread Chung-Lin Tang
On 2014/7/18 上午 05:19, Joel Sherrill wrote:
> Unless someone objects, I am going to commit this to the
> 4.9 branch and head.
> 
> --joel

Sorry about the delay, I'll review it today.

Thanks,
Chung-Lin

> On 7/7/2014 1:42 AM, Sebastian Huber wrote:
>> Ping.
>>
>> On 2014-06-26 13:43, Sebastian Huber wrote:
>>> This patch should be applied to GCC 4.9 and mainline.  I do not have
>>> write access, so in case this gets approved, please commit it for me.
>>>
>>> gcc/ChangeLog
>>> 2014-06-26  Sebastian Huber  
>>>
>>> * config.gcc (nios2-*-*): Add RTEMS support.
>>> * config/nios2/rtems.h: New file.
>>> * config/nios2/t-rtems: Likewise.
> 



Re: [PATCH, Pointer Bounds Checker 4/x] Built-in functions

2014-07-17 Thread Jeff Law

On 04/16/14 06:19, Ilya Enkovich wrote:

Hi,

This patch introduces built-in functions used by Pointer Bounds Checker.  It is 
mostly similar to what was reverted from 4.9, I just added types and attributes 
to builtins.  This patch also introduces pointer_bounds_type_node to be used in 
built-in function type declarations.

Bootstrapped and tested on linux-x86_64.

OK for trunk?

Thanks,
Ilya
--
gcc/

2014-04-16  Ilya Enkovich  

* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.h (pointer_bounds_type_node): New.
* tree.c (build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* builtin-types.def (BT_BND): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR): New.
(BT_FN_CONST_PTR_BND): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_VOID_PTR_BND): New.
(BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
(BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
* c-family/c.opt (fcheck-pointer-bounds): New.
* toplev.c (process_options): Check Pointer Bounds Checker is supported.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions.

OK for the trunk.  Please commit when the entire patchset has been approved.

Thanks for your patience,
Jeff



Re: Fix BLOCK_SUPERCONTEXT for cilk produced functions

2014-07-17 Thread Jeff Law

On 07/07/14 14:14, Jan Hubicka wrote:

On 07/04/14 03:05, Jan Hubicka wrote:

Hi,
BLOCK_SUPERCONTEXT for normal blocks leads all the way to FUNCTOIN_DECL.  This 
is not the case of some
automatically generated functions, like one for cilk.  I think it is bug. This 
patch fixes cilk, I will
look for other cases if that looks OK.

Bootstrapped/regtested x86_64-linux

Honza

* cilk.c (create_cilk_helper_decl): Create toplevel block correctly.

But doesn't this put the CILK created FUNCTION_DECL within the scope FNDECL?


I am not sure I understand your question...
FNDECL is a newly created decl that will get its body filled in later. I
believe the BLOCK_SUPERCONTEXT of its main block should be always FNDECL
instead of NULL as for was majority of functions we produce...
if that function contains code outlined from the main function, I think
these are linked by ABSTRACT_ORIGIN.

Nevermind.  I was confused about the state fndecl & the context chains.

OK for the trunk and 4.9 if you think it's important to address there as 
well


jeff


Re: [PATCH] Fix -imacros (PR c/57653)

2014-07-17 Thread Jeff Law

On 07/15/14 02:18, Marek Polacek wrote:

This is a revised patch that Peter recently submitted
, but
it was lacking a testcase and a better comment.  This patch
adds a testcase (kind of a hacky one), the comment is hopefully
better too.  Joseph already said that the code changes look ok.

Bootstrapped/regtested on x86_64-linux, ok for trunk?
Ok also for 4.9/4.8?

2014-07-14  Marek Polacek  
Manuel López-Ibáñez  

PR c/57653
* c-opts.c (c_finish_options): If -imacros is in effect, return.

* c-c++-common/pr57653.c: New test.
* c-c++-common/pr57653.h: New file.
I was really hoping someone could add tests from the old (2004?) thread 
between DJ and Per to ensure we weren't regressing any of those cases 
while fixing 57653.  In fact, I think I'd pre-approved with those tests 
added ;-)



Jeff



Re: [PATCH, Pointer Bounds Checker 3/x] Target hooks for Pointer Bounds Checker

2014-07-17 Thread Jeff Law

On 04/16/14 05:52, Ilya Enkovich wrote:

Hi,

This patch introduces target hooks to be used by Pointer Bounds Checker.  Hooks 
set is different from what was approved for 4.9 (and later reverted).  I added 
hooks to work with returned bounds and to prepare incoming bounds for vararg 
functions.  It allowed to remove some target assumptions from expand code.

Bootstrapped and tested on linux-x86_64.

OK for trunk?

Thanks,
Ilya
--
gcc/

2014-04-16  Ilya Enkovich  

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index b8ca17e..d868129 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4333,6 +4333,13 @@ This hook returns the va_list type of the calling 
convention specified by
  The default version of this hook returns @code{va_list_type_node}.
  @end deftypefn

+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for @code{va_list} object in function specified
+by @var{fndecl}.  This hook is used by Pointer Bounds Checker to build bounds
+for @code{va_list} object.  Return @code{integer_zero_node} if no bounds
+should be used (e.g. @code{va_list} is a scalar pointer to the stack).
+@end deftypefn
What if va_list is an aggregate, but lives in registers?  I'm not 
familiar with the different va_list implementations on all the targets, 
but GCC has supported aggregates in registers for various ABIs through 
the years.



+@deftypefn {Built-in Function} size_t __chkp_sizeof (const void *@var{ptr})
+Function code - @code{BUILT_IN_CHKP_SIZEOF}.  This built-in function
+returns size of object referenced by @var{ptr}. @var{ptr} is always
+@code{ADDR_EXPR} of @code{VAR_DECL}.  This built-in is used by
+Pointer Boudns Checker when bounds of object cannot be computed statically
+(e.g. object has incomplete type).

s/Boudns/Bounds/

OK for the trunk with those two doc fixes.  As with the other patches, 
wait for the remainder to be approved before committing.


jeff



Re: [PATCH, cprop] Check rtx_cost when propagating constant

2014-07-17 Thread Jeff Law

On 06/19/14 03:44, Zhenqiang Chen wrote:


ChangeLog:
2014-06-17  Zhenqiang Chen  

 * cprop.c (try_replace_reg): Check cost for constants.

diff --git a/gcc/cprop.c b/gcc/cprop.c
index aef3ee8..c9cf02a 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -733,6 +733,14 @@ try_replace_reg (rtx from, rtx to, rtx insn)
rtx src = 0;
int success = 0;
rtx set = single_set (insn);
+  int old_cost = 0;
+  bool copy_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+
+  if (set && SET_SRC (set) && REG_P (SET_SRC (set)))
+copy_p = true;
+  else
+old_cost = set_rtx_cost (set, speed);


Looks bogus for set == NULL?


set_rtx_cost has checked it. If it is NULL, the function will return 0;


Also what about register pressure?


Do you think it has big register pressure impact? I think it does not
increase register pressure.
I would expect a small impact on register pressure.  In general anytime 
we avoid propagating a constant to its uses and instead hold the value 
in a register register pressure will increase.





Here is a summary for performance result on X86-64 and ARM.

For X86-64, I run SPEC2000 INT and FP (-O3). There is no improvement
or regression. As tests, I moved the code segment to end of function
try_replace_reg and check insns which meet "success && new_cost >
old_cost". Logs show only 52 occurrences for all SPEC2000 build and
the only one instruction pattern: *adddi_1 is impacted. For *adddi_1,
rtx_cost increases from 8 to 10 when changing a register operand to a
constant.

For ARM Cortex-M4, minimal changes for Coremark, Dhrystone and EEMBC.
For ARM Chrome book (Cortex-A15), some wave in SPEC2000 INT test. But
the final result does not show improvement or regression.

The patch is updated to remove the "bogus" code and keep more constants.

Bootstrap and no make check regression on X86-64, i686 and ARM.

So with no notable improvements, do you still want this patch to go forward?

You certainly need a testcase.  It's fine if it's ARM specific, though 
obviously you get wider testing if you've got an x86_64 test.




diff --git a/gcc/cprop.c b/gcc/cprop.c
index aef3ee8..6ea6be0 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -733,6 +733,28 @@ try_replace_reg (rtx from, rtx to, rtx insn)
rtx src = 0;
int success = 0;
rtx set = single_set (insn);
+  int old_cost = 0;
+  bool const_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+
+  if (set && SET_SRC (set))
+{
+  rtx src = SET_SRC (set);
+  if (REG_P (src) || GET_CODE (src) == SUBREG)
+const_p = true;
+  else
+   {
+ if (note != 0
+ && REG_NOTE_KIND (note) == REG_EQUAL
+ && (GET_CODE (XEXP (note, 0)) == CONST
+ || CONSTANT_P (XEXP (note, 0
+   {
+ const_p = true;
+   }
+ else
+   old_cost = set_rtx_cost (set, speed);
+   }
+}
Can you come up with a better name than "const_p".  I see that and my 
first though is "that variable indicates if some object is a constant". 
 But that's not how its used here.





/* Usually we substitute easy stuff, so we won't copy everything.
   We however need to take care to not duplicate non-trivial CONST
@@ -740,6 +762,20 @@ try_replace_reg (rtx from, rtx to, rtx insn)
to = copy_rtx (to);

validate_replace_src_group (from, to, insn);
+
+  /* For CONSTANT_P (TO), loop2_invariant pass might hoist it out the loop.
+ And it can be shared by different references.  So skip propagation if
+ it makes INSN's rtx cost higher.  */
+  if (set && SET_SRC (set) && !const_p && CONSTANT_P (to))
+{
+  if (!CONSTANT_P (SET_SRC (set))
+ && (set_rtx_cost (set, speed) > old_cost))
+   {
+ cancel_changes (0);
+ return false;
+   }
+}
+
if (num_changes_pending () && apply_change_group ())
  success = 1;
I guess this needs to be after replacement so that you can easily 
compute the new cost.  Another case where our costing API is lame :( 
I'm not going to have you fix that.


Why the !CONSTANT_P check?  Why not just let the rtx costing alone 
determine if we want to avoid this propagation?


jeff



Re: [RTL] (vec_select (vec_concat a b) c) may be just a or b

2014-07-17 Thread Jeff Law

On 06/21/14 13:16, Marc Glisse wrote:

Hello,

this is another small simplification of RTL for vectors. Note that it
doesn't really solve the problem, because these simplifications are only
performed for single-use objects. If I start from vectors [a,b] and
[c,d] and concatenate them into [a,b,c,d], then extract both halves, as
in the original testcase in the PR, we won't notice that those are the
original vectors. Still, better than nothing...

(we output a vzeroupper for the testcase, that seems unnecessary)

Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-22  Marc Glisse  

 PR target/44551
gcc/
 * simplify-rtx.c (simplify_binary_operation_1) :
 Optimize inverse of a VEC_CONCAT.
gcc/testsuite/
 * gcc.target/i386/pr44551-1.c: New file.


OK.

Jeff



Re: [tree-optimization/61607] Look through SSA_NAME_VALUE chains

2014-07-17 Thread Jeff Law

On 07/07/14 03:02, Richard Biener wrote:

On Mon, Jun 30, 2014 at 8:56 AM, Jeff Law  wrote:


SSA_NAME_VALUE is, in effect, a chain of values.  ie, it's possible for
SSA_NAME_VALUE of any given SSA_NAME to refer to another SSA_NAME.


Hmm, but it shouldn't  as we walk in dominator order and should
always do set_ssa_name_value (name, ssa_name_value (value))
there shouldn't be any chains.
They're certainly there and always have been.  You can get chains in 
cases where we derive implicit sets, for example as well as due to loops 
in the CFG.





But maybe I'm missing sth about the lattice used by DOM here ...
(at least none of the callers in DOM care to valueize 'value' before
setting it as value for an SSA name).
It's not really a lattice in the way you'd see in VRP.  It's just an 
equality relationship with an unwinding stack to allow us to restore 
state from earlier points in the domwalk.




jeff



Re: Warn when returning the address of a temporary (middle-end) v2

2014-07-17 Thread Jeff Law

On 06/22/14 12:20, Marc Glisse wrote:

Hello,

I followed the advice in this discussion:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00269.html

and here is a new patch. I made an effort to isolate a path in at least
one subcase so it doesn't look too strange that the warning is in this
file. Computing the dominance info just to tweak the warning message may
be a bit excessive. I kept the same option as the front-ends, I don't
know if we want a different one, or maybe a Wmaybe-... version. There
will be cases where we get a duplicate warning from -Wtarget-lifetime in
fortran, but none in the testsuite, and I would rather have 2 warnings
than miss such broken code. The uninit-G testcase is about
initialization, not returning, so I am changing that, even if it is
unnecessary with the current version of the patch (only activated at -O2).

Bootstrap+testsuite (--enable-languages=all,obj-c++,ada,go) on
x86_64-unknown-linux-gnu.

(by the way, contrib/compare_tests is confused when I use all languages,
it prints "comm: file 1 is not in sorted order" and tons of spurious
differences)

2014-06-23  Marc Glisse  

 PR c++/60517
gcc/c/
 * c-typeck.c (c_finish_return): Return 0 instead of the address of
 a local variable.
gcc/cp/
 * typeck.c (maybe_warn_about_returning_address_of_local): Return
 whether it is returning the address of a local variable.
 (check_return_expr): Return 0 instead of the address of a local
 variable.
gcc/c-family/
 * c.opt (-Wreturn-local-addr): Move to common.opt.
gcc/
 * common.opt (-Wreturn-local-addr): Moved from c.opt.
 * gimple-ssa-isolate-paths.c: Include diagnostic-core.h.
 (isolate_path): New argument to avoid inserting a trap.
 (find_implicit_erroneous_behaviour): Handle returning the address
 of a local variable.
 (find_explicit_erroneous_behaviour): Likewise.
 (gimple_ssa_isolate_erroneous_paths): Calculate dominance info.
gcc/testsuite/
 * c-c++-common/addrtmp.c: New file.
 * c-c++-common/uninit-G.c: Adapt.
I note you don't catch return &localvar in the isolation code -- it 
looks like you just catch those which potentially flow from PHIs.


I realize you're primarily catching that in the front-ends, but can't we 
have cases which aren't caught by the front end, but after optimizations 
we're able to propagate &somelocal into the return statement?


It generally looks good and I'm ready to approve if the answer to the 
above question is "can't happen".  If it can happen, then we ought to 
handle it in the isolation code as well (ought to be relatively easy).


Jeff








Re: [PATCH] Limit combines work at -Og, introduce --param max-combine-insns

2014-07-17 Thread Jeff Law

On 07/15/14 05:58, Richard Biener wrote:


The following introduces a new param, max-combine-insns, to
be able to limit the work done at -Og to linear complexity
in the number of log-links (thus, two-insn combines).  It
also records statistics of performed combines where for
fold-const.ii on x86_64 we see at -Og (unpatched):

208 combine "three-insn combine" 25
208 combine "two-insn combine" 14393

and patched:

208 combine "two-insn combine" 14392

Bootstrap / regtest scheduled on x86_64-unknown-linux-gnu.

Ok?  (I can rip out the statistics stuff if you mind)

(the combine_insns diff is so large because of re-indenting)

Thanks,
Richard.

2014-07-15  Richard Biener  

* params.def (PARAM_MAX_COMBINE_INSNS): New.
* combine.c: Include statistics.h and params.h.
(combine_instructions): Guard three and four insn combines
with max-combine-insns value.  Record statistics for combines
performed.
* doc/invoke.texi (max-combine-insns): Document new param.
Please don't rip out the statistics :-)  Sometimes I wonder how much 
time we've lost because the statistics gathering often gets dropped as a 
patch goes from initial prototyping into production.  Then later we find 
the need to look at those statistics again.


The patch is fine.  I only gave it a cursory review as it looked like 
90% of it was just the indention change.  A diff -b variant would have 
made it fairly easy to do a more thorough review, I don't think it's 
strictly necessary though.


Jeff



Re: [PATCH] Fix PR 61225

2014-07-17 Thread Jeff Law

On 05/22/14 03:52, Zhenqiang Chen wrote:

On 21 May 2014 20:43, Steven Bosscher  wrote:

On Wed, May 21, 2014 at 11:58 AM, Zhenqiang Chen wrote:

Hi,

The patch fixes the gcc.target/i386/pr49095.c FAIL in PR61225. The
test case tends to check a peephole2 optimization, which optimizes the
following sequence

 2: bx:SI=ax:SI
 25: ax:SI=[bx:SI]
 7: {ax:SI=ax:SI-0x1;clobber flags:CC;}
 8: [bx:SI]=ax:SI
 9: flags:CCZ=cmp(ax:SI,0)
to
2: bx:SI=ax:SI
41: {flags:CCZ=cmp([bx:SI]-0x1,0);[bx:SI]=[bx:SI]-0x1;}

The enhanced shrink-wrapping, which calls copyprop_hardreg_forward
changes the INSN 25 to

 25: ax:SI=[ax:SI]

Then peephole2 can not optimize it since two memory_operands look like
different.

To fix it, the patch adds another peephole2 rule to read one more
insn. From the register copy, it knows the address is the same.


That is one complex peephole2 to deal with a transformation like this.
It seems to be like it's a too specific solution for a bigger problem.

Could you please try one of the following solutions instead:

1. Track register values for peephole2 and try different alternatives
based on known register equivalences? E.g. in your example, perhaps
there is already a REG_EQUAL/REG_EQUIV note available on insn 25 after
copyprop_hardreg_forward, to annotate that [ax:SI] is equivalent to
[bx:SI] at that point (or if that information is not available, it is
not very difficult to make it available). Then you could try applying
peephole2 on the original pattern but also on patterns modified with
the known equivalences (i.e. try peephole2 on multiple equivalent
patterns for the same insn). This may expose other peephole2
opportunities, not just the specific one your patch addresses.


Patch is updated according to the comment. There is no REG_EQUAL. So I
add it when replace_oldest_value_reg.

ChangeLog:
2014-05-22  Zhenqiang Chen  

 Part of PR rtl-optimization/61225
 * config/i386/i386-protos.h (ix86_peephole2_rtx_equal_p): New proto.
 * config/i386/i386.c (ix86_peephole2_rtx_equal_p): New function.
 * regcprop.c (replace_oldest_value_reg): Add REG_EQUAL note when
 propagating to SET.
I can't help but wonder why the new 4 insn combination code isn't 
presenting this as a nice big fat insn to the x86 backend which would 
eliminate the need for the peep2.


But, assuming there's a fundamental reason why that's not kicking in...

In replace_oldest_value_reg, why not use reg_overlap_mentioned_p to 
determine if the REGNO of NEW_RTX is modified by INSN?  I'd look to 
avoid some of those calls to single_set (insn).  Just call it once and 
reuse the value.


Shouldn't you be ensuring the REG_EQUAL note is unique?  I think we have 
a routine to avoid creating a note that already exists.


Don't you have to ensure that the value in the REG_EQUAL note has not 
changed?  A REG_EQUAL note denotes an equivalence that holds at the 
single insn where it appears.  If you want to use the value elsewhere 
you'd have to ensure the value hasn't been changed.  If RTX referred to 
by the REG_EQUAL note is a MEM, this can be relatively difficult due to 
aliasing issues.


Jeff






Re: [DOC PATCH] Clarify docs about stmt exprs (PR c/51088)

2014-07-17 Thread Jeff Law

On 04/08/14 10:23, Marek Polacek wrote:

On Fri, Mar 28, 2014 at 02:44:21PM +, Joseph S. Myers wrote:

On Fri, 28 Mar 2014, Marek Polacek wrote:


PR51088 contains some Really Bizzare code.  We should tell users
not to do any shenanigans like that.

Ok for trunk?


I don't think a doc patch resolves this bug.  The compiler should never
generate code with an undefined reference to a local label like that;
either the code should get a compile-time error (that's what I suggest),
or it should generate output that links but has undefined behavior at
runtime.


Ok, with this patch the compiler should issue an error if someone's
trying to take an address of a label defined in a statement expression
outside of that statement expression.
I admit this was very tricky; I had to completely revamp the patch
several times, this one is the least disrupting and simplest one
I could come up with.  It works by marking labels that are declared
outside of stmt expr while we're entering a stmt expr (but we mustn't
do this for nested stmt exprs).  If we're then defining the label in
stmt expr and it was referenced outside of this stmt expr, raise an error.
This patch doesn't catch cases like ({ A:0; }); &&A;, in that case the
behavior is just undefined.
Does this approach make sense?

Regtested/bootstrapped on x86_64-linux.  I don't think it's stage4 material,
so ok for next stage1?

2014-04-08  Marek Polacek  

PR c/51088
* c-decl.c (stmt_expr_depth): New variable.
(struct c_label_vars): Add seen_outside_stmt_expr variable.
(c_bindings_start_stmt_expr): Bump stmt_expr_depth.  Mark labels
declared outside of statement expressions.
(c_bindings_end_stmt_expr): Decrement stmt_expr_depth.
(make_label): Set seen_outside_stmt_expr.
(check_earlier_gotos): Return true if error was issued.
(define_label): Give error if taking an address of a label defined
in statement expression outside of the statement expression.

* doc/extend.texi (Statement Exprs): Add note about taking
addresses of labels inside of statement expressions.

* gcc.c-torture/compile/pr17913.c (f): Add dg-error.
* gcc.dg/pr51088.c: New test.
So this seems OK if you just want to warn when we take the address of 
the label at function scope, but don't you want to warn when the depth 
of the address taken operation is different (lower) then the depth of 
the when the label is defined?


Am I missing something here?

jeff



Re: [PATCH 2/3, Cilk+] Fix for PR61455

2014-07-17 Thread Jeff Law

On 07/12/14 03:51, Zamyatin, Igor wrote:

Hi!

This patch adds correct handling of declarations whit initializations that 
contain array notation.
It fixes ICE in PR61455.

Regtested for x86_64 (along with the first patch in the chain).

Ok for trunk/4.9?

Thanks,
Igor


gcc/c-family/ChangeLog:

2014-07-08  Igor Zamyatin  

   PR middle-end/61455
   * array-notation-common.c (extract_array_notation_exprs): Handling
   of DECL_EXPR added.

gcc/c/ChangeLog:

2014-07-08  Igor Zamyatin  

   PR middle-end/61455
   * c-array-notation.c (expand_array_notations): Handling
   of DECL_EXPR added.

gcc/cp/ChangeLog:

2014-07-08  Igor Zamyatin  

   PR middle-end/61455
   * cp-array-notation.c (expand_array_notation_exprs): Handling of
   DECL_EXPR improved. Changed handling for INIT_EXPR.

gcc/testsuite/ChangeLog:

2014-07-08  Igor Zamyatin  

   PR middle-end/61455
   * c-c++-common/cilk-plus/AN/pr61455.c: New test.
   * c-c++-common/cilk-plus/AN/pr61455-2.c: Likewise.

This is fine for the trunk/4.9.  Thanks,
Jeff



Re: [PATCH] Add statistical printout of rank_for_schedule decisions

2014-07-17 Thread Jeff Law

On 07/13/14 22:17, Maxim Kuvyrkov wrote:

Hi,

This patch adds dump printouts for scheduling heuristics in
rank_for_schedule.  Rank_for_schedule is one of the cornerstones of
haifa scheduler, yet its decisions are hard to track and debug.

This patch adds statistical gathering for each branch of
rank_for_schedule, and prints them out according to sched verbosity.
This patch helped me track down several bugs in rank_for_schedule
that result is stupid scheduling decisions.

Bootstrapped and tested on x86_64-linux-gnu.

OK to apply?

Presmably you use the

return increment, retval;

construct to avoid the need for braces?

I can see how it's useful here, but I don't think we've generally used 
comma operators like that and it's a style that I've never liked all 
that much.


Could you go ahead and split it into two statements and add the 
necessary braces?  Approved with that change.


Jeff


Re: [PATCH, DOC]: Fix for Options That Control Optimization section

2014-07-17 Thread Jeff Law

On 07/11/14 03:01, Martin Liška wrote:

Hello,
I fixed  Options That Control Optimization section according to 'gcc
-Q --help=optimizers' and after consultation with Jakub, I added missing
-foptimize-strlen option.

Ready for trunk?
Martin

ChangeLog:

2014-07-11  Martin Liska  

 * doc/invoke.texi: Added missing options to options
 that control optimization. Missing -foptimize-strlen option
 introduced.

OK.
jeff



Re: [PATCH] Clarify always_inline docs, fix PR61782

2014-07-17 Thread Jeff Law

On 07/15/14 06:03, Richard Biener wrote:


The following tries to clarify the always_inline docs, in particular
that it applies to non-O0 as well and what the failure mode is
(derived from how our implementation behaves).

Does this look ok?

Thanks,
Richard.

2014-07-15  Richard Biener  

PR other/61782
* doc/extend.texi (always_inline): Clarify.

Seems reasoanble.
jeff



Re: Fix PR 61461: -fdump-rtl-all-slim causes ICE

2014-07-17 Thread Jeff Law

On 07/14/14 20:25, Bernd Edlinger wrote:

Hi,

When I try to print the dumps of fold-const.c
with -fdump-rtl-all-slim cc1plus crashes.
This is caused by pretty-printing the ADDR_VEC pattern,
which has a single argument, a vector of LABEL_REFs at position 0.
But instead XEXP(x,0) is printed, which causes the ICE.

However also ADDR_DIFF_VEC seems to do the wrong thing,
because the XEXP(0) is an uninteresting label, which references just
the own position.

Instead of that, the list of LABEL_REFs should be printed IMO,
which is at position 1, in this case.

The attached patch fixes both, by prining the list if LABEL_REFs
separated by semicolons.


Boot-strapped and regression-tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.



changelog-pr61461.txt


2014-07-15  Bernd Edlinger

PR rtl-optimization/61461
* sched-vis.c (print_pattern) : Fixed.

OK.
jeff


[BUILDROBOT][PATCH] Fix mmix (unused variable)

2014-07-17 Thread Jan-Benedict Glaw
Hi!

As a leftover of r210931, an unused variable resulted in:

 g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
 -DHAVE_CONFIG_H -I. -I. -I../../../gcc/gcc -I../../../gcc/gcc/. 
-I../../../gcc/gcc/../include -I../../../gcc/gcc/../libcpp/include 
-I/opt/cfarm/mpc/include  -I../../../gcc/gcc/../libdecnumber 
-I../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I../../../gcc/gcc/../libbacktrace-o mmix.o -MT mmix.o -MMD -MP -MF 
./.deps/mmix.TPo ../../../gcc/gcc/config/mmix/mmix.c
../../../gcc/gcc/config/mmix/mmix.c: In function ‘int64_t 
mmix_intval(const_rtx)’:
../../../gcc/gcc/config/mmix/mmix.c:2694:12: error: unused variable ‘retval’ 
[-Werror=unused-variable]
   uint64_t retval;
^
cc1plus: all warnings being treated as errors
make[2]: *** [mmix.o] Error 1


(See eg.  http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=305352)



Committed as obvious, fixed like this:


2014-07-18  Jan-Benedict Glaw  

* config/mmix/mmix.c (mmix_intval): Drop unused automatic variable.

diff --git a/gcc/config/mmix/mmix.c b/gcc/config/mmix/mmix.c
index e0b8ce7..b9edc3c 100644
--- a/gcc/config/mmix/mmix.c
+++ b/gcc/config/mmix/mmix.c
@@ -2691,8 +2691,6 @@ mmix_output_condition (FILE *stream, const_rtx x, int 
reversed)
 int64_t
 mmix_intval (const_rtx x)
 {
-  uint64_t retval;
-
   if (GET_CODE (x) == CONST_INT)
 return INTVAL (x);
 


-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
  Signature of:  Zensur im Internet? Nein danke!
  the second  :


signature.asc
Description: Digital signature


Re: [PATCH 1/4] Add an abstract incremental hash data type

2014-07-17 Thread Trevor Saunders
On Thu, Jul 17, 2014 at 06:36:31AM +0200, Andi Kleen wrote:
> On Wed, Jul 16, 2014 at 10:40:53PM -0400, Trevor Saunders wrote:
> > 
> > > + public:
> > > +
> > > +  /* Start incremential hashing, optionally with SEED.  */
> > > +  void begin (hashval_t seed = 0)
> > > +  {
> > > +val = seed;
> > 
> > why isn't this the ctor?
> 
> It's standard for hash classes to have explicit begin()/end().
> All the existing ones I've seen work this way.

 I only know of one vaguelly similar thing
 http://mxr.mozilla.org/mozilla-central/source/mfbt/SHA1.h#37  which
 doesn't do that, and a bunch of people doing something doesn't
 necessarily mean it makes sense.  Now there may be a good reason it
 does make sense, but unless these other people need begin() to be
 fallible I don't see it.

> > > +  /* Add unsigned value V.  */
> > > +  void add_int (unsigned v)
> > > +  {
> > > +val = iterative_hash_hashval_t (v, val);
> > > +  }
> > 
> > istm this is a great spot to provide a bunch of overloads of just add()
> > and let the compiler pick the appropriate one for your type.
> 
> Sorry I'm not into code obfuscation. With hashing it's far better
> to work with explicit visible types instead of invisible magic.

if that were true I'd expect you'd see lots of cases of people using a
different hash function than the type of the expression being passed,
but a quick look at the later patches didn't show me any of those.
Not repeating the type of something is hardly obfiscation, and in most
cases there's only one sane function to call for a given expression.

but I guess its easy enough to change later if somebody gets really
annoyed by it so whatever.

Trev

> 
> -Andi


Re: Strenghten assumption about dynamic type changes (placement new)

2014-07-17 Thread Jason Merrill

On 07/08/2014 02:50 PM, Jan Hubicka wrote:

I am looking into tracking dynamic types now. Obviously I need to set very
exact rules about when these may change.


Let me first say that this area is somewhat in flux in the standard; if 
we have a model of what we want the rules to be for GCC, there's a good 
chance of getting them into the standard.  There are several unresolved 
DRs in this area already (1027, 1116, 1776).



I think b variants are invalid


Yes, by 3.8/7; we can't use 'a' to call foo after we've changed the 
object there to a C.



currently we also assume t1 to be invalid, but
t2 to be valid.


I think the compiler ought to be able to treat both as undefined, 
because 'a' is either defined (t1) or allocated (t2) as a B, and B does 
not contain an array of char, so changing the dynamic type of that 
memory before the end of its storage duration ought to be undefined.


But the standard doesn't currently say that, though it's along the lines 
of my proposed drafting for 1116 (which needs reworking).


And I suppose that my notion of 'allocated type' can really only apply 
when using the library allocation functions in 18.6.1.1 and 18.6.1.2, 
not the inline placement new.


Jason



[committed] Define HAVE_sync_compare_and_swap* on hppa-linux

2014-07-17 Thread John David Anglin
The attached change enables future support using the kernel assisted  
atomic support.  It's a bit
of a fudge since the libcall routines are not lock free, but  
everything appears to work.


Tested on hppa-unknown-linux-gnu.  Committed to trunk.

Dave
--
John David Anglin   dave.ang...@bell.net



2014-07-17  John David Anglin  

* config/pa/pa-linux.h (TARGET_OS_CPP_BUILTINS): Remove defines for
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1, __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
and __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
(HAVE_sync_compare_and_swapqi): Define.
(HAVE_sync_compare_and_swaphi): Likewise.
(HAVE_sync_compare_and_swapsi): Likewise.

Index: config/pa/pa-linux.h
===
--- config/pa/pa-linux.h(revision 212767)
+++ config/pa/pa-linux.h(working copy)
@@ -22,9 +22,6 @@
 #define TARGET_OS_CPP_BUILTINS()   \
   do   \
 {  \
-   builtin_define ("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1");  \
-   builtin_define ("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2");  \
-   builtin_define ("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4");  \
GNU_USER_TARGET_OS_CPP_BUILTINS();  \
builtin_assert ("machine=bigendian");   \
 }  \
@@ -133,3 +130,13 @@
 
 #undef TARGET_SYNC_LIBCALL
 #define TARGET_SYNC_LIBCALL 1
+
+/* The SYNC operations are implemented as library functions, not
+   INSN patterns.  As a result, the HAVE defines for the patterns are
+   not defined.  We need to define them to generate the corresponding
+   __GCC_HAVE_SYNC_COMPARE_AND_SWAP_* and __GCC_ATOMIC_*_LOCK_FREE
+   defines.  */
+
+#define HAVE_sync_compare_and_swapqi 1
+#define HAVE_sync_compare_and_swaphi 1
+#define HAVE_sync_compare_and_swapsi 1


[committed] Use __kernel_cmpxchg for __sync_lock_release

2014-07-17 Thread John David Anglin
Because the atomic sync functions in config/pa/linux-atomic.c are not  
lock free, we need to use
__kernel_cmpxchg for the __sync_lock_release.  This was found in  
glibc's pthread_spin_unlock

implementation.

Tested on hppa-unknown-linux-gnu.  Committed to trunk.

Dave
--
John David Anglin   dave.ang...@bell.net



2014-07-17  John David Anglin  

* config/pa/linux-atomic.c (__sync_lock_release_4): New.
(SYNC_LOCK_RELEASE): Update to use __kernel_cmpxchg for release.
Don't use SYNC_LOCK_RELEASE for int type.

Index: config/pa/linux-atomic.c
===
--- config/pa/linux-atomic.c(revision 210671)
+++ config/pa/linux-atomic.c(working copy)
@@ -293,13 +293,34 @@
 SUBWORD_TEST_AND_SET (unsigned short, 2)
 SUBWORD_TEST_AND_SET (unsigned char,  1)
 
+void HIDDEN
+__sync_lock_release_4 (int *ptr)
+{
+  int failure, oldval;
+
+  do {
+oldval = *ptr;
+failure = __kernel_cmpxchg (oldval, 0, ptr);
+  } while (failure != 0);
+}
+
 #define SYNC_LOCK_RELEASE(TYPE, WIDTH) \
   void HIDDEN  \
   __sync_lock_release_##WIDTH (TYPE *ptr)  \
   {\
-*ptr = 0;  \
+int failure;   \
+unsigned int oldval, newval, shift, mask;  \
+int *wordptr = (int *) ((unsigned long) ptr & ~3); \
+   \
+shift = (((unsigned long) ptr & 3) << 3) ^ INVERT_MASK_##WIDTH;\
+mask = MASK_##WIDTH << shift;  \
+   \
+do {   \
+  oldval = *wordptr;   \
+  newval = oldval & ~mask; \
+  failure = __kernel_cmpxchg (oldval, newval, wordptr);\
+} while (failure != 0);\
   }
 
-SYNC_LOCK_RELEASE (int,   4)
 SYNC_LOCK_RELEASE (short, 2)
 SYNC_LOCK_RELEASE (char,  1)


[wwwdocs] Patch for Re: PLEASE RE-ADD MIRRORS (small correction)

2014-07-17 Thread Gerald Pfeifer
On Tue, 15 Jul 2014, Dan D. wrote:
> Are you still interested in the mirrors?

Yep.  This is the patch I just committed to our web site.

If there are further updates, best propose a patch against
https://gcc.gnu.org/mirrors.html , that is the fastest way
and ensure things show up as you want them to.

Gerald

Index: mirrors.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v
retrieving revision 1.222
diff -u -r1.222 mirrors.html
--- mirrors.html6 Jul 2014 22:00:24 -   1.222
+++ mirrors.html17 Jul 2014 22:59:18 -
@@ -14,6 +14,10 @@
 (Phoenix, Arizona, USA) directly:
 
 
+Australia:
+  http://mirrors-au.go-parts.com/gcc/";>http://mirrors-au.go-parts.com/gcc
+| ftp://mirrors-au.go-parts.com/gcc";>ftp://mirrors-au.go-parts.com/gcc
+| rsync://mirrors-au.go-parts.com/gcc,
 thanks to Dan Derebenskiy (dderebens...@go-parts.com) at Go-Parts.
 Austria: ftp://gd.tuwien.ac.at/gnu/gcc/";>gd.tuwien.ac.at, 
thanks to Antonin.Sprinzl at tuwien.ac.at
 Bulgaria: http://gcc.igor.onlinedirect.bg/";>gcc.igor.onlinedirect.bg, thanks to 
igor at onlinedirect.bg
 Canada: http://gcc.parentingamerica.com";>http://gcc.parentingamerica.com, 
thanks to James Miller (jmiller at parentingamerica.com).
@@ -36,11 +40,19 @@
 Japan: http://ftp.tsukuba.wide.ad.jp/software/gcc/";>ftp.tsukuba.wide.ad.jp, 
thanks to Kohei Takahashi (tsukuba-ftp-servers at tsukuba.wide.ad.jp)
 Latvia, Riga: http://mirrors.webhostinggeeks.com/gcc/";>mirrors.webhostinggeeks.com/gcc/,
 thanks to Igor (whg.igp at gmail.com)
 The Netherlands, Nijmegen: ftp://ftp.nluug.nl/mirror/languages/gcc";>ftp.nluug.nl, thanks to Jan 
Cristiaan van Winkel (jc at ATComputing.nl)
+Russia:
+  http://mirrors-ru.go-parts.com/gcc/";>http://mirrors-ru.go-parts.com/gcc
+| ftp://mirrors-ru.go-parts.com/gcc";>ftp://mirrors-ru.go-parts.com/gcc
+| rsync://mirrors-ru.go-parts.com/gcc
 Slovakia, Bratislava: http://gcc.fyxm.net/";>gcc.fyxm.net, 
thanks to Jan Teluch (admin at 2600.sk)
 UK: ftp://ftp.mirrorservice.org/sites/sourceware.org/pub/gcc/";>ftp://ftp.mirrorservice.org/sites/sourceware.org/pub/gcc/,
 thanks to mirror at mirrorservice.org
 UK, London: http://gcc-uk.internet.bs";>http://gcc-uk.internet.bs, thanks to 
Internet.bs (info at internet.bs)
 US, Saint Louis: http://gcc.petsads.us";>http://gcc.petsads.us, thanks to Sergey 
Kutserey (s.kutserey at gmail.com)
 US, San Jose: http://www.netgull.com/gcc/";>http://www.netgull.com, thanks to admin 
at netgull.com
+US:
+  http://mirrors-usa.go-parts.com/gcc/";>http://mirrors-usa.go-parts.com/gcc
+| ftp://mirrors-usa.go-parts.com/gcc";>ftp://mirrors-usa.go-parts.com/gcc
+| rsync://mirrors-usa.go-parts.com/gcc
 
 
 The archives there will be signed by one of the following GnuPG keys:


Re: [PATCH] RTEMS: Add Nios 2 support

2014-07-17 Thread Joel Sherrill
Unless someone objects, I am going to commit this to the
4.9 branch and head.

--joel

On 7/7/2014 1:42 AM, Sebastian Huber wrote:
> Ping.
>
> On 2014-06-26 13:43, Sebastian Huber wrote:
>> This patch should be applied to GCC 4.9 and mainline.  I do not have
>> write access, so in case this gets approved, please commit it for me.
>>
>> gcc/ChangeLog
>> 2014-06-26  Sebastian Huber  
>>
>>  * config.gcc (nios2-*-*): Add RTEMS support.
>>  * config/nios2/rtems.h: New file.
>>  * config/nios2/t-rtems: Likewise.

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985



[committed] Fix MIPS p5600 scheduler

2014-07-17 Thread Richard Sandiford
The p5600 scheduler wasn't restricting itself to -mtune=p5600 and so
was being used for other CPUs too.  This showed up as a failure in
various tests, including gcc.target/mips/octeon-pipe-1.c.  (Thinking
about it, it was probably also why umips-lwp-*.c started failing,
although the patch I just committed is still OK after this fix.)

Guys: please make sure you do a before-and-after comparison of test results,
even if it "obviously" shouldn't be necessary.  This amount of fallout
in gcc.target/mips would have been a red flag that something was wrong.

Tested on mips64-linux-gnu and applied.

Thanks,
Richard


gcc/
* config/mips/p5600.md: Add missing cpu tests.

Index: gcc/config/mips/p5600.md
===
--- gcc/config/mips/p5600.md2014-07-17 20:53:50.423095856 +0100
+++ gcc/config/mips/p5600.md2014-07-17 20:53:50.764100479 +0100
@@ -47,52 +47,62 @@ (define_reservation "p5600_alq_alu" "p56
 
 ;; fadd, fsub
 (define_insn_reservation "p5600_fpu_fadd" 4
-  (eq_attr "type" "fadd,fabs,fneg")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fadd,fabs,fneg"))
   "p5600_fpu_long, p5600_fpu_apu")
 
 ;; fabs, fneg, fcmp
 (define_insn_reservation "p5600_fpu_fabs" 2
-  (eq_attr "type" "fabs,fneg,fcmp,fmove")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fabs,fneg,fcmp,fmove"))
   "p5600_fpu_short, p5600_fpu_apu")
 
 ;; fload
 (define_insn_reservation "p5600_fpu_fload" 8
-  (eq_attr "type" "fpload,fpidxload")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fpload,fpidxload"))
   "p5600_fpu_long, p5600_fpu_apu")
 
 ;; fstore
 (define_insn_reservation "p5600_fpu_fstore" 1
-  (eq_attr "type" "fpstore,fpidxstore")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fpstore,fpidxstore"))
   "p5600_fpu_short, p5600_fpu_apu")
 
 ;; fmadd
 (define_insn_reservation "p5600_fpu_fmadd" 9
-  (eq_attr "type" "fmadd")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fmadd"))
   "p5600_fpu_long, p5600_fpu_apu")
 
 ;; fmul
 (define_insn_reservation "p5600_fpu_fmul" 5
-  (eq_attr "type" "fmul")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fmul"))
   "p5600_fpu_long, p5600_fpu_apu")
 
 ;; fdiv, fsqrt
 (define_insn_reservation "p5600_fpu_div" 17
-  (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt"))
   "p5600_fpu_long, p5600_fpu_apu*17")
 
 ;; fcvt
 (define_insn_reservation "p5600_fpu_fcvt" 4
-  (eq_attr "type" "fcvt")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "fcvt"))
   "p5600_fpu_long, p5600_fpu_apu")
 
 ;; mtc
 (define_insn_reservation "p5600_fpu_fmtc" 7
-  (eq_attr "type" "mtc")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "mtc"))
   "p5600_fpu_short, p5600_fpu_store")
 
 ;; mfc
 (define_insn_reservation "p5600_fpu_fmfc" 4
-  (eq_attr "type" "mfc")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "mfc"))
   "p5600_fpu_short, p5600_fpu_store")
 
 ;; madd/msub feeding into the add source
@@ -105,100 +115,120 @@ (define_bypass 5 "p5600_fpu_fmadd" "p560
 
 ;; and
 (define_insn_reservation "p5600_int_and" 1
-  (eq_attr "move_type" "logical")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "move_type" "logical"))
   "p5600_alq_alu")
 
 ;; lui
 (define_insn_reservation "p5600_int_lui" 1
-  (eq_attr "move_type" "const")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "move_type" "const"))
   "p5600_alq_alu")
 
 ;; Load lb, lbu, lh, lhu, lq, lw, lw_i2f, lwxs
 (define_insn_reservation "p5600_int_load" 4
-  (eq_attr "move_type" "load")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "move_type" "load"))
   "p5600_agq_ldsta")
 
 ;; store
 (define_insn_reservation "p5600_int_store" 3
-  (eq_attr "move_type" "store")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "move_type" "store"))
   "p5600_agq_ldsta")
 
 ;; andi, sll, srl, seb, seh
 (define_insn_reservation "p5600_int_arith_1" 1
-  (eq_attr "move_type" "andi,sll0,signext")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "move_type" "andi,sll0,signext"))
   "p5600_agq_al2 | p5600_alq_alu")
 
 ;; addi, addiu, ori, xori, add, addu
 (define_insn_reservation "p5600_int_arith_2" 1
-  (eq_attr "alu_type" "add,or,xor")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "alu_type" "add,or,xor"))
   "p5600_agq_al2 | p5600_alq_alu")
 
 ;; nor, sub
 (define_insn_reservation "p5600_int_arith_3" 1
-  (eq_attr "alu_type" "nor,sub")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "alu_type" "nor,sub"))
   "p5600_alq_alu")
 
 ;; srl, sra, rotr, slt, sllv, srlv
 (define_insn_reservation "p5600_int_arith_4" 1
-  (eq_attr "type" "shift,slt,move")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "shift,slt,move"))
   "p5600_agq_al2 | p5600_alq_alu")
 
 ;; nop
 (define_insn_reservation "p5600_int_nop" 0
-  (eq_attr "type" "nop")
+  (and (eq_attr "cpu" "p5600")
+   (eq_attr "type" "nop"))
   "p5600_agq_al2")
 
 ;; clo, clz
 (define_insn_reservation "p5600_int_countbits" 1

[committed] Fix umips-lwp-*.c tests

2014-07-17 Thread Richard Sandiford
umips-lwp-[1234].c test that two loads from consecutive memory locations
to consecutive registers ($5 and $6) can use LWP.  The idea was to have
2 tests where $5 was loaded first before $6 and 2 tests that were the
other way around (at least when scheduling is enabled).  I'd tried to force
that by having one register involved in multiplication and so obviously
on the critical path.  However, the multiplication and store now happen
before the load of the other register, which defeats the point of the
thing and makes the test fail.

Here I've used a shift and add sequence instead, after verifying that
the pre-peephole2 orderings are still different for 1,3 vs. 2,4.

Tested on mips64-linux-gnu and applied.

Thanks,
Richard


gcc/testsuite/
* gcc.target/mips/umips-lwp-1.c (foo): Use a shift/add sequence
involving both inputs rather than a multiplication involving one.
* gcc.target/mips/umips-lwp-2.c (foo): Likewise.
* gcc.target/mips/umips-lwp-3.c (foo): Likewise.
* gcc.target/mips/umips-lwp-4.c (foo): Likewise.

Index: gcc/testsuite/gcc.target/mips/umips-lwp-1.c
===
--- gcc/testsuite/gcc.target/mips/umips-lwp-1.c 2014-07-17 20:17:43.255936742 
+0100
+++ gcc/testsuite/gcc.target/mips/umips-lwp-1.c 2014-07-17 20:27:46.543603109 
+0100
@@ -6,7 +6,7 @@ foo (int *r4)
 {
   int r5 = r4[0];
   int r6 = r4[1];
-  r4[2] = r5 * r5;
+  r4[2] = (r5 << 1) + r6;
   {
 register int r5asm asm ("$5") = r5;
 register int r6asm asm ("$6") = r6;
Index: gcc/testsuite/gcc.target/mips/umips-lwp-2.c
===
--- gcc/testsuite/gcc.target/mips/umips-lwp-2.c 2014-04-25 10:40:00.806595073 
+0100
+++ gcc/testsuite/gcc.target/mips/umips-lwp-2.c 2014-07-17 20:28:03.451761888 
+0100
@@ -6,7 +6,7 @@ foo (int *r4)
 {
   int r5 = r4[0];
   int r6 = r4[1];
-  r4[2] = r6 * r6;
+  r4[2] = (r6 << 1) + r5;
   {
 register int r5asm asm ("$5") = r5;
 register int r6asm asm ("$6") = r6;
Index: gcc/testsuite/gcc.target/mips/umips-lwp-3.c
===
--- gcc/testsuite/gcc.target/mips/umips-lwp-3.c 2014-04-25 10:40:00.995596575 
+0100
+++ gcc/testsuite/gcc.target/mips/umips-lwp-3.c 2014-07-17 20:28:38.133087563 
+0100
@@ -6,7 +6,7 @@ foo (int *r4)
 {
   int r5 = r4[511];
   int r6 = r4[512];
-  r4[2] = r5 * r5;
+  r4[2] = (r5 << 1) + r6;
   {
 register int r5asm asm ("$5") = r5;
 register int r6asm asm ("$6") = r6;
Index: gcc/testsuite/gcc.target/mips/umips-lwp-4.c
===
--- gcc/testsuite/gcc.target/mips/umips-lwp-4.c 2014-04-25 10:40:01.222598380 
+0100
+++ gcc/testsuite/gcc.target/mips/umips-lwp-4.c 2014-07-17 20:28:46.754168518 
+0100
@@ -6,7 +6,7 @@ foo (int *r4)
 {
   int r5 = r4[511];
   int r6 = r4[512];
-  r4[2] = r6 * r6;
+  r4[2] = (r6 << 1) + r5;
   {
 register int r5asm asm ("$5") = r5;
 register int r6asm asm ("$6") = r6;


PR61629 (was Re: Delay RTL initialization until it is really needed)

2014-07-17 Thread Richard Sandiford
Richard Sandiford  writes:
> Jan Hubicka  writes:
>> Hi,
>
>> IRA initialization shows high in profiles even when building lto
>> objects.  This patch simply delays RTL backend initialization until we
>> really decide to output a function.  In some cases this avoids the
>> initialization completely (like in the case of LTO but also user
>> target attributes) and there is some hope for better cache locality.
>>
>> Basic idea is to have two flags saying whether lang and target
>> dependent bits needs initialization and check it when starting
>> function codegen.
>>
>> Bootstrapped/regtested x86_64-linux, testing also at AIX. Ok if it passes?
>>
>> Honza
>>
>>  * toplev.c (backend_init_target): Move init_emit_regs and init_regs 
>> to...
>>  (backend_init) ... here; skip ira_init_once and backend_init_target.
>>  (target_reinit) ... and here; clear 
>> this_target_rtl->lang_dependent_initialized.
>>  (lang_dependent_init_target): Clear 
>> this_target_rtl->lang_dependent_initialized;
>>  break out rtl initialization to ...
>>  (initialize_rtl): ... here; call also backend_init_target and 
>> ira_init_once.
>>  * toplev.h (initialize_rtl): New function.
>>  * function.c: Include toplev.h
>>  (init_function_start): Call initialize_rtl.
>>  * rtl.h (target_rtl): Add target_specific_initialized,
>>  lang_dependent_initialized.
>> Index: toplev.c
>> ===
>> --- toplev.c (revision 211837)
>> +++ toplev.c (working copy)
>> @@ -1583,14 +1583,6 @@ backend_init_target (void)
>>/* Initialize alignment variables.  */
>>init_alignments ();
>>  
>> -  /* This reinitializes hard_frame_pointer, and calls 
>> init_reg_modes_target()
>> - to initialize reg_raw_mode[].  */
>> -  init_emit_regs ();
>> -
>> -  /* This invokes target hooks to set fixed_reg[] etc, which is
>> - mode-dependent.  */
>> -  init_regs ();
>> -
>>/* This depends on stack_pointer_rtx.  */
>>init_fake_stack_mems ();
>>  
>> @@ -1632,9 +1624,13 @@ backend_init (void)
>>init_varasm_once ();
>>save_register_info ();
>>  
>> -  /* Initialize the target-specific back end pieces.  */
>> -  ira_init_once ();
>> -  backend_init_target ();
>> +  /* Middle end needs this initialization for default mem attributes
>> + used by early calls to make_decl_rtl.  */
>> +  init_emit_regs ();
>> +
>> +  /* Middle end needs this initialization for mode tables used to assign
>> + modes to vector variables.  */
>> +  init_regs ();
>
> This causes a segfault on gcc.target/mips/umips-store16-1.c.  The register
> asm:
>
> register unsigned int global asm ("$16");
>
> causes us to globalise $16 and call reinit_regs.  reinit_regs in turn
> calls ira_init, but IRA hasn't been initialised at this point and
> prerequisites like init_fake_stack_mems haven't yet been called.
>
> Does the patch below look OK?
>
>> @@ -1686,6 +1682,31 @@ lang_dependent_init_target (void)
>>   front end is initialized.  It also depends on the HAVE_xxx macros
>>   generated from the target machine description.  */
>>init_optabs ();
>> +  this_target_rtl->lang_dependent_initialized = false;
>> +}
>> +
>> +/* Perform initializations that are lang-dependent or target-dependent.
>> +   but matters only for late optimizations and RTL generation.  */
>> +
>> +void
>> +initialize_rtl (void)
>> +{
>> +  static int initialized_once;
>> +
>> +  /* Initialization done just once per compilation, but delayed
>> + till code generation.  */
>> +  if (!initialized_once)
>> +ira_init_once ();
>> +  initialized_once = true;
>> +
>> +  /* Target specific RTL backend initialization.  */
>> +  if (!this_target_rtl->target_specific_initialized)
>> +backend_init_target ();
>> +  this_target_rtl->target_specific_initialized = true;
>> +
>> +  if (this_target_rtl->lang_dependent_initialized)
>> +return;
>> +  this_target_rtl->lang_dependent_initialized = true;
>>  
>>/* The following initialization functions need to generate rtl, so
>>   provide a dummy function context for them.  */
>
> Why do you need both these flags?  We only call this function once
> the language has been initialised, so we should always be initialising
> both sets of information (backend_init_target and the stuff after
> the comment above, from the old lang_dependent_init_target).
>
> How about the second patch below, still under testing?  The new assert
> is OK for target_reinit because it has:
>
>   this_target_rtl->target_specific_initialized = false;
>
>   /* This initializes hard_frame_pointer, and calls init_reg_modes_target()
>  to initialize reg_raw_mode[].  */
>   init_emit_regs ();
>
>   /* This invokes target hooks to set fixed_reg[] etc, which is
>  mode-dependent.  */
>   init_regs ();
>
>   /* Reinitialize lang-dependent parts.  */
>   lang_dependent_init_target ();
>
> i.e. it sets the flag to say that the RTL stuff hasn't been initialised
> and then goes on 

Re: FDO and source changes

2014-07-17 Thread Xinliang David Li
>>
>> I see why you do not like first_global_object_name because changing it would 
>> cause
>> all functions from that unit to drop the profiles. Perhaps we can combine 
>> function name
>> and compilation unit (gcov file) name?
>
> that is a good idea -- it will also solve the LTO problem you mentioned above.
>
> Will update the patch.

It already does this (similarly):

chksum = coverage_checksum_string
(chksum, aux_base_name);


The static function defined in the same header will have different
'aux_base_name' depending on the including module.

David


>
> David
>
>>
>> Honza
>>
>>>   chksum = coverage_checksum_string
>>> (chksum, first_global_object_name);
>>>chksum = coverage_checksum_string
>>> @@ -645,7 +650,12 @@ coverage_begin_function (unsigned lineno
>>>
>>>/* Announce function */
>>>offset = gcov_write_tag (GCOV_TAG_FUNCTION);
>>> -  gcov_write_unsigned (current_function_funcdef_no + 1);
>>> +  if (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID))
>>> +gcov_write_unsigned (current_function_funcdef_no + 1);
>>> +  else
>>> +gcov_write_unsigned (coverage_compute_profile_id (
>>> +   cgraph_get_node (current_function_decl)));
>>> +
>>>gcov_write_unsigned (lineno_checksum);
>>>gcov_write_unsigned (cfg_checksum);
>>>gcov_write_string (IDENTIFIER_POINTER
>>> @@ -682,8 +692,13 @@ coverage_end_function (unsigned lineno_c
>>>if (!DECL_EXTERNAL (current_function_decl))
>>>   {
>>> item = ggc_alloc ();
>>> -
>>> -   item->ident = current_function_funcdef_no + 1;
>>> +
>>> +  if (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID))
>>> + item->ident = current_function_funcdef_no + 1;
>>> +  else
>>> +item->ident = coverage_compute_profile_id (
>>> +   cgraph_get_node (cfun->decl));
>>> +
>>> item->lineno_checksum = lineno_checksum;
>>> item->cfg_checksum = cfg_checksum;
>>>
>>


Re: FDO and source changes

2014-07-17 Thread Xinliang David Li
On Wed, Jul 16, 2014 at 4:42 PM, Jan Hubicka  wrote:
>> Instrumentation based FDO is designed to work when the source files
>> that are used to generate the instr binary match exactly with the
>> sources in profile-use compile. It is known historically that using
>> stale profile (due to source changes, not gcda format change) can lead
>> to lots of mismatch warnings and even worse -- compiler ICEs.  This is
>> due to two reasons:
>> 1) the profile lookup for each function is based on funcdef_no which
>> can change when function definition order is changed or new functions
>> are inserted in the middle of a source
>> 2) the indirect call target id may change due to source changes:
>> before GCC4.9, the id uses cgraph uid which is as bad as funcdef_no.
>> Attributing wrong IC target to the indirect call site is the main
>> cause of compiler ICE (we have signature match check, but bad target
>> can leak through result in problem later). Starting from gcc49, the
>> indirect target profiling uses profile_id which is stable for public
>> functions.
>
> We should not ICE however when the targets gets wrong. There is some basic
> type checking on the place, do you have testcase where we still ICE?

I don't have test cases at hand (when it happened, it was not
considered important to fix due to the stale profile used by the
user). The basic check may have some holes - e.g. related to vararg
functions.


>>
>> This patch introduces a new parameter for FDO to determine whether to
>> use internal id or assembler name based external id for profile
>> lookup. When the external id is used, GCC FDO will become very
>> tolerant to simple source changes.
>>
>> Note that autoFDO solves this problem but it is currently limited to
>> Intel platforms with LBR support.
>>
>> (Tested with SPEC, SPEC06 and large internal benchmarks. No performance 
>> impact).
>>
>> Ok for trunk?
>
> I wonder if there are any downsides for using this always?
> We still compare checksums so we should warn user that profile is out of date,
> so I would consistently switch from funcdef_no to profile_id...

I wonder about the same. The tests show no downside.

>
>> Index: coverage.c
>> ===
>> --- coverage.c(revision 212682)
>> +++ coverage.c(working copy)
>> @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
>>  #include "intl.h"
>>  #include "filenames.h"
>>  #include "target.h"
>> +#include "params.h"
>>
>>  #include "gcov-io.h"
>>  #include "gcov-io.c"
>> @@ -369,8 +370,10 @@ get_coverage_counts (unsigned counter, u
>>   da_file_name);
>>return NULL;
>>  }
>> -
>> -  elt.ident = current_function_funcdef_no + 1;
>> +  if (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID))
>> +elt.ident = current_function_funcdef_no + 1;
>> +  else
>> +elt.ident = coverage_compute_profile_id (cgraph_get_node (cfun->decl));
>>elt.ctr = counter;
>>entry = counts_hash->find (&elt);
>>if (!entry || !entry->summary.num)
>> @@ -416,7 +419,8 @@ get_coverage_counts (unsigned counter, u
>>  }
>>else if (entry->lineno_checksum != lineno_checksum)
>>  {
>> -  warning (0, "source locations for function %qE have changed,"
>> +  warning (OPT_Wcoverage_mismatch,
>> +   "source locations for function %qE have changed,"
>>  " the profile data may be out of date",
>>  DECL_ASSEMBLER_NAME (current_function_decl));
>>  }
>> @@ -581,12 +585,13 @@ coverage_compute_profile_id (struct cgra
>>  {
>>expanded_location xloc
>>   = expand_location (DECL_SOURCE_LOCATION (n->decl));
>> +  bool use_name_only = (PARAM_VALUE (PARAM_PROFILE_FUNC_INTERNAL_ID) == 
>> 0);
>>
>> -  chksum = xloc.line;
>> +  chksum = (use_name_only ? 0 : xloc.line);
>>chksum = coverage_checksum_string (chksum, xloc.file);
>>chksum = coverage_checksum_string
>>   (chksum, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (n->decl)));
>> -  if (first_global_object_name)
>> +  if (!use_name_only && first_global_object_name)
>
> I think this will cause troubles with static functions and LTO indirect call
> optimization.  We really want to make two static functions with same name to 
> have
> different IDs when they come from different units.
>
> I see why you do not like first_global_object_name because changing it would 
> cause
> all functions from that unit to drop the profiles. Perhaps we can combine 
> function name
> and compilation unit (gcov file) name?

that is a good idea -- it will also solve the LTO problem you mentioned above.

Will update the patch.

David

>
> Honza
>
>>   chksum = coverage_checksum_string
>> (chksum, first_global_object_name);
>>chksum = coverage_checksum_string
>> @@ -645,7 +650,12 @@ coverage_begin_function (unsigned lineno
>>
>>/* Announce function */
>>offset = gcov_write_tag (GCOV_TAG_FUNCTION);
>> -  gcov_write_unsigned (

Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c

2014-07-17 Thread Alan Lawrence
Ok, the attached tests are passing on x86_64-none-linux-gnu, aarch64-none-elf, 
arm-none-eabi, and a bunch of smaller platforms for which I've only built a 
stage 1 compiler (i.e. as far as necessary to assemble). That's with either 
change to simplify_shift_const_1.


As to the addition of "result_mode != shift_mode", or removing the whole check 
against XOR - I've now tested the latter:


bootstrapped on x86_64-none-linux-gnu, check-gcc and check-ada;
bootstrapped on arm-none-linux-gnueabihf;
bootstrapped on aarch64-none-linux-gnu;
cross-tested check-gcc on aarch64-none-elf;
cross-tested on arm-none-eabi;
(and Uros has bootstrapped on alpha and done a suite of tests, as per 
https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html).


From a perspective of paranoia, I'd lean towards adding "result_mode != 
shift_mode", but for neatness removing the whole check against XOR is nicer. So 
I'd defer to the maintainers as to whether one might be preferable to the 
other...(but my unproven suspicion is that the two are equivalent, and no case 
where result_mode != shift_mode is possible!)


--Alan

Alan Lawrence wrote:
Thanks for the suggestions! I think I've got a reasonably platform-independent 
testcase that scans the rtl dump, just trying it on a few more platforms now...


As to running on Alpha: bootstrap succeeds, and the regression testsuite doesn't 
raise any issues (https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html) 
- and that's with a more aggressive patch that completely rolls back the 
original r76965:


Index: combine.c
===
--- combine.c   (revision 212523)
+++ combine.c   (working copy)
@@ -10218,9 +10218,6 @@
 if (CONST_INT_P (XEXP (varop, 1))
 /* We can't do this if we have (ashiftrt (xor))  and the
constant has its sign bit set in shift_mode.  */
- && !(code == ASHIFTRT && GET_CODE (varop) == XOR
-  && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
- shift_mode))
 && (new_rtx = simplify_const_binary_operation
 (code, result_mode,
  gen_int_mode (INTVAL (XEXP (varop, 1)), result_mode),
@@ -10237,10 +10234,7 @@
logical expression, make a new logical expression, and apply
the inverse distributive law.  This also can't be done
for some (ashiftrt (xor)).  */
- if (CONST_INT_P (XEXP (varop, 1))
-&& !(code == ASHIFTRT && GET_CODE (varop) == XOR
- && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
-shift_mode)))
+ if (CONST_INT_P (XEXP (varop, 1)))
   {
 rtx lhs = simplify_shift_const (NULL_RTX, code, shift_mode,
 XEXP (varop, 0), count);

I'm testing this version more widely but initial indications are good.

However, I've not succeeded in checking Ada on Alpha, as GCC's Ada frontend 
requires an Ada compiler to bootstrap. So I have to ask: does anyone actually 
use Ada on Alpha? (And if so, would they please be able to test the above patch?)


Moreover, I don't really see we have much reason to believe the check against 
commuting is required even for Ada/Alpha. GCC's internals have changed 
substantially in the interim, with the Ada frontend no longer generating RTL 
directly, as we now have intervening GENERIC/GIMPLE tree stages. Unless there is 
a logical/bitwise explanation for why the commuting of ashiftrc and xor is 
unsafe, is the best explanation that the Ada frontend was generating RTL that 
may have looked OK at the time but we would now consider dubious, malformed, bad?


(E.g., these days I don't see how to produce an ashiftrt of one mode containing
an XOR of another without an intervening sign_extend, zero_extend or subreg.)

--Alan

Jeff Law wrote:

On 06/30/14 13:05, Alan Lawrence wrote:

combine.c includes a check which prevents

(ashiftrt (xor A C2) C1)

from being commuted to

(xor (ashiftrt A C1) (ashiftrt C2 C1))

for constants C1, C2 if C2 has its sign bit set.

Specifically, this prevents (ashiftrt (not A) C1) from being commuted to

(not (ashiftrt A C1))

because the former is rewritten to (ashiftrt (xor A -1) C1) above, with
a comment /* Make this fit the case below.  */ - which it no longer does.

If result_mode == shift_mode, I can see no reason to prevent this
normalisation (indeed, I'm not sure I can see any reason to prevent it
even if result_mode != shift_mode - but I've not managed to produce such
a case in my own testing, as there are always intervening subreg's or
sign_extend's, or to build a toolchain on which to reproduce the
original bug, so I'm being cautious). Hence this patch allows
commutation if the two modes are equal.

As an illustrative example, on AArch64, without this patch, compiling
this

Re: [PATCH, testsuite] Reliably prune GCC notes in C++ compat suite

2014-07-17 Thread Bernhard Reutner-Fischer

On 17 July 2014 02:51:14 "Ulrich Weigand"  wrote:


Hello,

in testing the rs6000 ABI patches I noted a weird effect: usually, the
-Wpsabi warning notes are ignored in the compat test suites, so we get
a clean test run anyway.

However, when running the C++ version of the struct-layout-1.exp case
*alone* (using RUNTESTFLAGS=struct-layout-1.exp), suddenly tests are
failing because of those extra notes.  This does *not* happen with
the C version of that suite ...

It turns out that that pruning those notes is supposed to happen
from within gcc-defs.exp:${tool}_check_compile:
if { [info proc ${tool}-dg-prune] != "" } {
global target_triplet
set gcc_output [${tool}-dg-prune $target_triplet $gcc_output]
}

However, the g++-dg-prune routine is defined in g++-dg.exp, which
is never included from g++.dg/compat/struct-layout-1.exp (directly
or indirectly).  Now, when running the full suite, that file would
have been loaded by some earlier g++.dg .exp file, so everything
works out.  But when running struct-layout-1.exp stand-alone, the
g++-dg-prune routine is never defined and thus silently no pruning
takes place.

To fix this, the following patch simply loads g++-dg.exp directly
from g++.dg/compat/struct-layout-1.exp.

Tested on powerpc64-linux and powerpc64le-linux.

OK for mainline (and 4.8/4.9 once the rs6000 ABI patches are
backported there)?


Mike,
On a related note:
https://gcc.gnu.org/ml/fortran/2012-03/msg00094.html
The libdirs was applied to dejagnu some time ago, so once we can require 
that version (anytime now) we could apply the patch referenced above.

Also see BUG: in the URL above for another missing load_lib.

Thanks,


Bye,
Ulrich


gcc/testsuite/ChangeLog:

* g++.dg/compat/struct-layout-1.exp: Load g++-dg.exp


Index: gcc-4_9-branch/gcc/testsuite/g++.dg/compat/struct-layout-1.exp
===
--- gcc-4_9-branch.orig/gcc/testsuite/g++.dg/compat/struct-layout-1.exp
+++ gcc-4_9-branch/gcc/testsuite/g++.dg/compat/struct-layout-1.exp
@@ -89,6 +89,9 @@ proc compat-use-tst-compiler { } {
 # This must be done after the compat-use-*-compiler definitions.
 load_lib compat.exp

+# Provide the g++-dg-prune routine (gcc-dp.exp is loaded by compat.exp)
+load_lib g++-dg.exp
+
 g++_init

 # Save variables for the C++ compiler under test, which each test will

--
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com





Sent with AquaMail for Android
http://www.aqua-mail.com




Re: [C++ Patch/RFC] PR 50961

2014-07-17 Thread Jason Merrill

OK.

Jason


Re: [RFC, PATCH 2/n] IPA C++ refactoring

2014-07-17 Thread Jan Hubicka
> 
> Bootstrapped/regtested x86_64-linux
> 
> Ready for trunk?
> 
> gcc/ChangeLog:
> 2014-07-17  Martin Liska  
> 
>   * cgraph.h (varpool_node):
>   (availability get_availability (void)):
> created from cgraph_variable_initializer_availability
>   (inline varpool_node *ultimate_alias_target (availability *availability = 
> NULL)
> creted from: cgraph_variable_initializer_availability
>   (inline varpool_node *get_alias_target (void)): created from 
> varpool_alias_target
>   (void finalize_named_section_flags (void)):
> created from varpool_finalize_named_section_flags
>   (bool assemble_decl (void)): created from varpool_assemble_decl
>   (void analyze (void)): created from varpool_analyze_node
>   (bool call_for_node_and_aliases (bool (*callback) (varpool_node *, void *),
> void *data, bool include_overwritable)): created 
> fromvarpool_for_node_and_aliases
>   (void remove_initializer (void)): created from varpool_remove_initializer
>   (tree get_constructor (void)): created from varpool_get_constructor
>   (bool externally_visible_p (void)): created from 
> varpool_externally_visible_p
>   (bool ctor_useable_for_folding_p (void)): created from 
> varpool_ctor_useable_for_folding_p
>   (inline bool all_refs_explicit_p ()): created from 
> varpool_all_refs_explicit_p
>   (inline bool can_remove_if_no_refs_p (void)): created from 
> varpool_can_remove_if_no_refs
>   (static inline varpool_node *get (const_tree decl)): created from 
> varpool_get_node
>   (static void finalize_decl (tree decl)): created from varpool_finalize_decl
>   (static bool output_variables (void)): created from varpool_output_variables
>   (static varpool_node * extra_name_alias (tree alias, tree decl)):
> created from varpool_extra_name_alias
>   (static varpool_node * create_alias (tree, tree)): created from 
> varpool_create_variable_alias
>   (static void dump_varpool (FILE *f)): created from dump_varpool
>   (static void DEBUG_FUNCTION debug_varpool (void)): created from 
> debug_varpool
>   (static varpool_node *create_empty (void)): created from 
> varpool_create_empty_node
>   (static varpool_node *symbol_for_decl (tree decl)): created from 
> varpool_node_for_decl
>   (static varpool_node *symbol_for_asmname (tree asmname)): created from 
> varpool_node_for_asm
>   (void assemble_aliases (void)): created from assemble_aliases
> 

> +  /* Dump given varpool node to F.  */
> +  void dump (FILE *f);
> +
> +  /* Dump given varpool node to stderr.  */
> +  void DEBUG_FUNCTION debug (void);
> +
> +  /* Remove variable from symbol table.  */
> +  void remove (void);
> +
> +  /* Return variable availability.  */
> +  availability get_availability (void);
> +
> +  /* For given variable pool node, walk the alias chain to return the 
> function
> + the variable is alias of. Do not walk through thunks.
> + When AVAILABILITY is non-NULL, get minimal availability in the chain.  
> */
> +  inline varpool_node *ultimate_alias_target
> +(availability *availability = NULL);
> +
> +  /* Return node that alias is aliasing.  */
> +  inline varpool_node *get_alias_target (void);
> +
> +  /* For variables in named sections make sure get_variable_section
> + is called before we switch to those sections.  Then section
> + conflicts between read-only and read-only requiring relocations
> + sections can be resolved.  */
> +  void finalize_named_section_flags (void);

Quite special, so it should go bellow the other declarations.
> +
> +  /* Output one variable, if necessary.  Return whether we output it.  */
> +  bool assemble_decl (void);
> +
> +  void analyze (void);
> +
> +  /* Call calback on varpool symbol and aliases associated to varpool symbol.
> + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are
> + skipped. */
> +  bool call_for_node_and_aliases (bool (*callback) (varpool_node *, void *),
> +   void *data,
> +bool include_overwritable);
> +
> +  /* Remove node initializer when it is no longer needed.  */
> +  void remove_initializer (void);
> +
> +  /* When doing LTO, read variable's constructor from disk if
> + it is not already present.  */
> +  tree get_constructor (void);
> +
> +  /* Return true when variable should be considered externally visible.  */
> +  bool externally_visible_p (void);

We have flag and this helper function to set it.  Probably the comment should
mention that this is not to be used except for varpool maintenance code.
> +
> +  /* Return true if variable has constructor that can be used for folding.  
> */
> +  bool ctor_useable_for_folding_p (void);

Group it with get_constructor and move earlier in the list (probably basic 
manipulation
first, constructor next, aliases and random predicates last)
> +
> +  /* Return true when all references to variable must be visible
> + in ipa_ref_list.
> + i.e. if the variable is not externally visible or not used in some magic
> + 

Re: [Info], Add suport for PowerPC IEEE 128-bit floating point

2014-07-17 Thread Michael Meissner
On Tue, Jul 15, 2014 at 04:50:33PM -0500, Segher Boessenkool wrote:
> On Tue, Jul 15, 2014 at 05:20:31PM -0400, Michael Meissner wrote:
> > I did some timing tests to compare the new PowerPC IEEE 128-bit results to 
> > the
> > current implementation of long double using the IBM extended format.
> > 
> > The test consisted a short loop doing the operation over arrays of 1,024
> > elements, reading in two values, doing the operation, and then storing it 
> > back.
> > This loop in turn was done multiple times, with the idea that most of the
> > values would be in the cache, and we didn't have to worry about 
> > pre-fetching,
> > etc.
> > 
> > The float, double tests were done with vectorization disabled, while the 
> > vector
> > float and vector double tests, the compiler was allowed to do the normal 
> > auto
> > vectorization.
> > 
> > The number reported was how much longer the second column took over the 
> > first:
> 
> I assume you mean the other way around?
> 
> > Generally, the __float128 is 2x slower than the current IBM extended double
> > format, except for divide, where it is 5x slower.  I must say, the software
> > floating point emulation routines worked well, and once the proper macros 
> > were
> > setup, I only needed to override the type used for IEEE 128-bit.
> > 
> > Add loop
> > 
> > 
> > float   vs double:  2.00x
> 
> Why is float twice as slow as double?

Pat re-ran the tests, and now float/double are the same speed.  Since I was
running this on a development machine, and not a dedicated machine, it was
probably just luck of the draw that somebody was doing a large build at the
time I ran the tests.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH, rs6000, 4.9] Fix many powerpc*-linux ASAN test suite failures

2014-07-17 Thread Peter Bergner
With a recent mainline libsanitizer merge from upstream, we're now seeing a
lot of mainline ASAN test suite failures with the following error:

 ==26426==ASan runtime does not come first in initial library list; you should
 either link runtime to your application or manually preload it with LD_PRELOAD.
 FAIL: c-c++-common/asan/asan-interface-1.c   -O0  execution test

This is caused by mainline libasan detecting that libasan is not linked
first and erroring out.  With the 4.8 and 4.9, we may just silently run
into problems.  The root cause is that powerpc*-linux does not define
LIBASAN_EARLY_SPEC which is defined in gnu-user.h.  It looks like all
*-linux architectures include gnu-user.h except for powerpc*-linux.
As discussed, for the 4.8 and 4.9 backports of the original patch, we
will just copy those defines to the rs6000 header files and not try and
include gnu-user.h itself.

This is slightly different than the 4.8 patch, since the STATIC_LIB[AT]SAN_LIBS
macro was deleted in 4.9.

This passed bootstrap and regtesting on powerpc64-linux with no regressions.
Ok for 4.9?

Peter

* config/rs6000/sysv4.h:

Index: gcc/config/rs6000/sysv4.h
===
--- gcc/config/rs6000/sysv4.h   (revision 212695)
+++ gcc/config/rs6000/sysv4.h   (working copy)
@@ -949,3 +949,19 @@ ncrtn.o%s"
 #define TARGET_USES_SYSV4_OPT 1
 
 #undef DBX_REGISTER_NUMBER
+
+/* Link -lasan early on the command line.  For -static-libasan, don't link
+   it for -shared link, the executable should be compiled with -static-libasan
+   in that case, and for executable link link with --{,no-}whole-archive around
+   it to force everything into the executable.  And similarly for -ltsan.  */
+#if defined(HAVE_LD_STATIC_DYNAMIC)
+#undef LIBASAN_EARLY_SPEC
+#define LIBASAN_EARLY_SPEC "%{!shared:libasan_preinit%O%s} " \
+  "%{static-libasan:%{!shared:" \
+  LD_STATIC_OPTION " --whole-archive -lasan --no-whole-archive " \
+  LD_DYNAMIC_OPTION "}}%{!static-libasan:-lasan}"
+#undef LIBTSAN_EARLY_SPEC
+#define LIBTSAN_EARLY_SPEC "%{static-libtsan:%{!shared:" \
+  LD_STATIC_OPTION " --whole-archive -ltsan --no-whole-archive " \
+  LD_DYNAMIC_OPTION "}}%{!static-libtsan:-ltsan}"
+#endif




[PATCH, rs6000, 4.8] Fix many powerpc*-linux ASAN test suite failures

2014-07-17 Thread Peter Bergner
With a recent mainline libsanitizer merge from upstream, we're now seeing a
lot of mainline ASAN test suite failures with the following error:

 ==26426==ASan runtime does not come first in initial library list; you should
 either link runtime to your application or manually preload it with LD_PRELOAD.
 FAIL: c-c++-common/asan/asan-interface-1.c   -O0  execution test

This is caused by mainline libasan detecting that libasan is not linked
first and erroring out.  With the 4.8 and 4.9, we may just silently run
into problems.  The root cause is that powerpc*-linux does not define
LIBASAN_EARLY_SPEC which is defined in gnu-user.h.  It looks like all
*-linux architectures include gnu-user.h except for powerpc*-linux.
As discussed, for the 4.8 and 4.9 backports of the original patch, we
will just copy those defines to the rs6000 header files and not try and
include gnu-user.h itself.

This passed bootstrap and regtesting on powerpc64-linux with no regressions.
Ok for 4.8?

Peter


* config/rs6000/sysv4.h:

Index: gcc/config/rs6000/sysv4.h
===
--- gcc/config/rs6000/sysv4.h   (revision 212695)
+++ gcc/config/rs6000/sysv4.h   (working copy)
@@ -949,3 +949,27 @@ ncrtn.o%s"
 #define TARGET_USES_SYSV4_OPT 1
 
 #undef DBX_REGISTER_NUMBER
+
+/* Link -lasan early on the command line.  For -static-libasan, don't link
+   it for -shared link, the executable should be compiled with -static-libasan
+   in that case, and for executable link link with --{,no-}whole-archive around
+   it to force everything into the executable.  And similarly for -ltsan.  */
+#if defined(HAVE_LD_STATIC_DYNAMIC)
+#undef LIBASAN_EARLY_SPEC
+#define LIBASAN_EARLY_SPEC "%{!shared:libasan_preinit%O%s} " \
+  "%{static-libasan:%{!shared:" \
+  LD_STATIC_OPTION " --whole-archive -lasan --no-whole-archive " \
+  LD_DYNAMIC_OPTION "}}%{!static-libasan:-lasan}"
+#undef LIBTSAN_EARLY_SPEC
+#define LIBTSAN_EARLY_SPEC "%{static-libtsan:%{!shared:" \
+  LD_STATIC_OPTION " --whole-archive -ltsan --no-whole-archive " \
+  LD_DYNAMIC_OPTION "}}%{!static-libtsan:-ltsan}"
+#endif
+
+/* Additional libraries needed by -static-libasan.  */
+#undef STATIC_LIBASAN_LIBS
+#define STATIC_LIBASAN_LIBS "-ldl -lpthread"
+
+/* Additional libraries needed by -static-libtsan.  */
+#undef STATIC_LIBTSAN_LIBS
+#define STATIC_LIBTSAN_LIBS "-ldl -lpthread"




Re: [PATCH 2/5] Existing call graph infrastructure enhancement

2014-07-17 Thread Martin Liška


On 06/30/2014 08:54 PM, Jeff Law wrote:

On 06/30/14 05:49, Martin Liška wrote:


On 06/17/2014 10:00 PM, Jeff Law wrote:

On 06/13/14 04:26, mliska wrote:

Hi,
 this small patch prepares remaining needed infrastructure for
the new pass.

Changelog:

2014-06-13  Martin Liska  
Honza Hubicka  

* ipa-utils.h (polymorphic_type_binfo_p): Function marked external
instead of static.
* ipa-devirt.c (polymorphic_type_binfo_p): Likewise.
* ipa-prop.h (count_formal_params): Likewise.
* ipa-prop.c (count_formal_params): Likewise.
* ipa-utils.c (ipa_merge_profiles): Be more tolerant if we merge
profiles for semantically equivalent functions.
* passes.c (do_per_function): If we load body of a function
during WPA,
this condition should behave same.
* varpool.c (ctor_for_folding): More tolerant assert for variable
aliases created during WPA.

Presumably we don't have any useful way to merge the cases where we
have provides for SRC & DST in ipa_merge_profiles or even to guess
which is more useful when presented with both?  Does it make sense to
log this into a debugging file when we drop one?

Hello,
this merge function was written by Honza, what do you think Honza
about this note?


I think this patch is fine.  If adding logging makes sense, then feel
free to do so and consider that trivial change pre-approved.

I made a small change to this patch, where I moved
'gsi_next_nonvirtual_phi' from the pass to gimple-iterator.h.

Ready for trunk with this change?

Yes.  I think with the exception of patch #3/5 everything looks good. I'll try 
to get another pass over #3 this week.  What I looked at last week was pretty 
good; I'm pretty confident this will be wrapped up shortly.

If #1/#2 make sense to install independent of #3, go ahead.  #4/#5 are 
obviously dependent on #3.

Jeff


Hello,
   thank you for approval, this final version removes few hunks that are not 
needed any more. Changes are just cosmetic and I will commit the patch at the 
beginning of next week.

Thanks,
Martin

diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index 909d58b..47168b9 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -281,6 +281,30 @@ gsi_last_nondebug_bb (basic_block bb)
   return i;
 }
 
+/* Iterates I statement iterator to the next non-virtual statement.  */
+
+static inline void
+gsi_next_nonvirtual_phi (gimple_stmt_iterator *i)
+{
+  gimple phi;
+
+  if (gsi_end_p (*i))
+return;
+
+  phi = gsi_stmt (*i);
+  gcc_assert (phi != NULL);
+
+  while (virtual_operand_p (gimple_phi_result (phi)))
+{
+  gsi_next (i);
+
+  if (gsi_end_p (*i))
+	return;
+
+  phi = gsi_stmt (*i);
+}
+}
+
 /* Return the basic block associated with this iterator.  */
 
 static inline basic_block
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 40f696b..aecba07 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -211,7 +211,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
 
 /* Return how many formal parameters FNDECL has.  */
 
-static inline int
+int
 count_formal_params (tree fndecl)
 {
   tree parm;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 8886e93..bc6249e 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -529,6 +529,7 @@ void ipa_free_all_edge_args (void);
 void ipa_free_all_structures_after_ipa_cp (void);
 void ipa_free_all_structures_after_iinln (void);
 void ipa_register_cgraph_hooks (void);
+int count_formal_params (tree fndecl);
 
 /* This function ensures the array of node param infos is big enough to
accommodate a structure for all nodes and reallocates it if not.  */
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index c191210..d58b170 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -660,13 +660,8 @@ ipa_merge_profiles (struct cgraph_node *dst,
   if (dst->tp_first_run > src->tp_first_run && src->tp_first_run)
 dst->tp_first_run = src->tp_first_run;
 
-  if (src->profile_id)
-{
-  if (!dst->profile_id)
-	dst->profile_id = src->profile_id;
-  else
-	gcc_assert (src->profile_id == dst->profile_id);
-}
+  if (src->profile_id && !dst->profile_id)
+dst->profile_id = src->profile_id;
 
   if (!dst->count)
 return;
diff --git a/gcc/passes.c b/gcc/passes.c
index 61b4c12..bae302b 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1478,7 +1478,7 @@ do_per_function (void (*callback) (function *, void *data), void *data)
 {
   struct cgraph_node *node;
   FOR_EACH_DEFINED_FUNCTION (node)
-	if (node->analyzed && gimple_has_body_p (node->decl)
+	if (node->analyzed && (gimple_has_body_p (node->decl) && !in_lto_p)
 	&& (!node->clone_of || node->decl != node->clone_of->decl))
 	  callback (DECL_STRUCT_FUNCTION (node->decl), data);
 }
diff --git a/gcc/varpool.c b/gcc/varpool.c
index 04ce714..5662985 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -397,6 +397,7 @@ ctor_for_folding (tree decl)
   if (decl != real_decl)
 {
   gcc_assert (!DECL_INITIAL (decl)
+		  || (node->al

Re: [GSoC] generation of Gimple code from isl_ast_node_user

2014-07-17 Thread Tobias Grosser

On 17/07/2014 16:08, Roman Gareev wrote:

I see. Could you use vec_safe_grow_cleared(iv_map, loop_num) instead?
>This shows probably better that you zero initialize the vector.

If I am not mistaken, vec_safe_grow_cleared has the following declaration:

vec_safe_grow_cleared (vec *&v, unsigned len CXX_MEM_STAT_INFO)

Should we rewrite all the functions, which interact with iv_map?


Can you explain why all functions would need to be rewritten? I proposed
this function as an easier way to NULL initialize the vector and did not 
expect any rewrite to be necessary.


If there is no such thing, please just add a comment that your loop NULL 
initializes the vector. We can later improve this.



I've added test cases, which produce the following ISL code:

isl-ast-gen-single-loop-1.c

for (int c1 = 0; c1 <= 49; c1 += 1)
   S_3(c1);

isl-ast-gen-single-loop-2.c
for (int c1 = 0; c1 <= -n.0 + 69; c1 += 1)
   S_5(c1);

isl-ast-gen-single-loop-3.c
for (int c1 = 0; c1 < n.0; c1 += 1)
   S_5(c1);

The second and the third one use arrays. I wanted to make them similar
to the first one, but inability to handle blocks prevented this. For
example,



OK. The tests look good.

Cheers,
Tobias


Re: [GSoC] Addition of ISL AST generation to Graphite

2014-07-17 Thread Tobias Grosser

On 17/07/2014 16:11, Roman Gareev wrote:

I've attached the patch, which adds the requirement for isl 0.12.
Tobias, is it important to accept only 0.12.1, 0.12.2 and forbid 0.12?


I am not aware of any problems with isl 0.12 and would be surprised if 
such problems exist. Are you?


The patch itself looks good. As it is trivial, fixing an annoying 
bootstrapping bug, and people agreed that this is the right direction, I 
propose that you commit it right ahead.


Further reviews are still welcome.

Cheers,
Tobias

P.S: As Richard suggested, we may also want to forbid CLooG 0.17.


Re: [GSoC] Addition of ISL AST generation to Graphite

2014-07-17 Thread Roman Gareev
I've attached the patch, which adds the requirement for isl 0.12.
Tobias, is it important to accept only 0.12.1, 0.12.2 and forbid 0.12?

--
   Cheers, Roman Gareev
2014-07-12  Roman Gareev  

* configure.ac: Don't accept isl 0.11.
* configure: Regenerate.
Index: configure
===
--- configure   (revision 212756)
+++ configure   (working copy)
@@ -5898,54 +5898,6 @@
 LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs}"
 LIBS="${_isl_saved_LIBS} -lisl"
 
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for version 0.11 of ISL" 
>&5
-$as_echo_n "checking for version 0.11 of ISL... " >&6; }
-if test "$cross_compiling" = yes; then :
-  gcc_cv_isl=yes
-else
-  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-#include 
-   #include 
-int
-main ()
-{
-if (strncmp (isl_version (), "isl-0.11", strlen ("isl-0.11")) != 0)
- return 1;
-
-  ;
-  return 0;
-}
-_ACEOF
-if ac_fn_c_try_run "$LINENO"; then :
-  gcc_cv_isl=yes
-else
-  gcc_cv_isl=no
-fi
-rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
-  conftest.$ac_objext conftest.beam conftest.$ac_ext
-fi
-
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_isl" >&5
-$as_echo "$gcc_cv_isl" >&6; }
-
-CFLAGS=$_isl_saved_CFLAGS
-LDFLAGS=$_isl_saved_LDFLAGS
-LIBS=$_isl_saved_LIBS
-  fi
-
-
-  if test "${gcc_cv_isl}" = no ; then
-
-  if test "${ENABLE_ISL_CHECK}" = yes ; then
-_isl_saved_CFLAGS=$CFLAGS
-_isl_saved_LDFLAGS=$LDFLAGS
-_isl_saved_LIBS=$LIBS
-
-CFLAGS="${_isl_saved_CFLAGS} ${islinc} ${gmpinc}"
-LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs}"
-LIBS="${_isl_saved_LIBS} -lisl"
-
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for version 0.12 of ISL" 
>&5
 $as_echo_n "checking for version 0.12 of ISL... " >&6; }
 if test "$cross_compiling" = yes; then :
@@ -5983,7 +5935,6 @@
   fi
 
 
-  fi
 
 
 
Index: configure.ac
===
--- configure.ac(revision 212756)
+++ configure.ac(working copy)
@@ -1650,10 +1650,7 @@
   dnl with user input.
   ISL_INIT_FLAGS
   dnl The versions of ISL that work for Graphite
-  ISL_CHECK_VERSION(0,11)
-  if test "${gcc_cv_isl}" = no ; then
-ISL_CHECK_VERSION(0,12)
-  fi
+  ISL_CHECK_VERSION(0,12)
   dnl Only execute fail-action, if ISL has been requested.
   ISL_IF_FAILED([
 AC_MSG_ERROR([Unable to find a usable ISL.  See config.log for details.])])


Re: [GSoC] generation of Gimple code from isl_ast_node_user

2014-07-17 Thread Roman Gareev
> I see. Could you use vec_safe_grow_cleared(iv_map, loop_num) instead?
> This shows probably better that you zero initialize the vector.

If I am not mistaken, vec_safe_grow_cleared has the following declaration:

vec_safe_grow_cleared (vec *&v, unsigned len CXX_MEM_STAT_INFO)

Should we rewrite all the functions, which interact with iv_map?

I've added test cases, which produce the following ISL code:

isl-ast-gen-single-loop-1.c

for (int c1 = 0; c1 <= 49; c1 += 1)
  S_3(c1);

isl-ast-gen-single-loop-2.c
for (int c1 = 0; c1 <= -n.0 + 69; c1 += 1)
  S_5(c1);

isl-ast-gen-single-loop-3.c
for (int c1 = 0; c1 < n.0; c1 += 1)
  S_5(c1);

The second and the third one use arrays. I wanted to make them similar
to the first one, but inability to handle blocks prevented this. For
example,

/* { dg-do run } */
/* { dg-options "-O2 -fgraphite-identity -fgraphite-code-generator=isl" } */

int n = 25;

int
foo ()
{
  int i, res;

  for (i = n, res = 0; i < 50; i++)
  res += i;

  return res;
}

extern void abort ();

int
main (void)
{
  int res = foo ();

  if (res != 1225)
abort ();

  return 0;
}

produces the following code:

{
  S_6();
  for (int c1 = 0; c1 <= -i + 49; c1 += 1)
S_4(c1);
}
2014-07-12  Roman Gareev  

gcc/
* graphite-isl-ast-to-gimple.c:
Add inclusion of gimple-ssa.h, tree-into-ssa.h.
(ivs_params_clear):
(build_iv_mapping): New function.
(translate_isl_ast_node_user): Likewise.
(translate_isl_ast): Add calling of translate_isl_ast_node_user.

gcc/testsuite/gcc.dg/graphite/
* isl-ast-gen-single-loop-1.c: New testcase.
* isl-ast-gen-single-loop-2.c: New testcase.
* isl-ast-gen-single-loop-3.c: New testcase.
Index: gcc/graphite-isl-ast-to-gimple.c
===
--- gcc/graphite-isl-ast-to-gimple.c(revision 212756)
+++ gcc/graphite-isl-ast-to-gimple.c(working copy)
@@ -51,6 +51,8 @@
 #include "sese.h"
 #include "tree-ssa-loop-manip.h"
 #include "tree-scalar-evolution.h"
+#include "gimple-ssa.h"
+#include "tree-into-ssa.h"
 #include 
 
 #ifdef HAVE_cloog
@@ -541,6 +543,72 @@
   return last_e;
 }
 
+/* Inserts in iv_map a tuple (OLD_LOOP->num, NEW_NAME) for the induction
+   variables of the loops around GBB in SESE.
+ 
+   FIXME: Instead of using a vec that maps each loop id to a possible
+   chrec, we could consider using a map that maps loop ids to the
+   corresponding tree expressions.  */
+
+static void
+build_iv_mapping (vec iv_map, gimple_bb_p gbb,
+ __isl_keep isl_ast_expr *user_expr, ivs_params &ip,
+ sese region)
+{
+  gcc_assert (isl_ast_expr_get_type (user_expr) == isl_ast_expr_op &&
+  isl_ast_expr_get_op_type (user_expr) == isl_ast_op_call);
+  int i;
+  isl_ast_expr *arg_expr;
+  for (i = 1; i < isl_ast_expr_get_op_n_arg (user_expr); i++)
+{
+  arg_expr = isl_ast_expr_get_op_arg (user_expr, i);
+  tree type = *graphite_expression_size_type;
+  tree t = gcc_expression_from_isl_expression (type, arg_expr, ip);
+  loop_p old_loop = gbb_loop_at_index (gbb, region, i - 1);
+  iv_map[old_loop->num] = t;
+}
+
+}
+
+/* Translates an isl_ast_node_user to Gimple. */
+
+static edge
+translate_isl_ast_node_user (__isl_keep isl_ast_node *node,
+edge next_e, ivs_params &ip)
+{
+  gcc_assert (isl_ast_node_get_type (node) == isl_ast_node_user);
+  isl_ast_expr *user_expr = isl_ast_node_user_get_expr (node);
+  isl_ast_expr *name_expr = isl_ast_expr_get_op_arg (user_expr, 0);
+  gcc_assert (isl_ast_expr_get_type (name_expr) == isl_ast_expr_id);
+  isl_id *name_id = isl_ast_expr_get_id (name_expr);
+  poly_bb_p pbb = (poly_bb_p) isl_id_get_user (name_id);
+  gcc_assert (pbb);
+  gimple_bb_p gbb = PBB_BLACK_BOX (pbb);
+  vec iv_map;
+  isl_ast_expr_free (name_expr);
+  isl_id_free (name_id);
+
+  gcc_assert (GBB_BB (gbb) != ENTRY_BLOCK_PTR_FOR_FN (cfun) &&
+ "The entry block should not even appear within a scop");
+
+  loop_p loop = gbb_loop (gbb);
+  iv_map.create (loop->num + 1);
+  int i;
+  for (i = 0; i < loop->num + 1; i++)
+iv_map.quick_push (NULL_TREE);
+
+  build_iv_mapping (iv_map, gbb, user_expr, ip, SCOP_REGION (pbb->scop));
+  isl_ast_expr_free (user_expr);
+  next_e = copy_bb_and_scalar_dependences (GBB_BB (gbb),
+  SCOP_REGION (pbb->scop), next_e,
+  iv_map,
+  &graphite_regenerate_error);
+  iv_map.release ();
+  mark_virtual_operands_for_renaming (cfun);
+  update_ssa (TODO_update_ssa);
+  return next_e;
+}
+
 /* Translates an ISL AST node NODE to GCC representation in the
context of a SESE.  */
 
@@ -561,7 +629,7 @@
   return next_e;
 
 case isl_ast_node_user:
-  return next_e;
+  return translate_isl_ast_node_user (node, next_e, ip);
 
 case isl_ast_node_block:
   return

[PATCH] Fix PR61823

2014-07-17 Thread Richard Biener

This fixes PR61823, we need to use the proper varpool API now
to access DECL_INITIAL of globals.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-07-17  Richard Biener  

PR ipa/61823
* tree-ssa-structalias.c (create_variable_info_for_1):
Use varpool_get_constructor.
(create_variable_info_for): Likewise.

* gcc.dg/torture/pr43879_1.c: Add -fno-inline.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 212737)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -5671,7 +5671,7 @@ create_variable_info_for_1 (tree decl, c
 in IPA mode.  Else we'd have to parse arbitrary initializers.  */
   && !(in_ipa_mode
   && is_global_var (decl)
-  && DECL_INITIAL (decl)))
+  && varpool_get_constructor (varpool_get_node (decl
 {
   fieldoff_s *fo = NULL;
   bool notokay = false;
@@ -5797,13 +5797,13 @@ create_variable_info_for (tree decl, con
 
  /* If this is a global variable with an initializer and we are in
 IPA mode generate constraints for it.  */
- if (DECL_INITIAL (decl)
+ if (varpool_get_constructor (vnode)
  && vnode->definition)
{
  auto_vec rhsc;
  struct constraint_expr lhs, *rhsp;
  unsigned i;
- get_constraint_for_rhs (DECL_INITIAL (decl), &rhsc);
+ get_constraint_for_rhs (varpool_get_constructor (vnode), &rhsc);
  lhs.var = vi->id;
  lhs.offset = 0;
  lhs.type = SCALAR;
Index: gcc/testsuite/gcc.dg/torture/pr43879_1.c
===
--- gcc/testsuite/gcc.dg/torture/pr43879_1.c(revision 212737)
+++ gcc/testsuite/gcc.dg/torture/pr43879_1.c(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-fipa-pta" } */
+/* { dg-options "-fipa-pta -fno-inline" } */
 /* { dg-additional-sources "pr43879_2.c" } */
 
 void bar(int c)


Re: C++ PATCH for c++/61687 (extra errors with -O2)

2014-07-17 Thread Jan Hubicka
> My earlier patch for 61659 caused more virtual functions to be
> instantiated when -fdevirtualize is on, leading to additional errors
> appearing at higher optimization levels.  This patch shifts that
> instantiation to a new flag, -fuse-all-virtuals, which is on by
> default, and adds an explanatory note to help people understand why
> their code is breaking and how they can work around it if needed.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

> commit 82187fb06863161765a270f8ba00bdbf975b3af2
> Author: Jason Merrill 
> Date:   Mon Jul 7 03:15:02 2014 -0700
> 
>   PR c++/61659
>   PR c++/61687
> gcc/c-family/
>   * c.opt (-fuse-all-virtuals): New.
> gcc/cp/
>   * decl2.c (mark_all_virtuals): New variable.
>   (maybe_emit_vtables): Check it instead of flag_devirtualize.
>   (cp_write_global_declarations): Set it and give helpful diagnostic
>   if it introduces errors.
>   * class.c (finish_struct_1): Check it.
>   * decl.c (grokdeclarator): Clear virtualp after 'virtual auto' error.

Given my experience about numbers of functions that become reachable when you 
stream all virtuals into LTO,
I wonder if we don't want to use possible_polymorphic_call_targets within the 
front-end to avoid instantiating
those that can't be called?
I think it should not be too hard - all we need is to populate the type 
inheritance graph from FE and then
for each polymorphic call produce the list to mark possible targets are 
reachable.

Honza
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index faef774..3a2084f 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1268,6 +1268,10 @@ funsigned-char
>  C ObjC C++ ObjC++ LTO Var(flag_signed_char, 0)
>  Make \"char\" unsigned by default
>  
> +fuse-all-virtuals
> +C++ ObjC++ Var(flag_use_all_virtuals) Init(1)
> +Treat all virtual functions as odr-used
> +
>  fuse-cxa-atexit
>  C++ ObjC++ Var(flag_use_cxa_atexit) Init(DEFAULT_USE_CXA_ATEXIT)
>  Use __cxa_atexit to register destructors
> diff --git a/gcc/cp/class.c b/gcc/cp/class.c
> index 3a44dba..d0eb103 100644
> --- a/gcc/cp/class.c
> +++ b/gcc/cp/class.c
> @@ -6408,7 +6408,7 @@ finish_struct_1 (tree t)
>in every translation unit where the class definition appears.  If
>we're devirtualizing, we can look into the vtable even if we
>aren't emitting it.  */
> -  if (CLASSTYPE_KEY_METHOD (t) == NULL_TREE || flag_devirtualize)
> +  if (CLASSTYPE_KEY_METHOD (t) == NULL_TREE || flag_use_all_virtuals)
>   keyed_classes = tree_cons (NULL_TREE, t, keyed_classes);
>  }
>  
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 1ade586..01d74e3 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -9631,8 +9631,11 @@ grokdeclarator (const cp_declarator *declarator,
>   "-std=gnu++1y");
> }
>   else if (virtualp)
> -   error ("virtual function cannot "
> -  "have deduced return type");
> +   {
> + error ("virtual function cannot "
> +"have deduced return type");
> + virtualp = false;
> +   }
> }
>   else if (!is_auto (type))
> {
> diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
> index 98897f4..0926dbc 100644
> --- a/gcc/cp/decl2.c
> +++ b/gcc/cp/decl2.c
> @@ -106,6 +106,11 @@ static GTY(()) vec *no_linkage_decls;
>  /* Nonzero if we're done parsing and into end-of-file activities.  */
>  
>  int at_eof;
> +
> +/* Nonzero if we've instantiated everything used directly, and now want to
> +   mark all virtual functions as used so that they are available for
> +   devirtualization.  */
> +static int mark_all_virtuals;
>  
>  
>  /* Return a member function type (a METHOD_TYPE), given FNTYPE (a
> @@ -2009,7 +2014,7 @@ maybe_emit_vtables (tree ctype)
>if (DECL_COMDAT (primary_vtbl)
> && CLASSTYPE_DEBUG_REQUESTED (ctype))
>   note_debug_info_needed (ctype);
> -  if (flag_devirtualize)
> +  if (mark_all_virtuals)
>   /* Make sure virtual functions get instantiated/synthesized so that
>  they can be inlined after devirtualization even if the vtable is
>  never emitted.  */
> @@ -4340,6 +4345,8 @@ cp_write_global_declarations (void)
>   instantiated, etc., etc.  */
>  
>emit_support_tinfos ();
> +  int errs = errorcount + sorrycount;
> +  bool explained_devirt = false;
>  
>do
>  {
> @@ -4572,6 +4579,27 @@ cp_write_global_declarations (void)
>pending_statics->length ()))
>   reconsider = true;
>  
> +  if (flag_use_all_virtuals)
> + {
> +   if (!reconsider && !mark_all_virtuals)
> + {
> +   mark_all_virtuals = true;
> +   reconsider = true;
> +   errs = errorcount + sorrycount;
> + }

Re: [PING][PATCH] Fix for PR 61561

2014-07-17 Thread Marat Zakirov


On 07/16/2014 01:32 PM, Kyrill Tkachov wrote:


On 16/07/14 10:22, Marat Zakirov wrote:

Christophe,

Please look at a new patch.  Draft tests are OK.
I'll ask your commit approval when full regression (ARM/thumb1/thumb2)
tests are done.

Hi Marat,

I was about to propose the thumb2.md hunk myself, but I'll defer to 
the arm maintainers to comment on the other parts.


Also, in the ChangeLog it is helpful to specify which patterns are 
being affected, so in your case it would be something like:


* config/arm/thumb1.md (*thumb1_movhi_insn): Handle stack pointer.
(*thumb1_movqi_insn): Likewise.
* config/arm/thumb2.md (*thumb2_movhi_insn): Ditto.


Kyrill



Christophe, Kirill,

finally I've finished regression testing.
Please check if my patch is OK for trunk.

The following configures were used:

configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
--target=arm-linux-gnueabi  --with-interwork --enable-long-long 
--enable-languages=c,c++,fortran --enable-shared --with-gnu-as 
--with-gnu-ld --with-arch=$ARCH --with-mode=$MODE


Thumb-1

$ARCH=armv4t
$MODE=thumb

Thumb-2

$ARCH=armv7
$MODE=thumb

ARM

$ARCH=armv7-a
$MODE=arm

No regressions detected, test pr61561.c passed in all cases.

Thank you all.
--Marat
gcc/ChangeLog:

2014-07-16  Marat Zakirov  

	* config/arm/thumb1.md (*thumb1_movhi_insn): Handle stack pointer.
	(*thumb1_movqi_insn): Likewise.
	* config/arm/thumb2.md (*thumb2_movhi_insn): Likewise.

diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index c044fd5..47b5cbd 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -708,7 +708,7 @@
 
 (define_insn "*thumb1_movhi_insn"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=l,l,m,*r,*h,l")
-	(match_operand:HI 1 "general_operand"   "l,m,l,*h,*r,I"))]
+	(match_operand:HI 1 "general_operand"   "lk,m,l,*h,*r,I"))]
   "TARGET_THUMB1
&& (   register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
@@ -762,7 +762,7 @@
 
 (define_insn "*thumb1_movqi_insn"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=l,l,m,*r,*h,l")
-	(match_operand:QI 1 "general_operand"  "l, m,l,*h,*r,I"))]
+	(match_operand:QI 1 "general_operand"  "lk, m,l,*h,*r,I"))]
   "TARGET_THUMB1
&& (   register_operand (operands[0], QImode)
|| register_operand (operands[1], QImode))"
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 6ea0810..7228069 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -318,7 +318,7 @@
 ;; of the messiness associated with the ARM patterns.
 (define_insn "*thumb2_movhi_insn"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,l,r,m,r")
-	(match_operand:HI 1 "general_operand"  "r,I,Py,n,r,m"))]
+	(match_operand:HI 1 "general_operand"  "rk,I,Py,n,r,m"))]
   "TARGET_THUMB2
   && (register_operand (operands[0], HImode)
  || register_operand (operands[1], HImode))"


Re: [patch 0/2] gcc re-arch status

2014-07-17 Thread Richard Biener
On Wed, Jul 16, 2014 at 2:03 AM, Andrew MacLeod  wrote:
> I don't have the time I'd like to since I'm running about a month behind
> schedule,  but I wanted to make the code I've been experimenting with
> available a bit before cauldron so we can discuss it both here and in the
> upcoming BOF.
>
> I created a quick overview document, and managed to keep it to 5 pages :-)
> Therefore it is far from complete, but it touches all the major things I
> think.  It is here:
> https://gcc.gnu.org/wiki/rearch?action=AttachFile&do=view&target=gccrestructureplan3.0.pdf
>
> This code is all located on the branch 're-arch2'.  Its based off mainline
> revision 211758 dated June 17/2014, so its relatively modern (after wide-int
> went in). It builds, bootstraps and passes all regressions.
>
> I'll send the code changes as 2 patches, directly from the branch. Neither
> are intended to be proposed for mainline as is, so you don't need to
> critique them... Im not looking for a formal review, I know its a mess..
> and there is no changelog.Its really to give you a feel for what it is
> and how it works.  Before this code is ready to go in, I'll also be
> re-porting whatever optimizations/files are converted since there are a
> couple of generations of changes, some of which are probably obsolete now.
> I've thrown  a few comments in, but I primarily want to discuss the general
> approach and future direction.
>
> The first patch is the new code.. The gimple classes and the wrapper classes
> that act as smart pointers to the gimple objects, as well as some interface
> stuff to tree.[ch] routines gimple needs at the moment.
>
> The second patch is the rest of the changes which represent a selection of
> .h and .c files I converted, plus some unfortunate ripple effects required
> in other .c and .h files :-P  (more on that later).  I fully converted:
> tree-into-ssa.[ch]   <- seemed like a good test
> gimple-expr.[ch] <- lots of gimple utilities
> tree-ssa-ccp.c<- lot of constants and expressions
> ssa-iterators.h<- core file used by a lot of optimizations
> most of gimple.h<- Wanted access to gimple statements to use gimple
> classes
>
> Im travelling Wednesday night, expect to be jetlagged Thurs and have
> meetings Friday, but I will try to respond here when I can. I'll be around
> throughout cauldron and at the re-arch2 BOF if anyone wants to tackle me to
> discuss any of this... good, bad, or indifferent :-).

Just to mention - the "regimplification removal" and a "gimple-building
facility" is provided on the match-and-simplify branch worked on by
me and Prathamesh (a GSoC student this year).  I'll present about
this during the Cauldron with the title "Unifying GENERIC and GIMPLE
folding with a pattern description".  It falls under the folding umbrella
as the important feature passes get from using fold + re-gimplification
is expression simplification.

Another related topic (well, not so much maybe as you are focused
on removing references to "tree"s from GIMPLE) is (language dependent)
debug information for types and decls, esp. in the context of LTO
(but also in the context of information we need to retain during the
GIMPLE/RTL phases).  Recently on IRC we concluded that the simplest
approach to start tackling this is to emit (aka run the dwarf2out machinery)
debug info for decls and types early (at least before LTO streaming).
To reference DIEs created there we annotate decls which we can
later complete (function and variable definitions) with hidden symbols
we need to remember for those entities (and stream them via LTO).
At LTO LTRANS time we build a DWARF translation unit importing
referenced units with decls/types from compile-time and complete
functions and variables, refering to the compile-time decl/type dwarf
via those symbols.  And we of course link this "early" debug info
at link time.  A similar scheme could be used even without LTO
(emit the early debug info into the asm early and later emit another blob of
debug info refering to it).  Volunteers to prototype that welcome ;) (hah)
It would basically allow us to build a completely separate representation
for types (the "real" types, not 'tree') and decls on GIMPLE (yeah, really
no trees!), and Frontends like GFortran could emit this GIMPLE directly,
skipping GENERIC, if it knows to emit debug info itself.

Richard.

> Andrew


Re: [Patch] Fix UBSan tests for arm-linux-gnueabi.

2014-07-17 Thread Maxim Ostapenko


On 07/17/2014 02:31 PM, Jakub Jelinek wrote:

On Thu, Jul 17, 2014 at 02:12:23PM +0400, Maxim Ostapenko wrote:

Thanks, fixed. Ok now?

-Maxim
2014-07-17  Max Ostapenko  

* c-c++-common/ubsan/bounds-2.c: Change output pattern.
* c-c++-common/ubsan/bounds-5.c: Likewise.
* c-c++-common/ubsan/bounds-6.c: Likewise.
* c-c++-common/ubsan/bounds-7.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-1.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-10.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-3.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-5.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-6.c: Likewise.
* c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
* gcc.dg/ubsan/float-cast-overflow-bf.c: Likewise.
* g++.dg/ubsan/float-cast-overflow-bf.C: Likewise.

Ok, thanks.

Jakub



Thanks, commited in r212755.

-Maxim.


[patch] Fix typo in extend.texi

2014-07-17 Thread Jonathan Wakely

Committed as obvious.

commit 713d72fb657827ced70636147fd1c7217891f4f7
Author: Jonathan Wakely 
Date:   Thu Jul 17 11:43:21 2014 +0100

	* doc/extend.texi (Template Instantiation): Remove stray parenthesis.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a47b316..53fab8d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18158,7 +18158,7 @@ syntax to support instantiation of the compiler support data for a
 template class (i.e.@: the vtable) without instantiating any of its
 members (with @code{inline}), and instantiation of only the static data
 members of a template class, without the support data or member
-functions (with (@code{static}):
+functions (with @code{static}):
 
 @smallexample
 extern template int max (int, int);


Re: [Patch] Fix UBSan tests for arm-linux-gnueabi.

2014-07-17 Thread Jakub Jelinek
On Thu, Jul 17, 2014 at 02:12:23PM +0400, Maxim Ostapenko wrote:
> Thanks, fixed. Ok now?
> 
> -Maxim

> 2014-07-17  Max Ostapenko  
> 
>   * c-c++-common/ubsan/bounds-2.c: Change output pattern.
>   * c-c++-common/ubsan/bounds-5.c: Likewise.
>   * c-c++-common/ubsan/bounds-6.c: Likewise.
>   * c-c++-common/ubsan/bounds-7.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-1.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-10.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-3.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-5.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-6.c: Likewise.
>   * c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
>   * gcc.dg/ubsan/float-cast-overflow-bf.c: Likewise.
>   * g++.dg/ubsan/float-cast-overflow-bf.C: Likewise.

Ok, thanks.

Jakub


Re: [Patch] Fix UBSan tests for arm-linux-gnueabi.

2014-07-17 Thread Maxim Ostapenko


On 07/17/2014 01:49 PM, Jakub Jelinek wrote:

On Thu, Jul 17, 2014 at 01:45:42PM +0400, Maxim Ostapenko wrote:

I see regressions in some UBSan output pattern tests (bounds-2.c and some
others) on arm-linux-gnueabi if we use ssh to deliver executables to remote
target. After some investigation, I discovered, that Dejagnu deletes
trailing '\n' symbol from output message in this case. My patch removes
trailing newlines from these tests.

Regested on x86_64-unknown-linux-gnu and arm-linux-gnueabi.

Ok to commit?

Keeping 'int \\\[5\\\]'\[^\n\r]*" } */
etc. doesn't make much sense, I'd also remove the * stuff, so
'int \\\[5\\\]'" } */

Jakub


Thanks, fixed. Ok now?

-Maxim
2014-07-17  Max Ostapenko  

	* c-c++-common/ubsan/bounds-2.c: Change output pattern.
	* c-c++-common/ubsan/bounds-5.c: Likewise.
	* c-c++-common/ubsan/bounds-6.c: Likewise.
	* c-c++-common/ubsan/bounds-7.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-1.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-10.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-3.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-5.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-6.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
	* gcc.dg/ubsan/float-cast-overflow-bf.c: Likewise.
	* g++.dg/ubsan/float-cast-overflow-bf.C: Likewise.

diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-2.c b/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
index 7ef71aa..812e0db 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
@@ -162,4 +162,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 10 out of bounds for type 'long int \\\*\\\[10\\\]\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 5 out of bounds for type 'char \\\*\\\*\\\*\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 5 out of bounds for type 'int \\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 5 out of bounds for type 'int \\\[5\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-5.c b/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
index 9698529..c82287a 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
@@ -110,4 +110,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-6.c b/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
index 78ad29a..e7d15d5 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
@@ -34,4 +34,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'int \\\[4\\\]\\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'U \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'U \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[10\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-7.c b/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
index 2d7cc3b..e08a3fd 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
@@ -51,4 +51,4 @@ main (void)
 /* { dg-output "index -1 out of bounds for type 'int \\\[7\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[7\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[10\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
index f94588e..cd6941c 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
@@ -201,4 +201,4 @@ main (void)
 /* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value -?nan is outside t

Re: [AArch64] Implement some vca*_f[32,64] intrinsics

2014-07-17 Thread Marcus Shawcroft
On 10 July 2014 08:56, Kyrill Tkachov  wrote:
>
> On 02/07/14 08:59, Christophe Lyon wrote:
>>
>> Hi,
>>
>> It seems some of the scan-assembler directives fail:
>>
>> http://cbuild.validation.linaro.org/build/cross-validation/gcc/trunk/212196/aarch64-none-elf/diff-gcc-rh50-aarch64-none-elf-default-default-default.txt
>>
>> Christophe.
>
>
> This patch should fix the tests by marking the appropriate variable
> volatile.
> Sorry for the delay and breakage.
> Ok for trunk?

OK /Marcus


Re: [PATCH][AArch64] Handle fcvta[su] and frint in RTX cost function

2014-07-17 Thread Marcus Shawcroft
On 10 July 2014 08:52, Kyrill Tkachov  wrote:
> Hi all,
>
> This patch adds rtx costing code for the round-to-integer instructions in
> AArch64.
> Some of them are implemented as UNSPECs, so I've added the appropriate case
> to the big switch that can be
> expanded upon to handle other unspecs in the future.
> Tested aarch64-none-elf and bootstrapped on aarch64-none-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
>
> 2014-07-10  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_frint_unspec_p): New function.
> (aarch64_rtx_costs): Handle FIX, UNSIGNED_FIX, UNSPEC.

OK Thanks /Marcus


Re: [PATCH][AArch64] Implement vfma_f64, vmla_f64, vfms_f64, vmls_f64 intrinsics

2014-07-17 Thread Marcus Shawcroft
On 20 June 2014 15:17, Kyrill Tkachov  wrote:
> Hi all,
>
> Now that Alan fixed the float64x1_t machinery, this patch implements some
> low-hanging intrinsics
> in arm_neon.h.
>
> Tested aarch64-none-elf and bootstrapped on aarch64-linux.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2014-06-20  Kyrylo Tkachov  
>
> * config/aarch64/arm_neon.h (vfma_f64): New intrinsic.
> (vmla_f64): Likewise.
> (vfms_f64): Likewise.
> (vmls_f64): Likewise.
>
> 2014-06-20  Kyrylo Tkachov  
>
> * gcc.target/aarch64/simd/vfma_f64.c: New test.
> * gcc.target/aarch64/simd/vmla_f64.c: Likewise.
> * gcc.target/aarch64/simd/vfms_f64.c: Likewise.
> * gcc.target/aarch64/simd/vmls_f64.c: Likewise.

+  fprintf (stderr, "Expected: %lf, got %lf\n", expected, actual);

Drop these expected output messages from the test cases and the
associated #include stdio.h please/

Cheers
/Marcus


Re: [PATCH][AArch64] Fix argument types for some high_lane* intrinsics implemented in assembly

2014-07-17 Thread Marcus Shawcroft
On 9 July 2014 15:37, Kyrill Tkachov  wrote:
> Hi all,
>
> These intrinsics are implemented as macros that map down to asms but the
> types they accept are inconsistent with the ACLE spec. This patch fixes
> them, although they should be reimplemented properly in C in the future.
>
> This is a bugfix and it applies cleanly to trunk, 4.9 and 4.8.
> I know we're close to the 4.9.1 release, but this is not an ABI-breaking
> change so it's the aarch64 maintainers' call on whether it should be
> backported.
>
> Tested aarch64-none-elf
>
> Ok?
>
> Thanks,
> Kyrill
>
> 2014-07-09  Kyrylo Tkachov  
>
> * config/aarch64/arm_neon.h (vmlal_high_lane_s16): Fix type.
> (vmlal_high_lane_s32): Likewise.
> (vmlal_high_lane_u16): Likewise.
> (vmlal_high_lane_u32): Likewise.
> (vmlsl_high_lane_s16): Likewise.
> (vmlsl_high_lane_s32): Likewise.
> (vmlsl_high_lane_u16): Likewise.
> (vmlsl_high_lane_u32): Likewise.

OK thanks.

/Marcus


Re: [Patch] Fix UBSan tests for arm-linux-gnueabi.

2014-07-17 Thread Jakub Jelinek
On Thu, Jul 17, 2014 at 01:45:42PM +0400, Maxim Ostapenko wrote:
> I see regressions in some UBSan output pattern tests (bounds-2.c and some
> others) on arm-linux-gnueabi if we use ssh to deliver executables to remote
> target. After some investigation, I discovered, that Dejagnu deletes
> trailing '\n' symbol from output message in this case. My patch removes
> trailing newlines from these tests.
> 
> Regested on x86_64-unknown-linux-gnu and arm-linux-gnueabi.
> 
> Ok to commit?

Keeping 'int \\\[5\\\]'\[^\n\r]*" } */
etc. doesn't make much sense, I'd also remove the * stuff, so
'int \\\[5\\\]'" } */

Jakub


[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2014-07-17 Thread Yvan Roux
Hi all,

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 212419 as r212661.  We have also backported this set of revisions:

r209794 as r212697 : PR c/60114
r209797 as r212675 : [ARM] Wrap long literals in HOST_WIDE_INT_C in
aarch-common.c
r209858 as r212697 : [ARM/AArch64] Use signed chars in gcc.dg/pr60114.c.
r209940 as r212665 : Add execution + assembler tests of AArch64 UZP Intrinsics.
r209943 as r212665 : Rewrite AArch64 UZP Intrinsics using __builtin_shuffle.
r209947 as r212665 : Add execution tests of ARM UZP Intrinsics.
r210148 as r212698 : Add execution + assembler tests of AArch64 TRN Intrinsics.
r210151 as r212698 : Reimplement AArch64 TRN intrinsics with __builtin_shuffle.
r210152 as r212677 : Add execution + assembler tests of AArch64 EXT intrinsics.
r210153 as r212715 : Add execution + assembler tests of AArch64 REV
Neon Intrinsics.
r210216 as r212714 : Neon intrinsics TLC - Replace intrinsics with GNU
C implementations.
r210218 as r212714 : Neon intrinsics TLC - remove dead code.
r210219 as r212714 : Neon intrinsics TLC - remove ML
r210355 as r212669 : Implement HARD_REGNO_CALLER_SAVE_MODE for AArch64
r210369 as r212678 : [ARM] Remove vzip, vuzp, vtrn builtins and cleanup
r210422 as r212698 : Add execution tests of ARM TRN Intrinsics.
r210471 as r212679 : [ARM][cleanup] Use enum name instead of integer
value for PARAM_SCHED_PRESSURE_ALGORITHM.
r210828 as r212672 : TARGET_ATOMIC_ASSIGN_EXPAND_FENV AArch64
r210861 as r212695 : [AARCH64] Support tail indirect function call.
r210967 as r212680 : [ARM] Vectorise bswap* in aarch32.
r210996 as r212681 : [AArch64] Fix stack protector for ILP32
r211050 as r212682 : [AArch32] Fix PR/61331
r211058 as r212677 : Detect EXT patterns to vec_perm_const, use for
EXT intrinsics
r211059 as r212677 : Add execution tests of ARM EXT intrinsics
r211073 as r212683 : [ARM] Use mov_imm type for movw operations consistently
r211103 as r212672 : TARGET_ATOMIC_ASSIGN_EXPAND_FENV ARM
r211129 as r212685 : Fix PR target/61154
r211148 as r212673 : ILP32 dynamic linker
r211174 as r212715 : Recognize shuffle patterns for REV instructions
on AArch64, rewrite intrinsics.
r211177 as r212677 : Detect EXT patterns to vec_perm_const, use for
EXT intrinsics.
r211185 as r212690 : [PATCH AArch64 1/2] Correct signedness of
builtins, remove casts from arm_neon.h
r211186 as r212690 : AArch64 2/2] Correct signedness of builtins,
remove casts from arm_neon.h
r211268 as r212686 : [AArch64] clarify stack layout diagram
r211314 as r212691 : [AArch64] Implement movmem for the benefit of inline memcpy
r211371 as r212687 : Remove XFmode from ARM backend.
r211408 as r212689 : [AArch64] Fix REG_CFA_RESTORE mode.
r211416 as r212689 : [AArch64] Fix layout of frame layout code.
r211418 as r212688 : [AArch64] Fix some reg-to-reg move scheduler types.
r211440 as r212720 : [AArch64] Implement CRC32 ACLE intrinsics
r211441 as r212720 : [AArch64] Add CRC32 ACLE intrinsics testsuite.
r211771 as r212696 : [genattrtab] Fix memory corruption, allocate
enough memory for all bypassed reservations
r211887 as r212722 : [AArch64] Implement ADD in vector registers for
32-bit scalar values.
r211899 as r212722 : [AArch64] Implement ADD in vector registers for
32-bit scalar values.

This will be part of our 2014.07 release.

Thanks,
Yvan


[Patch] Fix UBSan tests for arm-linux-gnueabi.

2014-07-17 Thread Maxim Ostapenko

Hi,

I see regressions in some UBSan output pattern tests (bounds-2.c and 
some others) on arm-linux-gnueabi if we use ssh to deliver executables 
to remote target. After some investigation, I discovered, that Dejagnu 
deletes trailing '\n' symbol from output message in this case. My patch 
removes trailing newlines from these tests.


Regested on x86_64-unknown-linux-gnu and arm-linux-gnueabi.

Ok to commit?

-Maxim
gcc/testsuite/ChangeLog:

2014-07-16  Max Ostapenko  

	* c-c++-common/ubsan/bounds-2.c: Change output pattern.
	* c-c++-common/ubsan/bounds-5.c: Likewise.
	* c-c++-common/ubsan/bounds-6.c: Likewise.
	* c-c++-common/ubsan/bounds-7.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-1.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-10.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-2.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-3.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-4.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-5.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-6.c: Likewise.
	* c-c++-common/ubsan/float-cast-overflow-8.c: Likewise.
	* gcc.dg/ubsan/float-cast-overflow-bf.c: Likewise.
	* g++.dg/ubsan/float-cast-overflow-bf.C: Likewise.

diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-2.c b/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
index 7ef71aa..a4e77f9 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-2.c
@@ -162,4 +162,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 10 out of bounds for type 'long int \\\*\\\[10\\\]\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 5 out of bounds for type 'char \\\*\\\*\\\*\\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 5 out of bounds for type 'int \\\[5\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 5 out of bounds for type 'int \\\[5\\\]'\[^\n\r]*" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-5.c b/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
index 9698529..5c1de99 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-5.c
@@ -110,4 +110,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 1 out of bounds for type 'char \\\[1\\\]'\[^\n\r]*" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-6.c b/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
index 78ad29a..5bbcfc8 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-6.c
@@ -34,4 +34,4 @@ main (void)
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'int \\\[4\\\]\\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'U \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index 4 out of bounds for type 'U \\\[4\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-7.c b/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
index 2d7cc3b..05cbc1db 100644
--- a/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-7.c
@@ -51,4 +51,4 @@ main (void)
 /* { dg-output "index -1 out of bounds for type 'int \\\[7\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[7\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index -1 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
index f94588e..96757c6 100644
--- a/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-1.c
@@ -201,4 +201,4 @@ main (void)
 /* { dg-output "\[^\n\r]*value nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value -?nan is outside the range of representable values of type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*value inf is outside the range of representable values of type 'long long unsigned in

Re: [C++ Patch/RFC] PR 50961

2014-07-17 Thread Paolo Carlini

Hi,

On 07/17/2014 02:40 AM, Jason Merrill wrote:

On 07/16/2014 12:39 AM, Paolo Carlini wrote:

In practice, both for the original testcase and for a conditional (and
in more cases, eg conditional expressions), what happens is that
perform_implicit_conversion_flags is called, which, when
implicit_conversion fails, calls instantiate_type (and then
resolve_address_of_overloaded_function) only to get a good error
message. Thus, would it make sense to use resolve_nondeduced_context in
perform_implicit_conversion_flags itself?!? Conservatively, only when
the target type is a boolean_type_node, maybe?
How about in standard_conversion in the type_unknown_p case after 
we've checked for pointer-to(-member)-function?
Ah, I noticed that place a few days ago, when I didn't know about 
resolve_nondeduced_context, then I completely forgot about it...


The below passes testing.

Thanks!
Paolo.


Index: cp/call.c
===
--- cp/call.c   (revision 212742)
+++ cp/call.c   (working copy)
@@ -1107,14 +1107,22 @@ standard_conversion (tree to, tree from, tree expr
   to = strip_top_quals (to);
   from = strip_top_quals (from);
 
-  if ((TYPE_PTRFN_P (to) || TYPE_PTRMEMFUNC_P (to))
-  && expr && type_unknown_p (expr))
+  if (expr && type_unknown_p (expr))
 {
-  tsubst_flags_t tflags = tf_conv;
-  expr = instantiate_type (to, expr, tflags);
-  if (expr == error_mark_node)
-   return NULL;
-  from = TREE_TYPE (expr);
+  if (TYPE_PTRFN_P (to) || TYPE_PTRMEMFUNC_P (to))
+   {
+ tsubst_flags_t tflags = tf_conv;
+ expr = instantiate_type (to, expr, tflags);
+ if (expr == error_mark_node)
+   return NULL;
+ from = TREE_TYPE (expr);
+   }
+  else if (TREE_CODE (to) == BOOLEAN_TYPE)
+   {
+ /* Necessary for eg, TEMPLATE_ID_EXPRs (c++/50961).  */
+ expr = resolve_nondeduced_context (expr);
+ from = TREE_TYPE (expr);
+   }
 }
 
   fcode = TREE_CODE (from);
Index: testsuite/g++.dg/template/operator13.C
===
--- testsuite/g++.dg/template/operator13.C  (revision 0)
+++ testsuite/g++.dg/template/operator13.C  (working copy)
@@ -0,0 +1,12 @@
+// PR c++/50961
+
+template < class > void foo ();
+
+bool b1 = !foo;
+bool b2 = foo ? true : false;
+
+void bar()
+{
+  if (foo)
+;
+}


[PATCH 3/3]Improve induction variable elimination

2014-07-17 Thread Bin Cheng
Hi,
Function iv_elimination_compare_lt is used to eliminate induction variable
when the loop's latch could run for zero time (i.e., may_be_zero in loop
niter information evaluates to true).  As stated in the first message, it
only handles very specific case that rarely happens for either GCC bootstrap
or spec2k/spec2k6 compilation.  The function has two restrictions which
could be improved:
  a) When checking that candidate iv doesn't overflow, it only handles
candidates that are computed in a type that guarantees no overflows.  More
complex analysis can be used to prove the non-overflow ness,  as in this
patch.
  b) The function only handles the original form of may_be_zero like "a + 1
> b", but that expression could have been folded into other forms.  This
patch handles three folded forms and does iv elimination as well.  I think
this isn't a very corner case, because for many loops iterating from "0"
(i.e., we have "a == 0"), the expression will be folded.

I also refactored period check from may_eliminate_iv into a single function
so that it can be reused.

Thanks,
bin


2014-07-17  Bin Cheng  

* tree-ssa-loop-ivopts.c (iv_nowrap_period)
(nowrap_cand_for_loop_niter_p): New functions.
(period_greater_niter_exit): New function refactored from
may_eliminate_iv.
(iv_elimination_compare_lt): New parameter.  Check wrapping
behavior for candidate of wrapping type.  Handle folded forms
of may_be_zero expression.
(may_eliminate_iv): Call period_greater_niter_exit.  Pass new
argument for iv_elimination_compare_lt.

gcc/testsuite/ChangeLog
2014-07-17  Bin Cheng  

* gcc.dg/tree-ssa/ivopts-lt-3.c: New test.
* gcc.dg/tree-ssa/ivopts-lt-4.c: New test.Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 212387)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -4432,6 +4432,44 @@ iv_period (struct iv *iv)
   return period;
 }
 
+/* Returns no wrapping period of induction variable IV.  For now
+   only unsigned type IV is handled, we could extend it in case
+   of non-overflow for signed ones.  Return zero if it can't be
+   decided.  */
+
+static tree
+iv_nowrap_period (struct iv *iv)
+{
+  bool overflow;
+  tree type;
+  tree base = iv->base, step = iv->step;
+  widest_int base_val, step_val, max_val, span, period;
+
+  gcc_assert (step && TREE_CODE (step) == INTEGER_CST);
+
+  type = TREE_TYPE (base);
+  if (!TYPE_UNSIGNED (type) || TREE_CODE (base) != INTEGER_CST)
+return integer_zero_node;
+
+  base_val = wi::to_widest (base);
+  step_val = wi::to_widest (step);
+  if (!POINTER_TYPE_P (type) && TYPE_MAX_VALUE (type)
+  && TREE_CODE (TYPE_MAX_VALUE (type)) == INTEGER_CST)
+max_val = wi::to_widest (TYPE_MAX_VALUE (type));
+  else
+{
+  wide_int max_wi = wi::max_value (TYPE_PRECISION (type), UNSIGNED);
+  max_val = wi::to_widest (wide_int_to_tree (type, max_wi));
+}
+
+  span = max_val - base_val + step_val - 1;
+  period = wi::div_trunc (span, step_val, UNSIGNED, &overflow);
+  if (overflow)
+return integer_zero_node;
+
+  return wide_int_to_tree (type, period);
+}
+
 /* Returns the comparison operator used when eliminating the iv USE.  */
 
 static enum tree_code
@@ -4560,7 +4598,84 @@ difference_cannot_overflow_p (tree base, tree offs
 }
 }
 
-/* Tries to replace loop exit by one formulated in terms of a LT_EXPR
+/* Check whether PERIOD of CAND is greater than the number of iterations
+   described by DESC for which the exit condition is true.  The exit
+   condition is comparison against USE.  */
+
+static bool
+period_greater_niter_exit (struct ivopts_data *data,
+  struct iv_use *use, struct iv_cand *cand,
+  tree period, struct tree_niter_desc *desc)
+{
+  struct loop *loop = data->current_loop;
+
+  /* If the number of iterations is constant, compare against it directly.  */
+  if (TREE_CODE (desc->niter) == INTEGER_CST)
+{
+  /* See cand_value_at.  */
+  if (stmt_after_increment (loop, cand, use->stmt))
+{
+  if (!tree_int_cst_lt (desc->niter, period))
+return false;
+}
+  else
+{
+  if (tree_int_cst_lt (period, desc->niter))
+return false;
+}
+}
+
+  /* If not, and if this is the only possible exit of the loop, see whether
+ we can get a conservative estimate on the number of iterations of the
+ entire loop and compare against that instead.  */
+  else
+{
+  widest_int period_value, max_niter;
+
+  max_niter = desc->max;
+  if (stmt_after_increment (loop, cand, use->stmt))
+max_niter += 1;
+  period_value = wi::to_widest (period);
+  if (wi::gtu_p (max_niter, period_value))
+{
+  /* See if we can take advantage of inferred loop bound information.  
*/
+  if (data->loop_single_exit_p)
+ 

[PATCH 2/3]Improve induction variable elimination

2014-07-17 Thread Bin Cheng
Hi,
As quoted from the function difference_cannot_overflow_p,

  /* TODO: deeper inspection may be necessary to prove the equality.  */
  switch (code)
{
case PLUS_EXPR:
  return expr_equal_p (e1, offset) || expr_equal_p (e2, offset);
case POINTER_PLUS_EXPR:
  return expr_equal_p (e2, offset);

default:
  return false;
}

The overflow check can be improved by using deeper inspection to prove the
equality.  This patch deals with that by making below two improvements:
  a) Handles constant cases.
  b) Uses affine expansion as deeper inspection to check the equality.

As a result, functions strip_wrap_conserving_type_conversions and
expr_equal_p can be removed now.  A test case is also added to illustrate iv
elimination opportunity captured by this patch.

Thanks,
bin


2014-07-17  Bin Cheng  

* tree-ssa-loop-ivopts.c (ivopts_data): New field name_expansion.
(tree_ssa_iv_optimize_init): Initialize name_expansion.
(tree_ssa_iv_optimize_finalize): Free name_expansion.
(strip_wrap_conserving_type_conversions, expr_equal_p): Delete.
(difference_cannot_overflow_p): New parameter.  Handle constant
cases.  Use affine expansion for equality check.
(iv_elimination_compare_lt): Pass new argument.

gcc/testsuite/ChangeLog
2014-07-17  Bin Cheng  

* gcc.dg/tree-ssa/ivopts-lt-2.c: New test.
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-2.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-2.c (revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts" } */
+
+void
+f1 (int *p, unsigned int i)
+{
+  p += i;
+  do
+{
+  *p = 0;
+  p += 1;
+  i++;
+}
+  while (i < 100);
+}
+
+/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" } } */
+/* { dg-final { scan-tree-dump-times "PHI iv_candidates.create (20);
   data->inv_expr_tab = new hash_table (10);
   data->inv_expr_id = 0;
+  data->name_expansion = NULL;
   decl_rtl_to_reset.create (20);
 }
 
@@ -4449,76 +4453,40 @@ iv_elimination_compare (struct ivopts_data *data,
   return (exit->flags & EDGE_TRUE_VALUE ? EQ_EXPR : NE_EXPR);
 }
 
-static tree
-strip_wrap_conserving_type_conversions (tree exp)
-{
-  while (tree_ssa_useless_type_conversion (exp)
-&& (nowrap_type_p (TREE_TYPE (exp))
-== nowrap_type_p (TREE_TYPE (TREE_OPERAND (exp, 0)
-exp = TREE_OPERAND (exp, 0);
-  return exp;
-}
+/* Returns true if we can prove that BASE - OFFSET does not overflow.  For now,
+   we only check either the case BASE and OFFSET are integer constants, or the
+   situation that BASE = SOMETHING + OFFSET, where the calculation is performed
+   in non-wrapping type.  For the latter case, we use affine expansion for
+   further equality check.
 
-/* Walk the SSA form and check whether E == WHAT.  Fairly simplistic, we
-   check for an exact match.  */
+   TODO: More generally, we could test for the situation that
+BASE = SOMETHING + OFFSET' and OFFSET is between OFFSET' and zero.
+This would require knowing the sign of OFFSET.  */
 
 static bool
-expr_equal_p (tree e, tree what)
+difference_cannot_overflow_p (struct ivopts_data *data, tree base, tree offset)
 {
-  gimple stmt;
   enum tree_code code;
+  tree e1, e2;
+  aff_tree aff_e1, aff_e2, aff_offset;
 
-  e = strip_wrap_conserving_type_conversions (e);
-  what = strip_wrap_conserving_type_conversions (what);
-
-  code = TREE_CODE (what);
-  if (TREE_TYPE (e) != TREE_TYPE (what))
-return false;
-
-  if (operand_equal_p (e, what, 0))
+  /* No overflow if offset is zero.  */
+  if (offset == integer_zero_node)
 return true;
 
-  if (TREE_CODE (e) != SSA_NAME)
-return false;
-
-  stmt = SSA_NAME_DEF_STMT (e);
-  if (gimple_code (stmt) != GIMPLE_ASSIGN
-  || gimple_assign_rhs_code (stmt) != code)
-return false;
-
-  switch (get_gimple_rhs_class (code))
+  /* Overflow can be checked easily for constant values.  */
+  if (TREE_CODE (base) == INTEGER_CST && TREE_CODE (offset) == INTEGER_CST)
 {
-case GIMPLE_BINARY_RHS:
-  if (!expr_equal_p (gimple_assign_rhs2 (stmt), TREE_OPERAND (what, 1)))
-   return false;
-  /* Fallthru.  */
+  bool overflow = false;
+  tree type = TREE_TYPE (base);
+  signop sign = TYPE_SIGN (type);
 
-case GIMPLE_UNARY_RHS:
-case GIMPLE_SINGLE_RHS:
-  return expr_equal_p (gimple_assign_rhs1 (stmt), TREE_OPERAND (what, 0));
-default:
-  return false;
+  wide_int arg2 = wide_int::from (offset, TYPE_PRECISION (type),
+ TYPE_SIGN (TREE_TYPE (offset)));
+  (void) wi::sub (base, arg2, sign, &overflow);
+  return overflow;
 }
-}
 
-/* Returns true if we can prove that BASE - OFFSET does not overflow.  For now,
-   we only detect the situation that BASE = SOMETHING + OFFSET, where the
-   calculatio

[PATCH 1/3]Improve induction variable elimination

2014-07-17 Thread Bin Cheng
Hi,
This is a series of three patches improving induction variable elimination.
Currently GCC only eliminates iv for very specific case when the loop’s
latch could run zero times, i.e., when may_be_zero field of loop niter
information evaluates to true.  In fact, it’s so specific that
iv_elimination_compare_lt rarely succeeds during either GCC bootstrap or
spec2000/spec2006 compilation.  Though intrusive data shows these patches
don’t help iv elimination that much for GCC bootstrap, they do capture
5%~15% more eliminations for compiling spec2000/2006.  Detailed numbers are
like:
                  2k/int       2k/fp       2k6/int       2k6/fp
improve ~9.6%      ~4.8%      ~5.5%        ~14.4%

All patches pass bootstrap and regression test on x86_64/x86.  I will
bootstrap and test them on aarch64/arm platforms too.

The first patch turns to tree operand_equal_p to check the number of
iterations in iv_elimination_lt.  Though I think this change isn’t necessary
for current code, it’s needed if we further relax iv elimination for cases
in which sign/unsigned conversion is involved.

Thanks,
bin

2014-07-17  Bin Cheng  

* tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number
of iteration using tree comparison.Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 212387)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -4605,7 +4605,7 @@ iv_elimination_compare_lt (struct ivopts_data *dat
   struct tree_niter_desc *niter)
 {
   tree cand_type, a, b, mbz, nit_type = TREE_TYPE (niter->niter), offset;
-  struct aff_tree nit, tmpa, tmpb;
+  struct aff_tree nit, tmp1, tmpa, tmpb;
   enum tree_code comp;
   HOST_WIDE_INT step;
 
@@ -4661,15 +4661,19 @@ iv_elimination_compare_lt (struct ivopts_data *dat
 return false;
 
   /* Expected number of iterations is B - A - 1.  Check that it matches
- the actual number, i.e., that B - A - NITER = 1.  */
+ the actual number, i.e., that B - A = NITER + 1.  */
   tree_to_aff_combination (niter->niter, nit_type, &nit);
-  tree_to_aff_combination (fold_convert (nit_type, a), nit_type, &tmpa);
-  tree_to_aff_combination (fold_convert (nit_type, b), nit_type, &tmpb);
-  aff_combination_scale (&nit, -1);
-  aff_combination_scale (&tmpa, -1);
-  aff_combination_add (&tmpb, &tmpa);
-  aff_combination_add (&tmpb, &nit);
-  if (tmpb.n != 0 || tmpb.offset != 1)
+  aff_combination_const (&tmp1, nit_type, 1);
+  tree_to_aff_combination (b, TREE_TYPE (b), &tmpb);
+  aff_combination_add (&nit, &tmp1);
+  if (a != integer_zero_node)
+{
+  tree_to_aff_combination (a, TREE_TYPE (b), &tmpa);
+  aff_combination_scale (&tmpa, -1);
+  aff_combination_add (&tmpb, &tmpa);
+}
+  if (!operand_equal_p (aff_combination_to_tree (&nit),
+   aff_combination_to_tree (&tmpb), 0))
 return false;
 
   /* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not


[PATCH][match-and-simplify] API adjustments

2014-07-17 Thread Richard Biener

This marks two APIs (gimple_match_and_simplify overloads) as
obsolete - users should use fold_stmt and gimple_fold_stmt_to_constant
instead (actual implementation re-org pending).

It also adds a missing overload to gimple_convert with a location_t
argument.

Committed.

Richard.

2014-07-17  Richard Biener  

* gimple-fold.h (gimple_convert): Add overload with location_t
argument.
(gimple_match_and_simplify): Mark overloads on tree and gsi
as obsolete.

Index: gcc/gimple-fold.h
===
--- gcc/gimple-fold.h   (revision 212515)
+++ gcc/gimple-fold.h   (working copy)
@@ -91,19 +91,23 @@ gimple_build (gimple_seq *seq,
 extern bool useless_type_conversion_p (tree, tree);
 
 inline tree
-gimple_convert (gimple_seq *seq, tree type, tree op)
+gimple_convert (gimple_seq *seq, location_t loc, tree type, tree op)
 {
   if (useless_type_conversion_p (type, TREE_TYPE (op)))
 return op;
-  return gimple_build (seq, NOP_EXPR, type, op);
+  return gimple_build (seq, loc, NOP_EXPR, type, op);
+}
+
+inline tree
+gimple_convert (gimple_seq *seq, tree type, tree op)
+{
+  return gimple_convert (seq, UNKNOWN_LOCATION, type, op);
 }
 
 /* Add gimple_seq_discard (gimple_seq *) that releases defs of all stmts
in the sequence.  */
 
 /* In gimple-match.c.  */
-tree gimple_match_and_simplify (tree, gimple_seq *, tree (*)(tree));
-bool gimple_match_and_simplify (gimple_stmt_iterator *, tree (*)(tree));
 tree gimple_match_and_simplify (enum tree_code, tree, tree,
gimple_seq *, tree (*)(tree));
 tree gimple_match_and_simplify (enum tree_code, tree, tree, tree,
@@ -112,5 +116,9 @@ tree gimple_match_and_simplify (enum tre
gimple_seq *, tree (*)(tree));
 tree gimple_match_and_simplify (enum built_in_function, tree, tree,
gimple_seq *, tree (*)(tree));
+/* The following two APIs are an artifact and should vanish in favor
+   of the existing gimple_fold_stmt_to_constant and fold_stmt APIs.  */
+tree gimple_match_and_simplify (tree, gimple_seq *, tree (*)(tree));
+bool gimple_match_and_simplify (gimple_stmt_iterator *, tree (*)(tree));
 
 #endif  /* GCC_GIMPLE_FOLD_H */


Re: [PATCH] Generate canonical infinity for the Motorola extended real format

2014-07-17 Thread Andreas Schwab
Since this only affects m68k I have pushed this now.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[Ada] Crash while processing illegal state refinement

2014-07-17 Thread Arnaud Charlet
This patch modifies the parser to catch a case where the argument of SPARK
aspect Refined_State is not properly parenthesized.


-- Source --


--  no_parens.ads

package No_Parens
  with SPARK_Mode => On,
   Abstract_State => State
is
   pragma Elaborate_Body;
end No_Parens;

--  no_parens.adb

package body No_Parens
  with SPARK_Mode => On,
   Refined_State => State => (Speed, Status)
is
   Speed  : Integer := 0;
   Status : Integer := 0;
end No_Parens;


-- Compilation and output --


$ gcc -c no_parens.adb
no_parens.adb:3:25: missing "("

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-07-17  Hristian Kirtchev  

* par-ch13.adb (Get_Aspect_Specifications):
Catch a case where the argument of SPARK aspect Refined_State
is not properly parenthesized.

Index: par-ch13.adb
===
--- par-ch13.adb(revision 212640)
+++ par-ch13.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2014, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -308,8 +308,8 @@
end if;
 
--  Detect a common error where the non-null definition of
-   --  aspect Depends, Global, Refined_Depends or Refined_Global
-   --  must be enclosed in parentheses.
+   --  aspect Depends, Global, Refined_Depends, Refined_Global
+   --  or Refined_State lacks enclosing parentheses.
 
if Token /= Tok_Left_Paren and then Token /= Tok_Null then
 
@@ -400,6 +400,48 @@
Restore_Scan_State (Scan_State);
 end if;
  end;
+
+  --  Refined_State
+
+  elsif A_Id = Aspect_Refined_State then
+ if Token = Tok_Identifier then
+declare
+   Scan_State : Saved_Scan_State;
+
+begin
+   Save_Scan_State (Scan_State);
+   Scan;  --  past state
+
+   --  The refinement contains a constituent, the whole
+   --  argument of Refined_State must be parenthesized.
+
+   --with Refined_State => State => Constit
+
+   if Token = Tok_Arrow then
+  Restore_Scan_State (Scan_State);
+  Error_Msg_SC -- CODEFIX
+("missing ""(""");
+  Resync_Past_Malformed_Aspect;
+
+  --  Return when the current aspect is the last
+  --  in the list of specifications and the list
+  --  applies to a body.
+
+  if Token = Tok_Is then
+ return Aspects;
+  end if;
+
+   --  The refinement lacks constituents. Do not flag
+   --  this case as the error would be misleading. The
+   --  diagnostic is left to the analysis.
+
+   --with Refined_State => State
+
+   else
+  Restore_Scan_State (Scan_State);
+   end if;
+end;
+ end if;
   end if;
end if;