FRE may run out of memory

2014-02-07 Thread dxq
hi all,

We found that gcc would run out of memory on Windows when compiling a *big*
function (10 lines).

More investigation shows that gcc crashes at the function *compute_avail*,
in tree-fre pass.  *compute_avail* collects information from basic blocks,
so memory is allocated to record informantion.
However, if there are huge number of basic blocks,  the memory would be
exhausted and gcc would crash down, especially for Windows PC, only 2G or 4G
memory generally. It's ok On linux, and *compute_avail* allocates *2.4G*
memory. I guess some optimization passes in gcc like FRE didn't consider the
extreme
case. 

When disable tree-fre pass, gcc crashes at IRA pass.  I will do more
investigation about that.

Any suggestions?

Thanks!

danxiaoqiang



--
View this message in context: 
http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578.html
Sent from the gcc - patches mailing list archive at Nabble.com.


RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8

2014-02-07 Thread Joey Ye
Ping ^ 2

OK to 4.8?

> -Original Message-
> From: Joey Ye [mailto:joey...@arm.com]
> Sent: Monday, January 20, 2014 10:47
> To: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8
> 
> Ping
> 
> > -Original Message-
> > From: Joey Ye [mailto:joey...@arm.com]
> > Sent: Thursday, January 16, 2014 16:28
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8
> >
> > 4.8 has a number of strict-volatile-bitfields issues that can be fixed
> > by following patches.
> > trunk@205899, 205898, 205897, 205896, 203003
> >
> > Tested on x86_64 and arm without regression.
> >
> > OK to 4.8?
> >
> > 2013-09-28  Sandra Loosemore  
> >
> > gcc/
> > * expr.h (extract_bit_field): Remove packedp parameter.
> > * expmed.c (extract_fixed_bit_field): Remove packedp parameter
> > from forward declaration.
> > (store_split_bit_field): Remove packedp arg from calls to
> > extract_fixed_bit_field.
> > (extract_bit_field_1): Remove packedp parameter and packedp
> > argument from recursive calls and calls to
extract_fixed_bit_field.
> > (extract_bit_field): Remove packedp parameter and corresponding
> > arg to extract_bit_field_1.
> > (extract_fixed_bit_field): Remove packedp parameter.  Remove
code
> > to issue warnings.
> > (extract_split_bit_field): Remove packedp arg from call to
> > extract_fixed_bit_field.
> > * expr.c (emit_group_load_1): Adjust calls to extract_bit_field.
> > (copy_blkmode_from_reg): Likewise.
> > (copy_blkmode_to_reg): Likewise.
> > (read_complex_part): Likewise.
> > (store_field): Likewise.
> > (expand_expr_real_1): Likewise.
> > * calls.c (store_unaligned_arguments_into_pseudos): Adjust call
> > to extract_bit_field.
> > * config/tilegx/tilegx.c (tilegx_expand_unaligned_load): Adjust
> > call to extract_bit_field.
> > * config/tilepro/tilepro.c (tilepro_expand_unaligned_load):
Adjust
> > call to extract_bit_field.
> > * doc/invoke.texi (Code Gen Options): Remove mention of warnings
> > and special packedp behavior from -fstrict-volatile-bitfields
> > documentation.
> >
> > 2013-12-11  Bernd Edlinger  
> >
> > * expr.c (expand_assignment): Remove dependency on
> > flag_strict_volatile_bitfields. Always set the memory
> > access mode.
> > (expand_expr_real_1): Likewise.
> >
> > 2013-12-11  Sandra Loosemore  
> >
> > PR middle-end/23623
> > PR middle-end/48784
> > PR middle-end/56341
> > PR middle-end/56997
> >
> > gcc/
> > * expmed.c (strict_volatile_bitfield_p): New function.
> > (store_bit_field_1): Don't special-case strict volatile
> > bitfields here.
> > (store_bit_field): Handle strict volatile bitfields here
instead.
> > (store_fixed_bit_field): Don't special-case strict volatile
> > bitfields here.
> > (extract_bit_field_1): Don't special-case strict volatile
> > bitfields here.
> > (extract_bit_field): Handle strict volatile bitfields here
instead.
> > (extract_fixed_bit_field): Don't special-case strict volatile
> > bitfields here.  Simplify surrounding code to resemble that in
> > store_fixed_bit_field.
> > * doc/invoke.texi (Code Gen Options): Update
> > -fstrict-volatile-bitfields description.
> >
> > gcc/testsuite/
> > * gcc.dg/pr23623.c: New test.
> > * gcc.dg/pr48784-1.c: New test.
> > * gcc.dg/pr48784-2.c: New test.
> > * gcc.dg/pr56341-1.c: New test.
> > * gcc.dg/pr56341-2.c: New test.
> > * gcc.dg/pr56997-1.c: New test.
> > * gcc.dg/pr56997-2.c: New test.
> > * gcc.dg/pr56997-3.c: New test.
> >
> > 2013-12-11  Bernd Edlinger  
> >  Sandra Loosemore  
> >
> > PR middle-end/23623
> > PR middle-end/48784
> > PR middle-end/56341
> > PR middle-end/56997
> > * expmed.c (strict_volatile_bitfield_p): Add bitregion_start
> > and bitregion_end parameters.  Test for compliance with C++
> > memory model.
> > (store_bit_field): Adjust call to strict_volatile_bitfield_p.
> > Add fallback logic for cases where -fstrict-volatile-bitfields
> > is supposed to apply, but cannot.
> > (extract_bit_field): Likewise. Use narrow_bit_field_mem and
> > extract_fixed_bit_field_1 to do the extraction.
> > (extract_fixed_bit_field): Revert to previous mode selection
algorithm.
> > Call extract_fixed_bit_field_1 to do the real work.
> > (extract_fixed_bit_field_1): New function.
> >
> > testsuite:
> > * gcc.dg/pr23623.c: Update to test interaction with C++
> > memory model.
> >
> > 2

[PATCH, ARM] Document armv7e-m for ARM option -march

2014-02-07 Thread Terry Guo
Hi,

This small patch intends to add missing armv7e-m in the documentation of ARM
option -march. I will commit it to trunk and then back port to 4.7/4.8
branch as obvious.

BR,
Terry

2014-02-08  Terry Guo  

* doc/invoke.texi: Document ARM -march=armv7e-m.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e3dc9df..4d1b657 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12231,8 +12231,8 @@ of the @option{-mcpu=} option.  Permissible names
are: @samp{armv2},
 @samp{armv5}, @samp{armv5t}, @samp{armv5e}, @samp{armv5te},
 @samp{armv6}, @samp{armv6j},
 @samp{armv6t2}, @samp{armv6z}, @samp{armv6zk}, @samp{armv6-m},
-@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
@samp{armv7ve},
-@samp{armv8-a}, @samp{armv8-a+crc},
+@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
@samp{armv7e-m},
+@samp{armv7ve}, @samp{armv8-a}, @samp{armv8-a+crc},
 @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
 
 @option{-march=armv7ve} is the armv7-a architecture with virtualization




Re: [RFC] PR 59776 - esra vs gimple_debug

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 04:37:22PM -0800, Richard Henderson wrote:
> >> Thoughts on how this might really be solved?
> > 
> > Add a VIEW_CONVERT_EXPR around the rhs of the debug statement.
> 
> Well, ok, though I'm pretty sure that the debug info will pretty much barf on
> that immediately.

Why?  That is typically DW_OP_GNU_reinterpret.

Jakub


Re: [RFC] PR 59776 - esra vs gimple_debug

2014-02-07 Thread Richard Henderson
On 02/07/2014 03:12 PM, Richard Biener wrote:
> On February 7, 2014 8:35:16 PM GMT+01:00, Richard Henderson  
> wrote:
>> In the testcases with the PR, we have a bit of type punning going on,
>>
>>  *(int *) &s2.f = 0;
>>  s2 = s1;
>>
>> which SRA trasforms to
>>
>>  # DEBUG s2 => 0
>>  MEM[(int *)&s2] = 0;
>>  # DEBUG s2 => s1$f_7
>>  # DEBUG s2$g => s1$g_6
>>  s2 ={v} {CLOBBER};
>>
>> Note that it has chosen not to expand s1.f like s1.g, but to expand
>> that field
>> as the type-punned integer.  Which means that "s2 => s1$f_7" has
>> mismatched
>> types across lhs and rhs: SI => SF.  Which understandibly ICEs during
>> rtl
>> expansion.
>>
>> I'm not really sure how this is avoided for the actual code generation,
>> but
>> this minimal patch (aka hack) simply drops the debug info to avoid the
>> ICE.
>>
>> Thoughts on how this might really be solved?
> 
> Add a VIEW_CONVERT_EXPR around the rhs of the debug statement.

Well, ok, though I'm pretty sure that the debug info will pretty much barf on
that immediately.

What I really meant is: where's a better place to put this check, since such a
check _must_ exist somewhere else for the regular code generation.


r~


Re: Disable accumulate-outgoing-args for Generic and Buldozers

2014-02-07 Thread Jakub Jelinek
On Thu, Feb 06, 2014 at 06:25:16PM +0100, Jan Hubicka wrote:
> > The expr.[ch]/function.h/tree-tailcall.c bits are ok.
> > I see your changes clash with my PR60077 fix, does your patch make them
> > obsolete and you take care of using proper alignment info?
> > If so, at least the two tests from that PR's patch should be added,
> > but I can do that as a follow-up.
> 
> Yes, this patch was made to fix gcc.target/i386/pr35767-5.c
> by obtaining correct alignment (and also alias) info on the store.
> Sorry for making you to duplicate the effort - seems I should have pinged it
> earlier.

I've committed the two testcases I had in my patch now.

2014-02-08  Jakub Jelinek  

PR target/60077
* gcc.target/i386/pr60077-1.c: New test.
* gcc.target/i386/pr60077-2.c: New test.

--- gcc/testsuite/gcc.target/i386/pr60077-1.c.jj2014-02-06 
11:46:56.772700220 +0100
+++ gcc/testsuite/gcc.target/i386/pr60077-1.c   2014-02-06 11:44:52.0 
+0100
@@ -0,0 +1,18 @@
+/* Test that we generate aligned load when memory is aligned.  */
+/* { dg-do compile } */
+/* { dg-options "-O -mavx -mtune=generic" } */
+/* { dg-final { scan-assembler-not "movups" } } */
+/* { dg-final { scan-assembler "movaps" } } */
+
+typedef float v8sf __attribute__ ((__vector_size__ (32)));
+
+extern void foo (v8sf, v8sf, v8sf, v8sf, v8sf, v8sf, v8sf, v8sf, v8sf);
+
+int
+test (void)
+{
+  v8sf x = { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 };
+
+  foo (x, x, x, x, x, x, x, x, x);
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr60077-2.c.jj2014-02-06 
11:46:59.986683676 +0100
+++ gcc/testsuite/gcc.target/i386/pr60077-2.c   2014-02-06 11:45:04.0 
+0100
@@ -0,0 +1,18 @@
+/* Test that we generate aligned load when memory is aligned.  */
+/* { dg-do compile } */
+/* { dg-options "-O -mavx -mtune=generic" } */
+/* { dg-final { scan-assembler-not "movups" } } */
+/* { dg-final { scan-assembler "movaps" } } */
+
+typedef float v8sf __attribute__ ((__vector_size__ (32)));
+
+extern void foo (int, int, int, int, int, int, int, v8sf, v8sf, v8sf, v8sf, 
v8sf, v8sf, v8sf, v8sf, v8sf);
+
+int
+test (void)
+{
+  v8sf x = { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 };
+
+  foo (1, 2, 3, 4, 5, 6, 7, x, x, x, x, x, x, x, x, x);
+  return 0;
+}


Jakub


RE: [PATCH] Fix Cilk+ catch_exc.cc

2014-02-07 Thread Iyer, Balaji V
Hi Jakub,
This should fix PR 59834 
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59834) on i686-linux.

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Friday, February 7, 2014 6:51 PM
> To: Iyer, Balaji V; Richard Biener
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] Fix Cilk+ catch_exc.cc
> 
> Hi!
> 
> install_builtin calls build_fn_decl, which sets TREE_NOTHROW by default.
> In most cases I think that is desirable, but __cilkrts_rethrow apparently
> conditionally throws an exception, thus marking it TREE_NOTHROW is very
> much undesirable and the fact that the testcase happened to work (except
> for i?86
> recently) must have been by pure accident (that the call was between two
> instructions belonging to the right EH region?).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2014-02-08  Jakub Jelinek  
> 
>   * cilk-common.c (cilk_init_builtins): Clear TREE_NOTHROW
>   flag on __cilkrts_rethrow builtin.
> 
> --- gcc/cilk-common.c.jj  2014-02-06 23:06:47.0 +0100
> +++ gcc/cilk-common.c 2014-02-07 11:11:15.253128977 +0100
> @@ -285,6 +285,7 @@ cilk_init_builtins (void)
>/* __cilkrts_rethrow (struct stack_frame *);  */
>cilk_rethrow_fndecl = install_builtin ("__cilkrts_rethrow", fptr_fun,
>BUILT_IN_CILK_RETHROW, false);
> +  TREE_NOTHROW (cilk_rethrow_fndecl) = 0;
> 
>/* __cilkrts_save_fp_ctrl_state (__cilkrts_stack_frame *);  */
>cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state",
> 
>   Jakub


[PATCH] Fix Cilk+ catch_exc.cc

2014-02-07 Thread Jakub Jelinek
Hi!

install_builtin calls build_fn_decl, which sets TREE_NOTHROW by default.
In most cases I think that is desirable, but __cilkrts_rethrow apparently
conditionally throws an exception, thus marking it TREE_NOTHROW is very much
undesirable and the fact that the testcase happened to work (except for i?86
recently) must have been by pure accident (that the call was between two
instructions belonging to the right EH region?).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-08  Jakub Jelinek  

* cilk-common.c (cilk_init_builtins): Clear TREE_NOTHROW
flag on __cilkrts_rethrow builtin.

--- gcc/cilk-common.c.jj2014-02-06 23:06:47.0 +0100
+++ gcc/cilk-common.c   2014-02-07 11:11:15.253128977 +0100
@@ -285,6 +285,7 @@ cilk_init_builtins (void)
   /* __cilkrts_rethrow (struct stack_frame *);  */
   cilk_rethrow_fndecl = install_builtin ("__cilkrts_rethrow", fptr_fun, 
 BUILT_IN_CILK_RETHROW, false);
+  TREE_NOTHROW (cilk_rethrow_fndecl) = 0;
 
   /* __cilkrts_save_fp_ctrl_state (__cilkrts_stack_frame *);  */
   cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", 

Jakub


Re: [RFC] PR 59776 - esra vs gimple_debug

2014-02-07 Thread Richard Biener
On February 7, 2014 8:35:16 PM GMT+01:00, Richard Henderson  
wrote:
>In the testcases with the PR, we have a bit of type punning going on,
>
>  *(int *) &s2.f = 0;
>  s2 = s1;
>
>which SRA trasforms to
>
>  # DEBUG s2 => 0
>  MEM[(int *)&s2] = 0;
>  # DEBUG s2 => s1$f_7
>  # DEBUG s2$g => s1$g_6
>  s2 ={v} {CLOBBER};
>
>Note that it has chosen not to expand s1.f like s1.g, but to expand
>that field
>as the type-punned integer.  Which means that "s2 => s1$f_7" has
>mismatched
>types across lhs and rhs: SI => SF.  Which understandibly ICEs during
>rtl
>expansion.
>
>I'm not really sure how this is avoided for the actual code generation,
>but
>this minimal patch (aka hack) simply drops the debug info to avoid the
>ICE.
>
>Thoughts on how this might really be solved?

Add a VIEW_CONVERT_EXPR around the rhs of the debug statement.

Richard. 

>
>r~




[PR target/40977] Split ashldi_extsi

2014-02-07 Thread Jeff Law
As outlined in the PR, we end up generating poor code because of the 
existence of the ashldi_extsi insn in the m68k backend.


The pattern recognizes that left shifting a DImode value by 32 deposits 
the low part of the input into the high part of the output and clears 
the low part of the output.


That's a fine thing to recognize, except that it's a two instruction 
insn.  If (for example) the next insn happens to overwrite those low 
bits then the clearing done by the 2nd instruction generated by 
ashldi_extsi is dead.


The obvious way to fix this is to turn the pattern into a 
define_insn_and_split.  One could certainly make an argument that to the 
extent possible every insn that generates multiple instructions like 
this ought to turn into a define_insn_and_split.  However, I don't think 
there's enough interest in the m68k port to warrant the time spent.


Tested by building m68k cross and verifying correct output for the 
testcase.  Installed on the trunk.



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ce9c066..1237904 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2014-02-07  Jeff Law  
 
+   PR target/40977
+   * config/m68k/m68k.md (ashldi_extsi): Turn into a
+   define_insn_and_split.
+
* ipa-inline.c (inline_small_functions): Fix typos.
 
 2014-02-07  Richard Sandiford  
diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 7bf9abd..e61048b 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -4338,25 +4338,18 @@
 
 ;; arithmetic shift instructions
 ;; We don't need the shift memory by 1 bit instruction
-
-(define_insn "ashldi_extsi"
+(define_insn_and_split "ashldi_extsi"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=ro")
 (ashift:DI
   (match_operator:DI 2 "extend_operator"
 [(match_operand:SI 1 "general_operand" "rm")])
   (const_int 32)))]
   ""
-{
-  CC_STATUS_INIT;
-  if (GET_CODE (operands[0]) == REG)
-operands[2] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
-  else
-operands[2] = adjust_address (operands[0], SImode, 4);
-  if (ADDRESS_REG_P (operands[0]))
-return "move%.l %1,%0\;sub%.l %2,%2";
-  else
-return "move%.l %1,%0\;clr%.l %2";
-})
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 3) (match_dup 1))
+   (set (match_dup 2) (const_int 0))]
+  "split_di(operands, 1, operands + 2, operands + 3);")
 
 (define_insn "ashldi_sexthi"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=m,a*d")


Re: [PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)

2014-02-07 Thread Bernhard Reutner-Fischer

On 6 February 2014 16:42:05 Jakub Jelinek  wrote:


Hi!

As discussed on IRC, this patch introduces two new attributes,
so that the C library (and other headers) have a way to
a) tell the compiler something about functions like aligned_alloc
   or memalign
b) tell the compiler the alignment of pointers returned say by malloc

Ok for trunk if bootstrap/regtest passes?



+/* Return the propagation value for functions with assume_aligned
+   or alloc_aligned attribute.  */
+
+static prop_value_t
+bit_value_alloc_assume_aligned_attribute (gimple stmt, tree attr,
+ prop_value_t ptrval,
+ bool alloc_aligned)
+{
+  tree lhs = gimple_call_lhs (stmt), align, misalign = NULL_TREE;
+  tree type = TREE_TYPE (lhs);
+  unsigned HOST_WIDE_INT aligni, misaligni = 0;
+  prop_value_t alignval;
+  double_int value, mask;
+  prop_value_t val;


Do we have an optimization that moves most of the above down..


+  if (ptrval.lattice_val == UNDEFINED)
+return ptrval;
+  gcc_assert ((ptrval.lattice_val == CONSTANT
+  && TREE_CODE (ptrval.value) == INTEGER_CST)
+ || ptrval.mask.is_minus_one ());
+  if (TREE_VALUE (attr) == NULL_TREE)
+return ptrval;
+  attr = TREE_VALUE (attr);
+  align = TREE_VALUE (attr);
+  if (!tree_fits_uhwi_p (align))
+return ptrval;
+  aligni = tree_to_uhwi (align);
+  if (alloc_aligned)
+{
+  if (aligni == 0 || aligni > gimple_call_num_args (stmt))
+   return ptrval;
+  align = gimple_call_arg (stmt, aligni - 1);
+  if (!tree_fits_uhwi_p (align))
+   return ptrval;
+  aligni = tree_to_uhwi (align);
+}
+  if (aligni <= 1
+  || (aligni & (aligni - 1)) != 0)
+return ptrval;
+  if (!alloc_aligned && TREE_CHAIN (attr) && TREE_VALUE (TREE_CHAIN (attr)))
+{
+  misalign = TREE_VALUE (TREE_CHAIN (attr));
+  if (!tree_fits_uhwi_p (misalign))
+   return ptrval;
+  misaligni = tree_to_uhwi (misalign);
+  if (misaligni >= aligni)
+   return ptrval;
+}


.. here, btw? Or would one have to do that manually?
Just curious.
Thanks,


+  align = build_int_cst_type (type, -aligni);
+  alignval = get_value_for_expr (align, true);
+  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
+type, value_to_double_int (ptrval), ptrval.mask,
+type, value_to_double_int (alignval), alignval.mask);
+  if (!mask.is_minus_one ())
+{
+  val.lattice_val = CONSTANT;
+  val.mask = mask;
+  gcc_assert ((mask.low & (aligni - 1)) == 0);
+  gcc_assert ((value.low & (aligni - 1)) == 0);
+  value.low |= misaligni;
+  /* ???  Delay building trees here.  */
+  val.value = double_int_to_tree (type, value);
+}
+  else
+{
+  val.lattice_val = VARYING;
+  val.value = NULL_TREE;
+  val.mask = double_int_minus_one;
+}
+  return val;
+}
+
 /* Evaluate statement STMT.
Valid only for assignments, calls, conditionals, and switches. */




Sent with AquaMail for Android
http://www.aqua-mail.com




Re: [RFA] [middle-end/54041] Convert modes as needed from expand_expr

2014-02-07 Thread Jeff Law

On 02/07/14 02:17, Richard Biener wrote:

diff --git a/gcc/expr.c b/gcc/expr.c
index 878a51b..9609c45 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7708,6 +7708,11 @@ expand_expr_addr_expr_1 (tree exp, rtx target, enum
machine_mode tmode,
  modifier == EXPAND_INITIALIZER
   ? EXPAND_INITIALIZER : EXPAND_NORMAL);

+  /* expand_expr is allowed to return an object in a mode other
+than TMODE.  If it did, we need to convert.  */
+  if (tmode != GET_MODE (tmp))
+   tmp = convert_modes (tmode, GET_MODE (tmp),
+tmp, TYPE_UNSIGNED (TREE_TYPE (offset)));


What about CONSTANT_P tmp?  Don't you need to use
TYPE_MODE (TREE_TYPE (offset)) in that case?
Good question.  Given the behaviour of c_m_a_a_s, we should do nothing 
if GET_MODE (tmp) == VOIDmode.  I'll update the patch and resubmit.


jeff



Re: RFA: patch for PR59535

2014-02-07 Thread Jeff Law

On 02/07/14 11:20, Vladimir Makarov wrote:

   The following patch improves code size for ARM.  Before the patch
CSiBE size generated by GCC configured --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float=hard (with -mthumb) was

2414926

After the patch the size is

2396798

For comparison, when the reload pass is used the size is

2400154

The change in arm.h is to prevent reloading sp as an address by LRA.
Reload has no such problem as it uses legitimate address hook and LRA
mostly relies on base_reg_class.

Richard, is this part ok to commit to the trunk?
I think so.  It's fixing a P1 regression, that makes it "in scope" as 
far as I'm concerned.





The change in lra-constraints.c is for correct alternative choice in
move patterns when pseudo is of class of general reg and one alternative
contains lo regs and another one contains hi regs.

The patch was bootstrapped on x86/x86-64 and arm.

2014-02-07  Vladimir Makarov  

 PR rtl-optimization/59535
 * lra-constraints.c (process_alt_operands): Encourage alternative
 when unassigned pseudo class is superset of the alternative class.
 * config/arm/arm.h (MODE_BASE_REG_CLASS): Return CORE_REGS for
 Thumb2 for LRA.

Just one nit in the comment in lra-constraints.c
s/stil/still/
Jeff




Re: remove C_EXPR_APPEND macro

2014-02-07 Thread Prathamesh Kulkarni
On Sat, Feb 8, 2014 at 3:06 AM, Marek Polacek  wrote:
> On Sat, Feb 08, 2014 at 02:51:03AM +0530, Prathamesh Kulkarni wrote:
>> On Sat, Feb 8, 2014 at 2:22 AM, Joseph S. Myers  
>> wrote:
>> > On Sat, 8 Feb 2014, Prathamesh Kulkarni wrote:
>> >
>> >> This patch removes C_EXPR_APPEND macro in c-tree.h
>> >> OK for trunk ?
>> >
>> > Thanks, this is OK with the orphan comment "A varray of c_expr_t." also
>> > removed (please send the revised patch if you'd like someone to commit it
>> > for you).
>> Ah, I missed that, sorry.
>> Here's the revised patch:
>>
>> * c-parser.c (c_parser_get_builtin_args): replace calls to
>> C_EXPR_APPEND (cexpr_list, expr) by vec_safe_push (cexpr_list, expr)
>>
>> * c-tree.h (C_EXPR_APPEND): removed
>
> I'll fix up the CL and commit it for you.  I suppose you tested this
> patch.
Yes, bootstrapped/reg tested on x86_64-linux-gnu
>
> Marek


Re: [Bug fortran/60066] Bad elemental invocation of non-scalar base object

2014-02-07 Thread Mikael Morin
Le 07/02/2014 19:18, Paul Richard Thomas a écrit :
> Dear All,
> 
> I propose to add the attached to the testsuite.  It is the testcase
> from PR60066, which was fixed by the patch for PR59066.
> 
> OK for trunk, 4.8 and 4.7?
> 
Yes, sure.

Mikael.


Re: remove C_EXPR_APPEND macro

2014-02-07 Thread Marek Polacek
On Sat, Feb 08, 2014 at 02:51:03AM +0530, Prathamesh Kulkarni wrote:
> On Sat, Feb 8, 2014 at 2:22 AM, Joseph S. Myers  
> wrote:
> > On Sat, 8 Feb 2014, Prathamesh Kulkarni wrote:
> >
> >> This patch removes C_EXPR_APPEND macro in c-tree.h
> >> OK for trunk ?
> >
> > Thanks, this is OK with the orphan comment "A varray of c_expr_t." also
> > removed (please send the revised patch if you'd like someone to commit it
> > for you).
> Ah, I missed that, sorry.
> Here's the revised patch:
> 
> * c-parser.c (c_parser_get_builtin_args): replace calls to
> C_EXPR_APPEND (cexpr_list, expr) by vec_safe_push (cexpr_list, expr)
> 
> * c-tree.h (C_EXPR_APPEND): removed

I'll fix up the CL and commit it for you.  I suppose you tested this
patch.

Marek


Re: remove C_EXPR_APPEND macro

2014-02-07 Thread Prathamesh Kulkarni
On Sat, Feb 8, 2014 at 2:22 AM, Joseph S. Myers  wrote:
> On Sat, 8 Feb 2014, Prathamesh Kulkarni wrote:
>
>> This patch removes C_EXPR_APPEND macro in c-tree.h
>> OK for trunk ?
>
> Thanks, this is OK with the orphan comment "A varray of c_expr_t." also
> removed (please send the revised patch if you'd like someone to commit it
> for you).
Ah, I missed that, sorry.
Here's the revised patch:

* c-parser.c (c_parser_get_builtin_args): replace calls to
C_EXPR_APPEND (cexpr_list, expr) by vec_safe_push (cexpr_list, expr)

* c-tree.h (C_EXPR_APPEND): removed

Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 207610)
+++ gcc/c/c-parser.c(working copy)
@@ -6659,12 +6659,12 @@ c_parser_get_builtin_args (c_parser *par
   force_folding_builtin_constant_p
 = saved_force_folding_builtin_constant_p;
   vec_alloc (cexpr_list, 1);
-  C_EXPR_APPEND (cexpr_list, expr);
+  vec_safe_push (cexpr_list, expr);
   while (c_parser_next_token_is (parser, CPP_COMMA))
 {
   c_parser_consume_token (parser);
   expr = c_parser_expr_no_commas (parser, NULL);
-  C_EXPR_APPEND (cexpr_list, expr);
+  vec_safe_push (cexpr_list, expr);
 }

   if (!c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"))
Index: gcc/c/c-tree.h
===
--- gcc/c/c-tree.h(revision 207610)
+++ gcc/c/c-tree.h(working copy)
@@ -132,15 +132,6 @@ struct c_expr
inside the VEC types.  */
 typedef struct c_expr c_expr_t;

-/* A varray of c_expr_t.  */
-
-/* Append a new c_expr_t element to V.  */
-#define C_EXPR_APPEND(V, ELEM) \
-  do { \
-c_expr_t __elem = (ELEM); \
-vec_safe_push (V, __elem); \
-  } while (0)
-
 /* A kind of type specifier.  Note that this information is currently
only used to distinguish tag definitions, tag references and typeof
uses.  */
>
> Although this is small enough to go in without a copyright assignment,
> it's a good idea to start on the paperwork if you might be making larger
> changes in future.
>
> http://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future
Thanks
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Fix _Unwind_GetIPInfo detection on Mac OS X 10.4

2014-02-07 Thread Ian Lance Taylor
On Fri, Feb 7, 2014 at 12:59 PM, Misty De Meo  wrote:
> On Fri, Feb 7, 2014 at 11:09 AM, Ian Lance Taylor  wrote:
>> On Fri, Feb 7, 2014 at 9:06 AM, Misty De Meo  wrote:
>>> Revision 192853 added a new test for availability of _Unwind_GetIPInfo
>>> in the system unwinder to the configure script of libbacktrace:
>>> http://repo.or.cz/w/official-gcc.git/commitdiff/a4a5a77adfc9c28d6963e5ae054c997d57cfc7fa
>>> It was apparently added to fix a bug building with GCC 4.0 on Mac OS X
>>> 10.5.
>>>
>>> This is one of a few checks for that function (there's also one in the
>>> top-level configure script). Unfortunately, while the other tests have
>>> special handling for Mac OS X 10.4, this new test does not take it
>>> into account.
>>>
>>> In 10.4, _Unwind_GetIPInfo is not exported in the system unwinder.
>>> Trying to build a test program will succeed, but will fail to link due
>>> to the symbol being undefined. The test added to libbacktrace's
>>> configure used AC_COMPILE_IFELSE, which meant it didn't try to link
>>> and erroneously flagged _Unwind_GetIPInfo as usable. This led to build
>>> errors when linking.
>>>
>>> This small patch changes configure to use AC_LINK_IFELSE, which fixes
>>> the issue on OS X 10.4. I've been able to build GCC successfully.
>>>
>>> GCC 4.8 and 4.9 are affected by this issue.
>>>
>>> A bug report is filed for this as #58710
>>> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58710).
>>>
>>> diff --git libbacktrace/configure.ac libbacktrace/configure.ac
>>> index 28b2a1c..e0e0e08 100644
>>> --- libbacktrace/configure.ac
>>> +++ libbacktrace/configure.ac
>>> @@ -144,7 +144,7 @@ else
>>>ac_save_CFFLAGS="$CFLAGS"
>>>CFLAGS="$CFLAGS -Werror-implicit-function-
>>> declaration"
>>>AC_MSG_CHECKING([for _Unwind_GetIPInfo])
>>> -  AC_COMPILE_IFELSE(
>>> +  AC_LINK_IFELSE(
>>>  [AC_LANG_PROGRAM(
>>> [#include "unwind.h"
>>>  struct _Unwind_Context *context;
>>
>>
>> This is OK with a ChangeLog entry for 4.8 branch and mainline.
>
> Thanks. Is this sufficient?
>
> 2014-02-07  Misty De Meo  
>
> target/PR58710
> * configure.ac: Use AC_LINK_IFELSE in check for _Unwind_GetIPInfo.
> * configure: Regenerate.

Thanks.

Committed to mainline and 4.8 branch.

Ian


Re: Fix _Unwind_GetIPInfo detection on Mac OS X 10.4

2014-02-07 Thread Misty De Meo
On Fri, Feb 7, 2014 at 11:09 AM, Ian Lance Taylor  wrote:
> On Fri, Feb 7, 2014 at 9:06 AM, Misty De Meo  wrote:
>> Revision 192853 added a new test for availability of _Unwind_GetIPInfo
>> in the system unwinder to the configure script of libbacktrace:
>> http://repo.or.cz/w/official-gcc.git/commitdiff/a4a5a77adfc9c28d6963e5ae054c997d57cfc7fa
>> It was apparently added to fix a bug building with GCC 4.0 on Mac OS X
>> 10.5.
>>
>> This is one of a few checks for that function (there's also one in the
>> top-level configure script). Unfortunately, while the other tests have
>> special handling for Mac OS X 10.4, this new test does not take it
>> into account.
>>
>> In 10.4, _Unwind_GetIPInfo is not exported in the system unwinder.
>> Trying to build a test program will succeed, but will fail to link due
>> to the symbol being undefined. The test added to libbacktrace's
>> configure used AC_COMPILE_IFELSE, which meant it didn't try to link
>> and erroneously flagged _Unwind_GetIPInfo as usable. This led to build
>> errors when linking.
>>
>> This small patch changes configure to use AC_LINK_IFELSE, which fixes
>> the issue on OS X 10.4. I've been able to build GCC successfully.
>>
>> GCC 4.8 and 4.9 are affected by this issue.
>>
>> A bug report is filed for this as #58710
>> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58710).
>>
>> diff --git libbacktrace/configure.ac libbacktrace/configure.ac
>> index 28b2a1c..e0e0e08 100644
>> --- libbacktrace/configure.ac
>> +++ libbacktrace/configure.ac
>> @@ -144,7 +144,7 @@ else
>>ac_save_CFFLAGS="$CFLAGS"
>>CFLAGS="$CFLAGS -Werror-implicit-function-
>> declaration"
>>AC_MSG_CHECKING([for _Unwind_GetIPInfo])
>> -  AC_COMPILE_IFELSE(
>> +  AC_LINK_IFELSE(
>>  [AC_LANG_PROGRAM(
>> [#include "unwind.h"
>>  struct _Unwind_Context *context;
>
>
> This is OK with a ChangeLog entry for 4.8 branch and mainline.

Thanks. Is this sufficient?

2014-02-07  Misty De Meo  

target/PR58710
* configure.ac: Use AC_LINK_IFELSE in check for _Unwind_GetIPInfo.
* configure: Regenerate.


Re: remove C_EXPR_APPEND macro

2014-02-07 Thread Joseph S. Myers
On Sat, 8 Feb 2014, Prathamesh Kulkarni wrote:

> This patch removes C_EXPR_APPEND macro in c-tree.h
> OK for trunk ?

Thanks, this is OK with the orphan comment "A varray of c_expr_t." also 
removed (please send the revised patch if you'd like someone to commit it 
for you).

Although this is small enough to go in without a copyright assignment, 
it's a good idea to start on the paperwork if you might be making larger 
changes in future.

http://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future

-- 
Joseph S. Myers
jos...@codesourcery.com


remove C_EXPR_APPEND macro

2014-02-07 Thread Prathamesh Kulkarni
This patch removes C_EXPR_APPEND macro in c-tree.h
OK for trunk ?

* c-parser.c (c_parser_get_builtin_args): replace calls to
C_EXPR_APPEND (cexpr_list, expr) by vec_safe_push (cexpr_list, expr)

* c-tree.h (C_EXPR_APPEND): removed

Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 207610)
+++ gcc/c/c-parser.c(working copy)
@@ -6659,12 +6659,12 @@ c_parser_get_builtin_args (c_parser *par
   force_folding_builtin_constant_p
 = saved_force_folding_builtin_constant_p;
   vec_alloc (cexpr_list, 1);
-  C_EXPR_APPEND (cexpr_list, expr);
+  vec_safe_push (cexpr_list, expr);
   while (c_parser_next_token_is (parser, CPP_COMMA))
 {
   c_parser_consume_token (parser);
   expr = c_parser_expr_no_commas (parser, NULL);
-  C_EXPR_APPEND (cexpr_list, expr);
+  vec_safe_push (cexpr_list, expr);
 }

   if (!c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"))
Index: gcc/c/c-tree.h
===
--- gcc/c/c-tree.h(revision 207610)
+++ gcc/c/c-tree.h(working copy)
@@ -134,13 +134,6 @@ typedef struct c_expr c_expr_t;

 /* A varray of c_expr_t.  */

-/* Append a new c_expr_t element to V.  */
-#define C_EXPR_APPEND(V, ELEM) \
-  do { \
-c_expr_t __elem = (ELEM); \
-vec_safe_push (V, __elem); \
-  } while (0)
-
 /* A kind of type specifier.  Note that this information is currently
only used to distinguish tag definitions, tag references and typeof
uses.  */


Re: [Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension

2014-02-07 Thread Mikael Morin
Le 07/02/2014 21:42, Mikael Morin a écrit :
> maybe add gcc_assert to
> make it clear that fini->proc_tree should be set at this point.
Or better: a comment ;-)


Re: [Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension

2014-02-07 Thread Mikael Morin
Le 06/02/2014 23:40, Janus Weil a écrit :
> Hi Mikael,
> 
> thanks for your comments ...
> 
>>> attached is a small patch which fixes an ICE-on-invalid regression
>>> with finalization. In the PR, Dominique objected to the patch, but I
>>> think it's the correct thing to do after all. The line that I'm
>>> removing was added in a patch authored by Tobias and myself. I suspect
>>> it was added to work around some other problem in the finalization
>>> implementation, and there is no evidence it's actually needed.
>>>
>>> The patch regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk?
>>>
>> Wait a bit; let's try to understand the problem.
>>
>> Normally I would say calling gfc_is_finalizable here in
>> resolve_fl_derived0 is harmless because gfc_resolve_finalizers has been
>> called before in resolve_fl_derived.
>> BUT:
>> resolve_fl_derived0 recurses on its parent type, while
>> gfc_resolve_finalizers doesn't; and in this case we end up recursing on
>> type "cfml" whose finalizers haven't been resolved yet.
> 
> Yes, that's more or less what happens.
> 
> And the real problem is that gfc_is_finalzable already generates the
> finalization wrapper (via gfc_find_derived_vtab ->
> generate_finalizaton_wrapper) before we have checked that the
> finalizer is actually valid (which is what gfc_resolve_finalizers
> does). Once we get into gfc_resolve_finalizers, it is fooled to
> believe that the finalizer has already been resolved and therefore
> skips the checks and produces no error message.
> 
> 
>> Now whether your patch is the right thing to do... I'm a bit skeptical
>> about removing the one use of gfc_is_finalizable in resolve.c.
> 
> Well, all others occurrences of 'gfc_is_finalizable' are in trans*, so
> this is the only one that comes too early.
> 
Yeah OK.  gfc_is_finalizable is almost a no-op anyway (assuming the vtab
has been generated at resolution stage).
I suggest the following additional patch to make sure that the
finalization wrapper is never generated without prior resolution (in
this case, it replaces one ICE with another); maybe add gcc_assert to
make it clear that fini->proc_tree should be set at this point.
Patch is OK anyway. Thanks.

Mikael

diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index d3569fd..20488c0 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -1880,8 +1880,6 @@ generate_finalization_wrapper (gfc_symbol
*derived, gfc_namespace *ns,

   for (fini = derived->f2k_derived->finalizers; fini; fini =
fini->next)
{
- if (!fini->proc_tree)
-   fini->proc_tree = gfc_find_sym_in_symtree (fini->proc_sym);
  if (fini->proc_tree->n.sym->attr.elemental)
{
  fini_elem = fini;



[RFC] PR 59776 - esra vs gimple_debug

2014-02-07 Thread Richard Henderson
In the testcases with the PR, we have a bit of type punning going on,

  *(int *) &s2.f = 0;
  s2 = s1;

which SRA trasforms to

  # DEBUG s2 => 0
  MEM[(int *)&s2] = 0;
  # DEBUG s2 => s1$f_7
  # DEBUG s2$g => s1$g_6
  s2 ={v} {CLOBBER};

Note that it has chosen not to expand s1.f like s1.g, but to expand that field
as the type-punned integer.  Which means that "s2 => s1$f_7" has mismatched
types across lhs and rhs: SI => SF.  Which understandibly ICEs during rtl
expansion.

I'm not really sure how this is avoided for the actual code generation, but
this minimal patch (aka hack) simply drops the debug info to avoid the ICE.

Thoughts on how this might really be solved?


r~
diff --git a/gcc/testsuite/gcc.dg/debug/pr59776.c 
b/gcc/testsuite/gcc.dg/debug/pr59776.c
new file mode 100644
index 000..245c3d7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/pr59776.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+struct S { float f, g; };
+
+void
+sub_ (struct S *p)
+{
+  struct S s1, s2;
+  s1 = *p;
+  *(int *) &s2.f = 0;
+  s2 = s1;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 4992b4c..f9ff0a4 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2950,9 +2950,15 @@ load_assign_lhs_subreplacements (struct access *lacc, 
struct access *top_racc,
  lacc);
  else
drhs = NULL_TREE;
- ds = gimple_build_debug_bind (get_access_replacement (lacc),
-   drhs, gsi_stmt (*old_gsi));
- gsi_insert_after (new_gsi, ds, GSI_NEW_STMT);
+
+ // ??? Can this type mismatch only happen with debuginfo,
+ // or can it happen with real code as well?
+ if (drhs && types_compatible_p (lacc->type, TREE_TYPE (drhs)))
+   {
+ ds = gimple_build_debug_bind (get_access_replacement (lacc),
+   drhs, gsi_stmt (*old_gsi));
+ gsi_insert_after (new_gsi, ds, GSI_NEW_STMT);
+   }
}
}
 


Re: Fix _Unwind_GetIPInfo detection on Mac OS X 10.4

2014-02-07 Thread Ian Lance Taylor
On Fri, Feb 7, 2014 at 9:06 AM, Misty De Meo  wrote:
> Revision 192853 added a new test for availability of _Unwind_GetIPInfo
> in the system unwinder to the configure script of libbacktrace:
> http://repo.or.cz/w/official-gcc.git/commitdiff/a4a5a77adfc9c28d6963e5ae054c997d57cfc7fa
> It was apparently added to fix a bug building with GCC 4.0 on Mac OS X
> 10.5.
>
> This is one of a few checks for that function (there's also one in the
> top-level configure script). Unfortunately, while the other tests have
> special handling for Mac OS X 10.4, this new test does not take it
> into account.
>
> In 10.4, _Unwind_GetIPInfo is not exported in the system unwinder.
> Trying to build a test program will succeed, but will fail to link due
> to the symbol being undefined. The test added to libbacktrace's
> configure used AC_COMPILE_IFELSE, which meant it didn't try to link
> and erroneously flagged _Unwind_GetIPInfo as usable. This led to build
> errors when linking.
>
> This small patch changes configure to use AC_LINK_IFELSE, which fixes
> the issue on OS X 10.4. I've been able to build GCC successfully.
>
> GCC 4.8 and 4.9 are affected by this issue.
>
> A bug report is filed for this as #58710
> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58710).
>
> diff --git libbacktrace/configure.ac libbacktrace/configure.ac
> index 28b2a1c..e0e0e08 100644
> --- libbacktrace/configure.ac
> +++ libbacktrace/configure.ac
> @@ -144,7 +144,7 @@ else
>ac_save_CFFLAGS="$CFLAGS"
>CFLAGS="$CFLAGS -Werror-implicit-function-
> declaration"
>AC_MSG_CHECKING([for _Unwind_GetIPInfo])
> -  AC_COMPILE_IFELSE(
> +  AC_LINK_IFELSE(
>  [AC_LANG_PROGRAM(
> [#include "unwind.h"
>  struct _Unwind_Context *context;


This is OK with a ChangeLog entry for 4.8 branch and mainline.

Thanks.

Ian


Re: [Patch, Fortran] PR 58470: [4.9 Regression] [OOP] ICE on invalid with FINAL procedure and type extension

2014-02-07 Thread Janus Weil
> But after all I think that the patch should not hurt. After giving it
> some second thoughts, the only alternative I could see is this:
>
> Index: gcc/fortran/resolve.c
> ===
> --- gcc/fortran/resolve.c(revision 207485)
> +++ gcc/fortran/resolve.c(working copy)
> @@ -11224,13 +11224,6 @@ gfc_resolve_finalizers (gfc_symbol* derived)
>gfc_finalizer* i;
>int my_rank;
>
> -  /* Skip this finalizer if we already resolved it.  */
> -  if (list->proc_tree)
> -{
> -  prev_link = &(list->next);
> -  continue;
> -}
> -
>/* Check this exists and is a SUBROUTINE.  */
>if (!list->proc_sym->attr.subroutine)
>  {
>
>
> It also gets rid of the ICE, but I haven't regtested it yet. Does this
> look better to you than the original patch? (It might give duplicate
> error messages in some cases?)

Unfortunately this ICEs on a good number of finalize_* test cases ...

Cheers,
Janus


RFA: patch for PR59535

2014-02-07 Thread Vladimir Makarov
  The following patch improves code size for ARM.  Before the patch 
CSiBE size generated by GCC configured --with-arch=armv7-a 
--with-fpu=vfpv3-d16 --with-float=hard (with -mthumb) was


2414926

After the patch the size is

2396798

For comparison, when the reload pass is used the size is

2400154

The change in arm.h is to prevent reloading sp as an address by LRA. 
Reload has no such problem as it uses legitimate address hook and LRA 
mostly relies on base_reg_class.


Richard, is this part ok to commit to the trunk?

The change in lra-constraints.c is for correct alternative choice in 
move patterns when pseudo is of class of general reg and one alternative 
contains lo regs and another one contains hi regs.


The patch was bootstrapped on x86/x86-64 and arm.

2014-02-07  Vladimir Makarov  

PR rtl-optimization/59535
* lra-constraints.c (process_alt_operands): Encourage alternative
when unassigned pseudo class is superset of the alternative class.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Return CORE_REGS for
Thumb2 for LRA.

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 207562)
+++ lra-constraints.c   (working copy)
@@ -2112,6 +2112,21 @@ process_alt_operands (int only_alternati
  goto fail;
}
 
+ /* If not assigned pseudo has a class which a subset of
+required reg class, it is a less costly alternative
+as the pseudo stil can get a hard reg of necessary
+class.  */
+ if (! no_regs_p && REG_P (op) && hard_regno[nop] < 0
+ && (cl = get_reg_class (REGNO (op))) != NO_REGS
+ && ira_class_subset_p[this_alternative][cl])
+   {
+ if (lra_dump_file != NULL)
+   fprintf
+ (lra_dump_file,
+  "%d Super set class reg: reject-=3\n", nop);
+ reject -= 3;
+   }
+
  this_alternative_offmemok = offmemok;
  if (this_costly_alternative != NO_REGS)
{
Index: config/arm/arm.h
===
--- config/arm/arm.h(revision 207562)
+++ config/arm/arm.h(working copy)
@@ -1272,8 +1272,8 @@ enum reg_class
when addressing quantities in QI or HI mode; if we don't know the
mode, then we must be conservative.  */
 #define MODE_BASE_REG_CLASS(MODE)  \
-(TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS :  \
- (((MODE) == SImode) ? BASE_REGS : LO_REGS))
+(TARGET_ARM || (TARGET_THUMB2 && (!optimize_size || arm_lra_flag)) \
+ ? CORE_REGS : ((MODE) == SImode ? BASE_REGS : LO_REGS))
 
 /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
instead of BASE_REGS.  */


Contents of PO file 'cpplib-4.9-b20140202.fi.po'

2014-02-07 Thread Translation Project Robot


cpplib-4.9-b20140202.fi.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



New Finnish PO file for 'cpplib' (version 4.9-b20140202)

2014-02-07 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Finnish team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/fi.po

(This file, 'cpplib-4.9-b20140202.fi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [Bug fortran/60066] Bad elemental invocation of non-scalar base object

2014-02-07 Thread Paul Richard Thomas
Dear All,

I propose to add the attached to the testsuite.  It is the testcase
from PR60066, which was fixed by the patch for PR59066.

OK for trunk, 4.8 and 4.7?

On 5 February 2014 12:38, pault at gcc dot gnu.org
 wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60066
>
> Paul Thomas  changed:
>
>What|Removed |Added
> 
>  Resolution|DUPLICATE   |FIXED
>
> --- Comment #8 from Paul Thomas  ---
> (In reply to Dominique d'Humieres from comment #5)
>> > I have applied the patch at 
>> > http://gcc.gnu.org/ml/fortran/2014-02/txtX3eVILZEGw.txt
>> > on top of 4.8.3 r206497 and the test runs successfully ...
>>
>> Marking as duplicate of pr49906.
>>
>> Paul,
>>
>> For the record, no regression when testing with
>>
>> make -k -j8 check-gfortran RUNTESTFLAGS="--target_board=unix'{-m32,-m64}'"
>>
>> *** This bug has been marked as a duplicate of bug 49906 ***
>
> I will, however, add this testcase to that of PR59906 - it is different yet
> again from the verification tests although it is fixed by the patch.
>
> Cheers
>
> Pau
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy
Index: gcc/testsuite/gfortran.dg/elemental_subroutine_10.f90
===
*** gcc/testsuite/gfortran.dg/elemental_subroutine_10.f90   (revision 0)
--- gcc/testsuite/gfortran.dg/elemental_subroutine_10.f90   (working copy)
***
*** 0 
--- 1,68 
+ ! { dg-do run }
+ !
+ ! PR fortran/60066
+ !
+ ! Contributed by F Martinez Fadrique  
+ !
+ ! Fixed by the patch for PR59906 but adds another, different test.
+ !
+ module m_assertion_character
+   implicit none
+   type :: t_assertion_character
+ character(len=8) :: name
+   contains
+ procedure :: assertion_character
+ procedure :: write => assertion_array_write
+   end type t_assertion_character
+ contains
+   elemental subroutine assertion_character( ast, name )
+ class(t_assertion_character), intent(out) :: ast
+ character(len=*), intent(in) :: name
+ ast%name = name
+   end subroutine assertion_character
+   subroutine assertion_array_write( ast, unit )
+ class(t_assertion_character), intent(in) :: ast
+ character(*), intent(inOUT) :: unit
+ write(unit,*) trim (unit(2:len(unit)))//trim (ast%name)
+   end subroutine assertion_array_write
+ end module m_assertion_character
+ 
+ module m_assertion_array_character
+   use m_assertion_character
+   implicit none
+   type :: t_assertion_array_character
+ type(t_assertion_character), dimension(:), allocatable :: rast
+   contains
+ procedure :: assertion_array_character
+ procedure :: write => assertion_array_character_write
+   end type t_assertion_array_character
+ contains
+   pure subroutine assertion_array_character( ast, name, nast )
+ class(t_assertion_array_character), intent(out) :: ast
+ character(len=*), intent(in) :: name
+ integer, intent(in) :: nast
+ integer :: i
+ allocate ( ast%rast(nast) )
+ call ast%rast%assertion_character ( name )
+   end subroutine assertion_array_character
+   subroutine assertion_array_character_write( ast, unit )
+ class(t_assertion_array_character), intent(in) :: ast
+ CHARACTER(*), intent(inOUT) :: unit
+ integer :: i
+ do i = 1, size (ast%rast)
+   call ast%rast(i)%write (unit)
+ end do
+   end subroutine assertion_array_character_write
+ end module m_assertion_array_character
+ 
+ program main
+   use m_assertion_array_character
+   implicit none
+   type(t_assertion_array_character) :: ast
+   character(len=8) :: name
+   character (26) :: line = ''
+   name = 'test'
+   call ast%assertion_array_character ( name, 5 )
+   call ast%write (line)
+   if (line(2:len (line)) .ne. "testtesttesttesttest") call abort
+ end program main


Re: [PATCH] optabs: Allow CAS expanders to fail

2014-02-07 Thread Joseph S. Myers
On Fri, 7 Feb 2014, Andreas Krebbel wrote:

> Hi,
> 
> on S/390 128 bit atomic operations are not allowed for misaligned
> operands.  The expanders are supposed to FAIL in that case.  While it
> works for the other routines atomic_load/store it does not work
> currently during compare and swap expansion.
> 
> The patch just turns an expand_insn into maybe_expand_insn to allow
> other alternatives to be chosen when the expander fails.

I expect this is correct for cases when a misaligned operand gets through 
that the front end thought was aligned - I doubt it's possible to avoid 
all such cases where misalignment only becomes visible here.  But it also 
suggests there may well be problems in other places.  In particular:

* c-common.c:resolve_overloaded_atomic_* may need to check if the 
alignment is sufficient and generate library calls if not.  (This would 
need to be conservative about alignment, like C11 _Alignof, for cases such 
as x86 long long in structures; see 
.)

* It's possible your target should also increase alignment of relevant 
types when _Atomic-qualified.  (But increasing to an alignment greater 
than is returned by malloc may not be a good idea.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFA: MN10300: Include saved registers in stack usage

2014-02-07 Thread Jeff Law

On 02/06/14 08:37, nick clifton wrote:

Hi Jeff,


According to our coding conventions, the ability to build with something
other than gcc is still desirable.  You could argue that you're unlikely
to be bootstrapping on a mn103 with something other than GCC and if
you're building a cross, you could start by first building gcc native.

However, it's pretty easy to avoid the headaches and just provide a
popcount routine.


OK, here is a version of the patch with a homebrew popcount() function
in it.  OK to apply ?

Yes.  This is fine.
jeff



Re: [PATCH] Fix PR52289, a typoed word in an error message

2014-02-07 Thread Jeff Law

On 02/06/14 13:39, Benno Schulenberg wrote:


[Oops, had a wrong bug number in the subject line.]

Below patch fixes another miswording in an error message,
reported by Roland Stigge.  Please apply.


2014-02-06  Benno Schulenberg  

 PR translation/52289
 * fortran/resolve.c (resolve_ordinary_assign): Fix typoed word
 in an error message.

Thanks.  Installed.

jeff



Fix more trivial comment typos

2014-02-07 Thread Jeff Law


I meant to install this a week or so ago, but got sidetracked by other 
more pressing issues.


Installed on the trunk as obvious.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5703bb5..ce9c066 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2014-02-07  Jeff Law  
+
+   * ipa-inline.c (inline_small_functions): Fix typos.
+
 2014-02-07  Richard Sandiford  
 
* config/s390/s390-protos.h (s390_can_use_simple_return_insn)
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index ce24ea5..d304133 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -1749,9 +1749,9 @@ inline_small_functions (void)
  continue;
}
 
-  /* Heuristics for inlining small functions works poorly for
-recursive calls where we do efect similar to loop unrolling.
-When inliing such edge seems profitable, leave decision on
+  /* Heuristics for inlining small functions work poorly for
+recursive calls where we do effects similar to loop unrolling.
+When inlining such edge seems profitable, leave decision on
 specific inliner.  */
   if (cgraph_edge_recursive_p (edge))
{
@@ -1779,10 +1779,11 @@ inline_small_functions (void)
  struct cgraph_node *outer_node = NULL;
  int depth = 0;
 
- /* Consider the case where self recursive function A is inlined into 
B.
-This is desired optimization in some cases, since it leads to 
effect
-similar of loop peeling and we might completely optimize out the
-recursive call.  However we must be extra selective.  */
+ /* Consider the case where self recursive function A is inlined
+into B.  This is desired optimization in some cases, since it
+leads to effect similar of loop peeling and we might completely
+optimize out the recursive call.  However we must be extra
+selective.  */
 
  where = edge->caller;
  while (where->global.inlined_to)


Re: minor help message fix

2014-02-07 Thread Xinliang David Li
On Fri, Feb 7, 2014 at 1:22 AM, Richard Biener
 wrote:
> On Thu, Feb 6, 2014 at 10:30 PM, Xinliang David Li  wrote:
>> Hi the following patch removes the 'state' print for
>> -ftree-tree-vectorize option which does not make sense anymore. Ok for
>> trunk?
>
> Hmm, isn't it more appropriate to remove 'Report' from ftree-vectorize
> in common.opt?

For a supported option, we should probably report it. Do we want to
deprecate it in the future?

> Or simply treat the [enabled]/[disabled] literally?

Not clear what you mean.  I was thinking putting something like [see
-ftree-loop-vectorize and -ftree-slp-vectorize] but it wraps around
and looks bad.

> Or simply also enable (redundantly) OPT_ftree_vectorize at -O3?

This does not feel right. The flag does not represent one single
optimization. Imaging we have a default level where loop vectorize is
on, but slp is off (O2 will likely end up like that), what will the
enable/disable state for tree-vectorize?

David


>
> Richard.
>
>> thanks,
>>
>> David


Fix _Unwind_GetIPInfo detection on Mac OS X 10.4

2014-02-07 Thread Misty De Meo
Revision 192853 added a new test for availability of _Unwind_GetIPInfo
in the system unwinder to the configure script of libbacktrace:
http://repo.or.cz/w/official-gcc.git/commitdiff/a4a5a77adfc9c28d6963e5ae054c997d57cfc7fa
It was apparently added to fix a bug building with GCC 4.0 on Mac OS X
10.5.

This is one of a few checks for that function (there's also one in the
top-level configure script). Unfortunately, while the other tests have
special handling for Mac OS X 10.4, this new test does not take it
into account.

In 10.4, _Unwind_GetIPInfo is not exported in the system unwinder.
Trying to build a test program will succeed, but will fail to link due
to the symbol being undefined. The test added to libbacktrace's
configure used AC_COMPILE_IFELSE, which meant it didn't try to link
and erroneously flagged _Unwind_GetIPInfo as usable. This led to build
errors when linking.

This small patch changes configure to use AC_LINK_IFELSE, which fixes
the issue on OS X 10.4. I've been able to build GCC successfully.

GCC 4.8 and 4.9 are affected by this issue.

A bug report is filed for this as #58710
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58710).

diff --git libbacktrace/configure.ac libbacktrace/configure.ac
index 28b2a1c..e0e0e08 100644
--- libbacktrace/configure.ac
+++ libbacktrace/configure.ac
@@ -144,7 +144,7 @@ else
   ac_save_CFFLAGS="$CFLAGS"
   CFLAGS="$CFLAGS -Werror-implicit-function-
declaration"
   AC_MSG_CHECKING([for _Unwind_GetIPInfo])
-  AC_COMPILE_IFELSE(
+  AC_LINK_IFELSE(
 [AC_LANG_PROGRAM(
[#include "unwind.h"
 struct _Unwind_Context *context;


Re: [PATCH] Fix linemap_location_before_p with adhoc locs (PR preprocessor/56824)

2014-02-07 Thread Tom Tromey
> "Jakub" == Jakub Jelinek  writes:

Jakub> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.  Thanks.

Tom


Re: [PATCH] New optimize(0) versioning fix (PR target/60026, take 2)

2014-02-07 Thread Jan Hubicka
> On Fri, Feb 07, 2014 at 12:50:22AM +0100, Jan Hubicka wrote:
> > Don't we want to check opt_for_fn (node->decl, cp) instead and arrange 
> > -fipa-cp
> > to be false when !optimize?
> 
> I can easily imagine using
>   !opt_for_fn (node->decl, optimize)
>   || !opt_for_fn (node->decl, flag_ipa_cp)
> but guaranteeing flag_ipa_cp or flag_ipa_sra is never true for optimize == 0
> could be harder, what if something is built with -O0 -fipa-cp or
> __attribute__((optimize (0), "fipa-cp"))) or similar?  Checking optimize
> value is among other things about the lack of vdef/vuse for !optimize.

I always tought it would be better to inform users that -O0 -fipa-cp is
broken combination of flags (but it seems our policy to not do that)
or just clear -fipa-cp while processing argument as we do for some
other contradicting combinations.
But yes, lets go with those two checks now, to do what gate does:
static bool
cgraph_gate_cp (void)
{
  /* FIXME: We should remove the optimize check after we ensure we never run
 IPA passes when not optimizing.  */
  return flag_ipa_cp && optimize;
}

Honza
> 
>   Jakub


Re: Avoid unnnecesary copying of ipa-prop's expressions

2014-02-07 Thread Jan Hubicka
> On Thu, 6 Feb 2014, Jan Hubicka wrote:
> 
> > Hi,
> > at WPA we currently read trees accessed by jump functions and then copy them
> > to remove location that is already known to be UNKNOWN and then keep copying
> > them for every inline clone introduced (and there are many for firefox)
> > 
> > This patch makes us to copy only when expression really has an location in 
> > it.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> Hmm, I think you either can use just
> 
> if (EXPR_P (expr))
>   walk_tree (&expr, prune_expr_location, NULL, NULL);
> 
> or you miss unsharing and create invalid shared trees when
> the expr does not contain locations.
> 
> I fear it's the latter, given how ipa_set_jf_* is used.

Well, ipa-prop analysis takes random operands from GIMPLE bodies and stores
them into jump function. Then it streams in/out, propagates and eventually uses
them as a replacements for bodies.

We use unshare_without_location primarily to prevent LTO from need to stream
stale BLOCK expressions and to avoid  inserting wrong blocks into clones
http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01176.html
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01343.html

Calling prine_expr_location would kill locations in the original function body 
they
are taken from. For constant JF, they are IP invariants, so I do not think they 
need
unsharing. THe arithmetic is never inserted back to GIMPLE code.

I can also just add no_unshare parameter to jump functions to avoid the 
unsharing
during stream-in, possibly.

Honza


Re: [PING] [PATCH] _Cilk_for for C and C++

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 02:33:41PM +, Iyer, Balaji V wrote:
> > So, the issues I see:
> > 1) what is iter.1, why do you have it at all, and, after all, the iterator 
> > is a class
> > that needs to be constructed/destructed in the general way, so creating any
> > further copies of something is both costly and undesirable
> > 
> 
> Well, to get the loop count, I need to calculate it using operator-(array.end 
> (), &iter).
> 
> Now, if I do that iter is already set. I need to reset iter back to the
> original one (array.begin ()) in the child function.  This is why I used a
> temporary variable called iter1.

operator- shouldn't really change iter, if it does, it is purely the user's
fault, isn't it?  It isn't operator -=, so it shouldn't really change
array.end () either.

> > 2) the schedule clause doesn't belong on the omp parallel, but on the
> > _Cilk_for
> > 
> 
> What if grain is a variable say "x"? If I have it in the _Cilk_for, then
> won't it create omp_data_i->x.  That is not correct.  It should just emit
> "x." But let me look into this to make sure...

You certainly should gimplify the clause operand before the omp parallel, it
must be an integral anyway, right?  So just use get_temp_regvar?
Then simply use firstprivate on the #pragma omp parallel.  When you actually
omp expand, you'll still be able to find the original variable and look it
up on the parallel.  But, if you can't make it work, guess I could live with
the clause on the parallel.

> > 3) iter should be firstprivate, and there should be no explicit private var 
> > with
> > assignment during gimplification, just handle it like any other firstprivate
> > during omp lowering
> > 
> 
> Do you mean to say I should manually insert a firstprivate for iter and
> not the system figure out that it is shared?

Yes.  The class iterator is quite special thing, because already the C++ FE
lowers it to an integral iterator instead.  And when you make it
firstprivate, omp lowering/expansion should take care of running the copy
constructor/destructor in the parallel for you.

Jakub


RE: [PING] [PATCH] _Cilk_for for C and C++

2014-02-07 Thread Iyer, Balaji V


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> Sent: Friday, February 7, 2014 9:03 AM
> To: Iyer, Balaji V
> Cc: 'Jason Merrill'; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'r...@redhat.com'
> Subject: Re: [PING] [PATCH] _Cilk_for for C and C++
> 
> On Wed, Feb 05, 2014 at 05:27:26AM +, Iyer, Balaji V wrote:
> > Attached, please find a fixed patch (diff.txt) that will do as you
> requested (model _Cilk_for like a #pragma omp parallel for). Along with this,
> I have also attached two Changelog entries (1 for C and 1 for C++).
> > It passes all the tests on my x86_64 box (both 32 and 64 bit modes)
> and does not affect any other tests in the testsuite.
> > Is this Ok for trunk?
> 
> A step in the right direction, but I still see issues just from looking at the
> *.gimple dump:
> 
> For the first testcase, I see:
> iter = std::vector::begin (array); [return slot optimization]
> iter.1 = iter;
> D.13615 = std::vector::end (array); [return slot 
> optimization]
> try
>   {
> retval.0 = __gnu_cxx::operator- > 
> (&D.13615,
> &iter);
>   }
> finally
>   {
> D.13615 = {CLOBBER};
>   }
> #pragma omp parallel schedule(cilk-for grain,0) if(retval.0)
> #shared(iter.1) shared(D.13632) shared(D.13615) shared(iter)
>   {
> difference_type retval.2;
> const difference_type D.13633;
> int D.13725;
> struct __normal_iterator & D.13726;
> bool retval.3;
> int & D.13728;
> int D.13729;
> int & D.13732;
> 
> iter = iter.1;
>  private(D.13631)
> _Cilk_for (D.13631 = 0; D.13631 != retval.2; D.13631 = 
> D.13631 + 1)
>   D.13725 = D.13631 - D.13632;
> 
> So, the issues I see:
> 1) what is iter.1, why do you have it at all, and, after all, the iterator is 
> a class
> that needs to be constructed/destructed in the general way, so creating any
> further copies of something is both costly and undesirable
> 

Well, to get the loop count, I need to calculate it using operator-(array.end 
(), &iter).

Now, if I do that iter is already set. I need to reset iter back to the 
original one (array.begin ()) in the child function. This is why I used a 
temporary variable called iter1.



> 2) the schedule clause doesn't belong on the omp parallel, but on the
> _Cilk_for
> 

What if grain is a variable say "x"? If I have it in the _Cilk_for, then won't 
it create omp_data_i->x. That is not correct. It should just emit "x." But let 
me look into this to make sure...

> 3) iter should be firstprivate, and there should be no explicit private var 
> with
> assignment during gimplification, just handle it like any other firstprivate
> during omp lowering
> 

Do you mean to say I should manually insert a firstprivate for iter and not the 
system figure out that it is shared? 


> 4) the printing looks weird for _Cilk_for, as I said earlier, the clauses 
> should
> probably be printed after the closing ) of _Cilk_for rather than after nothing
> on the previous line; also, there is no {} printed around the _Cilk_for body
> and the next line is weirdly indented
> 

Ok will look into this.

> But more importantly, if I create some testcase with a generic C++
> conforming iterator (copied over from libgomp/testsuite/libgomp.c++/for-
> 1.C), as in the second testcase, the *.gimple dump shows that _Cilk_for is 
> still
> around the #pragma omp parallel.
> The intent of the second testcase is that you can really eyeball all the
> ctors/dtors/copy ctors etc. that should happen, and for -O0 shouldn't be
> really inlined.
> 
>   Jakub


Re: [PING] [PATCH] _Cilk_for for C and C++

2014-02-07 Thread Jakub Jelinek
On Wed, Feb 05, 2014 at 05:27:26AM +, Iyer, Balaji V wrote:
>   Attached, please find a fixed patch (diff.txt) that will do as you 
> requested (model _Cilk_for like a #pragma omp parallel for). Along with this, 
> I have also attached two Changelog entries (1 for C and 1 for C++).
>   It passes all the tests on my x86_64 box (both 32 and 64 bit modes) and 
> does not affect any other tests in the testsuite.
>   Is this Ok for trunk?

A step in the right direction, but I still see issues just from looking at
the *.gimple dump:

For the first testcase, I see:
iter = std::vector::begin (array); [return slot optimization]
iter.1 = iter;
D.13615 = std::vector::end (array); [return slot optimization]
try
  {
retval.0 = __gnu_cxx::operator- > 
(&D.13615, &iter);
  }
finally
  {
D.13615 = {CLOBBER};
  }
#pragma omp parallel schedule(cilk-for grain,0) if(retval.0)
#shared(iter.1) shared(D.13632) shared(D.13615) shared(iter)
  {
difference_type retval.2;
const difference_type D.13633;
int D.13725;
struct __normal_iterator & D.13726;
bool retval.3;
int & D.13728;
int D.13729;
int & D.13732;

iter = iter.1;
 private(D.13631)
_Cilk_for (D.13631 = 0; D.13631 != retval.2; D.13631 = D.13631 
+ 1)
  D.13725 = D.13631 - D.13632;

So, the issues I see:
1) what is iter.1, why do you have it at all, and, after all, the iterator
is a class that needs to be constructed/destructed in the general way, so
creating any further copies of something is both costly and undesirable

2) the schedule clause doesn't belong on the omp parallel, but on the _Cilk_for

3) iter should be firstprivate, and there should be no explicit private var
with assignment during gimplification, just handle it like any other
firstprivate during omp lowering

4) the printing looks weird for _Cilk_for, as I said earlier, the clauses
should probably be printed after the closing ) of _Cilk_for rather than
after nothing on the previous line; also, there is no {} printed around the
_Cilk_for body and the next line is weirdly indented

But more importantly, if I create some testcase with a generic C++
conforming iterator (copied over from
libgomp/testsuite/libgomp.c++/for-1.C), as in the second testcase, the
*.gimple dump shows that _Cilk_for is still around the #pragma omp parallel.
The intent of the second testcase is that you can really eyeball all the
ctors/dtors/copy ctors etc. that should happen, and for -O0 shouldn't be
really inlined.

Jakub
#include 

void
foo (std::vector &array)
{
  _Cilk_for (std::vector::iterator iter = array.begin(); iter != 
array.end(); iter++)
  {
if (*iter  == 6)
  *iter = 13;
  }
}
typedef __PTRDIFF_TYPE__ ptrdiff_t;

template 
class I
{
public:
  typedef ptrdiff_t difference_type;
  I ();
  ~I ();
  I (T *);
  I (const I &);
  T &operator * ();
  T *operator -> ();
  T &operator [] (const difference_type &) const;
  I &operator = (const I &);
  I &operator ++ ();
  I operator ++ (int);
  I &operator -- ();
  I operator -- (int);
  I &operator += (const difference_type &);
  I &operator -= (const difference_type &);
  I operator + (const difference_type &) const;
  I operator - (const difference_type &) const;
  template  friend bool operator == (I &, I &);
  template  friend bool operator == (const I &, const I &);
  template  friend bool operator < (I &, I &);
  template  friend bool operator < (const I &, const I &);
  template  friend bool operator <= (I &, I &);
  template  friend bool operator <= (const I &, const I &);
  template  friend bool operator > (I &, I &);
  template  friend bool operator > (const I &, const I &);
  template  friend bool operator >= (I &, I &);
  template  friend bool operator >= (const I &, const I &);
  template  friend typename I::difference_type operator - (I 
&, I &);
  template  friend typename I::difference_type operator - (const 
I &, const I &);
  template  friend I operator + (typename I::difference_type 
, const I &);
private:
  T *p;
};
template  I::I () : p (0) {}
template  I::~I () {}
template  I::I (T *x) : p (x) {}
template  I::I (const I &x) : p (x.p) {}
template  T &I::operator * () { return *p; }
template  T *I::operator -> () { return p; }
template  T &I::operator [] (const difference_type &x) const { 
return p[x]; }
template  I &I::operator = (const I &x) { p = x.p; return 
*this; }
template  I &I::operator ++ () { ++p; return *this; }
template  I I::operator ++ (int) { return I (p++); }
template  I &I::operator -- () { --p; return *this; }
template  I I::operator -- (int) { return I (p--); }
template  I &I::operator += (const difference_type &x) { p += 
x; return

Re: [PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)

2014-02-07 Thread Richard Biener
On Fri, 7 Feb 2014, Jakub Jelinek wrote:

> On Fri, Feb 07, 2014 at 10:02:29AM +0100, Richard Biener wrote:
> > > +  if (TREE_CODE (position) != INTEGER_CST
> > > +  || TREE_INT_CST_HIGH (position)
> > > +  || TREE_INT_CST_LOW (position) < 1
> > > +  || TREE_INT_CST_LOW (position) > arg_count)
> > 
> > You make it easier for wide-int folks if you use tree_fits_uhwi_p
> > and tree_to_uhwi ...
> 
> That was just a copy of the code from alloc_size, changed that too.
> 
> > > +static prop_value_t
> > > +bit_value_alloc_assume_aligned_attribute (gimple stmt, tree attr,
> > > +   prop_value_t ptrval,
> > > +   bool alloc_aligned)
> > > +{
> > 
> > This function is very similar to the existing bit_value_assume_aligned
> > which asks for some factoring?  Like share the tails once you've
> > figured out align and misalign values?
> 
> I've added support for the two attributes and original
> __builtin_assume_aligned in just one function, will test it momentarily.
> 
> > I wonder if we want to backport support for these attributes
> > to 4.8 (and 4.7?).
> 
> I think it doesn't help much.  At least glibc will need to conditionalize
> the attributes on gcc version anyway, so they will be used only for GCC >=
> 4.9 anyway (unless we'd do it for >= 4.8.4 or something, can't be 4.8.3,
> because, while it hasn't been released, current 4.8 branch snapshot mark
> themselves as 4.8.3 in the patchlevel).

Ah, indeed.  Didn't think about the snapshots...

> > Will you be working on a glibc patch?
> 
> I'll tell our glibc folks.

Thanks.  Updated patch looks ok.

Richard.

> 2014-02-07  Jakub Jelinek  
> 
>   PR middle-end/60092
>   * tree-ssa-ccp.c (surely_varying_stmt_p): Don't return true
>   if TYPE_ATTRIBUTES (gimple_call_fntype ()) contain
>   assume_aligned or alloc_align attributes.
>   (bit_value_assume_aligned): Add ATTR, PTRVAL and ALLOC_ALIGN
>   arguments.  Handle also assume_aligned and alloc_align attributes.
>   (evaluate_stmt): Adjust bit_value_assume_aligned caller.
>   Handle calls to functions with assume_aligned or alloc_align
>   attributes.
>   * doc/extend.texi: Document assume_aligned and alloc_align
>   attributes.
> c-family/
>   * c-common.c (handle_alloc_size_attribute): Use tree_fits_uhwi_p
>   and tree_to_uhwi.
>   (handle_alloc_align_attribute, handle_assume_aligned_attribute): New
>   functions.
>   (c_common_attribute_table): Add alloc_align and assume_aligned
>   attributes.
> testsuite/
>   * gcc.dg/attr-alloc_align-1.c: New test.
>   * gcc.dg/attr-alloc_align-2.c: New test.
>   * gcc.dg/attr-alloc_align-3.c: New test.
>   * gcc.dg/attr-assume_aligned-1.c: New test.
>   * gcc.dg/attr-assume_aligned-2.c: New test.
>   * gcc.dg/attr-assume_aligned-3.c: New test.
> 
> --- gcc/c-family/c-common.c.jj2014-02-07 11:44:07.114924852 +0100
> +++ gcc/c-family/c-common.c   2014-02-07 11:58:23.996841044 +0100
> @@ -366,6 +366,8 @@ static tree handle_warn_unused_result_at
>  static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_alloc_size_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool 
> *);
>  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
>  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> @@ -766,6 +768,10 @@ const struct attribute_spec c_common_att
> handle_omp_declare_simd_attribute, false },
>{ "omp declare target", 0, 0, true, false, false,
> handle_omp_declare_target_attribute, false },
> +  { "alloc_align", 1, 1, false, true, true,
> +   handle_alloc_align_attribute, false },
> +  { "assume_aligned",  1, 2, false, true, true,
> +   handle_assume_aligned_attribute, false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  
> @@ -8043,16 +8049,62 @@ handle_alloc_size_attribute (tree *node,
> && TREE_CODE (position) != FUNCTION_DECL)
>   position = default_conversion (position);
>  
> -  if (TREE_CODE (position) != INTEGER_CST
> -   || TREE_INT_CST_HIGH (position)
> -   || TREE_INT_CST_LOW (position) < 1
> -   || TREE_INT_CST_LOW (position) > arg_count )
> +  if (tree_fits_uhwi_p (position)
> +   || !IN_RANGE (tree_to_uhwi (position), 1, arg_count))
>   {
> warning (OPT_Wattributes,
>  "alloc_size parameter outside range");
> *no_add_attrs = true;
> return NULL_TREE;
>   }
> +}
> +  return NULL

Re: [PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 10:02:29AM +0100, Richard Biener wrote:
> > +  if (TREE_CODE (position) != INTEGER_CST
> > +  || TREE_INT_CST_HIGH (position)
> > +  || TREE_INT_CST_LOW (position) < 1
> > +  || TREE_INT_CST_LOW (position) > arg_count)
> 
> You make it easier for wide-int folks if you use tree_fits_uhwi_p
> and tree_to_uhwi ...

That was just a copy of the code from alloc_size, changed that too.

> > +static prop_value_t
> > +bit_value_alloc_assume_aligned_attribute (gimple stmt, tree attr,
> > + prop_value_t ptrval,
> > + bool alloc_aligned)
> > +{
> 
> This function is very similar to the existing bit_value_assume_aligned
> which asks for some factoring?  Like share the tails once you've
> figured out align and misalign values?

I've added support for the two attributes and original
__builtin_assume_aligned in just one function, will test it momentarily.

> I wonder if we want to backport support for these attributes
> to 4.8 (and 4.7?).

I think it doesn't help much.  At least glibc will need to conditionalize
the attributes on gcc version anyway, so they will be used only for GCC >=
4.9 anyway (unless we'd do it for >= 4.8.4 or something, can't be 4.8.3,
because, while it hasn't been released, current 4.8 branch snapshot mark
themselves as 4.8.3 in the patchlevel).

> Will you be working on a glibc patch?

I'll tell our glibc folks.

2014-02-07  Jakub Jelinek  

PR middle-end/60092
* tree-ssa-ccp.c (surely_varying_stmt_p): Don't return true
if TYPE_ATTRIBUTES (gimple_call_fntype ()) contain
assume_aligned or alloc_align attributes.
(bit_value_assume_aligned): Add ATTR, PTRVAL and ALLOC_ALIGN
arguments.  Handle also assume_aligned and alloc_align attributes.
(evaluate_stmt): Adjust bit_value_assume_aligned caller.
Handle calls to functions with assume_aligned or alloc_align
attributes.
* doc/extend.texi: Document assume_aligned and alloc_align
attributes.
c-family/
* c-common.c (handle_alloc_size_attribute): Use tree_fits_uhwi_p
and tree_to_uhwi.
(handle_alloc_align_attribute, handle_assume_aligned_attribute): New
functions.
(c_common_attribute_table): Add alloc_align and assume_aligned
attributes.
testsuite/
* gcc.dg/attr-alloc_align-1.c: New test.
* gcc.dg/attr-alloc_align-2.c: New test.
* gcc.dg/attr-alloc_align-3.c: New test.
* gcc.dg/attr-assume_aligned-1.c: New test.
* gcc.dg/attr-assume_aligned-2.c: New test.
* gcc.dg/attr-assume_aligned-3.c: New test.

--- gcc/c-family/c-common.c.jj  2014-02-07 11:44:07.114924852 +0100
+++ gcc/c-family/c-common.c 2014-02-07 11:58:23.996841044 +0100
@@ -366,6 +366,8 @@ static tree handle_warn_unused_result_at
 static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
 static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
 static tree handle_alloc_size_attribute (tree *, tree, tree, int, bool *);
+static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
+static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool *);
 static tree handle_target_attribute (tree *, tree, tree, int, bool *);
 static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
 static tree ignore_attribute (tree *, tree, tree, int, bool *);
@@ -766,6 +768,10 @@ const struct attribute_spec c_common_att
  handle_omp_declare_simd_attribute, false },
   { "omp declare target", 0, 0, true, false, false,
  handle_omp_declare_target_attribute, false },
+  { "alloc_align",   1, 1, false, true, true,
+ handle_alloc_align_attribute, false },
+  { "assume_aligned",1, 2, false, true, true,
+ handle_assume_aligned_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -8043,16 +8049,62 @@ handle_alloc_size_attribute (tree *node,
  && TREE_CODE (position) != FUNCTION_DECL)
position = default_conversion (position);
 
-  if (TREE_CODE (position) != INTEGER_CST
- || TREE_INT_CST_HIGH (position)
- || TREE_INT_CST_LOW (position) < 1
- || TREE_INT_CST_LOW (position) > arg_count )
+  if (tree_fits_uhwi_p (position)
+ || !IN_RANGE (tree_to_uhwi (position), 1, arg_count))
{
  warning (OPT_Wattributes,
   "alloc_size parameter outside range");
  *no_add_attrs = true;
  return NULL_TREE;
}
+}
+  return NULL_TREE;
+}
+
+/* Handle a "alloc_align" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_alloc_align_attribute (tree *node, tree, tree args, int,
+ bool *no_add_attrs)
+{
+  unsigned arg_count = type_num_

[PATCH] Fix more typos in error messages

2014-02-07 Thread Benno Schulenberg

Hi,

The below fixes some more typos in GCC's error messages.
When found okay, please apply.


2014-02-07  Benno Schulenberg  

* config/arc/arc.c (arc_init): Fix typo in error message.
* config/i386/i386.c (ix86_expand_builtin): Likewise.
(split_stack_prologue_scratch_regno): Likewise.
* fortran/check.c (gfc_check_fn_rc2008): Remove duplicate
word from error message.


Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c (revision 207597)
+++ gcc/fortran/check.c (working copy)
@@ -1736,7 +1736,7 @@
 return false;
 
   if (a->ts.type == BT_COMPLEX
-  && !gfc_notify_std (GFC_STD_F2008, "COMPLEX argument '%s' "
+  && !gfc_notify_std (GFC_STD_F2008, "COMPLEX '%s' "
  "argument of '%s' intrinsic at %L", 
  gfc_current_intrinsic_arg[0]->name, 
  gfc_current_intrinsic, &a->where))
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 207597)
+++ gcc/config/i386/i386.c  (working copy)
@@ -11804,7 +11804,7 @@
  if (regparm >= 2)
{
  sorry ("-fsplit-stack does not support 2 register "
-" parameters for a nested function");
+"parameters for a nested function");
  return INVALID_REGNUM;
}
  return DX_REG;
@@ -36006,7 +36006,7 @@
 
   if (!insn_data[icode].operand[3].predicate (op3, mode3))
{
- error ("the forth argument must be scale 1, 2, 4, 8");
+ error ("the fourth argument must be scale 1, 2, 4, 8");
  return const0_rtx;
}
 
Index: gcc/config/arc/arc.c
===
--- gcc/config/arc/arc.c(revision 207597)
+++ gcc/config/arc/arc.c(working copy)
@@ -746,7 +746,7 @@
   error ("-mmul32x16 supported only for ARC600 or ARC601");
 
   if (!TARGET_DPFP && TARGET_DPFP_DISABLE_LRSR)
-  error ("-mno-dpfp-lrsr suppforted only with -mdpfp");
+  error ("-mno-dpfp-lrsr supported only with -mdpfp");
 
   /* FPX-1. No fast and compact together.  */
   if ((TARGET_DPFP_FAST_SET && TARGET_DPFP_COMPACT_SET)


Re: [PATCH] optabs: Allow CAS expanders to fail

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 12:58:37PM +0100, Andreas Krebbel wrote:
> 2014-02-07  Andreas Krebbel  
> 
>   * optabs.c (expand_atomic_compare_and_swap): Allow expander to
>   fail.

Ok.

> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -7383,12 +7383,13 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, 
> rtx *ptarget_oval,
>create_integer_operand (&ops[5], is_weak);
>create_integer_operand (&ops[6], succ_model);
>create_integer_operand (&ops[7], fail_model);
> -  expand_insn (icode, 8, ops);
> -
> -  /* Return success/failure.  */
> -  target_bool = ops[0].value;
> -  target_oval = ops[1].value;
> -  goto success;
> +  if (maybe_expand_insn (icode, 8, ops))
> + {
> +   /* Return success/failure.  */
> +   target_bool = ops[0].value;
> +   target_oval = ops[1].value;
> +   goto success;
> + }
>  }
>  
>/* Otherwise fall back to the original __sync_val_compare_and_swap

Jakub


Re: [PATCH] sync builtin testcase: Add alignment attribute on TImode variable

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 02:12:44PM +0100, Andreas Krebbel wrote:
> 2014-02-07  Andreas Krebbel  
> 
>   * gcc.dg/gcc-have-sync-compare-and-swap.c: Align the 16 byte
>   variable used for atomic operations.

Ok.

> --- a/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c
> +++ b/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c
> @@ -40,10 +40,12 @@ void f8()
>  #endif
>  }
>  
> +/* aligned (16): On S/390 16 byte compare and swap operations are only
> +   available if the memory operand resides on a 16 byte boundary.  */
>  void f16()
>  {
>  #ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
> -  typedef int __attribute__ ((__mode__ (__TI__))) ti_int_type;
> +  typedef int __attribute__ ((__mode__ (__TI__), aligned (16))) ti_int_type;
>ti_int_type ti_int;
>__sync_bool_compare_and_swap (&ti_int, (ti_int_type)0, (ti_int_type)1);
>  #endif

Jakub


Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 10:33:45AM +0100, Richard Biener wrote:
> Thus like the following.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk
> at this stage?
> 
> Thanks,
> Richard.
> 
> 2014-02-07  Richard Biener  
> 
>   PR middle-end/60092
>   * gimple-low.c (lower_builtin_posix_memalign): New function.
>   (lower_stmt): Call it to lower posix_memalign in a way
>   to make alignment info accessible.
> 
>   * gcc.dg/vect/pr60092-2.c: New testcase.

Ok.

Jakub


[PATCH] sync builtin testcase: Add alignment attribute on TImode variable

2014-02-07 Thread Andreas Krebbel
Hi,

the S/390 expanders reject operands not being naturally aligned.  This
makes the gcc-have-sync-compare-and-swap.c failing.

The attached patch adds an alignment attribute to the data
type used in the check to make it succeed on S/390 again.

In the future perhaps it would be more appropriate to introduce a 128
bit data type supposed to be used for atomic operations.  Given that
we already have __int128_t perhaps we should also provide a
__atomic_int128_t?  For S/390 we cannot change the alignment of
__int128_t anymore but we would require a stricter alignment for
__atomic_int128_t right from the beginning in order to enable our
atomic hardware instructions.  Does that sound reasonable?

Ok for mainline?

Bye,

-Andreas-


2014-02-07  Andreas Krebbel  

* gcc.dg/gcc-have-sync-compare-and-swap.c: Align the 16 byte
variable used for atomic operations.


commit 4b91700e39db14f474b030956b14be94794b589c
Author: Andreas Krebbel 
Date:   Fri Feb 7 13:59:47 2014 +0100

Add alignment attribute to 128 bit type for atomic operations.

diff --git a/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c 
b/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c
index faed818..5affeba 100644
--- a/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c
+++ b/gcc/testsuite/gcc.dg/gcc-have-sync-compare-and-swap.c
@@ -40,10 +40,12 @@ void f8()
 #endif
 }
 
+/* aligned (16): On S/390 16 byte compare and swap operations are only
+   available if the memory operand resides on a 16 byte boundary.  */
 void f16()
 {
 #ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
-  typedef int __attribute__ ((__mode__ (__TI__))) ti_int_type;
+  typedef int __attribute__ ((__mode__ (__TI__), aligned (16))) ti_int_type;
   ti_int_type ti_int;
   __sync_bool_compare_and_swap (&ti_int, (ti_int_type)0, (ti_int_type)1);
 #endif



[PATCH] S/390: Reject misaligned operands in atomic expanders

2014-02-07 Thread Andreas Krebbel
Hi,

on S/390 atomic operands need to be naturally aligned.  Otherwise the
instruction throws an specification exception.  With the patch the
expanders reject operands not being properly aligned.

This only works if the expander code actually allows atomic expanders
to fail.  This did not seem to be case for the compare and swap code.
Fixed with a separate patch.

I'll commit the patch after the optab fix is in.

This fixes 45 atomic fails in the testsuite on s390x.

Bye,

-Andreas-

2014-02-07  Andreas Krebbel  

* config/s390/s390.md ("atomic_load", "atomic_store")
("atomic_compare_and_swap", "atomic_fetch_"):
Reject misaligned operands.

commit 9431dff0d0dcf8771689c1df49e0bea50ac96d5a
Author: Andreas Krebbel 
Date:   Fri Feb 7 12:51:17 2014 +0100

S/390: Prevent 128bit atomic ops from being used on misaligned memory 
operands.

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index bccc159..3f86304 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -9108,6 +9108,9 @@
(match_operand:SI 2 "const_int_operand")]   ;; model
   ""
 {
+  if (MEM_ALIGN (operands[1]) < GET_MODE_BITSIZE (GET_MODE (operands[1])))
+FAIL;
+
   if (mode == TImode)
 emit_insn (gen_atomic_loadti_1 (operands[0], operands[1]));
   else if (mode == DImode && !TARGET_ZARCH)
@@ -9149,6 +9152,9 @@
 {
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
 
+  if (MEM_ALIGN (operands[0]) < GET_MODE_BITSIZE (GET_MODE (operands[0])))
+FAIL;
+
   if (mode == TImode)
 emit_insn (gen_atomic_storeti_1 (operands[0], operands[1]));
   else if (mode == DImode && !TARGET_ZARCH)
@@ -9203,6 +9209,9 @@
   if (!register_operand (output, mode))
 output = gen_reg_rtx (mode);
 
+  if (MEM_ALIGN (operands[2]) < GET_MODE_BITSIZE (GET_MODE (operands[2])))
+FAIL;
+
   emit_insn (gen_atomic_compare_and_swap_internal
 (output, operands[2], operands[3], operands[4]));
 
@@ -9319,6 +9328,9 @@
(match_operand:SI 3 "const_int_operand")]   ;; model
   "TARGET_Z196"
 {
+  if (MEM_ALIGN (operands[1]) < GET_MODE_BITSIZE (GET_MODE (operands[1])))
+FAIL;
+
   emit_insn (gen_atomic_fetch__iaf
 (operands[0], operands[1], operands[2]));
   DONE;



[PATCH] optabs: Allow CAS expanders to fail

2014-02-07 Thread Andreas Krebbel
Hi,

on S/390 128 bit atomic operations are not allowed for misaligned
operands.  The expanders are supposed to FAIL in that case.  While it
works for the other routines atomic_load/store it does not work
currently during compare and swap expansion.

The patch just turns an expand_insn into maybe_expand_insn to allow
other alternatives to be chosen when the expander fails.

The patch is required to fix many c11-atomic* testcases on S/390.

gcc.dg/atomic/c11-atomic-exec-5.c still fails since
TARGET_ATOMIC_ASSIGN_EXPAND_FENV is not defined yet.

Ok for mainline?

2014-02-07  Andreas Krebbel  

* optabs.c (expand_atomic_compare_and_swap): Allow expander to
fail.

commit 63b1953fadae985ef8302159dacbfc8bf541ec1f
Author: Andreas Krebbel 
Date:   Fri Feb 7 12:50:50 2014 +0100

optabs: Allow compare and swap expanders to fail

diff --git a/gcc/optabs.c b/gcc/optabs.c
index e36fd13..cec25a4 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -7383,12 +7383,13 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx 
*ptarget_oval,
   create_integer_operand (&ops[5], is_weak);
   create_integer_operand (&ops[6], succ_model);
   create_integer_operand (&ops[7], fail_model);
-  expand_insn (icode, 8, ops);
-
-  /* Return success/failure.  */
-  target_bool = ops[0].value;
-  target_oval = ops[1].value;
-  goto success;
+  if (maybe_expand_insn (icode, 8, ops))
+   {
+ /* Return success/failure.  */
+ target_bool = ops[0].value;
+ target_oval = ops[1].value;
+ goto success;
+   }
 }
 
   /* Otherwise fall back to the original __sync_val_compare_and_swap



[PATCH] Fix for PR60080

2014-02-07 Thread Bernd Edlinger
Hi,

there has been a ICE on solaris 9 and 10 when dumping ASM_INPUT objects
without valid source loation at print-rtl.c.

print_rtx did not check for this, and tried to print NULL with printf format %s.
This happens to be handled by glibc's printf to print "(null)" but not on 
solaris.

Attached is my proposed patch for this: Firstly attaching the source location to
ASM_INPUT objects, that should not hurt, and secondly check for
UNKNOWN_LOCATION in print-rtl.c for ASM_INPUT and ASM_OPERANDS.

Boot-Strapped and Regression-tested on X86_64-linux-gnu.


Thanks
Bernd.2014-02-06  Bernd Edlinger  

PR middle-end/60080
* cfgexpand.c (expand_asm_operands): Attach source location to
ASM_INPUT rtx objects.
* print-rtl.c (print_rtx): Check for UNKNOWN_LOCATION.



patch-pr60080.diff
Description: Binary data


Re: [PATCH 12/8] [AVX-512] Improve EAS, ICC, GCC conformance.

2014-02-07 Thread Uros Bizjak
On Fri, Feb 7, 2014 at 11:13 AM, Uros Bizjak  wrote:

>> This (should be) the last patch for AVX-512 support in v4.9.
>>
>> It improves correspondence between ICC, SDM [1], and official
>> intrinsics guide [2].
>>
>> What was done:
>>   - Fixed shifts such as VPSLLD and friends. Actual instruction
>> loads 128-bit of count and uses only 64-bit (see [1]). So,
>> we're changing V4S* -> V2D* in built-ins.
>>
>>   - Rename (back) _mm512_[load|store]u_epi32 to
>> _mm512_[load|store]u_si512 according to [2].
>>
>>   - Remove floor and ceil with zero-masking and/or rounding.
>>
>>   - Remove _mm512_expand_p[s|d] as it is absent in [2].
>>
>>   - Make scatter prefetch take 1,2,5,6 immediates. 1 and 5 means
>> L1, 2 and 6 means L6.
>>
>>   - Make gather prefetch take 1 and 2 meaning corresponding cache
>> levels.
>>
>>   - Fix all tests accordingly.
>>
>> gcc/
>> * config/i386/avx512fintrin.h (_mm512_storeu_epi64): Removed.
>> (_mm512_loadu_epi32): Renamed into...
>> (_mm512_loadu_si512): This.
>> (_mm512_storeu_epi32): Renamed into...
>> (_mm512_storeu_si512): This.
>> (_mm512_maskz_ceil_ps): Removed.
>> (_mm512_maskz_ceil_pd): Ditto.
>> (_mm512_maskz_floor_ps): Ditto.
>> (_mm512_maskz_floor_pd): Ditto.
>> (_mm512_floor_round_ps): Ditto.
>> (_mm512_floor_round_pd): Ditto.
>> (_mm512_ceil_round_ps): Ditto.
>> (_mm512_ceil_round_pd): Ditto.
>> (_mm512_mask_floor_round_ps): Ditto.
>> (_mm512_mask_floor_round_pd): Ditto.
>> (_mm512_mask_ceil_round_ps): Ditto.
>> (_mm512_mask_ceil_round_pd): Ditto.
>> (_mm512_maskz_floor_round_ps): Ditto.
>> (_mm512_maskz_floor_round_pd): Ditto.
>> (_mm512_maskz_ceil_round_ps): Ditto.
>> (_mm512_maskz_ceil_round_pd): Ditto.
>> (_mm512_expand_pd): Ditto.
>> (_mm512_expand_ps): Ditto.
>> (_mm512_sll_epi32): Updated parameter type.
>> (_mm512_mask_sll_epi32): Ditto.
>> (_mm512_maskz_sll_epi32): Ditto.
>> (_mm512_srl_epi32): Ditto.
>> (_mm512_mask_srl_epi32): Ditto.
>> (_mm512_maskz_srl_epi32): Ditto.
>> (_mm512_sra_epi32): Ditto.
>> (_mm512_mask_sra_epi32): Ditto.
>> (_mm512_maskz_sra_epi32): Ditto.
>> * config/i386/i386-builtin-type.def 
>> (V16SI_FTYPE_V16SI_V4SI_V16SI_HI):
>> Change into...
>> (V16SI_FTYPE_V16SI_V2DI_V16SI_HI): This.
>> * config/i386/i386.c (ix86_builtins): Remove
>> IX86_BUILTIN_EXPANDPD512_NOMASK, IX86_BUILTIN_EXPANDPS512_NOMASK.
>> (bdesc_args): Ditto.
>> * config/i386/predicates.md (const1256_operand): New.
>> (const_1_to_2_operand): Ditto.
>> * config/i386/sse.md (avx512pf_gatherpfsf): Change hint value.
>> (*avx512pf_gatherpfsf_mask): Ditto.
>> (*avx512pf_gatherpfsf): Ditto.
>> (avx512pf_gatherpfdf): Ditto.
>> (*avx512pf_gatherpfdf_mask): Ditto.
>> (*avx512pf_gatherpfdf): Ditto.
>> (avx512pf_scatterpfsf): Ditto.
>> (*avx512pf_scatterpfsf_mask): Ditto.
>> (*avx512pf_scatterpfsf): Ditto.
>> (avx512pf_scatterpfdf): Ditto.
>> (*avx512pf_scatterpfdf_mask): Ditto.
>> (*avx512pf_scatterpfdf): Ditto.
>> (avx512f_expand): Removed.
>> (3): Change parameter type.
>>
>> gcc/testsuite/
>> * gcc.target/i386/avx512f-vexpandpd-1.c: Update intrinsics.
>> * gcc.target/i386/avx512f-vexpandps-1.c: Ditto.
>> * gcc.target/i386/avx512f-vexpandpd-2.c: Ditto.
>> * gcc.target/i386/avx512f-vexpandps-2.c: Ditto.
>> * gcc.target/i386/avx512f-vmovdqu32-1: Ditto.
>> * gcc.target/i386/avx512f-vmovdqu32-2: Ditto.
>> * gcc.target/i386/avx512f-vmovdqu64-1: Ditto.
>> * gcc.target/i386/avx512f-vmovdqu64-2: Ditto.
>> * gcc.target/i386/avx512f-vpcmpd-2.c: Ditto.
>> * gcc.target/i386/avx512f-vpcmpq-2.c: Ditto.
>> * gcc.target/i386/avx512f-vpcmupd-2.c: Ditto.
>> * gcc.target/i386/avx512f-vpcmupq-2.c: Ditto.
>> * gcc.target/i386/avx512f-vrndscalepd-1.c: Ditto.
>> * gcc.target/i386/avx512f-vrndscaleps-1.c: Ditto.
>> * gcc.target/i386/avx512f-vrndscalepd-2.c: Ditto.
>> * gcc.target/i386/avx512f-vrndscaleps-2.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Update parameters.
>> * gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Ditto.
>> * gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Ditto.
>> * gcc.target/i386/avx512f-vpsrad-2.c: Ditto.
>> * gcc.target/i386/avx512f-vpslld-2.c: Di

Re: [PATCH 12/8] [AVX-512] Improve EAS, ICC, GCC conformance.

2014-02-07 Thread Uros Bizjak
On Fri, Feb 7, 2014 at 10:49 AM, Kirill Yukhin  wrote:

> This (should be) the last patch for AVX-512 support in v4.9.
>
> It improves correspondence between ICC, SDM [1], and official
> intrinsics guide [2].
>
> What was done:
>   - Fixed shifts such as VPSLLD and friends. Actual instruction
> loads 128-bit of count and uses only 64-bit (see [1]). So,
> we're changing V4S* -> V2D* in built-ins.
>
>   - Rename (back) _mm512_[load|store]u_epi32 to
> _mm512_[load|store]u_si512 according to [2].
>
>   - Remove floor and ceil with zero-masking and/or rounding.
>
>   - Remove _mm512_expand_p[s|d] as it is absent in [2].
>
>   - Make scatter prefetch take 1,2,5,6 immediates. 1 and 5 means
> L1, 2 and 6 means L6.
>
>   - Make gather prefetch take 1 and 2 meaning corresponding cache
> levels.
>
>   - Fix all tests accordingly.
>
> gcc/
> * config/i386/avx512fintrin.h (_mm512_storeu_epi64): Removed.
> (_mm512_loadu_epi32): Renamed into...
> (_mm512_loadu_si512): This.
> (_mm512_storeu_epi32): Renamed into...
> (_mm512_storeu_si512): This.
> (_mm512_maskz_ceil_ps): Removed.
> (_mm512_maskz_ceil_pd): Ditto.
> (_mm512_maskz_floor_ps): Ditto.
> (_mm512_maskz_floor_pd): Ditto.
> (_mm512_floor_round_ps): Ditto.
> (_mm512_floor_round_pd): Ditto.
> (_mm512_ceil_round_ps): Ditto.
> (_mm512_ceil_round_pd): Ditto.
> (_mm512_mask_floor_round_ps): Ditto.
> (_mm512_mask_floor_round_pd): Ditto.
> (_mm512_mask_ceil_round_ps): Ditto.
> (_mm512_mask_ceil_round_pd): Ditto.
> (_mm512_maskz_floor_round_ps): Ditto.
> (_mm512_maskz_floor_round_pd): Ditto.
> (_mm512_maskz_ceil_round_ps): Ditto.
> (_mm512_maskz_ceil_round_pd): Ditto.
> (_mm512_expand_pd): Ditto.
> (_mm512_expand_ps): Ditto.
> (_mm512_sll_epi32): Updated parameter type.
> (_mm512_mask_sll_epi32): Ditto.
> (_mm512_maskz_sll_epi32): Ditto.
> (_mm512_srl_epi32): Ditto.
> (_mm512_mask_srl_epi32): Ditto.
> (_mm512_maskz_srl_epi32): Ditto.
> (_mm512_sra_epi32): Ditto.
> (_mm512_mask_sra_epi32): Ditto.
> (_mm512_maskz_sra_epi32): Ditto.
> * config/i386/i386-builtin-type.def (V16SI_FTYPE_V16SI_V4SI_V16SI_HI):
> Change into...
> (V16SI_FTYPE_V16SI_V2DI_V16SI_HI): This.
> * config/i386/i386.c (ix86_builtins): Remove
> IX86_BUILTIN_EXPANDPD512_NOMASK, IX86_BUILTIN_EXPANDPS512_NOMASK.
> (bdesc_args): Ditto.
> * config/i386/predicates.md (const1256_operand): New.
> (const_1_to_2_operand): Ditto.
> * config/i386/sse.md (avx512pf_gatherpfsf): Change hint value.
> (*avx512pf_gatherpfsf_mask): Ditto.
> (*avx512pf_gatherpfsf): Ditto.
> (avx512pf_gatherpfdf): Ditto.
> (*avx512pf_gatherpfdf_mask): Ditto.
> (*avx512pf_gatherpfdf): Ditto.
> (avx512pf_scatterpfsf): Ditto.
> (*avx512pf_scatterpfsf_mask): Ditto.
> (*avx512pf_scatterpfsf): Ditto.
> (avx512pf_scatterpfdf): Ditto.
> (*avx512pf_scatterpfdf_mask): Ditto.
> (*avx512pf_scatterpfdf): Ditto.
> (avx512f_expand): Removed.
> (3): Change parameter type.
>
> gcc/testsuite/
> * gcc.target/i386/avx512f-vexpandpd-1.c: Update intrinsics.
> * gcc.target/i386/avx512f-vexpandps-1.c: Ditto.
> * gcc.target/i386/avx512f-vexpandpd-2.c: Ditto.
> * gcc.target/i386/avx512f-vexpandps-2.c: Ditto.
> * gcc.target/i386/avx512f-vmovdqu32-1: Ditto.
> * gcc.target/i386/avx512f-vmovdqu32-2: Ditto.
> * gcc.target/i386/avx512f-vmovdqu64-1: Ditto.
> * gcc.target/i386/avx512f-vmovdqu64-2: Ditto.
> * gcc.target/i386/avx512f-vpcmpd-2.c: Ditto.
> * gcc.target/i386/avx512f-vpcmpq-2.c: Ditto.
> * gcc.target/i386/avx512f-vpcmupd-2.c: Ditto.
> * gcc.target/i386/avx512f-vpcmupq-2.c: Ditto.
> * gcc.target/i386/avx512f-vrndscalepd-1.c: Ditto.
> * gcc.target/i386/avx512f-vrndscaleps-1.c: Ditto.
> * gcc.target/i386/avx512f-vrndscalepd-2.c: Ditto.
> * gcc.target/i386/avx512f-vrndscaleps-2.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Update parameters.
> * gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Ditto.
> * gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Ditto.
> * gcc.target/i386/avx512f-vpsrad-2.c: Ditto.
> * gcc.target/i386/avx512f-vpslld-2.c: Ditto.
> * gcc.target/i386/avx512f-vpsrld-2.c: Ditto.
>
> Is it ok for trunk?

OK. The changes look triv

Re: [s390] Add return and simple_return patterns

2014-02-07 Thread Andreas Krebbel
On 04/02/14 12:27, Richard Sandiford wrote:
> This patch adds return and simple_return patterns to the s390 backend,
> which eanbles shrink-wrapping and conditional returns to be used.
> 
> Perhaps the only subtle thing is the handling of call-clobbered base
> registers.  The idea is to emit the initialising main_pool pattern in
> both early_mach -- at the very beginning of the function -- and in the
> prologue.  Then, if shrink-wrapping is used, the one added by early_mach
> will still be the first in the function.  If shrink-wrapping isn't used
> then the one added by the prologue will be the first in the function.
> s390_mainpool_start then deletes whichever isn't needed.
> 
> Tested in the same way as the previous patches.  OK to install?
> 
> Thanks,
> Richard
> 
> 
> gcc/
>   * config/s390/s390-protos.h (s390_can_use_simple_return_insn)
>   (s390_can_use_return_insn): Declare.
>   * config/s390/s390.h (EPILOGUE_USES): Define.
>   * config/s390/s390.c (s390_mainpool_start): Allow two main_pool
>   instructions.
>   (s390_chunkify_start): Handle return JUMP_LABELs.
>   (s390_early_mach): Emit a main_pool instruction on the entry edge.
>   (s300_set_up_by_prologue, s390_can_use_simple_return_insn)
>   (s390_can_use_return_insn): New functions.
>   (s390_fix_long_loop_prediction): Handle conditional returns.
>   (TARGET_SET_UP_BY_PROLOGUE): Define.
>   * config/s390/s390.md (ANY_RETURN): New code iterator.
>   (*creturn, *csimple_return, return, simple_return): New patterns.

Ok to apply.  Thanks!

-Andreas-



Re: [s390] Fix some epilogue CFA notes

2014-02-07 Thread Andreas Krebbel
On 04/02/14 12:19, Richard Sandiford wrote:
> This patch fixes the CFA notes used when an epilogue restores a GPR from
> an FPR.  It also makes sure that s390_optimize_prologue preserves the
> CFA information.
> 
> Tested in the same way as the previous patch.  OK to install?
> 
> Thanks,
> Richard
> 
> 
> gcc/
>   * config/s390/s390.c (s390_restore_gprs_from_fprs): Add REG_CFA_RESTORE
>   notes to each restore.  Also add REG_CFA_DEF_CFA when restoring %r15.
>   (s390_optimize_prologue): Don't clear RTX_FRAME_RELATED_P.  Update the
>   REG_CFA_RESTORE list when deciding not to restore a register.

Ok to apply.  Thanks!

-Andreas-




Re: [s390] Split out pre-prologue rewrite into separate pass

2014-02-07 Thread Andreas Krebbel
On 04/02/14 12:14, Richard Sandiford wrote:
> s390_emit_prologue performs some optimisations on the function before
> emitting the prologue.  It also rewrites constant pool accesses to make
> the base register explicit.
> 
> Doing this in the prologue pattern makes the interaction with direct
> returns and simple_returns less obvious, so this patch splits the code
> out into a new target-specific pre-prologue pass.  I've called it
> "early_mach" for want of a better name.
> 
> I also moved s390_option_override to the end of the file in order
> to avoid some forward declarations that would have been needed otherwise.
> The only change is at the very end of the function.
> 
> Tested on s390-linux-gnu and s390x-linux-gnu, using
> unix{,-m31,-march=z10/-m31,-march=z10,-march=z196,-march=zEC12}
> for the latter.  OK to install?
> 
> Thanks,
> Richard
> 
> 
> gcc/
>   * config/s390/s390.c: Include tree-pass.h and context.h.
>   (s390_early_mach): New function, split out from...
>   (s390_emit_prologue): ...here.
>   (pass_data_s390_early_mach): New pass structure.
>   (pass_s390_early_mach): New class.
>   (s390_option_override): Create and register early_mach pass.
>   Move to end of file.

Ok to apply.  Thanks a lot for doing this work!

-Andreas-



Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Richard Biener wrote:

> On Thu, 6 Feb 2014, Richard Biener wrote:
> 
> > 
> > This re-writes posix_memalign calls to
> > 
> >   posix_memalign (ptr, align, size);
> >   tem = *ptr;
> >   tem = __builtin_assume_aligned (align);
> >   *ptr = tem;
> > 
> > during CF lowering (yeah, ok ...) to make alignment info accessible
> > to SSA based analysis.
> > 
> > I have to adjust the added alias-31.c testcase again because with
> > the above we end up with
> > 
> >   :
> >   res_3 = *p_2(D);
> >   posix_memalign (&q.q1, 128, 512);
> >   _5 = MEM[(void *)&q];
> >   _6 = __builtin_assume_aligned (_5, 128);
> >   MEM[(void *)&q] = _6;
> >   posix_memalign (&q.q2, 128, 512);
> >   _17 = res_3 + res_3;
> >   _20 = _17 + 1;
> >   _23 = _20 + 2;
> >   q ={v} {CLOBBER};
> >   return _23;
> > 
> > after early DCE.  This is because DCE only has "baby" DSE built-in
> > and the store to MEM[(void *)&q] which it doesn't remove keeps
> > the rest live.  DSE removes the store and the DCE following it
> > the rest.
> > 
> > Not sure if more sophisticated lowering is wanted here.  Special-casing
> > &... operands to posix_memalign as stated in the PR, generating
> > for posix_memalign (&ptr, 128, 512);
> > 
> >   posix_memalign (&tem, 128, 512);
> >   reg = tem;
> >   reg = __builtin_assume_aligned (reg, 128);
> >   ptr = reg;
> > 
> > instead would be possible (hoping for ptr to become non-address-taken).
> 
> Ok, doing that was simple and avoids pessimizing the testcase.

Thus like the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk
at this stage?

Thanks,
Richard.

2014-02-07  Richard Biener  

PR middle-end/60092
* gimple-low.c (lower_builtin_posix_memalign): New function.
(lower_stmt): Call it to lower posix_memalign in a way
to make alignment info accessible.

* gcc.dg/vect/pr60092-2.c: New testcase.

Index: trunk/gcc/gimple-low.c
===
*** trunk.orig/gcc/gimple-low.c 2014-02-06 15:06:39.013419315 +0100
--- trunk/gcc/gimple-low.c  2014-02-06 15:41:14.855276396 +0100
*** static void lower_gimple_bind (gimple_st
*** 83,88 
--- 83,89 
  static void lower_try_catch (gimple_stmt_iterator *, struct lower_data *);
  static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *);
  static void lower_builtin_setjmp (gimple_stmt_iterator *);
+ static void lower_builtin_posix_memalign (gimple_stmt_iterator *);
  
  
  /* Lower the body of current_function_decl from High GIMPLE into Low
*** lower_stmt (gimple_stmt_iterator *gsi, s
*** 327,338 
  }
  
if (decl
!   && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL
!   && DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP)
  {
!   lower_builtin_setjmp (gsi);
!   data->cannot_fallthru = false;
!   return;
  }
  
if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN))
--- 328,346 
  }
  
if (decl
!   && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
  {
!   if (DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP)
! {
!   lower_builtin_setjmp (gsi);
!   data->cannot_fallthru = false;
!   return;
! }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN)
! {
!   lower_builtin_posix_memalign (gsi);
!   return;
! }
  }
  
if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN))
*** lower_builtin_setjmp (gimple_stmt_iterat
*** 771,776 
--- 779,827 
/* Remove the call to __builtin_setjmp.  */
gsi_remove (gsi, false);
  }
+ 
+ /* Lower calls to posix_memalign to
+  posix_memalign (ptr, align, size);
+  tem = *ptr;
+  tem = __builtin_assume_aligned (tem, align);
+  *ptr = tem;
+or to
+  void *tem;
+  posix_memalign (&tem, align, size);
+  ttem = tem;
+  ttem = __builtin_assume_aligned (ttem, align);
+  ptr = tem;
+in case the first argument was &ptr.  That way we can get at the
+alignment of the heap pointer in CCP.  */
+ 
+ static void
+ lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
+ {
+   gimple stmt = gsi_stmt (*gsi);
+   tree pptr = gimple_call_arg (stmt, 0);
+   tree align = gimple_call_arg (stmt, 1);
+   tree ptr = create_tmp_reg (ptr_type_node, NULL);
+   if (TREE_CODE (pptr) == ADDR_EXPR)
+ {
+   tree tem = create_tmp_var (ptr_type_node, NULL);
+   TREE_ADDRESSABLE (tem) = 1;
+   gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
+   stmt = gimple_build_assign (ptr, tem);
+ }
+   else
+ stmt = gimple_build_assign (ptr,
+   fold_build2 (MEM_REF, ptr_type_node, pptr,
+build_int_cst (ptr_type_node, 0)));
+   gsi_insert_after

Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled

2014-02-07 Thread Richard Biener
On Fri, Feb 7, 2014 at 7:40 AM, Yury Gribov  wrote:
>> As can be seen here:
>>
>> http://cbuild.validation.linaro.org/build/cross-validation/gcc/207533/report-build-info.html
>> this has caused some regressions on armv5t targets.
>
> IMHO this is expected: with this patch we prohibited unaligned loads on all
> platforms with -mno-unaligned-access. Perhaps we should set vect_no_align
> for armv5t?

Or provide a vec_realign_load_optab?  I'm sure the target supports a
re-alignment scheme if it cannot do unaligned loads?

Richard.

> -Y
>


Re: minor help message fix

2014-02-07 Thread Richard Biener
On Thu, Feb 6, 2014 at 10:30 PM, Xinliang David Li  wrote:
> Hi the following patch removes the 'state' print for
> -ftree-tree-vectorize option which does not make sense anymore. Ok for
> trunk?

Hmm, isn't it more appropriate to remove 'Report' from ftree-vectorize
in common.opt?  Or simply treat the [enabled]/[disabled] literally?
Or simply also enable (redundantly) OPT_ftree_vectorize at -O3?

Richard.

> thanks,
>
> David


Re: [RFA] [middle-end/54041] Convert modes as needed from expand_expr

2014-02-07 Thread Richard Biener
On Thu, Feb 6, 2014 at 8:33 PM, Jeff Law  wrote:
>
> expand_expr has, for as long as I can remember, had the ability to ignore
> the desired mode provided by its callers and instead returning something in
> a completely different mode.  It's always been the caller's responsibility
> to deal with that.
>
> For the testcase in 54041, we call expand_expr with a desired mode of
> SImode, but it actually returns a HImode object.  This causes the assertion
> in convert_memory_address_addr_space to trip because the passed mode must be
> the same as the mode of the memory address.
>
> The fix is simple.  If expand_expr returns something in the wrong mode,
> convert it to the desired mode.
>
> I've reviewed the resulting code for the m68k target and it looks correct to
> me.  I've also bootstrapped and regression tested on
> x86_64-unknown-linux-gnu, though the new code most certainly does not
> trigger there.
>
> I guess if someone really wanted to be thorough, they'd test on a target
> where pointers and integers are different sizes.
>
> OK for the trunk?
>
> Thanks,
> Jeff
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 2dbab72..4c7da83 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2014-02-05  Jeff Law  
> +
> +   PR middle-end/54041
> +   * expr.c (expand_expr_addr_1): Handle expand_expr returning an
> +   object with an undesirable mode.
> +
>  2014-02-05  Bill Schmidt  
>
> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 878a51b..9609c45 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -7708,6 +7708,11 @@ expand_expr_addr_expr_1 (tree exp, rtx target, enum
> machine_mode tmode,
>  modifier == EXPAND_INITIALIZER
>   ? EXPAND_INITIALIZER : EXPAND_NORMAL);
>
> +  /* expand_expr is allowed to return an object in a mode other
> +than TMODE.  If it did, we need to convert.  */
> +  if (tmode != GET_MODE (tmp))
> +   tmp = convert_modes (tmode, GET_MODE (tmp),
> +tmp, TYPE_UNSIGNED (TREE_TYPE (offset)));

What about CONSTANT_P tmp?  Don't you need to use
TYPE_MODE (TREE_TYPE (offset)) in that case?

Richard.

>result = convert_memory_address_addr_space (tmode, result, as);
>tmp = convert_memory_address_addr_space (tmode, tmp, as);
>
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index c81a00d..283912d 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-02-05  Jeff Law  
> +
> +   PR middle-end/54041
> +   * gcc.target/m68k/pr54041.c: New test.
> +
>  2014-02-05  Bill Schmidt  
>
> * gcc.dg/vmx/sum2s.c: New.
> diff --git a/gcc/testsuite/gcc.target/m68k/pr54041.c
> b/gcc/testsuite/gcc.target/m68k/pr54041.c
> new file mode 100644
> index 000..645cb6d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/m68k/pr54041.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -mshort" } */
> +
> +extern int r[];
> +
> +int *fn(int i)
> +{
> +   return &r[i];
> +}
> +
>


Re: wide-int, lto

2014-02-07 Thread Richard Biener
On Thu, Feb 6, 2014 at 7:56 PM, Mike Stump  wrote:
> On Nov 25, 2013, at 3:09 AM, Richard Biener  
> wrote:
>> please add streamer_read/write_wi () helpers to data-streamer*
>>
>> replicating the above loop N times is too ugly.
>
> Agreed.  Below is the patch to collapse the code.  Thanks.

Ok.

>
> diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
> index 08eba48..3e02840 100644
> --- a/gcc/lto-streamer-in.c
> +++ b/gcc/lto-streamer-in.c
> @@ -617,6 +617,21 @@ make_new_block (struct function *fn, unsigned int index)
>  }
>
>
> +/* Read a wide-int.  */
> +
> +static widest_int
> +streamer_read_wi (struct lto_input_block *ib)
> +{
> +  HOST_WIDE_INT a[WIDE_INT_MAX_ELTS];
> +  int i;
> +  int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib);
> +  int len = streamer_read_uhwi (ib);
> +  for (i = 0; i < len; i++)
> +a[i] = streamer_read_hwi (ib);
> +  return widest_int::from_array (a, len);
> +}
> +
> +
>  /* Read the CFG for function FN from input block IB.  */
>
>  static void
> @@ -726,28 +741,10 @@ input_cfg (struct lto_input_block *ib, struct data_in 
> *data_in,
>loop->estimate_state = streamer_read_enum (ib, loop_estimation, 
> EST_LAST);
>loop->any_upper_bound = streamer_read_hwi (ib);
>if (loop->any_upper_bound)
> -   {
> - HOST_WIDE_INT a[WIDE_INT_MAX_ELTS];
> - int i;
> - int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib);
> - int len = streamer_read_uhwi (ib);
> - for (i = 0; i < len; i++)
> -   a[i] = streamer_read_hwi (ib);
> -
> - loop->nb_iterations_upper_bound = widest_int::from_array (a, len);
> -   }
> +   loop->nb_iterations_upper_bound = streamer_read_wi (ib);
>loop->any_estimate = streamer_read_hwi (ib);
>if (loop->any_estimate)
> -   {
> - HOST_WIDE_INT a[WIDE_INT_MAX_ELTS];
> - int i;
> - int prec ATTRIBUTE_UNUSED = streamer_read_uhwi (ib);
> - int len = streamer_read_uhwi (ib);
> - for (i = 0; i < len; i++)
> -   a[i] = streamer_read_hwi (ib);
> -
> - loop->nb_iterations_estimate = widest_int::from_array (a, len);
> -   }
> +   loop->nb_iterations_estimate = streamer_read_wi (ib);
>
>/* Read OMP SIMD related info.  */
>loop->safelen = streamer_read_hwi (ib);
> diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
> index de19235..60acb42 100644
> --- a/gcc/lto-streamer-out.c
> +++ b/gcc/lto-streamer-out.c
> @@ -1622,6 +1622,21 @@ output_ssa_names (struct output_block *ob, struct 
> function *fn)
>  }
>
>
> +/* Output a wide-int.  */
> +
> +static void
> +streamer_write_wi (struct output_block *ob,
> +  const widest_int &w)
> +{
> +  int len = w.get_len ();
> +
> +  streamer_write_uhwi (ob, w.get_precision ());
> +  streamer_write_uhwi (ob, len);
> +  for (int i = 0; i < len; i++)
> +streamer_write_hwi (ob, w.elt (i));
> +}
> +
> +
>  /* Output the cfg.  */
>
>  static void
> @@ -1694,26 +1709,10 @@ output_cfg (struct output_block *ob, struct function 
> *fn)
>loop_estimation, EST_LAST, loop->estimate_state);
>streamer_write_hwi (ob, loop->any_upper_bound);
>if (loop->any_upper_bound)
> -   {
> - int len = loop->nb_iterations_upper_bound.get_len ();
> - int i;
> -
> - streamer_write_uhwi (ob, 
> loop->nb_iterations_upper_bound.get_precision ());
> - streamer_write_uhwi (ob, len);
> - for (i = 0; i < len; i++)
> -   streamer_write_hwi (ob, loop->nb_iterations_upper_bound.elt (i));
> -   }
> +   streamer_write_wi (ob, loop->nb_iterations_upper_bound);
>streamer_write_hwi (ob, loop->any_estimate);
>if (loop->any_estimate)
> -   {
> - int len = loop->nb_iterations_estimate.get_len ();
> - int i;
> -
> - streamer_write_uhwi (ob, loop->nb_iterations_estimate.get_precision 
> ());
> - streamer_write_uhwi (ob, len);
> - for (i = 0; i < len; i++)
> -   streamer_write_hwi (ob, loop->nb_iterations_estimate.elt (i));
> -   }
> +   streamer_write_wi (ob, loop->nb_iterations_estimate);
>
>/* Write OMP SIMD related info.  */
>streamer_write_hwi (ob, loop->safelen);


Re: [PATCH] Add alloc_align and assume_aligned attributes (PR middle-end/60092)

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Jakub Jelinek wrote:

> Hi!
> 
> As discussed on IRC, this patch introduces two new attributes,
> so that the C library (and other headers) have a way to
> a) tell the compiler something about functions like aligned_alloc
>or memalign
> b) tell the compiler the alignment of pointers returned say by malloc
> 
> Ok for trunk if bootstrap/regtest passes?
> 
> 2014-02-06  Jakub Jelinek  
> 
>   PR middle-end/60092
>   * tree-ssa-ccp.c (surely_varying_stmt_p): Don't return true
>   if TYPE_ATTRIBUTES (gimple_call_fntype ()) contain
>   assume_aligned or alloc_align attributes.
>   (bit_value_alloc_assume_aligned_attribute): New function.   
>   (evaluate_stmt): Handle calls to functions with
>   assume_aligned or alloc_align attributes.
>   * doc/extend.texi: Document assume_aligned and alloc_align
>   attributes.
> c-family/
>   * c-common.c (handle_alloc_align_attribute,
>   handle_assume_aligned_attribute): New functions.
>   (c_common_attribute_table): Add alloc_align and assume_aligned
>   attributes.
> testsuite/
>   * gcc.dg/attr-alloc_align-1.c: New test.
>   * gcc.dg/attr-alloc_align-2.c: New test.
>   * gcc.dg/attr-alloc_align-3.c: New test.
>   * gcc.dg/attr-assume_aligned-1.c: New test.
>   * gcc.dg/attr-assume_aligned-2.c: New test.
>   * gcc.dg/attr-assume_aligned-3.c: New test.
> 
> --- gcc/c-family/c-common.c.jj2014-02-05 10:37:58.0 +0100
> +++ gcc/c-family/c-common.c   2014-02-06 15:35:15.707333771 +0100
> @@ -366,6 +366,8 @@ static tree handle_warn_unused_result_at
>  static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_alloc_size_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_alloc_align_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_assume_aligned_attribute (tree *, tree, tree, int, bool 
> *);
>  static tree handle_target_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_optimize_attribute (tree *, tree, tree, int, bool *);
>  static tree ignore_attribute (tree *, tree, tree, int, bool *);
> @@ -766,6 +768,10 @@ const struct attribute_spec c_common_att
> handle_omp_declare_simd_attribute, false },
>{ "omp declare target", 0, 0, true, false, false,
> handle_omp_declare_target_attribute, false },
> +  { "alloc_align", 1, 1, false, true, true,
> +   handle_alloc_align_attribute, false },
> +  { "assume_aligned",  1, 2, false, true, true,
> +   handle_assume_aligned_attribute, false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  
> @@ -8046,13 +8052,64 @@ handle_alloc_size_attribute (tree *node,
>if (TREE_CODE (position) != INTEGER_CST
> || TREE_INT_CST_HIGH (position)
> || TREE_INT_CST_LOW (position) < 1
> -   || TREE_INT_CST_LOW (position) > arg_count )
> +   || TREE_INT_CST_LOW (position) > arg_count)
>   {
> warning (OPT_Wattributes,
>  "alloc_size parameter outside range");
> *no_add_attrs = true;
> return NULL_TREE;
>   }
> +}
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "alloc_align" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_alloc_align_attribute (tree *node, tree ARG_UNUSED (name), tree args,
> +   int, bool *no_add_attrs)
> +{
> +  unsigned arg_count = type_num_arguments (*node);
> +  tree position = TREE_VALUE (args);
> +  if (position && TREE_CODE (position) != IDENTIFIER_NODE
> +  && TREE_CODE (position) != FUNCTION_DECL)
> +position = default_conversion (position);
> +
> +  if (TREE_CODE (position) != INTEGER_CST
> +  || TREE_INT_CST_HIGH (position)
> +  || TREE_INT_CST_LOW (position) < 1
> +  || TREE_INT_CST_LOW (position) > arg_count)

You make it easier for wide-int folks if you use tree_fits_uhwi_p
and tree_to_uhwi ...

> +{
> +  warning (OPT_Wattributes,
> +"alloc_align parameter outside range");
> +  *no_add_attrs = true;
> +  return NULL_TREE;
> +}
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "assume_aligned" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_assume_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree 
> args,
> +  int, bool *no_add_attrs)
> +{
> +  for (; args; args = TREE_CHAIN (args))
> +{
> +  tree position = TREE_VALUE (args);
> +  if (position && TREE_CODE (position) != IDENTIFIER_NODE
> +   && TREE_CODE (position) != FUNCTION_DECL)
> + position = default_conversion (position);
> +
> +  if (TREE_CODE (position) != INTEGER_CST)
> + {
> +   warning (OPT_Wattributes,
>

Re: [PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082, take 2)

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Jakub Jelinek wrote:

> On Thu, Feb 06, 2014 at 11:00:14AM +0100, Richard Biener wrote:
> > Ah, so __builtin_setjmp_receiver is like setjmp in this regard
> > and setjmp is LEAF (it's a stmt that doesn't direct control-flow
> > anywhere else).  So __builtin_setjmp_receiver should be LEAF as well
> > (and so should __builtin_setjmp_setup).
> > 
> > Doesn't sound dangerous at all to me ...
> 
> This passed bootstrap/regtest and fixed the timeouts too.  Ok for trunk?

Ok.

Thanks,
Richard.

> 2014-02-06  Jakub Jelinek  
> 
>   PR c++/60082
>   * tree.c (build_common_builtin_nodes): Set ECF_LEAF for
>   __builtin_setjmp_receiver.
> 
>   Revert
>   2014-02-05  Balaji V. Iyer  
> 
>   * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1.
>   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
> 
> --- gcc/tree.c.jj 2014-02-04 10:30:15.010121513 +0100
> +++ gcc/tree.c2014-02-06 17:03:49.034656060 +0100
> @@ -9980,7 +9980,7 @@ build_common_builtin_nodes (void)
>ftype = build_function_type_list (void_type_node, ptr_type_node, 
> NULL_TREE);
>local_define_builtin ("__builtin_setjmp_receiver", ftype,
>   BUILT_IN_SETJMP_RECEIVER,
> - "__builtin_setjmp_receiver", ECF_NOTHROW);
> + "__builtin_setjmp_receiver", ECF_NOTHROW | ECF_LEAF);
>  
>ftype = build_function_type_list (ptr_type_node, NULL_TREE);
>local_define_builtin ("__builtin_stack_save", ftype, BUILT_IN_STACK_SAVE,
> --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc(revision 207519)
> +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc(revision 207518)
> @@ -1,7 +1,6 @@
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
>  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* 
> arm*-*-* } } } */
> -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
>  
>  #include 
>  #include 
> --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c  (revision 
> 207519)
> +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c  (revision 
> 207518)
> @@ -1,7 +1,6 @@
>  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } 
> */
> -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
>  
>  #include 
>  #define DEFAULT_VALUE 30
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: Free constructor elts in LTO merging

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Jan Hubicka wrote:

> Hi,
> according to memory stats this is relatively common reason for garbage left
> after tree merging.
> 
> Bootstrapped/regtested x86_64-linux, OK?

As CONSTRUCTOR_ELTS is a vec<, va_gc> please use

  if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
vec_free (CONSTRUCTOR_ELTS (scc->entries[i]));

Ok with that change.

Thanks,
Richard.

> Honza
> 
>   * lto/lto.c (unify_scc): Free also CONSTRUCTOR_ELTS.
> Index: lto/lto.c
> ===
> --- lto/lto.c (revision 207515)
> +++ lto/lto.c (working copy)
> @@ -1807,8 +1807,13 @@ unify_scc (struct streamer_tree_cache_d
> /* Free the tree nodes from the read SCC.  */
> for (unsigned i = 0; i < len; ++i)
>   {
> +   enum tree_code code;
> if (TYPE_P (scc->entries[i]))
>   num_merged_types++;
> +   code = TREE_CODE (scc->entries[i]);
> +   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR)
> +   && CONSTRUCTOR_ELTS (scc->entries[i]))
> + ggc_free (CONSTRUCTOR_ELTS (scc->entries[i]));
> ggc_free (scc->entries[i]);
>   }
>  
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: Avoid unnnecesary copying of ipa-prop's expressions

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Jan Hubicka wrote:

> Hi,
> at WPA we currently read trees accessed by jump functions and then copy them
> to remove location that is already known to be UNKNOWN and then keep copying
> them for every inline clone introduced (and there are many for firefox)
> 
> This patch makes us to copy only when expression really has an location in it.
> 
> Bootstrapped/regtested x86_64-linux, OK?

Hmm, I think you either can use just

if (EXPR_P (expr))
  walk_tree (&expr, prune_expr_location, NULL, NULL);

or you miss unsharing and create invalid shared trees when
the expr does not contain locations.

I fear it's the latter, given how ipa_set_jf_* is used.

So the patch looks wrong.

Richard.

> Honza
> 
>   * gimplify.c (expr_with_location_p): New function.
>   (expr_without_location): Likewise.
>   * gimplify.h (expr_without_location): Declare.
>   * ipa-prop.c (ipa_set_jf_constant, ipa_set_jf_arith_pass_through,
>   determine_known_aggregate_parts): Do not unshare.
> Index: gimplify.c
> ===
> --- gimplify.c(revision 207514)
> +++ gimplify.c(working copy)
> @@ -901,6 +901,32 @@ unshare_expr_without_location (tree expr
>  walk_tree (&expr, prune_expr_location, NULL, NULL);
>return expr;
>  }
> +
> +/* Worker for expr_without_location.  */
> +
> +static tree
> +expr_with_location_p (tree *tp, int *walk_subtrees, void *)
> +{
> +  if (EXPR_P (*tp))
> +{
> +  if (EXPR_LOCATION (*tp) != UNKNOWN_LOCATION)
> + return *tp;
> +}
> +  else
> +*walk_subtrees = 0;
> +  return NULL_TREE;
> +}
> +
> +/* Return EXPR if it has no location, otherwise make its copy
> +   without location.  */
> +
> +tree
> +expr_without_location (tree expr)
> +{
> +  if (walk_tree (&expr, expr_with_location_p, NULL, NULL))
> +return (unshare_expr_without_location (expr));
> +  return expr;
> +}
>  
>  /* WRAPPER is a code such as BIND_EXPR or CLEANUP_POINT_EXPR which can both
> contain statements and have a value.  Assign its value to a temporary
> Index: gimplify.h
> ===
> --- gimplify.h(revision 207514)
> +++ gimplify.h(working copy)
> @@ -62,6 +62,7 @@ extern void declare_vars (tree, gimple,
>  extern void gimple_add_tmp_var (tree);
>  extern tree unshare_expr (tree);
>  extern tree unshare_expr_without_location (tree);
> +extern tree expr_without_location (tree);
>  extern tree voidify_wrapper_expr (tree, tree);
>  extern tree build_and_jump (tree *);
>  extern enum gimplify_status gimplify_self_mod_expr (tree *, gimple_seq *,
> Index: ipa-prop.c
> ===
> --- ipa-prop.c(revision 207514)
> +++ ipa-prop.c(working copy)
> @@ -419,11 +419,8 @@ static void
>  ipa_set_jf_constant (struct ipa_jump_func *jfunc, tree constant,
>struct cgraph_edge *cs)
>  {
> -  constant = unshare_expr (constant);
> -  if (constant && EXPR_P (constant))
> -SET_EXPR_LOCATION (constant, UNKNOWN_LOCATION);
>jfunc->type = IPA_JF_CONST;
> -  jfunc->value.constant.value = unshare_expr_without_location (constant);
> +  jfunc->value.constant.value = expr_without_location (constant);
>  
>if (TREE_CODE (constant) == ADDR_EXPR
>&& TREE_CODE (TREE_OPERAND (constant, 0)) == FUNCTION_DECL)
> @@ -463,7 +460,7 @@ ipa_set_jf_arith_pass_through (struct ip
>  tree operand, enum tree_code operation)
>  {
>jfunc->type = IPA_JF_PASS_THROUGH;
> -  jfunc->value.pass_through.operand = unshare_expr_without_location 
> (operand);
> +  jfunc->value.pass_through.operand = expr_without_location (operand);
>jfunc->value.pass_through.formal_id = formal_id;
>jfunc->value.pass_through.operation = operation;
>jfunc->value.pass_through.agg_preserved = false;
> @@ -1538,7 +1535,7 @@ determine_known_aggregate_parts (gimple
> struct ipa_agg_jf_item item;
> item.offset = list->offset - arg_offset;
> gcc_assert ((item.offset % BITS_PER_UNIT) == 0);
> -   item.value = unshare_expr_without_location (list->constant);
> +   item.value = expr_without_location (list->constant);
> jfunc->agg.items->quick_push (item);
>   }
> list = list->next;
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: Make gimple_get_virt_method_for_vtable O(1) and not allocating garbage

2014-02-07 Thread Richard Biener
On Thu, 6 Feb 2014, Jan Hubicka wrote:

> Hi,
> I did some memory measurements for Firefox.  We seems in shape in exception of
> linemaps that takes about 20% of memory and I also noticed that we produce a 
> lot
> of garbage by copy_tree_r.  Analyzing the reasons for copy_tree_r I found two 
> about
> equally important.  The first one is that gimple_get_virt_method_for_vtable 
> lookup
> virtual method via fold_ctor_reference and this eventually calls unshare_expr 
> on
> the resulting value (that is ADDR_EXPR of FUNCTION_DECL). We make copy of 
> ADDR_EXPR
> and immediately throw it away in gimple_get_virt_method_for_vtable and care
> only about the decl.
> 
> I always considered it stupid to do linear walk of fields here, when we can do
> just simple O(1) lookup, but considered it SEP.  This patch implements that.
> The constructors are fully controlled by frontend (we verify that it is vtable
> by DECL_VIRTUAL flag first), so I think it is safe to use special purpose 
> lookup
> code.
> 
> Bootstrapped/regtested x86_64-linux, OK?

Ok.

Thanks,
Richard.

> Honza
> 
>   * gimple-fold.c (gimple_get_virt_method_for_vtable): Rewrite
>   constructor lookup to be O(1).
> Index: gimple-fold.c
> ===
> --- gimple-fold.c (revision 207523)
> +++ gimple-fold.c (working copy)
> @@ -3179,6 +3180,8 @@ gimple_get_virt_method_for_vtable (HOST_
>  {
>tree vtable = v, init, fn;
>unsigned HOST_WIDE_INT size;
> +  unsigned HOST_WIDE_INT elt_size, access_index;
> +  tree domain_type;
>  
>/* First of all double check we have virtual table.  */
>if (TREE_CODE (v) != VAR_DECL
> @@ -3202,10 +3205,30 @@ gimple_get_virt_method_for_vtable (HOST_
>offset *= BITS_PER_UNIT;
>offset += token * size;
>  
> -  /* Do not pass from_decl here, we want to know even about values we can
> - not use and will check can_refer_decl_in_current_unit_p ourselves.  */
> -  fn = fold_ctor_reference (TREE_TYPE (TREE_TYPE (v)), init,
> - offset, size, NULL);
> +  /* Lookup the value in the constructor that is assumed to be array.
> + This is equivalent to
> + fn = fold_ctor_reference (TREE_TYPE (TREE_TYPE (v)), init,
> +offset, size, NULL);
> + but in a constant time.  We expect that frontend produced a simple
> + array without indexed initializers.  */
> +
> +  gcc_checking_assert (TREE_CODE (TREE_TYPE (init)) == ARRAY_TYPE);
> +  domain_type = TYPE_DOMAIN (TREE_TYPE (init));
> +  gcc_checking_assert (integer_zerop (TYPE_MIN_VALUE (domain_type)));
> +  elt_size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init;
> +
> +  access_index = offset / BITS_PER_UNIT / elt_size;
> +  gcc_checking_assert (offset % (elt_size * BITS_PER_UNIT) == 0);
> +
> +  /* This code makes an assumption that there are no 
> + indexed fileds produced by C++ FE, so we can directly index the array. 
> */
> +  if (access_index < CONSTRUCTOR_NELTS (init))
> +{
> +  fn = CONSTRUCTOR_ELT (init, access_index)->value;
> +  gcc_checking_assert (!CONSTRUCTOR_ELT (init, access_index)->index);
> +}
> +  else
> +fn = NULL;
>  
>/* For type inconsistent program we may end up looking up virtual method
>   in virtual table that does not contain TOKEN entries.  We may overrun
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [PATCH] New optimize(0) versioning fix (PR target/60026, take 2)

2014-02-07 Thread Jakub Jelinek
On Fri, Feb 07, 2014 at 12:50:22AM +0100, Jan Hubicka wrote:
> Don't we want to check opt_for_fn (node->decl, cp) instead and arrange 
> -fipa-cp
> to be false when !optimize?

I can easily imagine using
  !opt_for_fn (node->decl, optimize)
  || !opt_for_fn (node->decl, flag_ipa_cp)
but guaranteeing flag_ipa_cp or flag_ipa_sra is never true for optimize == 0
could be harder, what if something is built with -O0 -fipa-cp or
__attribute__((optimize (0), "fipa-cp"))) or similar?  Checking optimize
value is among other things about the lack of vdef/vuse for !optimize.

Jakub