date:20140205

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Andreas Schwab

Marc Glisse  writes:

>> /usr/local/gcc/gcc-20140204/gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C:32:6:
>>  error: alignment for 'void g()' must be at least 16
>
> (I don't know why we error out for this, align specifies a minimal
> alignment, if it ends up more aligned, that's fine)

Probably because there is no packed attribute for functions, so the
alignment is always taken to be exact.

> Feel free to replace 8 with 16 in the initialization of size (you can
> commit it as obvious if it works for you).

Are there any targets that may reject an overaligned function decl?

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 09:28:16AM +0100, Andreas Schwab wrote:
> > Feel free to replace 8 with 16 in the initialization of size (you can
> > commit it as obvious if it works for you).
> 
> Are there any targets that may reject an overaligned function decl?

I think that is very well possible.  Why not just #ifdef that particular
test line to a handful of widely used targets where it is known to work?

Jakub

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Dominique Dhumieres

The test g++.dg/cpp0x/constexpr-attribute2.C fails on darwin
with

FAIL: g++.dg/cpp0x/constexpr-attribute2.C (test for excess errors)
Excess errors:
/opt/gcc/_clean/gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C:9:40: error: 
'init_priority' attribute is not supported on this platform
/opt/gcc/_clean/gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C:16:46: error: 
'init_priority' attribute is not supported on this platform
/opt/gcc/_clean/gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C:23:45: error: 
'init_priority' attribute is not supported on this platform

IMO the test should check that 'init_priority' is supported.

TIA

Dominique

[PATCH][4.10] Move PRE:inhibit_phi_insertion to where it better applies

2014-02-05 Thread Richard Biener


The following patch moves the code that avoids creating loop
carried data dependences in PRE when later vectorizing to a place
where it can use the optimizations PRE performs (in particular
loop invariant motion).  This makes sure it applies for Himeno
when built with -fipa-pta which then allows the critical loop
to be vectorized without any runtime checks for aliasing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for 4.10.

Richard.

2014-02-05  Richard Biener  

PR tree-optimization/60042
* tree-ssa-pre.c (inhibit_phi_insertion): Remove.
(insert_into_preds_of_block): Do not prevent PHI insertion
for REFERENCE exprs here ...
(eliminate_dom_walker::before_dom_children): ... but prevent
their use here under similar conditions when applied to the
IL after PRE optimizations.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 207455)
--- gcc/tree-ssa-pre.c  (working copy)
*** create_expression_by_pieces (basic_block
*** 3013,3078 
  }
  
  
- /* Returns true if we want to inhibit the insertions of PHI nodes
-for the given EXPR for basic block BB (a member of a loop).
-We want to do this, when we fear that the induction variable we
-create might inhibit vectorization.  */
- 
- static bool
- inhibit_phi_insertion (basic_block bb, pre_expr expr)
- {
-   vn_reference_t vr = PRE_EXPR_REFERENCE (expr);
-   vec ops = vr->operands;
-   vn_reference_op_t op;
-   unsigned i;
- 
-   /* If we aren't going to vectorize we don't inhibit anything.  */
-   if (!flag_tree_loop_vectorize)
- return false;
- 
-   /* Otherwise we inhibit the insertion when the address of the
-  memory reference is a simple induction variable.  In other
-  cases the vectorizer won't do anything anyway (either it's
-  loop invariant or a complicated expression).  */
-   FOR_EACH_VEC_ELT (ops, i, op)
- {
-   switch (op->opcode)
-   {
-   case CALL_EXPR:
- /* Calls are not a problem.  */
- return false;
- 
-   case ARRAY_REF:
-   case ARRAY_RANGE_REF:
- if (TREE_CODE (op->op0) != SSA_NAME)
-   break;
- /* Fallthru.  */
-   case SSA_NAME:
- {
-   basic_block defbb = gimple_bb (SSA_NAME_DEF_STMT (op->op0));
-   affine_iv iv;
-   /* Default defs are loop invariant.  */
-   if (!defbb)
- break;
-   /* Defined outside this loop, also loop invariant.  */
-   if (!flow_bb_inside_loop_p (bb->loop_father, defbb))
- break;
-   /* If it's a simple induction variable inhibit insertion,
-  the vectorizer might be interested in this one.  */
-   if (simple_iv (bb->loop_father, bb->loop_father,
-  op->op0, &iv, true))
- return true;
-   /* No simple IV, vectorizer can't do anything, hence no
-  reason to inhibit the transformation for this operand.  */
-   break;
- }
-   default:
- break;
-   }
- }
-   return false;
- }
- 
  /* Insert the to-be-made-available values of expression EXPRNUM for each
 predecessor, stored in AVAIL, into the predecessors of BLOCK, and
 merge the result with a phi node, given the same value number as
--- 3013,3018 
*** insert_into_preds_of_block (basic_block
*** 3106,3113 
EDGE_PRED (block, 1)->src);
/* Induction variables only have one edge inside the loop.  */
if ((firstinsideloop ^ secondinsideloop)
! && (expr->kind != REFERENCE
! || inhibit_phi_insertion (block, expr)))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Skipping insertion of phi for partial 
redundancy: Looks like an induction variable\n");
--- 3046,3052 
EDGE_PRED (block, 1)->src);
/* Induction variables only have one edge inside the loop.  */
if ((firstinsideloop ^ secondinsideloop)
! && expr->kind != REFERENCE)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Skipping insertion of phi for partial 
redundancy: Looks like an induction variable\n");
*** eliminate_dom_walker::before_dom_childre
*** 4234,4239 
--- 4173,4228 
  
  gcc_assert (sprime != rhs);
  
+ /* Inhibit the use of an inserted PHI on a loop header when
+the address of the memory reference is a simple induction
+variable.  In other cases the vectorizer won't do anything
+anyway (either it's loop invariant or a complicated
+expression).  */
+ if (flag_tree_loop_vectorize
+ && gimple_assign_single_p (stmt)
+ && TREE_CODE (s

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Marc Glisse


On Wed, 5 Feb 2014, Jakub Jelinek wrote:


On Wed, Feb 05, 2014 at 09:28:16AM +0100, Andreas Schwab wrote:

Feel free to replace 8 with 16 in the initialization of size (you can
commit it as obvious if it works for you).


Are there any targets that may reject an overaligned function decl?


I think that is very well possible.  Why not just #ifdef that particular
test line to a handful of widely used targets where it is known to work?


With Dominique's message that Darwin doesn't support init_priority, would 
this be ok? That's the most frequently tested platform, some attributes 
are already tested elsewhere, etc.


// { dg-do compile { target x86_64-*linux-gnu } }


--
Marc Glisse

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 09:58:28AM +0100, Marc Glisse wrote:
> On Wed, 5 Feb 2014, Jakub Jelinek wrote:
> 
> >On Wed, Feb 05, 2014 at 09:28:16AM +0100, Andreas Schwab wrote:
> >>>Feel free to replace 8 with 16 in the initialization of size (you can
> >>>commit it as obvious if it works for you).
> >>
> >>Are there any targets that may reject an overaligned function decl?
> >
> >I think that is very well possible.  Why not just #ifdef that particular
> >test line to a handful of widely used targets where it is known to work?
> 
> With Dominique's message that Darwin doesn't support init_priority,
> would this be ok? That's the most frequently tested platform, some
> attributes are already tested elsewhere, etc.
> 
> // { dg-do compile { target x86_64-*linux-gnu } }

I'd use i?86-*linux* x86_64-*linux* instead, but otherwise LGTM.  Please
give others a chance to comment on that though.

Jakub

[PATCH] Improve diagnostics for PR60042

2014-02-05 Thread Richard Biener


The following improves dumping / messages for vectorizer runtime
alias checks.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.  I'll consider
this "documentation" and thus appropriate at this stage.

Richard.

2014-02-05  Richard Biener  

* tree-vect-loop.c (vect_analyze_loop_2): Be more informative
when not vectorizing because of too many alias checks.
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Add more verboseness, avoid duplicate MSG_MISSED_OPTIMIZATION.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 207497)
--- gcc/tree-vect-loop.c(working copy)
*** vect_analyze_loop_2 (loop_vec_info loop_
*** 1723,1730 
  {
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!"too long list of versioning for alias "
!"run-time tests.\n");
return false;
  }
  
--- 1723,1732 
  {
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!"number of versioning for alias "
!"run-time tests exceeds %d "
!"(--param vect-max-version-for-alias-checks)\n",
!PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS));
return false;
  }
  
Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 207497)
--- gcc/tree-vect-data-refs.c   (working copy)
*** vect_prune_runtime_alias_test_list (loop
*** 2901,2906 
--- 2923,2946 
  && diff - (HOST_WIDE_INT) TREE_INT_CST_LOW (dr_a1->seg_len) <
 min_seg_len_b))
{
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_NOTE, vect_location,
+  "merging ranges for ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM,
+DR_REF (dr_a1->dr));
+ dump_printf (MSG_NOTE,  ", ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM,
+DR_REF (dr_b1->dr));
+ dump_printf (MSG_NOTE,  " and ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM,
+DR_REF (dr_a2->dr));
+ dump_printf (MSG_NOTE,  ", ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM,
+DR_REF (dr_b2->dr));
+ dump_printf (MSG_NOTE, "\n");
+   }
+ 
  dr_a1->seg_len = size_binop (PLUS_EXPR,
   dr_a2->seg_len, size_int (diff));
  comp_alias_ddrs.ordered_remove (i--);
*** vect_prune_runtime_alias_test_list (loop
*** 2908,2925 
}
  }
  
if ((int) comp_alias_ddrs.length () >
PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
! {
!   if (dump_enabled_p ())
!   {
! dump_printf_loc (MSG_MISSED_OPTIMIZATION,  vect_location,
!  "disable versioning for alias - max number of "
!  "generated checks exceeded.\n");
!   }
! 
!   return false;
! }
  
return true;
  }
--- 2948,2959 
}
  }
  
+   dump_printf_loc (MSG_NOTE, vect_location,
+  "improved number of alias checks from %d to %d\n",
+  may_alias_ddrs.length (), comp_alias_ddrs.length ());
if ((int) comp_alias_ddrs.length () >
PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
! return false;
  
return true;
  }

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Marc Glisse


On Wed, 5 Feb 2014, Jakub Jelinek wrote:


On Wed, Feb 05, 2014 at 09:58:28AM +0100, Marc Glisse wrote:

On Wed, 5 Feb 2014, Jakub Jelinek wrote:


On Wed, Feb 05, 2014 at 09:28:16AM +0100, Andreas Schwab wrote:

Feel free to replace 8 with 16 in the initialization of size (you can
commit it as obvious if it works for you).


Are there any targets that may reject an overaligned function decl?


I think that is very well possible.  Why not just #ifdef that particular
test line to a handful of widely used targets where it is known to work?


With Dominique's message that Darwin doesn't support init_priority,
would this be ok? That's the most frequently tested platform, some
attributes are already tested elsewhere, etc.

// { dg-do compile { target x86_64-*linux-gnu } }


I'd use i?86-*linux* x86_64-*linux* instead, but otherwise LGTM.


I wasn't sure if all libc (not just glibc) used on linux could handle all 
the attributes, but now you've said it, good :-)



Please give others a chance to comment on that though.


I'll give it a day (it can easily be changed again later).

--
Marc Glisse

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Dominique Dhumieres

> I'll give it a day (it can easily be changed again later).

IMO it would be better (more general) to use a dg-require-effective-target
with the corresponding test(s) in target-supports-dg.exp.

Dominique

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Marc Glisse


On Wed, 5 Feb 2014, Dominique Dhumieres wrote:


I'll give it a day (it can easily be changed again later).


IMO it would be better (more general) to use a dg-require-effective-target
with the corresponding test(s) in target-supports-dg.exp.


But which tests? One for over/under alignment of functions, one for 
init_priority, maybe yet one more, all that for a file that is just 
checking that the parser understands the attribute argument is a 
constant...


I agree on the principle, but unless those tests already exist, it doesn't 
seem worth it (well, if you are volunteering to write them, I certainly 
won't object).


--
Marc Glisse

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 10:43:24AM +0100, Marc Glisse wrote:
> On Wed, 5 Feb 2014, Dominique Dhumieres wrote:
> 
> >>I'll give it a day (it can easily be changed again later).
> >
> >IMO it would be better (more general) to use a dg-require-effective-target
> >with the corresponding test(s) in target-supports-dg.exp.
> 
> But which tests? One for over/under alignment of functions, one for
> init_priority, maybe yet one more, all that for a file that is just
> checking that the parser understands the attribute argument is a
> constant...
> 
> I agree on the principle, but unless those tests already exist, it
> doesn't seem worth it (well, if you are volunteering to write them,
> I certainly won't object).

Yeah, the tests really don't test anything target specific, but test it
on attributes that have target specific meaning/applicability.
So it is enough to catch it on some most widely tested targets, any
regression in that area will be known the same day when it breaks.

Jakub

Re: PR ipa/59831 (ipa-cp devirt issues)

2014-02-05 Thread Martin Jambor

Hi,

On Wed, Feb 05, 2014 at 12:47:30AM +0100, Jan Hubicka wrote:
> > > -  if (TREE_CODE (t) != TREE_BINFO)
> > > +  /* Try to work out BINFO from virtual table pointer value in 
> > > replacements.  */
> > > +  if (!t && agg_reps && !ie->indirect_info->by_ref)
> > 
> > At this point you know that !ie->indirect_info->polymorphic is set and
> > thus ie->indirect_info->by_ref is always false because it really has
> > no meaning (it is only meaningful when agg_contents is set which is
> > mutually exclusive with the polymorphic flag).
> 
> I was worried here about case where in future we may want to represent call of
> virtual methods from pointers passed by reference (i.e. in some other object).
> We don't do that at the moment, but for that would really need better jump
> functions.
> 
> If you preffer, I can remove that check.
> 

I think it would be better, yes.  IIRC, We want to re-organize
indirect_info anyway (I vaguely remember we want to split the
overloaded offset field into two but forgot the exact reason why but I
have it written somewhere), I suppose we'll be turning it into a union
(or class hierarchy?) and this would make it slightly awkward.  If we
ever support the pointer-by-reference scenario, quite a few more
places will need to be adjusted anyway.

Thanks,

Martin

Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-02-05 Thread Venkataramanan Kumar

Hi Marcus,

> +  "ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0"
> +  [(set_attr "length" "12")])
>
> This pattern emits an opaque sequence of instructions that cannot be
> scheduled, is that necessary? Can we not expand individual
> instructions or at least split ?

Almost all the ports emits a template of assembly instructions.
I m not sure why they have to be generated this way.
But usage of these pattern is to clear the register that holds canary
value immediately after its usage.

> -/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */
> +/* { dg-do compile { target stack_protection } } */
>  /* { dg-options "-O2 -fstack-protector-strong" } */
>
> Do we need a new effective target test, why is the existing
> "fstack_protector" not appropriate?

"stack_protector" does a run time test. It failed in cross compilation
environment and these are compile only tests.
Also I thought  richard suggested  me to add a new option for this.
ref: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03358.html

regards,
Venkat.

On 4 February 2014 21:39, Marcus Shawcroft  wrote:
> Hi Venkat,
>
> On 22 January 2014 16:57, Venkataramanan Kumar
>  wrote:
>> Hi Marcus,
>>
>> After we changed the frame growing direction (downwards) in Aarch64,
>> the back-end now generates stack smashing set and test based on
>> generic code available in GCC.
>>
>> But most of the ports (i386, spu, rs6000, s390, sh, sparc, tilepro and
>> tilegx) define machine descriptions using standard pattern names
>> 'stack_protect_set' and 'stack_protect_test'. This is done for both
>> TLS model as well as global variable based stack guard model.
>
> +  ""
> +  "ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0"
> +  [(set_attr "length" "12")])
>
> This pattern emits an opaque sequence of instructions that cannot be
> scheduled, is that necessary? Can we not expand individual
> instructions or at least split ?
>
> +  "ldr\t%x3, %x1\;ldr\t%x0, %x2\;eor\t%x0, %x3, %x0"
> +  [(set_attr "length" "12")])
>
> Likewise.
>
> -/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */
> +/* { dg-do compile { target stack_protection } } */
>  /* { dg-options "-O2 -fstack-protector-strong" } */
>
> Do we need a new effective target test, why is the existing
> "fstack_protector" not appropriate?
>
> Cheers
> /Marcus

Re: [ARM Documentation] Clarify -mcpu, -mtune, -march

2014-02-05 Thread James Greenhalgh

*ping*

Thanks,
James

On Mon, Jan 27, 2014 at 10:01:51AM +, James Greenhalgh wrote:
> 
> Hi,
> 
> I've tripped myself over with these three options too many times,
> actually, their behaviour is very simple.
> 
> This patch clarifies the language used to describe the options, and
> puts them in a logical order. I'm happy to reword again if this
> is still not clear.
> 
> OK?
> 
> Thanks,
> James
> 
> ---
> gcc/
> 
> 2014-01-27  James Greenhalgh  
> 
>   PR target/59718
>   * doc/invoke.texi (-march=): Clarify documentation for ARM.
>   (-mtune=): Likewise.
>   (-mcpu=): Likewise.
> 

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 8c620a5..38a55a0 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12221,11 +12221,35 @@ option should only be used if you require 
> compatibility with code for
>  big-endian ARM processors generated by versions of the compiler prior to
>  2.8.  This option is now deprecated.
>  
> -@item -mcpu=@var{name}
> -@opindex mcpu
> -This specifies the name of the target ARM processor.  GCC uses this name
> -to determine what kind of instructions it can emit when generating
> -assembly code.  Permissible names are: @samp{arm2}, @samp{arm250},
> +@item -march=@var{name}
> +@opindex march
> +This specifies the name of the target ARM architecture.  GCC uses this
> +name to determine what kind of instructions it can emit when generating
> +assembly code.  This option can be used in conjunction with or instead
> +of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
> +@samp{armv2a}, @samp{armv3}, @samp{armv3m}, @samp{armv4}, @samp{armv4t},
> +@samp{armv5}, @samp{armv5t}, @samp{armv5e}, @samp{armv5te},
> +@samp{armv6}, @samp{armv6j},
> +@samp{armv6t2}, @samp{armv6z}, @samp{armv6zk}, @samp{armv6-m},
> +@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
> +@samp{armv8-a}, @samp{armv8-a+crc},
> +@samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
> +
> +@option{-march=armv8-a+crc} enables code generation for the ARMv8-A
> +architecture together with the optional CRC32 extensions.
> +
> +@option{-march=native} causes the compiler to auto-detect the architecture
> +of the build computer.  At present, this feature is only supported on
> +Linux, and not all architectures are recognized.  If the auto-detect is
> +unsuccessful the option has no effect.
> +
> +@item -mtune=@var{name}
> +@opindex mtune
> +This option specifies the name of the target ARM processor for
> +which GCC should tune the performance of the code.
> +For some ARM implementations better performance can be obtained by using
> +this option.
> +Permissible names are: @samp{arm2}, @samp{arm250},
>  @samp{arm3}, @samp{arm6}, @samp{arm60}, @samp{arm600}, @samp{arm610},
>  @samp{arm620}, @samp{arm7}, @samp{arm7m}, @samp{arm7d}, @samp{arm7dm},
>  @samp{arm7di}, @samp{arm7dmi}, @samp{arm70}, @samp{arm700},
> @@ -12259,26 +12283,6 @@ Additionally, this option can specify that GCC 
> should tune the performance
>  of the code for a big.LITTLE system.  Permissible names are:
>  @samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}.
>  
> -@option{-mcpu=generic-@var{arch}} is also permissible, and is
> -equivalent to @option{-march=@var{arch} -mtune=generic-@var{arch}}.
> -See @option{-mtune} for more information.
> -
> -@option{-mcpu=native} causes the compiler to auto-detect the CPU
> -of the build computer.  At present, this feature is only supported on
> -Linux, and not all architectures are recognized.  If the auto-detect is
> -unsuccessful the option has no effect.
> -
> -@item -mtune=@var{name}
> -@opindex mtune
> -This option is very similar to the @option{-mcpu=} option, except that
> -instead of specifying the actual target processor type, and hence
> -restricting which instructions can be used, it specifies that GCC should
> -tune the performance of the code as if the target were of the type
> -specified in this option, but still choosing the instructions it
> -generates based on the CPU specified by a @option{-mcpu=} option.
> -For some ARM implementations better performance can be obtained by using
> -this option.
> -
>  @option{-mtune=generic-@var{arch}} specifies that GCC should tune the
>  performance for a blend of processors within architecture @var{arch}.
>  The aim is to generate code that run well on the current most popular
> @@ -12291,24 +12295,21 @@ of the build computer.  At present, this feature is 
> only supported on
>  Linux, and not all architectures are recognized.  If the auto-detect is
>  unsuccessful the option has no effect.
>  
> -@item -march=@var{name}
> -@opindex march
> -This specifies the name of the target ARM architecture.  GCC uses this
> -name to determine what kind of instructions it can emit when generating
> -assembly code.  This option can be used in conjunction with or instead
> -of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
> -@samp{armv2a}, @samp{armv3}, @samp{armv3m}, @samp{armv4}, @samp{ar

RFA: MN10300: Include saved registers in stack usage

2014-02-05 Thread Nick Clifton

Hi Jeff, Hi Alex,

  Sorry - another MN10300 patch - this time for the stack size reported
  with -fstack-usage.  Currently the value includes the stack frame, but
  it does not take into account any registers that might have been
  pushed onto the stack.

  The patch below fixes this problem, but there is one thing that I am
  not sure about - is it OK to use __builtin_popcount() or should I be
  calling some other function ?

  Tested with no regression on an mn10300-elf toolchain.

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2014-02-05  Nick Clifton  

* config/mn10300/mn10300.c (mn10300_expand_prologue): Include
saved registers in current function's stack size.

Index: gcc/config/mn10300/mn10300.c
===
--- gcc/config/mn10300/mn10300.c(revision 207498)
+++ gcc/config/mn10300/mn10300.c(working copy)
@@ -745,13 +745,15 @@
 mn10300_expand_prologue (void)
 {
   HOST_WIDE_INT size = mn10300_frame_size ();
+  int mask;
 
+  mask = mn10300_get_live_callee_saved_regs (NULL);
+  /* If we use any of the callee-saved registers, save them now.  */
+  mn10300_gen_multiple_store (mask);
+
   if (flag_stack_usage_info)
-current_function_static_stack_size = size;
+current_function_static_stack_size = size + __builtin_popcount (mask) * 4;
 
-  /* If we use any of the callee-saved registers, save them now.  */
-  mn10300_gen_multiple_store (mn10300_get_live_callee_saved_regs (NULL));
-
   if (TARGET_AM33_2 && fp_regs_to_save ())
 {
   int num_regs_to_save = fp_regs_to_save (), i;
@@ -767,6 +769,9 @@
   unsigned int strategy_size = (unsigned)-1, this_strategy_size;
   rtx reg;
 
+  if (flag_stack_usage_info)
+   current_function_static_stack_size += num_regs_to_save * 4;
+
   /* We have several different strategies to save FP registers.
 We can store them using SP offsets, which is beneficial if
 there are just a few registers to save, or we can use `a0' in

[PATCH] Remove lto_global_var_decls, simplify necessary analysis for PR60060

2014-02-05 Thread Richard Biener


When looking at PR60060, outputting the declaration debug info for
a local static variable twice (once via BLOCK_VARS and once via
calling the debug hook on all globals) I noticed that the
rest_of_decl_compilation call in materialize_cgraph always works
on the empty lto_global_var_decls vector (we only populate it
later).  That results in the opportunity to effectively remove it
and in lto_write_globals walk over the defined vars in the
varpool instead.

Eventually the fix for PR60060 is to not push TREE_ASM_WRITTEN
decls there (but I'm not sure of other side-effects of that ...).

Thus, this cleanup first.

LTO bootstrap / regtest running on x86_64-unknown-linux-gnu.

Ok?

Thanks,
Richard.

2014-02-05  Richard Biener  

lto/
* lto.h (lto_global_var_decls): Remove.
* lto-lang.c (lto_init): Do not allocate lto_global_var_decls.
(lto_write_globals): Do nothing in WPA stage, gather globals from
the varpool here ...
* lto.c (lto_main): ... not here.
(materialize_cgraph): Do not call rest_of_decl_compilation
on the empty lto_global_var_decls vector.
(lto_global_var_decls): Remove.

Index: gcc/lto/lto-lang.c
===
*** gcc/lto/lto-lang.c  (revision 207497)
--- gcc/lto/lto-lang.c  (working copy)
*** lto_getdecls (void)
*** 1075,1085 
  static void
  lto_write_globals (void)
  {
!   tree *vec = lto_global_var_decls->address ();
!   int len = lto_global_var_decls->length ();
wrapup_global_declarations (vec, len);
emit_debug_global_declarations (vec, len);
!   vec_free (lto_global_var_decls);
  }
  
  static tree
--- 1075,1094 
  static void
  lto_write_globals (void)
  {
!   if (flag_wpa)
! return;
! 
!   /* Record the global variables.  */
!   vec lto_global_var_decls = vNULL;
!   varpool_node *vnode;
!   FOR_EACH_DEFINED_VARIABLE (vnode)
! lto_global_var_decls.safe_push (vnode->decl);
! 
!   tree *vec = lto_global_var_decls.address ();
!   int len = lto_global_var_decls.length ();
wrapup_global_declarations (vec, len);
emit_debug_global_declarations (vec, len);
!   lto_global_var_decls.release ();
  }
  
  static tree
*** lto_init (void)
*** 1218,1224 
  #undef NAME_TYPE
  
/* Initialize LTO-specific data structures.  */
-   vec_alloc (lto_global_var_decls, 256);
in_lto_p = true;
  
return true;
--- 1227,1232 
Index: gcc/lto/lto.c
===
*** gcc/lto/lto.c   (revision 207497)
--- gcc/lto/lto.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 50,57 
  #include "context.h"
  #include "pass_manager.h"
  
- /* Vector to keep track of external variables we've seen so far.  */
- vec *lto_global_var_decls;
  
  static GTY(()) tree first_personality_decl;
  
--- 50,55 
*** read_cgraph_and_symbols (unsigned nfiles
*** 3009,3017 
  static void
  materialize_cgraph (void)
  {
-   tree decl;
struct cgraph_node *node; 
-   unsigned i;
timevar_id_t lto_timer;
  
if (!quiet_flag)
--- 3007,3013 
*** materialize_cgraph (void)
*** 3043,3052 
current_function_decl = NULL;
set_cfun (NULL);
  
-   /* Inform the middle end about the global variables we have seen.  */
-   FOR_EACH_VEC_ELT (*lto_global_var_decls, i, decl)
- rest_of_decl_compilation (decl, 1, 0);
- 
if (!quiet_flag)
  fprintf (stderr, "\n");
  
--- 3039,3044 
*** lto_main (void)
*** 3309,3316 
do_whole_program_analysis ();
else
{
- varpool_node *vnode;
- 
  timevar_start (TV_PHASE_OPT_GEN);
  
  materialize_cgraph ();
--- 3301,3306 
*** lto_main (void)
*** 3330,3339 
 this.  */
  if (flag_lto_report || (flag_wpa && flag_lto_report_wpa))
print_lto_report_1 ();
- 
- /* Record the global variables.  */
- FOR_EACH_DEFINED_VARIABLE (vnode)
-   vec_safe_push (lto_global_var_decls, vnode->decl);
}
  }
  
--- 3320,3325 
Index: gcc/lto/lto.h
===
*** gcc/lto/lto.h   (revision 207497)
--- gcc/lto/lto.h   (working copy)
*** extern tree lto_eh_personality (void);
*** 41,49 
  extern void lto_main (void);
  extern void lto_read_all_file_options (void);
  
- /* In lto-symtab.c  */
- extern GTY(()) vec *lto_global_var_decls;
- 
  /* In lto-elf.c or lto-coff.c  */
  extern lto_file *lto_obj_file_open (const char *filename, bool writable);
  extern void lto_obj_file_close (lto_file *file);
--- 41,46

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Andreas Schwab

Marc Glisse  writes:

> On Wed, 5 Feb 2014, Dominique Dhumieres wrote:
>
>>> I'll give it a day (it can easily be changed again later).
>>
>> IMO it would be better (more general) to use a dg-require-effective-target
>> with the corresponding test(s) in target-supports-dg.exp.
>
> But which tests? One for over/under alignment of functions, one for
> init_priority, maybe yet one more, all that for a file that is just
> checking that the parser understands the attribute argument is a
> constant...

Wrt. the function decl it should probably just be changed to a variable
decl.  Alignments on function decls are kind of exotic.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [ARM Documentation] Clarify -mcpu, -mtune, -march

2014-02-05 Thread Ramana Radhakrishnan


On 01/27/14 10:01, James Greenhalgh wrote:


Hi,

I've tripped myself over with these three options too many times,
actually, their behaviour is very simple.

This patch clarifies the language used to describe the options, and
puts them in a logical order. I'm happy to reword again if this
is still not clear.

OK?

Thanks,
James

---
gcc/

2014-01-27  James Greenhalgh  

PR target/59718
* doc/invoke.texi (-march=): Clarify documentation for ARM.
(-mtune=): Likewise.
(-mcpu=): Likewise.




s/=/ in Changelog :)


Where this option is used in conjunction with @option{-march} or 
@option{-mtune}, those options override this option


Also, rewording this sentence in the mcpu description like the AArch64 
version would be better.


i.e.

Where this option is used in conjunction with @option{-march} or 
@option{-mtune}, those options override appropriate parts of this option.


Ok with those changes.

Thanks,
Ramana

--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

New Dutch PO file for 'gcc' (version 4.9-b20140202)

2014-02-05 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Dutch team of translators.  The file is available at:

http://translationproject.org/latest/gcc/nl.po

(This file, 'gcc-4.9-b20140202.nl.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Marc Glisse


On Wed, 5 Feb 2014, Andreas Schwab wrote:


Marc Glisse  writes:


On Wed, 5 Feb 2014, Dominique Dhumieres wrote:


I'll give it a day (it can easily be changed again later).


IMO it would be better (more general) to use a dg-require-effective-target
with the corresponding test(s) in target-supports-dg.exp.


But which tests? One for over/under alignment of functions, one for
init_priority, maybe yet one more, all that for a file that is just
checking that the parser understands the attribute argument is a
constant...


Wrt. the function decl it should probably just be changed to a variable
decl.  Alignments on function decls are kind of exotic.


Good idea. How about this patch then?

--
Marc GlisseIndex: gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C
===
--- gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C   (revision 207501)
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-attribute2.C   (working copy)
@@ -1,10 +1,11 @@
+// { dg-do compile { target init_priority } }
 // { dg-options -std=gnu++11 }
 
 struct t { t(); };
 
 constexpr int prio = 123;
 constexpr int size = 8;
 constexpr int pos = 1;
 enum A { zero = 0, one, two };
 __attribute__((init_priority(prio))) t a;
 
@@ -22,11 +23,11 @@ enum E2 {
 };
 __attribute__((init_priority(E2_second))) t c;
 
 void* my_calloc(unsigned, unsigned) __attribute__((alloc_size(pos,two)));
 void* my_realloc(void*, unsigned) __attribute__((alloc_size(two)));
 
 typedef char vec __attribute__((vector_size(size)));
 
 void f(char*) __attribute__((nonnull(pos)));
 
-void g() __attribute__((aligned(size)));
+char g __attribute__((aligned(size)));

[PATCH] Don't ICE on empty bbs with no successors in EH opt (PR middle-end/57499)

2014-02-05 Thread Jakub Jelinek

Hi!

As the testcase shows, invalid use of noreturn attribute (which we diagnose)
can result in in empty bb with no successors.  This patch fixes
cleanup_empty_eh not to ICE on these.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-04  Jakub Jelinek  

PR middle-end/57499
* tree-eh.c (cleanup_empty_eh): Bail out on totally empty
bb with no successors.

* g++.dg/torture/pr57499.C: New test.

--- gcc/tree-eh.c.jj2014-01-31 22:22:09.0 +0100
+++ gcc/tree-eh.c   2014-02-04 12:35:05.611066852 +0100
@@ -4396,8 +4396,11 @@ cleanup_empty_eh (eh_landing_pad lp)
   /* If the block is totally empty, look for more unsplitting cases.  */
   if (gsi_end_p (gsi))
 {
-  /* For the degenerate case of an infinite loop bail out.  */
-  if (infinite_empty_loop_p (e_out))
+  /* For the degenerate case of an infinite loop bail out.
+If bb has no successors and is totally empty, which can happen e.g.
+because of incorrect noreturn attribute, bail out too.  */
+  if (e_out == NULL
+ || infinite_empty_loop_p (e_out))
return ret;
 
   return ret | cleanup_empty_eh_unsplit (bb, e_out, lp);
--- gcc/testsuite/g++.dg/torture/pr57499.C.jj   2014-02-04 12:52:10.241846755 
+0100
+++ gcc/testsuite/g++.dg/torture/pr57499.C  2014-02-04 12:51:51.0 
+0100
@@ -0,0 +1,14 @@
+// PR middle-end/57499
+// { dg-do compile }
+
+struct S
+{
+  ~S () __attribute__ ((noreturn)) {} // { dg-warning "function does return" }
+};
+
+void
+foo ()
+{
+  S s;
+  throw 1;
+}

Jakub

[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062)

2014-02-05 Thread Jakub Jelinek

Hi!

Using !!optimize to determine if we should switch local ABI to regparm
convention isn't compatible with optimize attribute, as !!optimize is
whether the current function is being optimized, but for the ABI decisions
we actually need the caller and callee to agree on the calling convention.

Fixed by looking at callee's optimize settings all the time.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-05  Jakub Jelinek  

PR target/60062
* config/i386/i386.c (ix86_function_regparm): Use optimize
from the callee instead of current function's optimize
to determine if local regparm convention should be used.

* gcc.c-torture/execute/pr60062.c: New test.
* gcc.c-torture/execute/pr60072.c: New test.

--- gcc/config/i386/i386.c.jj   2014-02-04 01:36:00.0 +0100
+++ gcc/config/i386/i386.c  2014-02-05 09:09:29.603827877 +0100
@@ -5608,11 +5608,22 @@ ix86_function_regparm (const_tree type,
   /* Use register calling convention for local functions when possible.  */
   if (decl
   && TREE_CODE (decl) == FUNCTION_DECL
-  && optimize
   && !(profile_flag && !flag_fentry))
 {
   /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
   struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE (decl));
+
+  /* Caller and callee must agree on the calling convention, so
+checking here just optimize means that with
+__attribute__((optimize (...))) caller could use regparm convention
+and callee not, or vice versa.  Instead look at whether the callee
+is optimized or not.  */
+  tree opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl);
+  if (opts == NULL_TREE)
+   opts = optimization_default_node;
+  if (!TREE_OPTIMIZATION (opts)->x_optimize)
+   i = NULL;
+
   if (i && i->local && i->can_change_signature)
{
  int local_regparm, globals = 0, regno;
--- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 
09:02:13.703085235 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr60062.c   2014-02-05 
09:01:54.0 +0100
@@ -0,0 +1,25 @@
+/* PR target/60062 */
+
+int a;
+
+static void
+foo (const char *p1, int p2)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+static void
+bar (const char *p1)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+__attribute__((optimize (0))) int
+main ()
+{
+  foo ("hello", a);
+  bar ("hello");
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj2013-08-25 
18:20:55.717911035 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr60072.c   2014-02-05 
11:33:07.501748426 +0100
@@ -0,0 +1,16 @@
+/* PR target/60072 */
+
+int c = 1;
+
+__attribute__ ((optimize (1)))
+static int *foo (int *p)
+{
+  return p;
+}
+
+int
+main ()
+{
+  *foo (&c) = 2;
+  return c - 2;
+}

Jakub

[PATCH] Make pr59597 test PIC-friendly

2014-02-05 Thread Ian Bolton

PR59597 reinstated some code to cancel unnecessary jump threading,
and brought with it a testcase to check that the cancelling happened.

http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01448.html

With PIC enabled for arm and aarch64, the unnecessary jump threading
already never took place, so there is nothing to cancel, leading the
test case to fail.  My suspicion is that similar issues will happen
for other architectures too.

This patch changes the called function to be static, so that jump
threading and the resulting cancellation happen for PIC variants too.

OK for stage 4 or wait for stage 1?

Cheers,
Ian


2014-02-05  Ian Bolton  

testsuite/
* gcc.dg/tree-ssa/pr59597.c: Make called function static so that
expected outcome works for PIC variants too.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c
index 814d299..bc9d730 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c
@@ -8,7 +8,8 @@ typedef unsigned int u32;
 
 u32 f[NNN], t[NNN];
 
-u16 Calc_crc8(u8 data, u16 crc )
+static u16
+Calc_crc8 (u8 data, u16 crc)
 {
   u8 i=0,x16=0,carry=0;
   for (i = 0; i < 8; i++)
@@ -31,7 +32,9 @@ u16 Calc_crc8(u8 data, u16 crc )
 }
   return crc;
 }
-int main (int argc, char argv[])
+
+int
+main (int argc, char argv[])
 {
   int i, j; u16 crc;
   for (j = 0; j < 1000; j++)

[PATCH] One possible fix for the compute_bb_predicate oscillation (PR ipa/60013)

2014-02-05 Thread Jakub Jelinek

Hi!

Here is one possible fix for the PR60013 compute_bb_predicates oscillation,
the PR contains long analysis, but to sum it up, the problem is that
we only allow 8 conjuction operands for the predicates and when we reach 8
and want to add some further one, we just throw some of the disjunctions
on the floor, which makes the dataflow computation not guaranteed to
terminate.

The following patch fixes that by just turning all predicates that would
need more than 8 CNF toplevel operands into always true predicates (meaning
that we conservatively consider the particular basic block to be potentially
always executed independently on the function arguments).

I've bootstrapped/regtested this patch on x86_64-linux and i686-linux,
the (i2 == MAX_CLAUSES) condition was true in 2447 cases while inside
of compute_bb_predicates (on average compute_bb_predicates call that
triggered at least one such conservative bail out would see 3.14 of those)
and 482 times outside of compute_bb_predicate, on 159 unique source files
something triggered.  But, the fact that it triggered doesn't mean at all
the function would be even considered for inlining and even if it would, it
often would not change the inlining decisions.

As I said in the PR, the other possibility I see is just in letting
compute_bb_predicates do the dataflow until it set's the predicates on all
basic blocks, and then perhaps have some counter for each bb how many times
it has changed in the following iterations and if it changes too many times,
just use true_predicate for it or something similar.

Or the following patch could be combined with some MAX_CLAUSES growth (say
from 8 to 16) to make it punt less often.

2014-02-05  Jakub Jelinek  

PR ipa/60013
* ipa-inline-analysis.c (add_clause): Fix comment typos.  If
clause couldn't be added because there are already MAX_CLAUSES
clauses, turn p into a true predicate.
(and_predicates, or_predicates): If add_clause turned the
result into a true_predicate_p, break out of the loop nest.

* gcc.dg/pr60013.c: New test.

--- gcc/ipa-inline-analysis.c.jj2014-02-03 08:53:12.0 +0100
+++ gcc/ipa-inline-analysis.c   2014-02-05 08:01:57.093878954 +0100
@@ -310,7 +310,7 @@ add_clause (conditions conditions, struc
   if (false_predicate_p (p))
 return;
 
-  /* No one should be sily enough to add false into nontrivial clauses.  */
+  /* No one should be silly enough to add false into nontrivial clauses.  */
   gcc_checking_assert (!(clause & (1 << predicate_false_condition)));
 
   /* Look where to insert the clause.  At the same time prune out
@@ -372,9 +372,13 @@ add_clause (conditions conditions, struc
 }
 
 
-  /* We run out of variants.  Be conservative in positive direction.  */
+  /* We run out of variants.  Conservatively turn clause into true
+ predicate.  */
   if (i2 == MAX_CLAUSES)
-return;
+{
+  p->clause[0] = 0;
+  return;
+}
   /* Keep clauses in decreasing order. This makes equivalence testing easy.  */
   p->clause[i2 + 1] = 0;
   if (insert_here >= 0)
@@ -412,6 +416,8 @@ and_predicates (conditions conditions,
 {
   gcc_checking_assert (i < MAX_CLAUSES);
   add_clause (conditions, &out, p2->clause[i]);
+  if (true_predicate_p (&out))
+   break;
 }
   return out;
 }
@@ -459,6 +465,8 @@ or_predicates (conditions conditions,
   {
gcc_checking_assert (i < MAX_CLAUSES && j < MAX_CLAUSES);
add_clause (conditions, &out, p->clause[i] | p2->clause[j]);
+   if (true_predicate_p (&out))
+ return out;
   }
   return out;
 }
@@ -1035,7 +1043,7 @@ inline_node_removal_hook (struct cgraph_
   memset (info, 0, sizeof (inline_summary_t));
 }
 
-/* Remap predicate P of former function to be predicate of duplicated functoin.
+/* Remap predicate P of former function to be predicate of duplicated function.
POSSIBLE_TRUTHS is clause of possible truths in the duplicated node,
INFO is inline summary of the duplicated node.  */
 
--- gcc/testsuite/gcc.dg/pr60013.c.jj   2014-02-05 09:49:51.632142747 +0100
+++ gcc/testsuite/gcc.dg/pr60013.c  2014-02-05 09:49:14.0 +0100
@@ -0,0 +1,47 @@
+/* PR ipa/60013 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef long int jmp_buf[64];
+extern int _setjmp (jmp_buf) __attribute__ ((__nothrow__));
+struct S { int a, b, c; };
+extern struct S *baz (struct S *);
+static jmp_buf j;
+
+static inline int
+bar (int b, int d)
+{
+  return (b & d) < 0;
+}
+
+struct S *
+foo (int a, struct S *b, struct S *c, struct S *d)
+{
+  if (b->a == 0)
+{
+  switch (a)
+   {
+   case 8:
+ return baz (b);
+   case 7:
+ bar (b->c, c->b);
+ return 0;
+   case 6:
+   case 5:
+   case 4:
+ return baz (c);
+   case 3:
+   case 2:
+ return baz (d);
+   }
+  return 0;
+}
+  if (b->a == 1)
+{
+  if (baz (c))
+   return c;
+  els

Re: [PATCH] Don't ICE on empty bbs with no successors in EH opt (PR middle-end/57499)

2014-02-05 Thread Richard Biener

On Wed, 5 Feb 2014, Jakub Jelinek wrote:

> Hi!
> 
> As the testcase shows, invalid use of noreturn attribute (which we diagnose)
> can result in in empty bb with no successors.  This patch fixes
> cleanup_empty_eh not to ICE on these.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-02-04  Jakub Jelinek  
> 
>   PR middle-end/57499
>   * tree-eh.c (cleanup_empty_eh): Bail out on totally empty
>   bb with no successors.
> 
>   * g++.dg/torture/pr57499.C: New test.
> 
> --- gcc/tree-eh.c.jj  2014-01-31 22:22:09.0 +0100
> +++ gcc/tree-eh.c 2014-02-04 12:35:05.611066852 +0100
> @@ -4396,8 +4396,11 @@ cleanup_empty_eh (eh_landing_pad lp)
>/* If the block is totally empty, look for more unsplitting cases.  */
>if (gsi_end_p (gsi))
>  {
> -  /* For the degenerate case of an infinite loop bail out.  */
> -  if (infinite_empty_loop_p (e_out))
> +  /* For the degenerate case of an infinite loop bail out.
> +  If bb has no successors and is totally empty, which can happen e.g.
> +  because of incorrect noreturn attribute, bail out too.  */
> +  if (e_out == NULL
> +   || infinite_empty_loop_p (e_out))
>   return ret;
>  
>return ret | cleanup_empty_eh_unsplit (bb, e_out, lp);
> --- gcc/testsuite/g++.dg/torture/pr57499.C.jj 2014-02-04 12:52:10.241846755 
> +0100
> +++ gcc/testsuite/g++.dg/torture/pr57499.C2014-02-04 12:51:51.0 
> +0100
> @@ -0,0 +1,14 @@
> +// PR middle-end/57499
> +// { dg-do compile }
> +
> +struct S
> +{
> +  ~S () __attribute__ ((noreturn)) {} // { dg-warning "function does return" 
> }
> +};
> +
> +void
> +foo ()
> +{
> +  S s;
> +  throw 1;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062)

2014-02-05 Thread Richard Biener

On Wed, 5 Feb 2014, Jakub Jelinek wrote:

> Hi!
> 
> Using !!optimize to determine if we should switch local ABI to regparm
> convention isn't compatible with optimize attribute, as !!optimize is
> whether the current function is being optimized, but for the ABI decisions
> we actually need the caller and callee to agree on the calling convention.
> 
> Fixed by looking at callee's optimize settings all the time.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Seeing a 2nd variant of such code I'd rather have an abstraction
for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?

Richard.

> 2014-02-05  Jakub Jelinek  
> 
>   PR target/60062
>   * config/i386/i386.c (ix86_function_regparm): Use optimize
>   from the callee instead of current function's optimize
>   to determine if local regparm convention should be used.
> 
>   * gcc.c-torture/execute/pr60062.c: New test.
>   * gcc.c-torture/execute/pr60072.c: New test.
> 
> --- gcc/config/i386/i386.c.jj 2014-02-04 01:36:00.0 +0100
> +++ gcc/config/i386/i386.c2014-02-05 09:09:29.603827877 +0100
> @@ -5608,11 +5608,22 @@ ix86_function_regparm (const_tree type,
>/* Use register calling convention for local functions when possible.  */
>if (decl
>&& TREE_CODE (decl) == FUNCTION_DECL
> -  && optimize
>&& !(profile_flag && !flag_fentry))
>  {
>/* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
>struct cgraph_local_info *i = cgraph_local_info (CONST_CAST_TREE 
> (decl));
> +
> +  /* Caller and callee must agree on the calling convention, so
> +  checking here just optimize means that with
> +  __attribute__((optimize (...))) caller could use regparm convention
> +  and callee not, or vice versa.  Instead look at whether the callee
> +  is optimized or not.  */
> +  tree opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl);
> +  if (opts == NULL_TREE)
> + opts = optimization_default_node;
> +  if (!TREE_OPTIMIZATION (opts)->x_optimize)
> + i = NULL;
> +
>if (i && i->local && i->can_change_signature)
>   {
> int local_regparm, globals = 0, regno;
> --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj  2014-02-05 
> 09:02:13.703085235 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c 2014-02-05 
> 09:01:54.0 +0100
> @@ -0,0 +1,25 @@
> +/* PR target/60062 */
> +
> +int a;
> +
> +static void
> +foo (const char *p1, int p2)
> +{
> +  if (__builtin_strcmp (p1, "hello") != 0)
> +__builtin_abort ();
> +}
> +
> +static void
> +bar (const char *p1)
> +{
> +  if (__builtin_strcmp (p1, "hello") != 0)
> +__builtin_abort ();
> +}
> +
> +__attribute__((optimize (0))) int
> +main ()
> +{
> +  foo ("hello", a);
> +  bar ("hello");
> +  return 0;
> +}
> --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj  2013-08-25 
> 18:20:55.717911035 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c 2014-02-05 
> 11:33:07.501748426 +0100
> @@ -0,0 +1,16 @@
> +/* PR target/60072 */
> +
> +int c = 1;
> +
> +__attribute__ ((optimize (1)))
> +static int *foo (int *p)
> +{
> +  return p;
> +}
> +
> +int
> +main ()
> +{
> +  *foo (&c) = 2;
> +  return c - 2;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: [ARM Documentation] Clarify -mcpu, -mtune, -march

2014-02-05 Thread James Greenhalgh

On Wed, Feb 05, 2014 at 11:02:42AM +, Ramana Radhakrishnan wrote:
> On 01/27/14 10:01, James Greenhalgh wrote:
> >
> > Hi,
> >
> > I've tripped myself over with these three options too many times,
> > actually, their behaviour is very simple.
> >
> > This patch clarifies the language used to describe the options, and
> > puts them in a logical order. I'm happy to reword again if this
> > is still not clear.
> >
> > OK?
> >
> > Thanks,
> > James
> >
> > ---
> > gcc/
> >
> > 2014-01-27  James Greenhalgh  
> >
> > PR target/59718
> > * doc/invoke.texi (-march=): Clarify documentation for ARM.
> > (-mtune=): Likewise.
> > (-mcpu=): Likewise.
> >
> 
> 

> Ok with those changes.

Thanks,

I've attached what I ended up committing.

As far as I know the behaviour of this flag has always been this way.  So is
this also OK to backport to release branches? 

Thanks,
James

>From b72249d400685a3c1a2e7eee5ad21db86006d34c Mon Sep 17 00:00:00 2001
From: James Greenhalgh 
Date: Wed, 8 Jan 2014 12:26:12 +
Subject: [ARM Documentation] Clarify -mcpu, -mtune, -march
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="1.8.3-rc0"

This is a multi-part message in MIME format.
--1.8.3-rc0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


Hi,

I've tripped myself over with these three options too many times,
actually, their behaviour is very simple.

This patch clarifies the language used to describe the options, and
puts them in a logical order. I'm happy to reword again if this
is still not clear.

OK?

Thanks,
James

---
gcc/

2014-02-05  James Greenhalgh  

	PR target/59718
	* doc/invoke.texi (-march=): Clarify documentation for ARM.
	(-mtune=): Likewise.
	(-mcpu=): Likewise.


--1.8.3-rc0
Content-Type: text/x-patch; name="0001-ARM-Documentation-Clarify-mcpu-mtune-march.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-ARM-Documentation-Clarify-mcpu-mtune-march.patch"

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 640c123..e3dc9df 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12221,11 +12221,38 @@ option should only be used if you require compatibility with code for
 big-endian ARM processors generated by versions of the compiler prior to
 2.8.  This option is now deprecated.
 
-@item -mcpu=@var{name}
-@opindex mcpu
-This specifies the name of the target ARM processor.  GCC uses this name
-to determine what kind of instructions it can emit when generating
-assembly code.  Permissible names are: @samp{arm2}, @samp{arm250},
+@item -march=@var{name}
+@opindex march
+This specifies the name of the target ARM architecture.  GCC uses this
+name to determine what kind of instructions it can emit when generating
+assembly code.  This option can be used in conjunction with or instead
+of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
+@samp{armv2a}, @samp{armv3}, @samp{armv3m}, @samp{armv4}, @samp{armv4t},
+@samp{armv5}, @samp{armv5t}, @samp{armv5e}, @samp{armv5te},
+@samp{armv6}, @samp{armv6j},
+@samp{armv6t2}, @samp{armv6z}, @samp{armv6zk}, @samp{armv6-m},
+@samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m}, @samp{armv7ve},
+@samp{armv8-a}, @samp{armv8-a+crc},
+@samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
+
+@option{-march=armv7ve} is the armv7-a architecture with virtualization
+extensions.
+
+@option{-march=armv8-a+crc} enables code generation for the ARMv8-A
+architecture together with the optional CRC32 extensions.
+
+@option{-march=native} causes the compiler to auto-detect the architecture
+of the build computer.  At present, this feature is only supported on
+Linux, and not all architectures are recognized.  If the auto-detect is
+unsuccessful the option has no effect.
+
+@item -mtune=@var{name}
+@opindex mtune
+This option specifies the name of the target ARM processor for
+which GCC should tune the performance of the code.
+For some ARM implementations better performance can be obtained by using
+this option.
+Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{arm3}, @samp{arm6}, @samp{arm60}, @samp{arm600}, @samp{arm610},
 @samp{arm620}, @samp{arm7}, @samp{arm7m}, @samp{arm7d}, @samp{arm7dm},
 @samp{arm7di}, @samp{arm7dmi}, @samp{arm70}, @samp{arm700},
@@ -12259,26 +12286,6 @@ Additionally, this option can specify that GCC should tune the performance
 of the code for a big.LITTLE system.  Permissible names are:
 @samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}.
 
-@option{-mcpu=generic-@var{arch}} is also permissible, and is
-equivalent to @option{-march=@var{arch} -mtune=generic-@var{arch}}.
-See @option{-mtune} for more information.
-
-@option{-mcpu=native} causes the compiler to auto-detect the CPU
-of the build computer.  At present, this feature is only supported on
-Linux, and not all architectures are recognized.  If the auto-detect is
-unsuccessful the option has no effect.
-
-@item -mtune=@var{name}

Re: [ARM Documentation] Clarify -mcpu, -mtune, -march

2014-02-05 Thread Ramana Radhakrishnan




I've attached what I ended up committing.

As far as I know the behaviour of this flag has always been this way.  So is
this also OK to backport to release branches?


Ok by me. It's a documentation fix to make things more explicit.

And s/=/ again in Changelog :) .

Ramana



Thanks,
James




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

[RS6000] SDmode and reload

2014-02-05 Thread Alan Modra

This little patch fixes a number of dfp failures:

Fail: c-c++-common/dfp/func-vararg-alternate-d32.c execution test
FAIL: c-c++-common/dfp/func-vararg-alternate-d32.c -std=c++11 execution test
FAIL: c-c++-common/dfp/func-vararg-alternate-d32.c -std=c++98 execution test
FAIL: c-c++-common/dfp/func-vararg-dfp.c execution test
FAIL: c-c++-common/dfp/func-vararg-dfp.c -std=c++11 execution test
FAIL: c-c++-common/dfp/func-vararg-dfp.c -std=c++98 execution test
FAIL: c-c++-common/dfp/func-vararg-mixed.c execution test
FAIL: c-c++-common/dfp/func-vararg-mixed.c -std=c++11 execution test
FAIL: c-c++-common/dfp/func-vararg-mixed.c -std=c++98 execution test
FAIL: decimal/pr54036-1.cc (test for excess errors)
FAIL: gcc.dg/compat/scalar-by-value-dfp c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: gcc.dg/compat/scalar-return-dfp c_compat_x_tst.o-c_compat_y_tst.o execute 
FAIL: gcc.target/powerpc/ppc32-abi-dfp-1.c execution test
FAIL: g++.dg/compat/decimal/pass-1 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/pass-2 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/pass-3 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/pass-4 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/pass-5 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/pass-6 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
FAIL: g++.dg/compat/decimal/return-1 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 
FAIL: g++.dg/compat/decimal/return-2 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 
FAIL: g++.dg/compat/decimal/return-3 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 
FAIL: g++.dg/compat/decimal/return-4 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 
FAIL: g++.dg/compat/decimal/return-5 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 
FAIL: g++.dg/compat/decimal/return-6 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute 

Vlad added rs6000_secondary_memory_needed_mode for lra, and in so
doing broke reload's use of cfun->machine->sdmode_stack_slot in
rs6000_secondary_memory_needed_rtx (mode was no longer SDmode there).
Bootstrapped and regression tested powerpc64-linux, both with and
without -mlra.  OK to commit?

PR target/60032
* config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only
change SDmode to DDmode when lra_in_progress.

* gcc.target/powerpc/pr60032.c: New.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 207417)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15581,7 +15581,7 @@
 enum machine_mode
 rs6000_secondary_memory_needed_mode (enum machine_mode mode)
 {
-  if (mode == SDmode)
+  if (lra_in_progress && mode == SDmode)
 return DDmode;
   return mode;
 }
Index: gcc/testsuite/gcc.target/powerpc/pr60032.c
===
--- gcc/testsuite/gcc.target/powerpc/pr60032.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr60032.c  (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target dfp } */
+/* { dg-options "-O2" } */
+
+void foo (void)
+{
+  register float __attribute__ ((mode(SD))) r31 __asm__ ("r31");
+  register float __attribute__ ((mode(SD))) fr1 __asm__ ("fr1");
+
+  __asm__ ("#" : "=d" (fr1));
+  r31 = fr1;
+  __asm__ ("#" : : "r" (r31));
+}

-- 
Alan Modra
Australia Development Lab, IBM

[PATCH] Fix loop preserving in loop header copying

2014-02-05 Thread Richard Biener


There's a simple mistake in gimple_duplicate_sese_region that
causes all loop-header copied loops to be thrown away (but soon
re-discovered, hopefully without somebody inbetween messing things up).

Fixed like the following, bootstrapped on x86_64-unknown-linux-gnu,
testing in progress.

Richard.

2014-02-05  Richard Biener  

* tree-cfg.c (gimple_duplicate_sese_region): Fix ordering of
set_loop_copy and initialize_original_copy_tables.

Index: gcc/tree-cfg.c
===
*** gcc/tree-cfg.c  (revision 207497)
--- gcc/tree-cfg.c  (working copy)
*** gimple_duplicate_sese_region (edge entry
*** 5879,5892 
return false;
  }
  
-   set_loop_copy (loop, loop);
- 
/* In case the function is used for loop header copying (which is the 
primary
   use), ensure that EXIT and its copy will be new latch and entry edges.  
*/
if (loop->header == entry->dest)
  {
copying_header = true;
-   set_loop_copy (loop, loop_outer (loop));
  
if (!dominated_by_p (CDI_DOMINATORS, loop->latch, exit->src))
return false;
--- 5905,5915 
*** gimple_duplicate_sese_region (edge entry
*** 5897,5910 
  return false;
  }
  
if (!region_copy)
  {
region_copy = XNEWVEC (basic_block, n_region);
free_region_copy = true;
  }
  
-   initialize_original_copy_tables ();
- 
/* Record blocks outside the region that are dominated by something
   inside.  */
if (update_dominance)
--- 5920,5938 
  return false;
  }
  
+   initialize_original_copy_tables ();
+ 
+   if (copying_header)
+ set_loop_copy (loop, loop_outer (loop));
+   else
+ set_loop_copy (loop, loop);
+ 
if (!region_copy)
  {
region_copy = XNEWVEC (basic_block, n_region);
free_region_copy = true;
  }
  
/* Record blocks outside the region that are dominated by something
   inside.  */
if (update_dominance)

Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled

2014-02-05 Thread Yury Gribov


Ramana wrote:

The error is caused by unaligned vector load via two vldr instructions:


So, does this mean that we get unaligned vector loads
when munaligned-access is on, using vldr instructions ?


No, when -munaligned-access is on, the backend will enable movmisalign 
patterns that generate friendly vst1/vld1 instructions (well, at least 
on little-endian ARMs).



Cause for this error is, even when -mno-unaligned-access is enabled,
backend will inform vectorizer that it supports misaligned accesses
via TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT.


Ramana wrote:

Attached patch fixes this. Is this OK for trunk?


How has it been tested ?


I was hoping Linaro people could run their magical cbuild on it...

-Y

Re: [C,C++] integer constants in attribute arguments

2014-02-05 Thread Andreas Schwab

Marc Glisse  writes:

> On Wed, 5 Feb 2014, Andreas Schwab wrote:
>
>> Wrt. the function decl it should probably just be changed to a variable
>> decl.  Alignments on function decls are kind of exotic.
>
> Good idea. How about this patch then?

LGTM.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH, rs6000] Implement -maltivec=be for vec_pack, vec_unpackh, vec_unpackl Altivec builtins

2014-02-05 Thread David Edelsohn

On Tue, Feb 4, 2014 at 2:13 PM, Bill Schmidt
 wrote:

> Yet another -maltivec=be patch, this one for the vector pack/unpack
> builtins.  For the pack operations, we need to reverse the order of the
> inputs for little endian without -maltivec=be.  For the unpack
> operations, we need to replace unpackh with unpackl, and vice versa, for
> little endian without -maltivec=be.
>
> For both pack and unpack, there are some internal uses that should not
> have their semantics changed from the existing implementation.  For
> these, "_direct" forms of the insns are provided, as has been done in
> previous patches in this series.
>
> Four new test cases are added to test these builtins for all applicable
> vector types, with and without -maltivec=be.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> gcc:
>
> 2014-02-04  Bill Schmidt  
>
> * altivec.md (UNSPEC_VPACK_UNS_UNS_MOD_DIRECT): New unspec.
> (UNSPEC_VUNPACK_HI_SIGN_DIRECT): Likewise.
> (UNSPEC_VUNPACK_LO_SIGN_DIRECT): Likewise.
> (mulv8hi3): Use gen_altivec_vpkuwum_direct instead of
> gen_altivec_vpkuwum.
> (altivec_vpkpx): Test for VECTOR_ELT_ORDER_BIG instead of for
> BYTES_BIG_ENDIAN.
> (altivec_vpksss): Likewise.
> (altivec_vpksus): Likewise.
> (altivec_vpkuus): Likewise.
> (altivec_vpkuum): Likewise.
> (altivec_vpkuum_direct): New (copy of
> altivec_vpkuum that still relies on BYTES_BIG_ENDIAN, for
> internal use).
> (altivec_vupkhs): Emit vupkls* instead of vupkhs* when
> target is little endian and -maltivec=be is not specified.
> (*altivec_vupkhs_direct): New (copy of
> altivec_vupkhs that always emits vupkhs*, for internal
> use).
> (altivec_vupkls): Emit vupkhs* instead of vupkls* when
> target is little endian and -maltivec=be is not specified.
> (*altivec_vupkls_direct): New (copy of
> altivec_vupkls that always emits vupkls*, for internal
> use).
> (altivec_vupkhpx): Emit vupklpx instead of vupkhpx when target is
> little endian and -maltivec=be is not specified.
> (altivec_vupklpx): Emit vupkhpx instead of vupklpx when target is
> little endian and -maltivec=be is not specified.
>
> gcc/testsuite:
>
> 2014-02-04  Bill Schmidt  
>
> * gcc.dg/vmx/pack.c: New.
> * gcc.dg/vmx/pack-be-order.c: New.
> * gcc.dg/vmx/unpack.c: New.
> * gcc.dg/vmx/unpack-be-order.c: New.

Okay.

Thanks, David

Re: [ARM][PATCH] Vectorizer generates unaligned access when -mno-unaligned-access is enabled

2014-02-05 Thread Ramana Radhakrishnan






Ramana wrote:

Attached patch fixes this. Is this OK for trunk?


How has it been tested ?


I was hoping Linaro people could run their magical cbuild on it...


Ok if no regressions.

regards
Ramana



-Y




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

[PATCH] FIx PR60076

2014-02-05 Thread Richard Biener


Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-02-05  Richard Biener  

PR testsuite/60076
* gcc.dg/vect/pr60012.c: Require vect_extract_even_odd and
avoid using unsigned long long.

Index: gcc/testsuite/gcc.dg/vect/pr60012.c
===
--- gcc/testsuite/gcc.dg/vect/pr60012.c (revision 207504)
+++ gcc/testsuite/gcc.dg/vect/pr60012.c (working copy)
@@ -8,14 +8,14 @@ typedef struct
 } complex16_t;
 
 void
-libvector_AccSquareNorm_ref (unsigned long long  *acc,
+libvector_AccSquareNorm_ref (unsigned int *acc,
 const complex16_t *x, unsigned len)
 {
   unsigned i;
   for (i = 0; i < len; i++)
-acc[i] += ((unsigned long long)((int)x[i].real * x[i].real))
-   + ((unsigned long long)((int)x[i].imag * x[i].imag));
+acc[i] += ((unsigned int)((int)x[i].real * x[i].real))
+   + ((unsigned int)((int)x[i].imag * x[i].imag));
 }
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { 
vect_extract_even_odd } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */

Re: [PATCH, rs6000] Handle -maltivec=be for vec_sum2s builtin (last -maltivec=be patch)

2014-02-05 Thread David Edelsohn

On Tue, Feb 4, 2014 at 10:15 PM, Bill Schmidt
 wrote:
> Hi,
>
> One final patch in the series, this one for vec_sum2s.  This builtin
> requires some additional code generation for the case of little endian
> without -maltivec=be.  Here's an example:
>
>   va = {-10,1,2,3};0x 0003 0002 0001 fff6
>   vb = {100,101,102,-103}; 0x ff99 0066 0065 0064
>   vc = vec_sum2s (va, vb); 0x ff9e  005c 
>   = {0,92,0,-98};
>
> We need to add -10 + 1 + 101 = 92 and place it in vc[1], and add 2 + 3 +
> -103 and place the result in vc[3], with zeroes in the other two
> elements.  To do this, we first use "vsldoi vs,vb,vb,12" to rotate 101
> and -103 into big-endian elements 1 and 3, as required by the vsum2sws
> instruction:
>
>   0x ff99 0066 0065 0064 ff99 0066 0065 0064
>    
>   vs =  0064 ff99 0066 0065
>
> Executing "vsum2sws vs,va,vs" then gives
>
>   vs = 0x  ff9e  005c
>
> which then must be shifted into position with "vsldoi vc,vs,vs,4"
>
>   0x  ff9e  005c  ff9e  005c
>      
>  vc = ff9e  005c 
>
> which is the desired result.
>
> In addition to this change, I noticed a redundant test from one of my
> previous patches and simplified it.  (BYTES_BIG_ENDIAN implies
> VECTOR_ELT_ORDER_BIG, so we don't need to test BYTES_BIG_ENDIAN.)
>
> As usual, new test cases are added to cover the possible cases.  These
> are simpler this time since only vector signed integer is a legal type
> for vec_sum2s.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> gcc:
>
> 2014-02-04  Bill Schmidt  
>
> * config/rs6000/altivec.md (altivec_vsum2sws): Adjust code
> generation for -maltivec=be.
> (altivec_vsumsws): Simplify redundant test.
>
> gcc/testsuite:
>
> 2014-02-04  Bill Schmidt  
>
> * gcc.dg/vmx/sum2s.c: New.
> * gcc.dg/vmx/sum2s-be-order.c: New.

Okay.

The multi-instruction sequences really should be emitted as separate
instructions and the scratch only allocated for the LE case.

Thanks, David

Re: [PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062)

2014-02-05 Thread Jan Hubicka

> On Wed, 5 Feb 2014, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > Using !!optimize to determine if we should switch local ABI to regparm
> > convention isn't compatible with optimize attribute, as !!optimize is
> > whether the current function is being optimized, but for the ABI decisions
> > we actually need the caller and callee to agree on the calling convention.
> > 
> > Fixed by looking at callee's optimize settings all the time.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Seeing a 2nd variant of such code I'd rather have an abstraction
> for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?

Propbable opt_for_fn.  I wanted to implement SSE passing conventions for local
functions but didn't get to it since I was concerned about -fno-sse being
passed to teh callee by target attribute.  Having API to check those would be 
nice.

Honza

Re: [PATCH] Remove lto_global_var_decls, simplify necessary analysis for PR60060

2014-02-05 Thread Jan Hubicka

> 
> When looking at PR60060, outputting the declaration debug info for
> a local static variable twice (once via BLOCK_VARS and once via
> calling the debug hook on all globals) I noticed that the
> rest_of_decl_compilation call in materialize_cgraph always works
> on the empty lto_global_var_decls vector (we only populate it
> later).  That results in the opportunity to effectively remove it
> and in lto_write_globals walk over the defined vars in the
> varpool instead.
> 
> Eventually the fix for PR60060 is to not push TREE_ASM_WRITTEN
> decls there (but I'm not sure of other side-effects of that ...).
> 
> Thus, this cleanup first.
> 
> LTO bootstrap / regtest running on x86_64-unknown-linux-gnu.
> 
> Ok?

Seems OK to me.  I always had an impression that this machinery exists to
make decls that are revmoed fro symbol table (optimized out) to land the
debug info, but I think this is mostly broken by partitioning anyway, right?
Do we stream them at all?

Honza
> 
> Thanks,
> Richard.
> 
> 2014-02-05  Richard Biener  
> 
>   lto/
>   * lto.h (lto_global_var_decls): Remove.
>   * lto-lang.c (lto_init): Do not allocate lto_global_var_decls.
>   (lto_write_globals): Do nothing in WPA stage, gather globals from
>   the varpool here ...
>   * lto.c (lto_main): ... not here.
>   (materialize_cgraph): Do not call rest_of_decl_compilation
>   on the empty lto_global_var_decls vector.
>   (lto_global_var_decls): Remove.
> 
> Index: gcc/lto/lto-lang.c
> ===
> *** gcc/lto/lto-lang.c(revision 207497)
> --- gcc/lto/lto-lang.c(working copy)
> *** lto_getdecls (void)
> *** 1075,1085 
>   static void
>   lto_write_globals (void)
>   {
> !   tree *vec = lto_global_var_decls->address ();
> !   int len = lto_global_var_decls->length ();
> wrapup_global_declarations (vec, len);
> emit_debug_global_declarations (vec, len);
> !   vec_free (lto_global_var_decls);
>   }
>   
>   static tree
> --- 1075,1094 
>   static void
>   lto_write_globals (void)
>   {
> !   if (flag_wpa)
> ! return;
> ! 
> !   /* Record the global variables.  */
> !   vec lto_global_var_decls = vNULL;
> !   varpool_node *vnode;
> !   FOR_EACH_DEFINED_VARIABLE (vnode)
> ! lto_global_var_decls.safe_push (vnode->decl);
> ! 
> !   tree *vec = lto_global_var_decls.address ();
> !   int len = lto_global_var_decls.length ();
> wrapup_global_declarations (vec, len);
> emit_debug_global_declarations (vec, len);
> !   lto_global_var_decls.release ();
>   }
>   
>   static tree
> *** lto_init (void)
> *** 1218,1224 
>   #undef NAME_TYPE
>   
> /* Initialize LTO-specific data structures.  */
> -   vec_alloc (lto_global_var_decls, 256);
> in_lto_p = true;
>   
> return true;
> --- 1227,1232 
> Index: gcc/lto/lto.c
> ===
> *** gcc/lto/lto.c (revision 207497)
> --- gcc/lto/lto.c (working copy)
> *** along with GCC; see the file COPYING3.
> *** 50,57 
>   #include "context.h"
>   #include "pass_manager.h"
>   
> - /* Vector to keep track of external variables we've seen so far.  */
> - vec *lto_global_var_decls;
>   
>   static GTY(()) tree first_personality_decl;
>   
> --- 50,55 
> *** read_cgraph_and_symbols (unsigned nfiles
> *** 3009,3017 
>   static void
>   materialize_cgraph (void)
>   {
> -   tree decl;
> struct cgraph_node *node; 
> -   unsigned i;
> timevar_id_t lto_timer;
>   
> if (!quiet_flag)
> --- 3007,3013 
> *** materialize_cgraph (void)
> *** 3043,3052 
> current_function_decl = NULL;
> set_cfun (NULL);
>   
> -   /* Inform the middle end about the global variables we have seen.  */
> -   FOR_EACH_VEC_ELT (*lto_global_var_decls, i, decl)
> - rest_of_decl_compilation (decl, 1, 0);
> - 
> if (!quiet_flag)
>   fprintf (stderr, "\n");
>   
> --- 3039,3044 
> *** lto_main (void)
> *** 3309,3316 
>   do_whole_program_analysis ();
> else
>   {
> -   varpool_node *vnode;
> - 
> timevar_start (TV_PHASE_OPT_GEN);
>   
> materialize_cgraph ();
> --- 3301,3306 
> *** lto_main (void)
> *** 3330,3339 
>this.  */
> if (flag_lto_report || (flag_wpa && flag_lto_report_wpa))
>   print_lto_report_1 ();
> - 
> -   /* Record the global variables.  */
> -   FOR_EACH_DEFINED_VARIABLE (vnode)
> - vec_safe_push (lto_global_var_decls, vnode->decl);
>   }
>   }
>   
> --- 3320,3325 
> Index: gcc/lto/lto.h
> ===
> *** gcc/lto/lto.h (revision 207497)
> --- gcc/lto/lto.h (working copy)
> *** extern tree lto_eh_personality (void);
> *** 41,49 
>   extern void lto_main (void);
>   extern void lto_read_all_file_options (void

Re: [PATCH] Remove lto_global_var_decls, simplify necessary analysis for PR60060

2014-02-05 Thread Richard Biener

On Wed, 5 Feb 2014, Jan Hubicka wrote:

> > 
> > When looking at PR60060, outputting the declaration debug info for
> > a local static variable twice (once via BLOCK_VARS and once via
> > calling the debug hook on all globals) I noticed that the
> > rest_of_decl_compilation call in materialize_cgraph always works
> > on the empty lto_global_var_decls vector (we only populate it
> > later).  That results in the opportunity to effectively remove it
> > and in lto_write_globals walk over the defined vars in the
> > varpool instead.
> > 
> > Eventually the fix for PR60060 is to not push TREE_ASM_WRITTEN
> > decls there (but I'm not sure of other side-effects of that ...).
> > 
> > Thus, this cleanup first.
> > 
> > LTO bootstrap / regtest running on x86_64-unknown-linux-gnu.
> > 
> > Ok?
> 
> Seems OK to me.  I always had an impression that this machinery exists to
> make decls that are revmoed fro symbol table (optimized out) to land the
> debug info, but I think this is mostly broken by partitioning anyway, right?
> Do we stream them at all?

No, we don't stream those.  We'd need to stream what the FEs think
are the globals (lang_hooks.getdecls()) and somehow partition them
and ship them somewhere.

Richard.
 
> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > 2014-02-05  Richard Biener  
> > 
> > lto/
> > * lto.h (lto_global_var_decls): Remove.
> > * lto-lang.c (lto_init): Do not allocate lto_global_var_decls.
> > (lto_write_globals): Do nothing in WPA stage, gather globals from
> > the varpool here ...
> > * lto.c (lto_main): ... not here.
> > (materialize_cgraph): Do not call rest_of_decl_compilation
> > on the empty lto_global_var_decls vector.
> > (lto_global_var_decls): Remove.
> > 
> > Index: gcc/lto/lto-lang.c
> > ===
> > *** gcc/lto/lto-lang.c  (revision 207497)
> > --- gcc/lto/lto-lang.c  (working copy)
> > *** lto_getdecls (void)
> > *** 1075,1085 
> >   static void
> >   lto_write_globals (void)
> >   {
> > !   tree *vec = lto_global_var_decls->address ();
> > !   int len = lto_global_var_decls->length ();
> > wrapup_global_declarations (vec, len);
> > emit_debug_global_declarations (vec, len);
> > !   vec_free (lto_global_var_decls);
> >   }
> >   
> >   static tree
> > --- 1075,1094 
> >   static void
> >   lto_write_globals (void)
> >   {
> > !   if (flag_wpa)
> > ! return;
> > ! 
> > !   /* Record the global variables.  */
> > !   vec lto_global_var_decls = vNULL;
> > !   varpool_node *vnode;
> > !   FOR_EACH_DEFINED_VARIABLE (vnode)
> > ! lto_global_var_decls.safe_push (vnode->decl);
> > ! 
> > !   tree *vec = lto_global_var_decls.address ();
> > !   int len = lto_global_var_decls.length ();
> > wrapup_global_declarations (vec, len);
> > emit_debug_global_declarations (vec, len);
> > !   lto_global_var_decls.release ();
> >   }
> >   
> >   static tree
> > *** lto_init (void)
> > *** 1218,1224 
> >   #undef NAME_TYPE
> >   
> > /* Initialize LTO-specific data structures.  */
> > -   vec_alloc (lto_global_var_decls, 256);
> > in_lto_p = true;
> >   
> > return true;
> > --- 1227,1232 
> > Index: gcc/lto/lto.c
> > ===
> > *** gcc/lto/lto.c   (revision 207497)
> > --- gcc/lto/lto.c   (working copy)
> > *** along with GCC; see the file COPYING3.
> > *** 50,57 
> >   #include "context.h"
> >   #include "pass_manager.h"
> >   
> > - /* Vector to keep track of external variables we've seen so far.  */
> > - vec *lto_global_var_decls;
> >   
> >   static GTY(()) tree first_personality_decl;
> >   
> > --- 50,55 
> > *** read_cgraph_and_symbols (unsigned nfiles
> > *** 3009,3017 
> >   static void
> >   materialize_cgraph (void)
> >   {
> > -   tree decl;
> > struct cgraph_node *node; 
> > -   unsigned i;
> > timevar_id_t lto_timer;
> >   
> > if (!quiet_flag)
> > --- 3007,3013 
> > *** materialize_cgraph (void)
> > *** 3043,3052 
> > current_function_decl = NULL;
> > set_cfun (NULL);
> >   
> > -   /* Inform the middle end about the global variables we have seen.  */
> > -   FOR_EACH_VEC_ELT (*lto_global_var_decls, i, decl)
> > - rest_of_decl_compilation (decl, 1, 0);
> > - 
> > if (!quiet_flag)
> >   fprintf (stderr, "\n");
> >   
> > --- 3039,3044 
> > *** lto_main (void)
> > *** 3309,3316 
> > do_whole_program_analysis ();
> > else
> > {
> > - varpool_node *vnode;
> > - 
> >   timevar_start (TV_PHASE_OPT_GEN);
> >   
> >   materialize_cgraph ();
> > --- 3301,3306 
> > *** lto_main (void)
> > *** 3330,3339 
> >  this.  */
> >   if (flag_lto_report || (flag_wpa && flag_lto_report_wpa))
> > print_lto_report_1 ();
> > - 
> > - /* Record the global variables.  */
> > - FOR_EACH_DEF

Re: PR ipa/59831 (ipa-cp devirt issues)

2014-02-05 Thread Jan Hubicka

> > 
> 
> I think it would be better, yes.  IIRC, We want to re-organize
> indirect_info anyway (I vaguely remember we want to split the
> overloaded offset field into two but forgot the exact reason why but I
> have it written somewhere), I suppose we'll be turning it into a union
> (or class hierarchy?) and this would make it slightly awkward.  If we
> ever support the pointer-by-reference scenario, quite a few more
> places will need to be adjusted anyway.

OK, I will remove the check. I wanted to split polymorphic call context
and its propagation from aggregate analysis.  For example

struct A
{
  struct B b;
  struct C c;
}

when you call method of A.c.foo() its context is not
{outer_type=a,offset=offsetof(c)}, since A is not polymorphic type at all. We
should instead use {outer_type=c,offset=0,not_derived_type}.  In
ipa-devirt there is function get_class_context that gets you from first to
second.

So as briefly discussed last july, I think ipa-prop can simply do two
propagations in parallel where the type one is done via functionality that
willbe exported from ipa-devirt, since the logic should be shared with
a local devirt pass.

There are also cases where the parameter is KNOWN_TYPE but also PASS_THROUGH at
the same type (as tings passed by invisible refernece, or when the function is
constructor and it initialized dynamic type of THIS pointer).

Those are cases where current code is losing information because one is not
supperset of the other.  We want to use KNOWN_TYPE information for 
devirtualization,
but we do not want to forget about PASS_THROUGH for normal constant propagation 
in
case constructor is only called on &decl.

Honza
> 
> Thanks,
> 
> Martin

Re: [RS6000] SDmode and reload

2014-02-05 Thread David Edelsohn

On Wed, Feb 5, 2014 at 8:27 AM, Alan Modra  wrote:

> Vlad added rs6000_secondary_memory_needed_mode for lra, and in so
> doing broke reload's use of cfun->machine->sdmode_stack_slot in
> rs6000_secondary_memory_needed_rtx (mode was no longer SDmode there).
> Bootstrapped and regression tested powerpc64-linux, both with and
> without -mlra.  OK to commit?
>
> PR target/60032
> * config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only
> change SDmode to DDmode when lra_in_progress.
>
> * gcc.target/powerpc/pr60032.c: New.

So Vlad added the definition to the rs6000 port, but it only is needed
by LRA and does not interact correctly with classic reload.

Okay.

- David

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Sandiford

Jakub Jelinek  writes:
> On Tue, Feb 04, 2014 at 12:21:14PM +, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>> >> But then we wouldn't be able to use var-tracking when __builtin_eh_return
>> >> is used, since in that case replacing the (set (reg ...) (mem ...))
>> >> with a (plus ...) would be incorrect -- the value we're loading from the
>> >> stack will have had a variable adjustment applied.  And I know from 
>> >> painful
>> >> experience that being able to debug the unwind code is very useful. :-)
>> >
>> > Aren't functions using EH_RETURN typically using frame pointer?
>> 
>> Sorry, forgot to respond to this bit.  On s390, _Unwind_RaiseException and
>> _Unwind_ForcedUnwind don't use a frame pointer, at least not on trunk.
>> %r11 is used as a general scratch register instead.  E.g.:
>> 
>> 2ba8 <_Unwind_ForcedUnwind>:
>> 2ba8:   90 6f f0 18 stm %r6,%r15,24(%r15)
>> 2bac:   0d d0   basr%r13,%r0
>> 2bae:   60 40 f0 50 std %f4,80(%r15)
>> 2bb2:   60 60 f0 58 std %f6,88(%r15)
>> 2bb6:   a7 fa fd f8 ahi %r15,-520
>> 2bba:   58 c0 d0 9e l   %r12,158(%r13)
>> 2bbe:   58 10 d0 9a l   %r1,154(%r13)
>> 2bc2:   18 b2   lr  %r11,%r2
>> ...
>> 2c10:   98 6f f2 20 lm  %r6,%r15,544(%r15)
>> 2c14:   07 f4   br  %r4
>
> Looking at pr54693-2.c -O2 -g -m64 -march=z196 (was that the one you meant)
> with your patches, I see:
> (insn/f:TI 122 30 31 4 (parallel [
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 80 [0x50])) [3  S8 A8])
> (reg:DI 10 %r10))
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 88 [0x58])) [3  S8 A8])
> (reg:DI 11 %r11))
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 96 [0x60])) [3  S8 A8])
> (reg:DI 12 %r12))
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 104 [0x68])) [3  S8 A8])
> (reg:DI 13 %r13))
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 112 [0x70])) [3  S8 A8])
> (reg:DI 14 %r14))
> (set/f (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 120 [0x78])) [3  S8 A8])
> (reg/f:DI 15 %r15))
> ]) pr54693-2.c:16 94 {*store_multiple_di}
>  (expr_list:REG_DEAD (reg:DI 14 %r14)
> (expr_list:REG_DEAD (reg:DI 13 %r13)
> (expr_list:REG_DEAD (reg:DI 12 %r12)
> (expr_list:REG_DEAD (reg:DI 11 %r11)
> (expr_list:REG_DEAD (reg:DI 10 %r10)
> (nil)))
> ...
> (insn/f 123 31 124 4 (set (reg/f:DI 15 %r15)
> (plus:DI (reg/f:DI 15 %r15)
> (const_int -160 [0xff60]))) pr54693-2.c:16 65 {*la_64}
>  (expr_list:REG_FRAME_RELATED_EXPR (set (reg/f:DI 15 %r15)
> (plus:DI (reg/f:DI 15 %r15)
> (const_int -160 [0xff60])))
> (nil)))
> ...
> (insn/f:TI 135 134 136 7 (parallel [
> (set (reg:DI 10 %r10)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 240 [0xf0])) [3  S8 A8]))
> (set (reg:DI 11 %r11)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 248 [0xf8])) [3  S8 A8]))
> (set (reg:DI 12 %r12)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 256 [0x100])) [3  S8 A8]))
> (set (reg:DI 13 %r13)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 264 [0x108])) [3  S8 A8]))
> (set (reg:DI 14 %r14)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 272 [0x110])) [3  S8 A8]))
> (set (reg/f:DI 15 %r15)
> (mem:DI (plus:DI (reg/f:DI 15 %r15)
> (const_int 280 [0x118])) [3  S8 A8]))
> ]) pr54693-2.c:25 92 {*load_multiple_di}
>  (expr_list:REG_CFA_DEF_CFA (plus:DI (reg/f:DI 15 %r15)
> (const_int 160 [0xa0]))
> (expr_list:REG_CFA_RESTORE (reg/f:DI 15 %r15)
> (expr_list:REG_CFA_RESTORE (reg:DI 14 %r14)
> (expr_list:REG_CFA_RESTORE (reg:DI 13 %r13)
> (expr_list:REG_CFA_RESTORE (reg:DI 12 %r12)
> (expr_list:REG_CFA_RESTORE (reg:DI 11 %r11)
> (expr_list:REG_CFA_RESTORE (reg:DI 10 %r10)
> (nil)
> In a function that doesn't need frame pointer, I'd say this is a serious
> bloat of the unwind info, you are saving/restoring %r15 not because you have
> to, but just that i

Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Richard Henderson

On 02/04/2014 09:35 AM, Jan Hubicka wrote:
>> On 02/04/2014 08:48 AM, Jan Hubicka wrote:
>>> How things are supposed to work in this case? So perhaps we need scheduler 
>>> to
>>> understand and move around the ARGS_SIZE note?
>>
>> I believe we do need to have the scheduler move the notes around.
> 
> If we need notes on non-stack adjusting insns as you seem to show in your 
> testcase,
> I guess it is the only way around.  Still combine stack adjust may be smart 
> enough
> to not produce redundant note in the case of go's ICE.
> 
> I am not terribly familiar with the code, will you look into it or shall I 
> give it a try?

I had a brief look at find_modifiable_mems, which seems to be the only kind of
schedule-time dependency breaking that we do.  I started writing some new code
that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
appropriate for stage4.

So I've left off with just the dependency links between the insns.  We'd need
these for the schedule-time adjustments anyway, and with them in place the
scheduler does not re-order the insns in a way that causes problems.

Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
question I have is if anyone disagrees that OUTPUT is incorrect as the
dependency type.

r~

Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Richard Henderson

This time with the patch...

On 02/05/2014 07:40 AM, Richard Henderson wrote:
> On 02/04/2014 09:35 AM, Jan Hubicka wrote:
>>> On 02/04/2014 08:48 AM, Jan Hubicka wrote:
 How things are supposed to work in this case? So perhaps we need scheduler 
 to
 understand and move around the ARGS_SIZE note?
>>>
>>> I believe we do need to have the scheduler move the notes around.
>>
>> If we need notes on non-stack adjusting insns as you seem to show in your 
>> testcase,
>> I guess it is the only way around.  Still combine stack adjust may be smart 
>> enough
>> to not produce redundant note in the case of go's ICE.
>>
>> I am not terribly familiar with the code, will you look into it or shall I 
>> give it a try?
> 
> I had a brief look at find_modifiable_mems, which seems to be the only kind of
> schedule-time dependency breaking that we do.  I started writing some new code
> that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
> appropriate for stage4.
> 
> So I've left off with just the dependency links between the insns.  We'd need
> these for the schedule-time adjustments anyway, and with them in place the
> scheduler does not re-order the insns in a way that causes problems.
> 
> Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
> question I have is if anyone disagrees that OUTPUT is incorrect as the
> dependency type.
> 
> 
> r~
> 

* combine-stack-adj.c: Revert r206943.
* sched-int.h (struct deps_desc): Add last_args_size.
* sched-deps.c (init_deps): Initialize it.
(sched_analyze_insn): Add OUTPUT dependencies between insns that
contain REG_ARGS_SIZE notes.


diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c
index c591c60..69fd5ea 100644
--- a/gcc/combine-stack-adj.c
+++ b/gcc/combine-stack-adj.c
@@ -567,7 +567,6 @@ combine_stack_adjustments_for_block (basic_block bb)
  && try_apply_stack_adjustment (insn, reflist, 0,
 -last_sp_adjust))
{
- rtx note;
  if (last2_sp_set)
maybe_move_args_size_note (last2_sp_set, last_sp_set, false);
  else
@@ -577,11 +576,6 @@ combine_stack_adjustments_for_block (basic_block bb)
  reflist = NULL;
  last_sp_set = NULL_RTX;
  last_sp_adjust = 0;
- /* We no longer adjust stack size.  Whoever adjusted it earlier
-hopefully got the note right.  */
- note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
- if (note)
-   remove_note (insn, note);
  continue;
}
}
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 7efc937..efc4223 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -3470,6 +3470,15 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, rtx 
insn)
 change_spec_dep_to_hard (sd_it);
 }
 }
+
+  /* We do not yet have code to adjust REG_ARGS_SIZE, therefore we must
+ honor their original ordering.  */
+  if (find_reg_note (insn, REG_ARGS_SIZE, NULL))
+{
+  if (deps->last_args_size)
+   add_dependence (insn, deps->last_args_size, REG_DEP_OUTPUT);
+  deps->last_args_size = insn;
+}
 }
 
 /* Return TRUE if INSN might not always return normally (e.g. call exit,
@@ -3876,6 +3885,7 @@ init_deps (struct deps_desc *deps, bool lazy_reg_last)
   deps->sched_before_next_jump = 0;
   deps->in_post_call_group_p = not_post_call;
   deps->last_debug_insn = 0;
+  deps->last_args_size = 0;
   deps->last_reg_pending_barrier = NOT_A_BARRIER;
   deps->readonly = 0;
 }
diff --git a/gcc/sched-int.h b/gcc/sched-int.h
index 3b1106f..2cec624 100644
--- a/gcc/sched-int.h
+++ b/gcc/sched-int.h
@@ -539,6 +539,9 @@ struct deps_desc
   /* The last debug insn we've seen.  */
   rtx last_debug_insn;
 
+  /* The last insn bearing REG_ARGS_SIZE that we've seen.  */
+  rtx last_args_size;
+
   /* The maximum register number for the following arrays.  Before reload
  this is max_reg_num; after reload it is FIRST_PSEUDO_REGISTER.  */
   int max_reg;

Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Jan Hubicka

> > I had a brief look at find_modifiable_mems, which seems to be the only kind 
> > of
> > schedule-time dependency breaking that we do.  I started writing some new 
> > code
> > that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
> > appropriate for stage4.
> > 
> > So I've left off with just the dependency links between the insns.  We'd 
> > need
> > these for the schedule-time adjustments anyway, and with them in place the
> > scheduler does not re-order the insns in a way that causes problems.
> > 
> > Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
> > question I have is if anyone disagrees that OUTPUT is incorrect as the
> > dependency type.

>From i386  perspective, this seems like resonable compromise for now.
I did some re-testing of scheduler recently for generic.
Call sequences is one of few places where scheduling matters (in addition to
tight loops), but I do not think we will lose measurable % of perofmrance by
disabling scheduling here.

Thanks!
Honza
> > 
> > 
> > r~
> > 
> 

>   * combine-stack-adj.c: Revert r206943.
>   * sched-int.h (struct deps_desc): Add last_args_size.
>   * sched-deps.c (init_deps): Initialize it.
>   (sched_analyze_insn): Add OUTPUT dependencies between insns that
>   contain REG_ARGS_SIZE notes.
> 
> 
> diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c
> index c591c60..69fd5ea 100644
> --- a/gcc/combine-stack-adj.c
> +++ b/gcc/combine-stack-adj.c
> @@ -567,7 +567,6 @@ combine_stack_adjustments_for_block (basic_block bb)
> && try_apply_stack_adjustment (insn, reflist, 0,
>-last_sp_adjust))
>   {
> -   rtx note;
> if (last2_sp_set)
>   maybe_move_args_size_note (last2_sp_set, last_sp_set, false);
> else
> @@ -577,11 +576,6 @@ combine_stack_adjustments_for_block (basic_block bb)
> reflist = NULL;
> last_sp_set = NULL_RTX;
> last_sp_adjust = 0;
> -   /* We no longer adjust stack size.  Whoever adjusted it earlier
> -  hopefully got the note right.  */
> -   note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
> -   if (note)
> - remove_note (insn, note);
> continue;
>   }
>   }
> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
> index 7efc937..efc4223 100644
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -3470,6 +3470,15 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, rtx 
> insn)
>  change_spec_dep_to_hard (sd_it);
>  }
>  }
> +
> +  /* We do not yet have code to adjust REG_ARGS_SIZE, therefore we must
> + honor their original ordering.  */
> +  if (find_reg_note (insn, REG_ARGS_SIZE, NULL))
> +{
> +  if (deps->last_args_size)
> + add_dependence (insn, deps->last_args_size, REG_DEP_OUTPUT);
> +  deps->last_args_size = insn;
> +}
>  }
>  
>  /* Return TRUE if INSN might not always return normally (e.g. call exit,
> @@ -3876,6 +3885,7 @@ init_deps (struct deps_desc *deps, bool lazy_reg_last)
>deps->sched_before_next_jump = 0;
>deps->in_post_call_group_p = not_post_call;
>deps->last_debug_insn = 0;
> +  deps->last_args_size = 0;
>deps->last_reg_pending_barrier = NOT_A_BARRIER;
>deps->readonly = 0;
>  }
> diff --git a/gcc/sched-int.h b/gcc/sched-int.h
> index 3b1106f..2cec624 100644
> --- a/gcc/sched-int.h
> +++ b/gcc/sched-int.h
> @@ -539,6 +539,9 @@ struct deps_desc
>/* The last debug insn we've seen.  */
>rtx last_debug_insn;
>  
> +  /* The last insn bearing REG_ARGS_SIZE that we've seen.  */
> +  rtx last_args_size;
> +
>/* The maximum register number for the following arrays.  Before reload
>   this is max_reg_num; after reload it is FIRST_PSEUDO_REGISTER.  */
>int max_reg;

Re: [PATCH] Remove lto_global_var_decls, simplify necessary analysis for PR60060

2014-02-05 Thread Jan Hubicka

> > 
> > Seems OK to me.  I always had an impression that this machinery exists to
> > make decls that are revmoed fro symbol table (optimized out) to land the
> > debug info, but I think this is mostly broken by partitioning anyway, right?
> > Do we stream them at all?
> 
> No, we don't stream those.  We'd need to stream what the FEs think
> are the globals (lang_hooks.getdecls()) and somehow partition them
> and ship them somewhere.

My old idea was to simply output debug for dead stuff from WPA and add it asone
of object files to linking later.
But indeed, this is for later time.

Honza

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
> > In a function that doesn't need frame pointer, I'd say this is a serious
> > bloat of the unwind info, you are saving/restoring %r15 not because you have
> > to, but just that it seems to be cheaper for the target.  So, I'd say you
> > shouldn't emit DW_CFA_offset 15, 5 on the insn 122 in the prologue and
> > DW_CFA_restore 15 in the epilogue (twice in this case), that just waste
> > of .eh_frame/.debug_frame space.  I'd say you should represent in
> > this case the *store_multiple_di as REG_FRAME_RELATED_EXPR
> > with all but the last set in the PARALLEL, and on the *load_multiple_di
> > remove REG_CFA_RESTORE (r15) note and change REG_CFA_DEF_CFA note to
> > REG_CFA_ADJUST_CFA note which would say that stack pointer has been
> > incremented by 160 bytes (epilogue expansion knows that value).
> 
> I think at the moment we're relying on the "DW_CFA_offset 15"s to
> provide correct %r15 values on eh_return.  Every non-leaf function
> saves the stack pointer and the unwind routines track the %r15 save
> slots like they would track any call-saved register.  In other words,
> the final eh_return %r15 comes directly from a save slot, without any
> CFA-specific handling.  If some frames omit the CFI for the %r15 save
> then we'll just keep the save slot from the previous (inner) frame instead.

CCing Richard Henderson into the discussion.

> The new unwind routines would handle 4.9+ and pre-4.9 frames, but pre-4.9
> unwind routines wouldn't handle 4.9+ frames, so would that require a version
> bump on the unwind symbols?

No need to version anything IMHO.  We've never done that for any port
where we've started to emit something the old unwinder wouldn't cope with,
it is user's responsibility to use new enough libgcc_s.

BTW, what is the reason why %r15 is saved/restored from the stack at all say
for simple:
void
foo (void)
{
  int a = 6;
  asm volatile ("# %0" : "+m" (a));
}

Is that some ABI matter where it effectively always uses requires to use
some kind of frame pointer (which the saved previous stack pointer is)?
It can surely make debugging without unwind info easier, ditto backtraces,
on the other side it must have a runtime cost.

Jakub

[testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Rainer Orth

gcc.target/i386/avx512f-vrndscaless-2.c currently FAILs on Solaris 9/x86
with gas:

FAIL: gcc.target/i386/avx512f-vrndscaless-2.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
c:21:14: warning: incompatible implicit declaration of built-in function 'floorf
' [enabled by default]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
c:24:14: warning: incompatible implicit declaration of built-in function 'ceilf'
 [enabled by default]

The platform lacks C99 support, but this can easily be avoided by using
the builtins instead.  The following patch does just that; tested
with the appropriate runtest invocation on i386-pc-solaris2.9 and
x86_64-unknown-linux-gnu.

Ok for mainline?

Rainer


2014-02-05  Rainer Orth  

* gcc.target/i386/avx512f-vrndscaless-2.c (compute_rndscaless):
Use __builtin_floorf, __builtin_ceilf.

# HG changeset patch
# Parent c8a18ca98263f4a2ca4e3e723f9d2b4596b67207
Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
--- a/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
@@ -18,10 +18,10 @@ compute_rndscaless (float *s1, float *s2
   switch (rc)
 {
 case _MM_FROUND_FLOOR:
-  r[0] = floorf (s2[0] * pow (2, m)) / pow (2, m);
+  r[0] = __builtin_floorf (s2[0] * pow (2, m)) / pow (2, m);
   break;
 case _MM_FROUND_CEIL:
-  r[0] = ceilf (s2[0] * pow (2, m)) / pow (2, m);
+  r[0] = __builtin_ceilf (s2[0] * pow (2, m)) / pow (2, m);
   break;
 default:
   abort ();

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Uros Bizjak

On Wed, Feb 5, 2014 at 4:50 PM, Rainer Orth  
wrote:
> gcc.target/i386/avx512f-vrndscaless-2.c currently FAILs on Solaris 9/x86
> with gas:
>
> FAIL: gcc.target/i386/avx512f-vrndscaless-2.c (test for excess errors)
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
> c:21:14: warning: incompatible implicit declaration of built-in function 
> 'floorf
> ' [enabled by default]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
> c:24:14: warning: incompatible implicit declaration of built-in function 
> 'ceilf'
>  [enabled by default]
>
> The platform lacks C99 support, but this can easily be avoided by using
> the builtins instead.  The following patch does just that; tested
> with the appropriate runtest invocation on i386-pc-solaris2.9 and
> x86_64-unknown-linux-gnu.

Let's solve this in the way sse4_1-floorf-vec.c solves it and simply add

extern float floorf (float);

after math.h include.

Does this work for you?

Uros.

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 05:09:31PM +0100, Uros Bizjak wrote:
> On Wed, Feb 5, 2014 at 4:50 PM, Rainer Orth  
> wrote:
> > gcc.target/i386/avx512f-vrndscaless-2.c currently FAILs on Solaris 9/x86
> > with gas:
> >
> > FAIL: gcc.target/i386/avx512f-vrndscaless-2.c (test for excess errors)
> > Excess errors:
> > /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
> > c:21:14: warning: incompatible implicit declaration of built-in function 
> > 'floorf
> > ' [enabled by default]
> > /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
> > c:24:14: warning: incompatible implicit declaration of built-in function 
> > 'ceilf'
> >  [enabled by default]
> >
> > The platform lacks C99 support, but this can easily be avoided by using
> > the builtins instead.  The following patch does just that; tested
> > with the appropriate runtest invocation on i386-pc-solaris2.9 and
> > x86_64-unknown-linux-gnu.
> 
> Let's solve this in the way sse4_1-floorf-vec.c solves it and simply add
> 
> extern float floorf (float);

Won't that break if math.h defines floorf as a macro?
You'd at least need then
extern float (floorf) (float);

Jakub

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Uros Bizjak

On Wed, Feb 5, 2014 at 5:11 PM, Jakub Jelinek  wrote:

>> > gcc.target/i386/avx512f-vrndscaless-2.c currently FAILs on Solaris 9/x86
>> > with gas:
>> >
>> > FAIL: gcc.target/i386/avx512f-vrndscaless-2.c (test for excess errors)
>> > Excess errors:
>> > /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
>> > c:21:14: warning: incompatible implicit declaration of built-in function 
>> > 'floorf
>> > ' [enabled by default]
>> > /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.
>> > c:24:14: warning: incompatible implicit declaration of built-in function 
>> > 'ceilf'
>> >  [enabled by default]
>> >
>> > The platform lacks C99 support, but this can easily be avoided by using
>> > the builtins instead.  The following patch does just that; tested
>> > with the appropriate runtest invocation on i386-pc-solaris2.9 and
>> > x86_64-unknown-linux-gnu.
>>
>> Let's solve this in the way sse4_1-floorf-vec.c solves it and simply add
>>
>> extern float floorf (float);
>
> Won't that break if math.h defines floorf as a macro?
> You'd at least need then
> extern float (floorf) (float);

Looks like using builtins should be the correct way, so the original
patch is OK.

Rainer, can you please add the same cure to sse4_1-floor*.vec tests?

Thanks,
Uros.

Re: [C++ Patch/RFC] PR 60047

2014-02-05 Thread Paolo Carlini


Hi,

On 02/04/2014 12:51 PM, Paolo Carlini wrote:

Hi,

thus I tried to have a look to this issue, while experiencing some 
weird problems with the debugger, which slowed me down a lot. I have 
been able to figure out that we don't seem to actively set constexpr_p 
to true anywhere in implicitly_declare_fn (besides the unrelated case 
of inheriting constructors), thus I'm wondering if it would make sense 
to simply do the below?!? (and revert my recent tweak)
I spent a little more time on this. Outside the case of inheriting 
constructors and the special case of expected_trivial default 
constructors (which are handled in a special conditional), we are 
normally assigning true to constexpr_p only in one place, that is close 
to the beginning of synthesized_method_walk:


  /* If that user-written default constructor would satisfy the
 requirements of a constexpr constructor (7.1.5), the
 implicitly-defined default constructor is constexpr.  */
  if (constexpr_p)
*constexpr_p = ctor_p;

then, for the testcase at issue it becomes false in the place which we 
already discussed for the ICE:


  if (vec_safe_is_empty (vbases))
/* No virtual bases to worry about.  */;
  else if (!assign_p)
{
  if (constexpr_p)
*constexpr_p = false;

Then it doesn't gets a chance to return true. Thus, given the more 
accurate analysis, I think that we either do what I quickly proposed 
yesterday, or alternately, the only option I can see, we change 
process_subob_fn to set it to true in some cases (but I note that it 
does never set to true trivial_p).


Thanks,
Paolo.

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Rainer Orth

Hi Uros,

> Let's solve this in the way sse4_1-floorf-vec.c solves it and simply add
>
> extern float floorf (float);
>
> after math.h include.
>
> Does this work for you?

it does, but only because the floorf call is optimized away at -O2.
While the Solaris 9 libm contains __floorf, there's no floorf, neither
in libm nor in headers.  The builtin route avoids those problems.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Rainer Orth

Uros Bizjak  writes:

> Looks like using builtins should be the correct way, so the original
> patch is OK.
>
> Rainer, can you please add the same cure to sse4_1-floor*.vec tests?

Sure, will do.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [testsuite] Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

2014-02-05 Thread Rainer Orth

Rainer Orth  writes:

> Uros Bizjak  writes:
>
>> Looks like using builtins should be the correct way, so the original
>> patch is OK.
>>
>> Rainer, can you please add the same cure to sse4_1-floor*.vec tests?
>
> Sure, will do.

Here's what I committed after retesting on i386-pc-solaris2.9,
i386-pc-solaris2.11 and x86_64-unknown-linux-gnu.

Rainer


2014-02-05  Rainer Orth  

* gcc.target/i386/avx512f-vrndscaless-2.c (compute_rndscaless):
Use __builtin_floorf, __builtin_ceilf.
* gcc.target/i386/sse4_1-floorf-sfix-vec.c (floorf): Remove
declaration.
(TEST): Use __builtin_floorf.
* gcc.target/i386/sse4_1-floorf-vec.c: Likewise.

# HG changeset patch
# Parent c8a18ca98263f4a2ca4e3e723f9d2b4596b67207
Fix gcc.target/i386/avx512f-vrndscaless-2.c on Solaris 9/x86

diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c b/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
--- a/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vrndscaless-2.c
@@ -18,10 +18,10 @@ compute_rndscaless (float *s1, float *s2
   switch (rc)
 {
 case _MM_FROUND_FLOOR:
-  r[0] = floorf (s2[0] * pow (2, m)) / pow (2, m);
+  r[0] = __builtin_floorf (s2[0] * pow (2, m)) / pow (2, m);
   break;
 case _MM_FROUND_CEIL:
-  r[0] = ceilf (s2[0] * pow (2, m)) / pow (2, m);
+  r[0] = __builtin_ceilf (s2[0] * pow (2, m)) / pow (2, m);
   break;
 default:
   abort ();
diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-floorf-sfix-vec.c b/gcc/testsuite/gcc.target/i386/sse4_1-floorf-sfix-vec.c
--- a/gcc/testsuite/gcc.target/i386/sse4_1-floorf-sfix-vec.c
+++ b/gcc/testsuite/gcc.target/i386/sse4_1-floorf-sfix-vec.c
@@ -15,8 +15,6 @@
 
 #include 
 
-extern float floorf (float);
-
 #define NUM 64
 
 static void
@@ -53,10 +51,10 @@ TEST (void)
   init_src (a);
 
   for (i = 0; i < NUM; i++)
-r[i] = (int) floorf (a[i]);
+r[i] = (int) __builtin_floorf (a[i]);
 
   /* check results:  */
   for (i = 0; i < NUM; i++)
-if (r[i] != (int) floorf (a[i]))
+if (r[i] != (int) __builtin_floorf (a[i]))
   abort();
 }
diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-floorf-vec.c b/gcc/testsuite/gcc.target/i386/sse4_1-floorf-vec.c
--- a/gcc/testsuite/gcc.target/i386/sse4_1-floorf-vec.c
+++ b/gcc/testsuite/gcc.target/i386/sse4_1-floorf-vec.c
@@ -15,8 +15,6 @@
 
 #include 
 
-extern float floorf (float);
-
 #define NUM 64
 
 static void
@@ -53,10 +51,10 @@ TEST (void)
   init_src (a);
 
   for (i = 0; i < NUM; i++)
-r[i] = floorf (a[i]);
+r[i] = __builtin_floorf (a[i]);
 
   /* check results:  */
   for (i = 0; i < NUM; i++)
-if (r[i] != floorf (a[i]))
+if (r[i] != __builtin_floorf (a[i]))
   abort();
 }

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: regression test issue

2014-02-05 Thread Paolo Carlini


Hi,

On 02/05/2014 06:29 AM, Iyer, Balaji V wrote:

Hello Everyone,
The following two Cilk Plus tests is timing out at -O1 in my x86_64 box 
(-O2, -O3  and -O0 works fine). These tests were working fine till revision 
r207047. Can someone please look at this? It looks like a middle-end/back-end 
issue.

WARNING: program timed out.
FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
WARNING: program timed out.
FAIL: c-c++-common/cilk-plus/CK/spawner_inline.c  -O1 execution test
Thanks. I'm wondering if we could open an high priority Bugzilla about 
that and for the time being avoid running the test at -O1, it is slowing 
down the testsuite and using a lot of cpu. Also, maybe you could figure 
out which specific change caused the regression and ping the appropriate 
people...


Thanks again,
Paolo.

RE: regression test issue

2014-02-05 Thread Iyer, Balaji V



> -Original Message-
> From: Paolo Carlini [mailto:paolo.carl...@oracle.com]
> Sent: Wednesday, February 5, 2014 11:53 AM
> To: Iyer, Balaji V; gcc-patches@gcc.gnu.org
> Subject: Re: regression test issue
> 
> Hi,
> 
> On 02/05/2014 06:29 AM, Iyer, Balaji V wrote:
> > Hello Everyone,
> > The following two Cilk Plus tests is timing out at -O1 in my x86_64 box
> (-O2, -O3  and -O0 works fine). These tests were working fine till revision
> r207047. Can someone please look at this? It looks like a middle-end/back-
> end issue.
> >
> > WARNING: program timed out.
> > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
> > WARNING: program timed out.
> > FAIL: c-c++-common/cilk-plus/CK/spawner_inline.c  -O1 execution test
> Thanks. I'm wondering if we could open an high priority Bugzilla about that
> and for the time being avoid running the test at -O1, it is slowing down the
> testsuite and using a lot of cpu. Also, maybe you could figure out which
> specific change caused the regression and ping the appropriate people...

Hi Paolo,   
OK. Here is a patch that will do so:

Index: gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
===
--- gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(revision 207437)
+++ gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(working copy)
@@ -64,12 +64,12 @@
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -O3 
-ftree-vectorize -fcilkplus -g" " "

 if { [check_libcilkrts_available] } {
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-O1 -fcilkplus" " "
+#dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-O1 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-O3 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-g -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-g -O2 -fcilkplus" " "

-dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1" " "
+#dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1" " "
 dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3" " "
 dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g" " "
 dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -O2" " "
===

Here is the ChangeLog entry:

2014-02-05  Balaji V. Iyer  

* g++.dg/cilk-plus/cilk-plus.exp: Commented out -O1 Cilk Keywords
test for the time-being.


Note: I am purposely commenting the line out and not deleting them because I 
want to get them to run in -O1 some time, and deleting them might cause us 
(atleast me :-). ) to forget about it.

I will go ahead and file a bugzilla report.

Thanks,

Balaji V. Iyer.


> 
> Thanks again,
> Paolo.

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Henderson

On 02/05/2014 07:58 AM, Jakub Jelinek wrote:
> On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
>> I think at the moment we're relying on the "DW_CFA_offset 15"s to
>> provide correct %r15 values on eh_return.  Every non-leaf function
>> saves the stack pointer and the unwind routines track the %r15 save
>> slots like they would track any call-saved register.  In other words,
>> the final eh_return %r15 comes directly from a save slot, without any
>> CFA-specific handling.  If some frames omit the CFI for the %r15 save
>> then we'll just keep the save slot from the previous (inner) frame instead.
> 
> CCing Richard Henderson into the discussion.

If the sp is saved for a given frame, we'll use that instead of the CFA when
unwinding to a previous frame.  That should be unaffected by whether or not sp
is saved for any particular frame.

There appears to be code in uw_install_context_1 to handle a mix of sp-saving
and non-sp-saving frames.  Given that s390 is pretty much the only target that
uses these paths, I suppose it could be broken.  But if it is, it would surely
be better to fix than just say "it doesn't work".

> BTW, what is the reason why %r15 is saved/restored from the stack at all say
> for simple:
> void
> foo (void)
> {
>   int a = 6;
>   asm volatile ("# %0" : "+m" (a));
> }
> 
> Is that some ABI matter where it effectively always uses requires to use
> some kind of frame pointer (which the saved previous stack pointer is)?
> It can surely make debugging without unwind info easier, ditto backtraces,
> on the other side it must have a runtime cost.

As far as I recall, the return address is in r14 so the ABI also saves r15 for
all non-leaf functions simply because it is practically free to do so with
store-multiple.  This creates the familiar ra/fp unwinding chain.  Given how
cheap it is to maintain this chain on s390, I see no particular call to break 
it.

As for why r15 gets saved for this leaf example... I have no idea.
Seems like a bug to me, frankly.

r~

RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-02-05 Thread Graham Stott

Hi Richard,

Attached is an updated patch for  feedback so  MSA support to MIPS backend can 
be added when open again for next stage1.

It is unfinished in that some comments from your review of the initial patch 
have yet to be addressed.

The diff is against svn 207500 

Graham




msa.svn.207500.diffs.gz
Description: msa.svn.207500.diffs.gz

Re: RFA: MN10300: Include saved registers in stack usage

2014-02-05 Thread Joseph S. Myers

On Wed, 5 Feb 2014, Jeff Law wrote:

> However, it's pretty easy to avoid the headaches and just provide a popcount
> routine.   I don't think this code is at all performance critical (once per
> function being compiled), so a simple popcount should be sufficient.  No need
> for lookup tables IMHO.

hwint.[ch] provide popcount_hwi, if HOST_WIDE_INT is a suitable type.

(Personally I think there are lots of bits of standard integer 
manipulation functionality, such as popcount, that are common in processor 
instructions and compiler intrinsics, but could do with having bindings in 
ISO C to provide a standard way to access that functionality.  Such 
bindings might take the form of library functions / macros that compilers 
/ library headers can then optimize as appropriate; that seems most in the 
ISO C style rather than __builtin_*.  Even outside of ISO C, I suspect 
there's quite a bit in common between target-specific builtins for 
different targets where a target-independent builtin would be better - and 
for that matter, target-specific builtins where some form of standard 
binding already exists but isn't supported in GCC, or where the 
target-specific builtin duplicates something generic in GNU C.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFA: MN10300: Include saved registers in stack usage

2014-02-05 Thread Jeff Law


On 02/05/14 03:44, Nick Clifton wrote:

Hi Jeff, Hi Alex,

   Sorry - another MN10300 patch - this time for the stack size reported
   with -fstack-usage.  Currently the value includes the stack frame, but
   it does not take into account any registers that might have been
   pushed onto the stack.

   The patch below fixes this problem, but there is one thing that I am
   not sure about - is it OK to use __builtin_popcount() or should I be
   calling some other function ?
According to our coding conventions, the ability to build with something 
other than gcc is still desirable.  You could argue that you're unlikely 
to be bootstrapping on a mn103 with something other than GCC and if 
you're building a cross, you could start by first building gcc native.


However, it's pretty easy to avoid the headaches and just provide a 
popcount routine.   I don't think this code is at all performance 
critical (once per function being compiled), so a simple popcount should 
be sufficient.  No need for lookup tables IMHO.


jeff

RE: regression test issue

2014-02-05 Thread Iyer, Balaji V

Sorry, I forgot to put [PATCH] in the subject line. Is the patch below OK to 
install?

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Wednesday, February 5, 2014 12:02 PM
> To: 'Paolo Carlini'; gcc-patches@gcc.gnu.org
> Subject: RE: regression test issue
> 
> 
> 
> > -Original Message-
> > From: Paolo Carlini [mailto:paolo.carl...@oracle.com]
> > Sent: Wednesday, February 5, 2014 11:53 AM
> > To: Iyer, Balaji V; gcc-patches@gcc.gnu.org
> > Subject: Re: regression test issue
> >
> > Hi,
> >
> > On 02/05/2014 06:29 AM, Iyer, Balaji V wrote:
> > > Hello Everyone,
> > >   The following two Cilk Plus tests is timing out at -O1 in my x86_64
> > > box
> > (-O2, -O3  and -O0 works fine). These tests were working fine till
> > revision r207047. Can someone please look at this? It looks like a
> > middle-end/back- end issue.
> > >
> > > WARNING: program timed out.
> > > FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution
> > > test
> > > WARNING: program timed out.
> > > FAIL: c-c++-common/cilk-plus/CK/spawner_inline.c  -O1 execution test
> > Thanks. I'm wondering if we could open an high priority Bugzilla about
> > that and for the time being avoid running the test at -O1, it is
> > slowing down the testsuite and using a lot of cpu. Also, maybe you
> > could figure out which specific change caused the regression and ping the
> appropriate people...
> 
> Hi Paolo,
>   OK. Here is a patch that will do so:
> 
> Index: gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
> ==
> =
> --- gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(revision 207437)
> +++ gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(working copy)
> @@ -64,12 +64,12 @@
>  dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -O3 
> -
> ftree-vectorize -fcilkplus -g" " "
> 
>  if { [check_libcilkrts_available] } {
> -dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
> -
> O1 -fcilkplus" " "
> +#dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] 
> " -
> O1 -fcilkplus" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
> -O3
> -fcilkplus" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
> -g -
> fcilkplus" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
> -g -
> O2 -fcilkplus" " "
> 
> -dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-
> plus/CK/*.c]] " -O1" " "
> +#dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-
> plus/CK/*.c]] " -O1" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-
> plus/CK/*.c]] " -O3" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-
> plus/CK/*.c]] " -g" " "
>  dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-
> plus/CK/*.c]] " -g -O2" " "
> ==
> =
> 
> Here is the ChangeLog entry:
> 
> 2014-02-05  Balaji V. Iyer  
> 
> * g++.dg/cilk-plus/cilk-plus.exp: Commented out -O1 Cilk Keywords
> test for the time-being.
> 
> 
> Note: I am purposely commenting the line out and not deleting them
> because I want to get them to run in -O1 some time, and deleting them
> might cause us (atleast me :-). ) to forget about it.
> 
> I will go ahead and file a bugzilla report.
> 
> Thanks,
> 
> Balaji V. Iyer.
> 
> 
> >
> > Thanks again,
> > Paolo.

Re: regression test issue

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 07:08:32PM +, Iyer, Balaji V wrote:
> Sorry, I forgot to put [PATCH] in the subject line. Is the patch below OK to 
> install?

No.  If you want to skip a particular test, you should use dg-skip-if for
that, see http://gcc.gnu.org/onlinedocs/gccint/Directives.html

Jakub

RE: regression test issue

2014-02-05 Thread Iyer, Balaji V



> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, February 5, 2014 2:25 PM
> To: Iyer, Balaji V
> Cc: Paolo Carlini; gcc-patches@gcc.gnu.org
> Subject: Re: regression test issue
> 
> On Wed, Feb 05, 2014 at 07:08:32PM +, Iyer, Balaji V wrote:
> > Sorry, I forgot to put [PATCH] in the subject line. Is the patch below OK to
> install?
> 
> No.  If you want to skip a particular test, you should use dg-skip-if for 
> that,
> see http://gcc.gnu.org/onlinedocs/gccint/Directives.html

Ok. Here is the fixed patch:

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 207437)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-02-05  Balaji V. Iyer  
+
+   * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1
+   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
+
 2014-02-03  Marc Glisse  

PR c++/53017
Index: gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
===
--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207437)
+++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (working copy)
@@ -1,6 +1,7 @@
 /* { dg-options "-fcilkplus" } */
 /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
 /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* arm*-*-* 
} } } */
+/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */

 #include 
 #include 
Index: gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c
===
--- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 
207437)
+++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(working copy)
@@ -1,6 +1,7 @@
 /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
 /* { dg-options "-fcilkplus" } */
 /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */

OK to install?

Thanks,
Balaji V. Iyer.

> 
>   Jakub

Re: regression test issue

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 07:38:11PM +, Iyer, Balaji V wrote:
> +2014-02-05  Balaji V. Iyer  
> +
> +   * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1
> +   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.

Ok.

> --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207437)
> +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (working copy)
> @@ -1,6 +1,7 @@
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
>  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* 
> arm*-*-* } } } */
> +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> 
>  #include 
>  #include 
> Index: gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c
> ===
> --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 
> 207437)
> +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(working copy)
> @@ -1,6 +1,7 @@
>  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } 
> */
> +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> 
> OK to install?

Jakub

[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 2)

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote:
> > Using !!optimize to determine if we should switch local ABI to regparm
> > convention isn't compatible with optimize attribute, as !!optimize is
> > whether the current function is being optimized, but for the ABI decisions
> > we actually need the caller and callee to agree on the calling convention.
> > 
> > Fixed by looking at callee's optimize settings all the time.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Seeing a 2nd variant of such code I'd rather have an abstraction
> for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?

Here is an updated version of the patch, unfortunately this one regressed in
quite a lot of tests.  The problem is that by using opt_for_fn there
it also sets the failure reason for all functions at -O0 that don't have
the optimize attribute at all.  Now copy_forbidden is used by the inliner,
where we have to handle always_inline functions, other -O0 functions we likely
don't want to inline, be it because the cmd line options are -O0 or because
the function we are considering for inlining has optimize (0), and
for versioning, where I'd say we want to never version the -O0 functions.

So, where do we want to do that instead?  E.g. should it be e.g. in
tree_versionable_function_p directly and let the inliner (if it doesn't do
already) also treat optimize(0) functions that aren't always_inline as
noinline?

I guess even without this patch my PR60026 fix broke inlining of
__attribute__((always_inline, optimize (0))) functions.

2014-02-05  Jakub Jelinek  

PR target/60062
* tree.h (opts_for_fn): New inline function.
(opt_for_fn): Define.
* config/i386/i386.c (ix86_function_regparm): Use
opt_for_fn (decl, optimize) instead of optimize.
* tree-inline.c (copy_forbidden): Use opt_for_fn.

* gcc.c-torture/execute/pr60062.c: New test.
* gcc.c-torture/execute/pr60072.c: New test.

--- gcc/tree.h.jj   2014-01-20 19:16:29.0 +0100
+++ gcc/tree.h  2014-02-05 15:54:04.681904492 +0100
@@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var)
  || TREE_ADDRESSABLE (var)));
 }
 
+/* Return pointer to optimization flags of FNDECL.  */
+static inline struct cl_optimization *
+opts_for_fn (const_tree fndecl)
+{
+  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
+  if (fn_opts == NULL_TREE)
+fn_opts = optimization_default_node;
+  return TREE_OPTIMIZATION (fn_opts);
+}
+
+/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
+   the optimization level of function fndecl.  */
+#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt)
+
 /* For anonymous aggregate types, we need some sort of name to
hold on to.  In practice, this should not appear, but it should
not be harmful if it does.  */
--- gcc/config/i386/i386.c.jj   2014-02-05 10:37:36.166167116 +0100
+++ gcc/config/i386/i386.c  2014-02-05 15:56:36.779146394 +0100
@@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type,
   /* Use register calling convention for local functions when possible.  */
   if (decl
   && TREE_CODE (decl) == FUNCTION_DECL
-  && optimize
+  /* Caller and callee must agree on the calling convention, so
+checking here just optimize means that with
+__attribute__((optimize (...))) caller could use regparm convention
+and callee not, or vice versa.  Instead look at whether the callee
+is optimized or not.  */
+  && opt_for_fn (decl, optimize)
   && !(profile_flag && !flag_fentry))
 {
   /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
--- gcc/tree-inline.c.jj2014-02-04 14:03:31.0 +0100
+++ gcc/tree-inline.c   2014-02-05 15:55:34.747448968 +0100
@@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr
goto fail;
   }
 
-  tree fs_opts;
-  fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl);
-  if (fs_opts)
+  if (!opt_for_fn (fun->decl, optimize))
 {
-  struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts);
-  if (!os->x_optimize)
-   {
- reason = G_("function %q+F compiled without optimizations");
- goto fail;
-   }
+  reason = G_("function %q+F compiled without optimizations");
+  goto fail;
 }
 
  fail:
--- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 
15:55:48.639388697 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr60062.c   2014-02-05 
15:55:48.639388697 +0100
@@ -0,0 +1,25 @@
+/* PR target/60062 */
+
+int a;
+
+static void
+foo (const char *p1, int p2)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+static void
+bar (const char *p1)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+__attribute__((optimize (0))) int
+main ()
+{
+  foo ("hello", a);
+  bar ("hello");
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/p

RE: regression test issue

2014-02-05 Thread Iyer, Balaji V



> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> Sent: Wednesday, February 5, 2014 2:43 PM
> To: Iyer, Balaji V
> Cc: Paolo Carlini; gcc-patches@gcc.gnu.org
> Subject: Re: regression test issue
> 
> On Wed, Feb 05, 2014 at 07:38:11PM +, Iyer, Balaji V wrote:
> > +2014-02-05  Balaji V. Iyer  
> > +
> > +   * g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1
> > +   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
> 
> Ok.
> 

Committed.

-Balaji V. Iyer.


> > --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207437)
> > +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (working copy)
> > @@ -1,6 +1,7 @@
> >  /* { dg-options "-fcilkplus" } */
> >  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
> >  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* 
> > arm*-*-* }
> } } */
> > +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> >
> >  #include 
> >  #include 
> > Index: gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c
> >
> ==
> =
> > --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision
> 207437)
> > +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(working
> copy)
> > @@ -1,6 +1,7 @@
> >  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
> >  /* { dg-options "-fcilkplus" } */
> >  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } 
> > } */
> > +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> >
> > OK to install?
> 
>   Jakub

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Sandiford

Richard Henderson  writes:
> On 02/05/2014 07:58 AM, Jakub Jelinek wrote:
>> On Wed, Feb 05, 2014 at 03:32:22PM +, Richard Sandiford wrote:
>>> I think at the moment we're relying on the "DW_CFA_offset 15"s to
>>> provide correct %r15 values on eh_return.  Every non-leaf function
>>> saves the stack pointer and the unwind routines track the %r15 save
>>> slots like they would track any call-saved register.  In other words,
>>> the final eh_return %r15 comes directly from a save slot, without any
>>> CFA-specific handling.  If some frames omit the CFI for the %r15 save
>>> then we'll just keep the save slot from the previous (inner) frame instead.
>> 
>> CCing Richard Henderson into the discussion.
>
> If the sp is saved for a given frame, we'll use that instead of the CFA when
> unwinding to a previous frame.  That should be unaffected by whether or not sp
> is saved for any particular frame.
>
> There appears to be code in uw_install_context_1 to handle a mix of sp-saving
> and non-sp-saving frames.  Given that s390 is pretty much the only target that
> uses these paths, I suppose it could be broken.  But if it is, it would surely
> be better to fix than just say "it doesn't work".

OK.  When I was looking at it earlier today, the reason it didn't work
seemed to be that if %r15 wasn't saved for a particular frame, the context
would inherit the %r15 save slot for the previous frame, just like when
some functions use and save a particular call-saved register and some don't.
So we would only get to uw_install_context_1 without a %r15 slot if none
of the frames had one.

In other words, the reason seemed to be that the _Unwind_SetGRPtr in:

#ifdef EH_RETURN_STACKADJ_RTX
  /* Special handling here: Many machines do not use a frame pointer,
 and track the CFA only through offsets from the stack pointer from
 one frame to the next.  In this case, the stack pointer is never
 stored, so it has no saved address in the context.  What we do
 have is the CFA from the previous stack frame.

 In very special situations (such as unwind info for signal return),
 there may be location expressions that use the stack pointer as well.

 Do this conditionally for one frame.  This allows the unwind info
 for one frame to save a copy of the stack pointer from the previous
 frame, and be able to use much easier CFA mechanisms to do it.
 Always zap the saved stack pointer value for the next frame; carrying
 the value over from one frame to another doesn't make sense.  */

  _Unwind_SpTmp tmp_sp;

  if (!_Unwind_GetGRPtr (&orig_context, __builtin_dwarf_sp_column ()))
_Unwind_SetSpColumn (&orig_context, context->cfa, &tmp_sp);
  _Unwind_SetGRPtr (context, __builtin_dwarf_sp_column (), NULL);
#endif

was conditional on EH_RETURN_STACKADJ_RTX being defined.  It looked at
first glance like things would work if that call was moved outside,
so that we never inherit save slots for the stack pointer.  Does that
make sense?  I can give it a go tomorrow if so.

Thanks,
Richard

[PATCH] Remove unreachable return stmt (PR c/53123)

2014-02-05 Thread Marek Polacek

This is not a regression, on the other hand it's so obvious that
it could go in nevertheless.  Ok?

2014-02-05  Marek Polacek  

PR c/53123
c-family/
* c-omp.c (c_finish_omp_atomic): Remove unreachable return
statement.

diff --git gcc/c-family/c-omp.c gcc/c-family/c-omp.c
index 4ce51e4..dd0a45d 100644
--- gcc/c-family/c-omp.c
+++ gcc/c-family/c-omp.c
@@ -183,7 +183,6 @@ c_finish_omp_atomic (location_t loc, enum tree_code code,
   OMP_ATOMIC_SEQ_CST (x) = seq_cst;
   return build_modify_expr (loc, v, NULL_TREE, NOP_EXPR,
loc, x, NULL_TREE);
-  return x;
 }
 
   /* There are lots of warnings, errors, and conversions that need to happen

Marek

Re: [PATCH] Remove unreachable return stmt (PR c/53123)

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 09:58:32PM +0100, Marek Polacek wrote:
> This is not a regression, on the other hand it's so obvious that
> it could go in nevertheless.  Ok?
> 
> 2014-02-05  Marek Polacek  
> 
>   PR c/53123
> c-family/
>   * c-omp.c (c_finish_omp_atomic): Remove unreachable return
>   statement.

Ok, thanks.
> --- gcc/c-family/c-omp.c
> +++ gcc/c-family/c-omp.c
> @@ -183,7 +183,6 @@ c_finish_omp_atomic (location_t loc, enum tree_code code,
>OMP_ATOMIC_SEQ_CST (x) = seq_cst;
>return build_modify_expr (loc, v, NULL_TREE, NOP_EXPR,
>   loc, x, NULL_TREE);
> -  return x;
>  }
>  
>/* There are lots of warnings, errors, and conversions that need to happen

Jakub

Re: regression test issue

2014-02-05 Thread Rainer Orth

"Iyer, Balaji V"  writes:

> Index: gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
> ===
> --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207437)
> +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (working copy)
> @@ -1,6 +1,7 @@
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
>  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* 
> arm*-*-* } } } */
> +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
>
>  #include 
>  #include 
> Index: gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c
> ===
> --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 
> 207437)
> +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(working copy)
> @@ -1,6 +1,7 @@
>  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } 
> */
> +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */

You should put a comment (e.g. a PR reference) in the comment field to
explain why you're skipping this test.  Most likely, the last arg to
dg-skip-if ({ "" }) is optional; if so, please omit it.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand

Richard Sandiford wrote:

> In other words, the reason seemed to be that the _Unwind_SetGRPtr in:
> 
> #ifdef EH_RETURN_STACKADJ_RTX
>   /* Special handling here: Many machines do not use a frame pointer,
>  and track the CFA only through offsets from the stack pointer from
>  one frame to the next.  In this case, the stack pointer is never
>  stored, so it has no saved address in the context.  What we do
>  have is the CFA from the previous stack frame.
> 
>  In very special situations (such as unwind info for signal return),
>  there may be location expressions that use the stack pointer as well.
> 
>  Do this conditionally for one frame.  This allows the unwind info
>  for one frame to save a copy of the stack pointer from the previous
>  frame, and be able to use much easier CFA mechanisms to do it.
>  Always zap the saved stack pointer value for the next frame; carrying
>  the value over from one frame to another doesn't make sense.  */
> 
>   _Unwind_SpTmp tmp_sp;
> 
>   if (!_Unwind_GetGRPtr (&orig_context, __builtin_dwarf_sp_column ()))
> _Unwind_SetSpColumn (&orig_context, context->cfa, &tmp_sp);
>   _Unwind_SetGRPtr (context, __builtin_dwarf_sp_column (), NULL);
> #endif
> 
> was conditional on EH_RETURN_STACKADJ_RTX being defined.  It looked at
> first glance like things would work if that call was moved outside,
> so that we never inherit save slots for the stack pointer.  Does that
> make sense?  I can give it a go tomorrow if so.

I haven't looked into this in detail right now, but I recall that I had
to specifically add that #ifdef a long time ago since unwinding wouldn't
work correctly on s390 otherwise:
http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00904.html

And the background of the bug here:
http://gcc.gnu.org/ml/gcc/2003-05/msg00536.html

Actually, now I think the problem originally described there is still
valid: on s390 the CFA is *not* equal to the value at function entry,
but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

Re: [C++ Patch/RFC] PR 60047

2014-02-05 Thread Jason Merrill


On 02/05/2014 11:19 AM, Paolo Carlini wrote:

   if (vec_safe_is_empty (vbases))
 /* No virtual bases to worry about.  */;
   else if (!assign_p)
 {
   if (constexpr_p)
 *constexpr_p = false;


*constexpr_p should be false for a constructor of a class with virtual 
bases, according to the standard (7.1.5p4):


The definition of a constexpr constructor shall satisfy the following 
constraints:

— the class shall not have any virtual base classes;
...

So the assert in implicit_declare_fn is wrong.  I guess I would fix it 
by checking CLASSTYPE_VBASECLASSES.


Jason

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand

I wrote and forgot to proof-read:

> Actually, now I think the problem originally described there is still
> valid: on s390 the CFA is *not* equal to the value at function entry,

... CFA is not equal to the *SP* value at function entry ...

> but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

RE: regression test issue

2014-02-05 Thread Iyer, Balaji V



> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Rainer Orth
> Sent: Wednesday, February 5, 2014 4:22 PM
> To: Iyer, Balaji V
> Cc: Jakub Jelinek; Paolo Carlini; gcc-patches@gcc.gnu.org
> Subject: Re: regression test issue
> 
> "Iyer, Balaji V"  writes:
> 
> > Index: gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
> >
> ==
> =
> > --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207437)
> > +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (working copy)
> > @@ -1,6 +1,7 @@
> >  /* { dg-options "-fcilkplus" } */
> >  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
> >  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-*
> > arm*-*-* } } } */
> > +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> >
> >  #include 
> >  #include 
> > Index: gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c
> >
> ==
> =
> > --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision
> 207437)
> > +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(working
> copy)
> > @@ -1,6 +1,7 @@
> >  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
> >  /* { dg-options "-fcilkplus" } */
> >  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-*
> > } } } */
> > +/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
> 
> You should put a comment (e.g. a PR reference) in the comment field to
> explain why you're skipping this test.  Most likely, the last arg to 
> dg-skip-if ({
> "" }) is optional; if so, please omit it.

There isn't a PR reference for this. I can add a comment in the testcase. Would 
that be enough?

I will also remove the last {""} from the tests.

Thanks,

Balaji V. Iyer.
> 
> Thanks.
> Rainer
> 
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [i386] Replace builtins with vector extensions

2014-02-05 Thread Marc Glisse


Hello,

I was wondering if the new #pragma target in *mmintrin.h make this 
approach more acceptable for 4.10?


http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html

On Sun, 7 Apr 2013, Marc Glisse wrote:


Hello,

the attached patch is very incomplete (it passes bootstrap+testsuite on 
x86_64-linux-gnu), but it raises a number of questions that I'd like to 
settle before continuing.


* Is there any chance of a patch in this direction being accepted?

* May I remove the builtins (from i386.c and the doc) when they become 
unused?


* Do we want to keep the casts even when they don't seem strictly necessary? 
For instance for _mm_add_ps, we can write:

return __A + __B;
or:
return (__m128) ((__v4sf)__A + (__v4sf)__B);
Note that for _mm_add_epi8 for instance we do need the casts.

* For integer operations like _mm_add_epi16 I should probably use the 
unsigned typedefs to make it clear overflow is well defined? (the patch still 
has the signed version)


* Any better name than __v4su for the unsigned version of __v4si?

* Other comments?


2013-04-07  Marc Glisse  

* emmintrin.h (__v2du, __v4su, __v8hu): New typedefs.
(_mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd,
_mm_cmpeq_pd, _mm_cmplt_pd, _mm_cmple_pd, _mm_cmpgt_pd, _mm_cmpge_pd,
_mm_cmpneq_pd, _mm_add_epi8, _mm_add_epi16, _mm_add_epi32,
_mm_add_epi64, _mm_slli_epi16, _mm_slli_epi32, _mm_slli_epi64,
_mm_srai_epi16, _mm_srai_epi32, _mm_srli_epi16, _mm_srli_epi32,
_mm_srli_epi64): Replace builtins with vector extensions.
* xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps,
_mm_cmpeq_ps, _mm_cmplt_ps, _mm_cmple_ps, _mm_cmpgt_ps, _mm_cmpge_ps,
_mm_cmpneq_ps): Likewise.


--
Marc Glisse

Re: regression test issue

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 09:48:12PM +, Iyer, Balaji V wrote:
> > You should put a comment (e.g. a PR reference) in the comment field to
> > explain why you're skipping this test.  Most likely, the last arg to 
> > dg-skip-if ({
> > "" }) is optional; if so, please omit it.
> 
> There isn't a PR reference for this. I can add a comment in the testcase. 
> Would that be enough?
> 
> I will also remove the last {""} from the tests.

Please keep it as is, I have an untested fix for the issue already and will
be testing it soon, so I'd prefer not to have to back out too many changes.

Jakub

Re: RFA: MN10300: Include saved registers in stack usage

2014-02-05 Thread Jeff Law


On 02/05/14 11:26, Joseph S. Myers wrote:

On Wed, 5 Feb 2014, Jeff Law wrote:


However, it's pretty easy to avoid the headaches and just provide a popcount
routine.   I don't think this code is at all performance critical (once per
function being compiled), so a simple popcount should be sufficient.  No need
for lookup tables IMHO.


hwint.[ch] provide popcount_hwi, if HOST_WIDE_INT is a suitable type.

Even better.



(Personally I think there are lots of bits of standard integer
manipulation functionality, such as popcount, that are common in processor
instructions and compiler intrinsics, but could do with having bindings in
ISO C to provide a standard way to access that functionality.

Agreed.


Jeff

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
> Actually, now I think the problem originally described there is still
> valid: on s390 the CFA is *not* equal to the value at function entry,
> but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...

Such biasing happens on most of the targets, e.g. x86_64 has upon function
entry CFA set to %rsp + 8, i?86 to %esp + 4.

Jakub

Re: Is testing libgomp outside of the build tree supported?

2014-02-05 Thread Matthias Klose

Am 04.02.2014 03:14, schrieb Mike Stump:
> On Feb 3, 2014, at 3:52 PM, Joseph S. Myers  wrote:
>> On Mon, 3 Feb 2014, Andrew Pinski wrote:
>>> On Mon, Feb 3, 2014 at 11:14 AM, Diego Novillo  wrote:
 On Mon, Feb 3, 2014 at 2:11 PM, Ian Lance Taylor  wrote:

> If the presence of the build
> tree makes writing some tests significantly simpler, I think that is
> OK.

 I would like to discourage that.  Testing an already installed GCC for
 which no build tree exists is a very useful feature.
>>>
>>> I agree.  Here at Cavium, we use the packaged up toolchain that comes
>>> from a RPM and test it so we are testing exactly what we ship out to
>>> our customers.
>>
>> Similarly at Mentor.
> 
> And the maintainer of the test suite thinks that supporting the people that 
> ship gcc to large numbers of people who help ensure the quality of gcc is 
> useful.  :-)  It is nice to hear from people that this type of testing is 
> useful; thanks.

could somebody please shed some light on how this is done?  It's nice that
everybody has this kind of testing, but the only bit in the gcc sources itself
seems to be a bit bit-rot and incomplete (contrib/test_installed).

thanks, Matthias

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Ulrich Weigand

Jakub Jelinek wrote:
> On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
> > Actually, now I think the problem originally described there is still
> > valid: on s390 the CFA is *not* equal to the value at function entry,
> > but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...
> 
> Such biasing happens on most of the targets, e.g. x86_64 has upon function
> entry CFA set to %rsp + 8, i?86 to %esp + 4.

Ah, but that's a different bias.  In those cases it is still correct
to *unwind* SP to the CFA, since that's the value SP needs to have
back in the caller immediately after the call site.

On s390, the call/return instructions do not modify the SP so the
SP at function entry is equal to the SP in the caller after return,
but neither of these is equal to the CFA.  Instead, SP points to
96/160 bytes below the CFA ...   Therefore you cannot simply
unwind SP to the CFA.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

Re: [PATCH] Make pr59597 test PIC-friendly

2014-02-05 Thread Jeff Law


On 02/05/14 05:32, Ian Bolton wrote:


2014-02-05  Ian Bolton  

testsuite/
 * gcc.dg/tree-ssa/pr59597.c: Make called function static so that
 expected outcome works for PIC variants too.

This is fine for stage4.  Please install.

Thanks,
Jeff

Re: Is testing libgomp outside of the build tree supported?

2014-02-05 Thread Jeff Law


On 02/05/14 15:10, Matthias Klose wrote:

Am 04.02.2014 03:14, schrieb Mike Stump:

On Feb 3, 2014, at 3:52 PM, Joseph S. Myers  wrote:

On Mon, 3 Feb 2014, Andrew Pinski wrote:

On Mon, Feb 3, 2014 at 11:14 AM, Diego Novillo  wrote:

On Mon, Feb 3, 2014 at 2:11 PM, Ian Lance Taylor  wrote:


If the presence of the build
tree makes writing some tests significantly simpler, I think that is
OK.


I would like to discourage that.  Testing an already installed GCC for
which no build tree exists is a very useful feature.


I agree.  Here at Cavium, we use the packaged up toolchain that comes
from a RPM and test it so we are testing exactly what we ship out to
our customers.


Similarly at Mentor.


And the maintainer of the test suite thinks that supporting the people that 
ship gcc to large numbers of people who help ensure the quality of gcc is 
useful.  :-)  It is nice to hear from people that this type of testing is 
useful; thanks.


could somebody please shed some light on how this is done?  It's nice that
everybody has this kind of testing, but the only bit in the gcc sources itself
seems to be a bit bit-rot and incomplete (contrib/test_installed).
I suspect most folks have a site.exp they drop somewhere and explicitly 
call runtest --tool gcc 


I know we do it internally in our QE team, but I never asked them about 
the current mechanics of how it's done.


Jeff

[PATCH] C++ thunk section names

2014-02-05 Thread Sriraman Tallam

Hi,

  I would like this patch reviewed and considered for commit when
Stage 1 is active again.

Patch Description:

A C++ thunk's section name is set to be the same as the original function's
section name for which the thunk was created in order to place the two
together.  This is done in cp/method.c in function use_thunk.
However, with function reordering turned on, the original function's section
name can change to something like ".text.hot." or
".text.unlikely." in function default_function_section in varasm.c
based on the node count of that function.  The thunk function's section name
is not updated to be the same as the original here and also is not always
correct to do it as the original function can be hotter than the thunk.

I have created a patch to not name the thunk function's section to be the same
as the original function when function reordering is enabled.

Thanks
Sri
A C++ thunk's section name is set to be the same as the original function's
section name for which the thunk was created in order to place the two
together.  This is done in cp/method.c in function use_thunk.
However, with function reordering turned on, the original function's section
name can change to something like ".text.hot." or 
".text.unlikely." in function default_function_section in varasm.c
based on the node count of that function.  The thunk function's section name 
is not updated to be the same as the original here and also is not always
correct to do it as the original function can be hotter than the thunk.

I have created a patch to not name the thunk function's section to be the same
as the original function when function reordering is enabled. 


Index: testsuite/g++.dg/thunk_section_name.C
===
--- testsuite/g++.dg/thunk_section_name.C   (revision 0)
+++ testsuite/g++.dg/thunk_section_name.C   (revision 0)
@@ -0,0 +1,30 @@
+/* { dg-require-named-sections "" } */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-reorder-blocks-and-partition -ffunction-sections" } 
*/
+
+class base_class_1
+{
+public:
+  virtual void vfn () {}
+};
+
+class base_class_2
+{
+public:
+  virtual void vfn () {}
+};
+
+class need_thunk_class : public base_class_1, public base_class_2
+{
+public:
+  virtual void vfn () {} 
+};
+
+int main (int argc, char *argv[])
+{
+  base_class_1 *c = new need_thunk_class ();
+  c->vfn();
+  return 0;
+}
+
+/* { dg-final { scan-assembler "\.text\._ZThn8_N16need_thunk_class3vfnEv" } } 
*/
Index: cp/method.c
===
--- cp/method.c (revision 207517)
+++ cp/method.c (working copy)
@@ -362,8 +362,10 @@ use_thunk (tree thunk_fndecl, bool emit_p)
{
  resolve_unique_section (thunk_fndecl, 0, flag_function_sections);
 
- /* Output the thunk into the same section as function.  */
- DECL_SECTION_NAME (thunk_fndecl) = DECL_SECTION_NAME (function);
+ /* Output the thunk into the same section as function if function 
reordering
+is not switched on.  */
+ if (!flag_reorder_functions)
+   DECL_SECTION_NAME (thunk_fndecl) = DECL_SECTION_NAME (function);
}
 }

Re: [PATCH] Documentation for dump and optinfo output

2014-02-05 Thread Sharad Singhai

I am really sorry about the delay. I couldn't exactly reproduce the
warning which you described, but I found a place where two nodes were
out of order in optinfo.texi. Could you please apply the following
patch and see if the problem goes away? If it works for you, I will
commit the doc fixes.

Also I would appreciate the exact command which produces these warnings.

Thanks,
Sharad

2014-02-05  Sharad Singhai  

* doc/optinfo.texi: Fix order of nodes.

Index: doc/optinfo.texi
===
--- doc/optinfo.texi (revision 207525)
+++ doc/optinfo.texi (working copy)
@@ -12,8 +12,8 @@ information about various compiler transformations
 @menu
 * Dump setup:: Setup of optimization dumps.
 * Optimization groups::Groups made up of optimization passes.
+* Dump files and streams:: Dump output file names and streams.
 * Dump output verbosity::  How much information to dump.
-* Dump files and streams:: Dump output file names and streams.
 * Dump types:: Various types of dump functions.
 * Dump examples::  Sample usage.
 @end menu

Thanks,
Sharad


On Fri, Jan 24, 2014 at 2:20 AM, Thomas Schwinge
 wrote:
> Hi!
>
> On Thu, 19 Dec 2013 01:44:08 -0800, Sharad Singhai  wrote:
>> I ran 'make doc' and browsed the info output for basic sanity. Okay for 
>> trunk?
>
>>   * Makefile.in: Add optinfo.texi.
>>   * doc/optinfo.texi: New documentation for optimization info.
>>   * doc/passes.texi: Add new node.
>
> Thanks for contributing to the documentation!
>
> During build, I'm seeing the following warnings; would be great if you
> could address these:
>
> ../../source/gcc/doc/optinfo.texi:45: warning: node next `Optimization 
> groups' in menu `Dump output verbosity' and in sectioning `Dump files and 
> streams' differ
> ../../source/gcc/doc/optinfo.texi:77: warning: node next `Dump files and 
> streams' in menu `Dump types' and in sectioning `Dump output verbosity' differ
> ../../source/gcc/doc/optinfo.texi:77: warning: node prev `Dump files and 
> streams' in menu `Dump output verbosity' and in sectioning `Optimization 
> groups' differ
> ../../source/gcc/doc/optinfo.texi:104: warning: node next `Dump output 
> verbosity' in menu `Dump files and streams' and in sectioning `Dump types' 
> differ
> ../../source/gcc/doc/optinfo.texi:104: warning: node prev `Dump output 
> verbosity' in menu `Optimization groups' and in sectioning `Dump files and 
> streams' differ
> ../../source/gcc/doc/optinfo.texi:137: warning: node prev `Dump types' in 
> menu `Dump files and streams' and in sectioning `Dump output verbosity' differ
>
>
> Grüße,
>  Thomas

Re: Is testing libgomp outside of the build tree supported?

2014-02-05 Thread Joseph S. Myers

On Wed, 5 Feb 2014, Jeff Law wrote:

> I suspect most folks have a site.exp they drop somewhere and explicitly call
> runtest --tool gcc 

* Create site.exp (based on what GCC's makefiles do for build-tree 
testing).  Note that in some cases you may need different contents for 
different testsuite, especially runtime libraries.

* Set up PATH so the installed compilers are found, LD_LIBRARY_PATH so the 
installed shared libraries are found by the dynamic linker (or otherwise 
set up your board file appropriately for running newly compiled programs).

* runtest --srcdir /some/where --tool whatever

If you compare results of this for build-tree and installed testing, 
fixing differences is very useful.  (There are some known bugs for such 
differences; at least, bugs 23867, 25320, 58867.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: var-tracking and s390's LM(G)

2014-02-05 Thread Richard Henderson

On 02/05/2014 02:30 PM, Ulrich Weigand wrote:
> Jakub Jelinek wrote:
>> On Wed, Feb 05, 2014 at 10:26:16PM +0100, Ulrich Weigand wrote:
>>> Actually, now I think the problem originally described there is still
>>> valid: on s390 the CFA is *not* equal to the value at function entry,
>>> but biased by 96/160 bytes.  So setting the SP to the CFA is wrong ...
>>
>> Such biasing happens on most of the targets, e.g. x86_64 has upon function
>> entry CFA set to %rsp + 8, i?86 to %esp + 4.
> 
> Ah, but that's a different bias.  In those cases it is still correct
> to *unwind* SP to the CFA, since that's the value SP needs to have
> back in the caller immediately after the call site.
> 
> On s390, the call/return instructions do not modify the SP so the
> SP at function entry is equal to the SP in the caller after return,
> but neither of these is equal to the CFA.  Instead, SP points to
> 96/160 bytes below the CFA ...   Therefore you cannot simply
> unwind SP to the CFA.

I was about to reply the same to Jakub, Ulrich beat me to it.

Basically, the CFA has been defined "incorrectly" for s390, but it hasn't
mattered since they record the save of the SP.  But the result is that you
can't stop recording the save of SP without also adjusting how the CFA is 
defined.

Which _can_ be done, in a way that's fully compatible with all of the existing
unwinding.  But certainly shouldn't be attempted at this stage of development.

r~

New Vietnamese PO file for 'cpplib' (version 4.9-b20140202)

2014-02-05 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Vietnamese team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/vi.po

(This file, 'cpplib-4.9-b20140202.vi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-4.9-b20140202.vi.po'

2014-02-05 Thread Translation Project Robot



cpplib-4.9-b20140202.vi.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Vietnamese PO file for 'gcc' (version 4.9-b20140202)

2014-02-05 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Vietnamese team of translators.  The file is available at:

http://translationproject.org/latest/gcc/vi.po

(This file, 'gcc-4.9-b20140202.vi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] One possible fix for the compute_bb_predicate oscillation (PR ipa/60013)

2014-02-05 Thread Jan Hubicka


Hi,
this is variant of patch I am testing (thanks Jakub for the hard analysis).
The dataflow was supposed to be monotonous by keeping the rule that oldvalue
imply new value at each step.  This is broken by the approximation done by
and/or operations.

Instead of dropping to true, it seems easier to simply or the result before
updating.  This operation should be no-op until we hit the clause limit.

Bootstrapping/regtesting x86_64-linux, will commit if it passes.

Honza

* ipa-inline-analysis.c (compute_bb_predicates): Ensure monotonicity
of the dataflow.

gcc.dg/pr60013.c: New testcase.
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 207514)
+++ ipa-inline-analysis.c   (working copy)
@@ -310,7 +310,7 @@ add_clause (conditions conditions, struc
   if (false_predicate_p (p))
 return;
 
-  /* No one should be sily enough to add false into nontrivial clauses.  */
+  /* No one should be silly enough to add false into nontrivial clauses.  */
   gcc_checking_assert (!(clause & (1 << predicate_false_condition)));
 
   /* Look where to insert the clause.  At the same time prune out
@@ -1035,7 +1035,7 @@ inline_node_removal_hook (struct cgraph_
   memset (info, 0, sizeof (inline_summary_t));
 }
 
-/* Remap predicate P of former function to be predicate of duplicated functoin.
+/* Remap predicate P of former function to be predicate of duplicated function.
POSSIBLE_TRUTHS is clause of possible truths in the duplicated node,
INFO is inline summary of the duplicated node.  */
 
@@ -1887,8 +1887,15 @@ compute_bb_predicates (struct cgraph_nod
}
  else if (!predicates_equal_p (&p, (struct predicate *) bb->aux))
{
- done = false;
- *((struct predicate *) bb->aux) = p;
+ /* This OR operation is needed to ensure monotonous data flow
+in the case we hit the limit on number of clauses and the
+and/or operations above give approximate answers.  */
+ p = or_predicates (summary->conds, &p, (struct predicate 
*)bb->aux);
+ if (!predicates_equal_p (&p, (struct predicate *) bb->aux))
+   {
+ done = false;
+ *((struct predicate *) bb->aux) = p;
+   }
}
}
}
Index: testsuite/gcc.dg/pr60013.c
===
--- testsuite/gcc.dg/pr60013.c  (revision 0)
+++ testsuite/gcc.dg/pr60013.c  (revision 0)
@@ -0,0 +1,47 @@
+/* PR ipa/60013 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef long int jmp_buf[64];
+extern int _setjmp (jmp_buf) __attribute__ ((__nothrow__));
+struct S { int a, b, c; };
+extern struct S *baz (struct S *);
+static jmp_buf j;
+
+static inline int
+bar (int b, int d)
+{
+  return (b & d) < 0;
+}
+
+struct S *
+foo (int a, struct S *b, struct S *c, struct S *d)
+{
+  if (b->a == 0)
+{
+  switch (a)
+   {
+   case 8:
+ return baz (b);
+   case 7:
+ bar (b->c, c->b);
+ return 0;
+   case 6:
+   case 5:
+   case 4:
+ return baz (c);
+   case 3:
+   case 2:
+ return baz (d);
+   }
+  return 0;
+}
+  if (b->a == 1)
+{
+  if (baz (c))
+   return c;
+  else if (_setjmp (j))
+   baz (b);
+}
+  return 0;
+}

[PATCH] Fix ix86_function_regparm with optimize attribute (PR target/60062, take 3)

2014-02-05 Thread Jakub Jelinek

On Wed, Feb 05, 2014 at 08:42:27PM +0100, Jakub Jelinek wrote:
> So, where do we want to do that instead?  E.g. should it be e.g. in
> tree_versionable_function_p directly and let the inliner (if it doesn't do
> already) also treat optimize(0) functions that aren't always_inline as
> noinline?

So, another attempt to put the && opt_for_fn (fndecl, optimize) into
tree_versionable_function_p also failed, because e.g. for TM (but also for
SIMD clones) we need to clone -O0 functions.  So, can I fix PR600{6,7}2
without touching tree-inline.c and fix PR60026 again separately somehow else
as follow-up?  Bootstrapped/regtested on x86_64-linux and i686-linux.

2014-02-06  Jakub Jelinek  

PR target/60062
* tree.h (opts_for_fn): New inline function.
(opt_for_fn): Define.
* config/i386/i386.c (ix86_function_regparm): Use
opt_for_fn (decl, optimize) instead of optimize.

* gcc.c-torture/execute/pr60062.c: New test.
* gcc.c-torture/execute/pr60072.c: New test.

--- gcc/tree.h.jj   2014-01-20 19:16:29.0 +0100
+++ gcc/tree.h  2014-02-05 15:54:04.681904492 +0100
@@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var)
  || TREE_ADDRESSABLE (var)));
 }
 
+/* Return pointer to optimization flags of FNDECL.  */
+static inline struct cl_optimization *
+opts_for_fn (const_tree fndecl)
+{
+  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
+  if (fn_opts == NULL_TREE)
+fn_opts = optimization_default_node;
+  return TREE_OPTIMIZATION (fn_opts);
+}
+
+/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
+   the optimization level of function fndecl.  */
+#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt)
+
 /* For anonymous aggregate types, we need some sort of name to
hold on to.  In practice, this should not appear, but it should
not be harmful if it does.  */
--- gcc/config/i386/i386.c.jj   2014-02-05 10:37:36.166167116 +0100
+++ gcc/config/i386/i386.c  2014-02-05 15:56:36.779146394 +0100
@@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type,
   /* Use register calling convention for local functions when possible.  */
   if (decl
   && TREE_CODE (decl) == FUNCTION_DECL
-  && optimize
+  /* Caller and callee must agree on the calling convention, so
+checking here just optimize means that with
+__attribute__((optimize (...))) caller could use regparm convention
+and callee not, or vice versa.  Instead look at whether the callee
+is optimized or not.  */
+  && opt_for_fn (decl, optimize)
   && !(profile_flag && !flag_fentry))
 {
   /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
--- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj2014-02-05 
15:55:48.639388697 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr60062.c   2014-02-05 
15:55:48.639388697 +0100
@@ -0,0 +1,25 @@
+/* PR target/60062 */
+
+int a;
+
+static void
+foo (const char *p1, int p2)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+static void
+bar (const char *p1)
+{
+  if (__builtin_strcmp (p1, "hello") != 0)
+__builtin_abort ();
+}
+
+__attribute__((optimize (0))) int
+main ()
+{
+  foo ("hello", a);
+  bar ("hello");
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj2014-02-05 
15:55:48.640388641 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr60072.c   2014-02-05 
15:55:48.640388641 +0100
@@ -0,0 +1,16 @@
+/* PR target/60072 */
+
+int c = 1;
+
+__attribute__ ((optimize (1)))
+static int *foo (int *p)
+{
+  return p;
+}
+
+int
+main ()
+{
+  *foo (&c) = 2;
+  return c - 2;
+}


Jakub

[PATCH] Fix two missed Makefile.in dependencies

2014-02-05 Thread Jakub Jelinek

Hi!

I have noticed that with the .deps introduction for gcc/ we've lost
these two dependencies, which autodependency creation isn't aware of,
as the Makefile runs cat on those files and passes the content through
-D option.

Ok for trunk?

2014-02-06  Jakub Jelinek  

* Makefile.in (prefix.o, cppbuiltin.o): Depend on $(BASEVER).

--- gcc/Makefile.in.jj  2014-01-28 14:03:49.0 +0100
+++ gcc/Makefile.in 2014-02-05 13:09:26.871019299 +0100
@@ -1925,6 +1925,7 @@ default-c.o: config/default-c.c
 # Files used by all variants of C and some other languages.
 
 CFLAGS-prefix.o += -DPREFIX=\"$(prefix)\" -DBASEVER=$(BASEVER_s)
+prefix.o: $(BASEVER)
 
 # Language-independent files.
 
@@ -2540,6 +2541,7 @@ PREPROCESSOR_DEFINES = \
   @TARGET_SYSTEM_ROOT_DEFINE@
 
 CFLAGS-cppbuiltin.o += $(PREPROCESSOR_DEFINES) -DBASEVER=$(BASEVER_s)
+cppbuiltin.o: $(BASEVER)
 
 CFLAGS-cppdefault.o += $(PREPROCESSOR_DEFINES)
 

Jakub

[PATCH] Fix linemap_location_before_p with adhoc locs (PR preprocessor/56824)

2014-02-05 Thread Jakub Jelinek

Hi!

linemap_location_before_p (which is implemented as a call to
linemap_compare_locations) broke with the addition of IS_ADHOC_LOC,
because it doesn't look through those and thus can happily compare some
0x8... number to normal locus and expect it to work.  Most of the other
line-map.c routines handle IS_ADHOC_LOC already, just not this function.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The rest is just formatting cleanup, seeing opening [ at the end of
lines was just too much for me.

2014-02-06  Jakub Jelinek  

PR preprocessor/56824
* line-map.c (get_combined_adhoc_loc, linemap_get_expansion_line,
linemap_get_expansion_filename, linemap_location_in_system_header_p,
linemap_location_from_macro_expansion_p,
linemap_macro_loc_to_spelling_point, linemap_macro_loc_to_def_point,
linemap_macro_loc_to_exp_point, linemap_expand_location): Fix
formatting.
(linemap_compare_locations): Look through adhoc locations for both
l0 and l1.

* gcc.dg/pr56824.c: New test.

--- libcpp/line-map.c.jj2014-01-23 10:53:07.0 +0100
+++ libcpp/line-map.c   2014-02-05 15:07:41.558748559 +0100
@@ -106,8 +106,8 @@ get_combined_adhoc_loc (struct line_maps
   linemap_assert (data);
 
   if (IS_ADHOC_LOC (locus))
-locus =
-   set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
+locus
+  = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
 return 0;
   lb.locus = locus;
@@ -141,8 +141,8 @@ get_combined_adhoc_loc (struct line_maps
}
   *slot = set->location_adhoc_data_map.data
  + set->location_adhoc_data_map.curr_loc;
-  set->location_adhoc_data_map.data[
- set->location_adhoc_data_map.curr_loc++] = lb;
+  
set->location_adhoc_data_map.data[set->location_adhoc_data_map.curr_loc++]
+   = lb;
 }
   return ((*slot) - set->location_adhoc_data_map.data) | 0x8000;
 }
@@ -833,8 +833,8 @@ linemap_get_expansion_line (struct line_
   const struct line_map *map = NULL;
 
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;
 
   if (location < RESERVED_LOCATION_COUNT)
 return 0;
@@ -861,8 +861,8 @@ linemap_get_expansion_filename (struct l
   const struct line_map *map = NULL;
 
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;
 
   if (location < RESERVED_LOCATION_COUNT)
 return NULL;
@@ -899,8 +899,8 @@ linemap_location_in_system_header_p (str
   const struct line_map *map = NULL;
 
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;
 
   if (location < RESERVED_LOCATION_COUNT)
 return false;
@@ -942,8 +942,8 @@ linemap_location_from_macro_expansion_p
 source_location location)
 {
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;
 
   linemap_assert (location <= MAX_SOURCE_LOCATION
  && (set->highest_location
@@ -1024,6 +1024,11 @@ linemap_compare_locations (struct line_m
   bool pre_virtual_p, post_virtual_p;
   source_location l0 = pre, l1 = post;
 
+  if (IS_ADHOC_LOC (l0))
+l0 = set->location_adhoc_data_map.data[l0 & MAX_SOURCE_LOCATION].locus;
+  if (IS_ADHOC_LOC (l1))
+l1 = set->location_adhoc_data_map.data[l1 & MAX_SOURCE_LOCATION].locus;
+
   if (l0 == l1)
 return 0;
 
@@ -1086,8 +1091,8 @@ linemap_macro_loc_to_spelling_point (str
   struct line_map *map;
 
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;
 
   linemap_assert (set && location >= RESERVED_LOCATION_COUNT);
 
@@ -1124,8 +1129,8 @@ linemap_macro_loc_to_def_point (struct l
   struct line_map *map;
 
   if (IS_ADHOC_LOC (location))
-location = set->location_adhoc_data_map.data[
-   location & MAX_SOURCE_LOCATION].locus;
+location = set->location_adhoc_data_map.data[location
+& MAX_SOURCE_LOCATION].locus;

[PATCH] Fix various issues in vect_analyze_data_refs (PR middle-end/59150)

2014-02-05 Thread Jakub Jelinek

Hi!

This patch fixes a double free on simd-lane access if
get_vectype_for_scalar_type fails (the new dr is already in datarefs[i]
and if we free_data_ref it, it will be free_data_ref'ed again at the end of
the failed vectorization), fixes handling of clobbers (plugs a memory leak
due to missed free_data_ref and more importantly, if we goto again, dr
would be kept at the old dr and thus we'd basically remove all the remaining
drs in a loop) and plugs a memory leak with simd_lane_access.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

PS, just looking at the patch now again, in the light of PR59594
the reordering of datarefs before goto again looks also wrong, we IMHO have
to perform a vector ordered remove in that case, ok to handle it as a
follow-up?

2014-02-06  Jakub Jelinek  

PR middle-end/59150
* tree-vect-data-refs.c (vect_analyze_data_refs): For clobbers, call
free_data_ref on the dr first, and before goto again also set dr
to the next dr.  For simd_lane_access, free old datarefs[i] before
overwriting it.  For get_vectype_for_scalar_type failure, don't
free_data_ref if simd_lane_access.

--- gcc/tree-vect-data-refs.c.jj2014-02-05 15:28:10.0 +0100
+++ gcc/tree-vect-data-refs.c   2014-02-05 18:30:53.323659288 +0100
@@ -3297,12 +3297,13 @@ again:
  clobber stmts during vectorization.  */
   if (gimple_clobber_p (stmt))
{
+ free_data_ref (dr);
  if (i == datarefs.length () - 1)
{
  datarefs.pop ();
  break;
}
- datarefs[i] = datarefs.pop ();
+ datarefs[i] = dr = datarefs.pop ();
  goto again;
}
 
@@ -3643,13 +3644,14 @@ again:
   if (simd_lane_access)
{
  STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info) = true;
+ free_data_ref (datarefs[i]);
  datarefs[i] = dr;
}
 
   /* Set vectype for STMT.  */
   scalar_type = TREE_TYPE (DR_REF (dr));
-  STMT_VINFO_VECTYPE (stmt_info) =
-get_vectype_for_scalar_type (scalar_type);
+  STMT_VINFO_VECTYPE (stmt_info)
+   = get_vectype_for_scalar_type (scalar_type);
   if (!STMT_VINFO_VECTYPE (stmt_info))
 {
   if (dump_enabled_p ())
@@ -3669,7 +3671,8 @@ again:
  if (gather || simd_lane_access)
{
  STMT_VINFO_DATA_REF (stmt_info) = NULL;
- free_data_ref (dr);
+ if (gather)
+   free_data_ref (dr);
}
  return false;
 }

Jakub

[PATCH] Fix up __builtin_setjmp_receiver handling (PR c++/60082)

2014-02-05 Thread Jakub Jelinek

Hi!

I've looked at the spawner_inline.c timeout issue, and it looks like the
problem is that __builtin_setjmp_receiver is a stmt_ends_bb_p call and
insert_backedge_copies then happily inserts statements before the call,
because it can't insert them after the call.  But __builtin_setjmp_receiver
is quite special beast, it's bb is reachable only through an abnormal edge
and it pretty much relies on being at the beginning of the bb, as before the
call e.g. frame pointer value is bogus, the expansion of the call itself
fixes that up and then emits a blockage to make sure things aren't
reordered.  But when outof-ssa emits something before the builtin, it isn't
reordered, but is on the wrong side of the frame pointer restoration.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?  Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead
in call_can_make_abnormal_goto, there is probably no need to have AB edge
out of __builtin_setjmp_receiver, we already have one out of
__builtin_setjmp_setup and receiver is really expanded just to load of frame
pointer, so it doesn't jump anywhere.

Note that this fixes both the CK tests on x86_64-linux, on i686-linux
neither of them timeout, but catch_exc.cc still fails:
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test

2014-02-06  Jakub Jelinek  

PR c++/60082
* tree-outof-ssa.c (insert_backedge_copies): Never insert statements
before __builtin_setjmp_receiver.

Revert
2014-02-05  Balaji V. Iyer  

* g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1.
* c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.

--- gcc/tree-outof-ssa.c.jj 2014-01-03 11:41:01.0 +0100
+++ gcc/tree-outof-ssa.c2014-02-05 22:12:03.637412102 +0100
@@ -1152,6 +1152,12 @@ insert_backedge_copies (void)
  if (TREE_CODE (arg) == SSA_NAME
  && SSA_NAME_DEF_STMT (arg) == last)
continue;
+ /* Never insert anything before
+__builtin_setjmp_receiver, that builtin should be
+immediately after labels.  */
+ if (gimple_call_builtin_p (last,
+BUILT_IN_SETJMP_RECEIVER))
+   continue;
}
 
  /* Create a new instance of the underlying variable of the
--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207519)
+++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc  (revision 207518)
@@ -1,7 +1,6 @@
 /* { dg-options "-fcilkplus" } */
 /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
 /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* arm*-*-* 
} } } */
-/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
 
 #include 
 #include 
--- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 
207519)
+++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c(revision 
207518)
@@ -1,7 +1,6 @@
 /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
 /* { dg-options "-fcilkplus" } */
 /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
-/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
 
 #include 
 #define DEFAULT_VALUE 30

Jakub

96 matches

Mail list logo