[PATCH] Fix PR53703

2012-06-17 Thread William J. Schmidt
The test case exposes a bug that occurs only when a diamond control flow
pattern has the arguments of the joining phi in a different order from
the successor arcs of the entry block.  My logic for setting
bb_for_def[12] was just brain-dead.  This cleans that up and also
prevents wasting time examining phis of virtual ops, which I noticed
happening while debugging this.

Bootstrapped and regtested on powerpc64-unknown-linux-gnu with no new
failures.  Ok for trunk?

Thanks,
Bill


gcc:

2012-06-17  Bill Schmidt  

PR tree-optimization/53703
* tree-ssa-phiopt.c (hoist_adjacent_loads): Skip virtual phis;
correctly set bb_for_def[12].

gcc/testsuite:

2012-06-17  Bill Schmidt  

PR tree-optimization/53703
* gcc.dg/torture/pr53703.c: New test.


Index: gcc/testsuite/gcc.dg/torture/pr53703.c
===
--- gcc/testsuite/gcc.dg/torture/pr53703.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr53703.c  (revision 0)
@@ -0,0 +1,149 @@
+/* Reduced test case from PR53703.  Used to ICE.  */
+
+/* { dg-do compile } */
+/* { dg-options "-w" } */
+
+typedef long unsigned int size_t;
+typedef unsigned short int sa_family_t;
+struct sockaddr   {};
+typedef unsigned char __u8;
+typedef unsigned short __u16;
+typedef unsigned int __u32;
+struct nlmsghdr {
+  __u32 nlmsg_len;
+  __u16 nlmsg_type;
+};
+struct ifaddrmsg {
+  __u8 ifa_family;
+};
+enum {
+  IFA_ADDRESS,
+  IFA_LOCAL,
+};
+enum {
+  RTM_NEWLINK = 16,
+  RTM_NEWADDR = 20,
+};
+struct rtattr {
+  unsigned short rta_len;
+  unsigned short rta_type;
+};
+struct ifaddrs {
+  struct ifaddrs *ifa_next;
+  unsigned short ifa_flags;
+};
+typedef unsigned short int uint16_t;
+typedef unsigned int uint32_t;
+struct nlmsg_list {
+  struct nlmsg_list *nlm_next;
+  int size;
+};
+struct rtmaddr_ifamap {
+  void *address;
+  void *local;
+  int address_len;
+  int local_len;
+};
+int usagi_getifaddrs (struct ifaddrs **ifap)
+{
+  struct nlmsg_list *nlmsg_list, *nlmsg_end, *nlm;
+  size_t dlen, xlen, nlen;
+  int build;
+  for (build = 0; build <= 1; build++)
+{
+  struct ifaddrs *ifl = ((void *)0), *ifa = ((void *)0);
+  struct nlmsghdr *nlh, *nlh0;
+  uint16_t *ifflist = ((void *)0);
+  struct rtmaddr_ifamap ifamap;
+  for (nlm = nlmsg_list; nlm; nlm = nlm->nlm_next)
+   {
+ int nlmlen = nlm->size;
+ for (nlh = nlh0;
+  ((nlmlen) >= (int)sizeof(struct nlmsghdr)
+   && (nlh)->nlmsg_len >= sizeof(struct nlmsghdr)
+   && (nlh)->nlmsg_len <= (nlmlen));
+  nlh = ((nlmlen) -= ( (((nlh)->nlmsg_len)+4U -1) & ~(4U -1) ),
+ (struct nlmsghdr*)(((char*)(nlh))
++ ( (((nlh)->nlmsg_len)+4U -1)
+& ~(4U -1) 
+   {
+ struct ifinfomsg *ifim = ((void *)0);
+ struct ifaddrmsg *ifam = ((void *)0);
+ struct rtattr *rta;
+ sa_family_t nlm_family = 0;
+ uint32_t nlm_scope = 0, nlm_index = 0;
+ memset (&ifamap, 0, sizeof (ifamap));
+ switch (nlh->nlmsg_type)
+   {
+   case RTM_NEWLINK:
+ ifim = (struct ifinfomsg *)
+   ((void*)(((char*)nlh)
++ ((0)+( int)
+( ((sizeof(struct nlmsghdr))+4U -1)
+  & ~(4U -1) )))+4U -1)
+ & ~(4U -1) ;
+   case RTM_NEWADDR:
+ ifam = (struct ifaddrmsg *)
+   ((void*)(((char*)nlh)
++ ((0)+( int)
+( ((sizeof(struct nlmsghdr))+4U -1)
+  & ~(4U -1) )))+4U -1)
+ & ~(4U -1) ;
+ nlm_family = ifam->ifa_family;
+ if (build)
+   ifa->ifa_flags = ifflist[nlm_index];
+ break;
+   default:
+ continue;
+   }
+ if (!build)
+   {
+ void *rtadata = ((void*)(((char*)(rta))
+  + (( ((sizeof(struct rtattr))+4 -1)
+   & ~(4 -1) ) + (0;
+ size_t rtapayload = ((int)((rta)->rta_len)
+  - (( ((sizeof(struct rtattr))+4 -1)
+   & ~(4 -1) ) + (0)));
+ switch (nlh->nlmsg_type)
+   {
+   case RTM_NEWLINK:
+ break;
+   case RTM_NEWADDR:
+ if (nlm_family == 17)
+   break;
+ switch (rta->rta_type)
+   {
+   case IFA_ADDRESS:

Re: PATCH: Always create a new language function for nested functions

2012-06-17 Thread Meador Inge
On 06/17/2012 09:28 AM, Markus Trippelsdorf wrote:

> On 2012.05.29 at 19:07 +, Joseph S. Myers wrote:
>> On Tue, 29 May 2012, Meador Inge wrote:
>>
>>> 2012-05-29  Meador Inge  
>>>
>>> * c-decl.c (c_push_function_context): Always create a new language
>>> function.
>>> (c_pop_function_context): Clear the language function created in
>>> c_push_function_context.
>>
>> Thanks, committed.
> 
> This patch caused http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53702 

Thanks.  I am looking into it.

-- 
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software


[4.6][ARM] Backport fix PR48126

2012-06-17 Thread Joey Ye
OK for 4.6?

2012-06-18  Joey Ye  

Backport from mainline
2011-10-14  David Alan Gilbert  

PR target/48126
* config/arm/arm.c (arm_output_sync_loop): Move label before
barrier.


Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 188331)
+++ gcc/config/arm/arm.c(working copy)
@@ -23423,8 +23423,11 @@
}
 }
 
+  /* Note: label is before barrier so that in cmp failure case we still get
+ a barrier to stop subsequent loads floating upwards past the ldrex
+ PR target/48126.  */
+  arm_output_asm_insn (emit, 1, operands, "%sLSYB%%=:",
LOCAL_LABEL_PREFIX);
   arm_process_output_memory_barrier (emit, NULL);
-  arm_output_asm_insn (emit, 1, operands, "%sLSYB%%=:",
LOCAL_LABEL_PREFIX);
 }
 
 static rtx
Index: gcc/config/arm/arm.h
===
--- gcc/config/arm/arm.h(revision 188331)
+++ gcc/config/arm/arm.h(working copy)
@@ -294,7 +294,8 @@
 #define TARGET_HAVE_DMB(arm_arch7)
 
 /* Nonzero if this chip implements a memory barrier via CP15.  */
-#define TARGET_HAVE_DMB_MCR(arm_arch6k && ! TARGET_HAVE_DMB)
+#define TARGET_HAVE_DMB_MCR(arm_arch6 && ! TARGET_HAVE_DMB \
+&& ! TARGET_THUMB1)
 
 /* Nonzero if this chip implements a memory barrier instruction.  */
 #define TARGET_HAVE_MEMORY_BARRIER (TARGET_HAVE_DMB || TARGET_HAVE_DMB_MCR)






RE: [PATCH][Cilkplus]PR 53567

2012-06-17 Thread Iyer, Balaji V
Hello Everyone,
I missed one case where bug comes up again. This patch should fix that.

Thanks,

Balaji V. Iyer.

-Original Message-
From: Iyer, Balaji V [mailto:balaji.v.i...@intel.com] 
Sent: Friday, June 15, 2012 3:37 PM
To: gcc-patches@gcc.gnu.org
Subject: [PATCH][Cilkplus]PR 53567

 Hello Everyone,
This patch is for the Cilkplus branch affecting both C and C++ compilers. 
The dwarf output function was looking for debugging information for an 
internally generated spawn helper which is not there. So this patch will make 
sure that those functions are excluded.

Thanks,

Balaji V. Iyer.
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 188714)
+++ gcc/tree.h  (working copy)
@@ -520,6 +520,7 @@
   unsigned deprecated_flag : 1;
   unsigned saturating_flag : 1;
   unsigned is_cilk_spawn : 1;
+  unsigned is_cilk_helper_fn : 1;
   unsigned default_def_flag : 1;
   unsigned lang_flag_0 : 1;
   unsigned lang_flag_1 : 1;
@@ -1891,6 +1892,10 @@
(TREE_CODE(N) == CALL_EXPR || TREE_CODE(N) == FUNCTION_DECL)
 #define SPAWN_CALL_P(N) (/* FUNCTION_DECL_CALL_CHECK */(N)->base.is_cilk_spawn)
 
+/* True if the function is a cilk helper function or something that cilk
+   touches */
+#define CILK_FN_P(N) (N->base.is_cilk_helper_fn)
+
 /* True if this call is the point at which a wrapper should detach. */
 #define SPAWN_DETACH_POINT(NODE) (CALL_EXPR_CHECK 
(NODE)->base.default_def_flag)
 
Index: gcc/cp/cilk.c
===
--- gcc/cp/cilk.c   (revision 188714)
+++ gcc/cp/cilk.c   (working copy)
@@ -585,6 +585,7 @@
  the uncopyable value in the outer frame. */
 
   cfun->is_cilk_function = 1;
+  CILK_FN_P (cfun->decl) = 1;
   pre = 0;
   lower_bound = cfd->lower_bound;
   if (!lower_bound)
@@ -833,7 +834,7 @@
   tree decl = cfun->cilk_frame_decl;
 
   cfun->is_cilk_function = 1;
-  
+  CILK_FN_P (cfun->decl) = 1;
   if (!decl)
 {
   tree addr, body, ctor, dtor, obody;
Index: gcc/cp/ChangeLog.cilk
===
--- gcc/cp/ChangeLog.cilk   (revision 188714)
+++ gcc/cp/ChangeLog.cilk   (working copy)
@@ -1,3 +1,8 @@
+2012-06-17  balaji.v.iyer  
+
+   * cilk.c (cp_build_cilk_for_body): Set CILK_FN_P field to 1.
+   (cp_make_cilk_frame): Likewise.
+
 2012-06-14  Balaji V. Iyer  
 
* pt.c (tsubst_expr): Added a check for CILK_SYNC statement.
Index: gcc/cilk.c
===
--- gcc/cilk.c  (revision 188714)
+++ gcc/cilk.c  (working copy)
@@ -1105,7 +1105,8 @@
   if (cfun) 
 { 
   cfun->calls_notify_intrinsic = 1;
-  cfun->is_cilk_function = 1; 
+  cfun->is_cilk_function = 1;
+  CILK_FN_P (cfun->decl) = 1;
 }
 
   return const0_rtx;
@@ -1184,3 +1185,4 @@
   
   return;
 }
+
Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 188714)
+++ gcc/dwarf2out.c (working copy)
@@ -16565,6 +16565,14 @@
   /* Make sure we have the actual abstract inline, not a clone.  */
   decl = DECL_ORIGIN (decl);
 
+  if (flag_enable_cilk && decl && TREE_CODE (decl) == FUNCTION_DECL)
+{
+  struct function *f = DECL_STRUCT_FUNCTION (decl);
+  if (f && f->is_cilk_helper_function)
+   return; /* can't do debuging output for spawn helper */
+  else if (!f && CILK_FN_P (decl))
+   return; /* can't do it if it is a cilk function and f is NULL */
+}
   old_die = lookup_decl_die (decl);
   if (old_die && get_AT (old_die, DW_AT_inline))
 /* We've already generated the abstract instance.  */
@@ -19553,6 +19561,8 @@
   struct function *f = DECL_STRUCT_FUNCTION (decl);
   if (f && f->is_cilk_helper_function)
return; /* can't do debuging output for spawn helper */
+  else if (!f && CILK_FN_P (decl))
+   return;
 }
   dwarf2out_decl (decl);
   call_arg_locations = NULL;
Index: gcc/cilk-spawn.c
===
--- gcc/cilk-spawn.c(revision 188714)
+++ gcc/cilk-spawn.c(working copy)
@@ -129,6 +129,7 @@
 
   f->is_cilk_function = 1;
   f->is_cilk_helper_function = 1;
+  CILK_FN_P (fndecl) = 1;
   /* gimplify_body may garbage collect.  Save a root. */
   cilk_trees[CILK_TI_PENDING_FUNCTIONS] =
 tree_cons (NULL_TREE, fndecl, cilk_trees[CILK_TI_PENDING_FUNCTIONS]);
@@ -388,7 +389,7 @@
 
   cfun->calls_spawn = 1;
   cfun->is_cilk_function = 1;
-  
+  CILK_FN_P (cfun->decl) = 1;
 
   /* Convert this statement into a nested function, using capture
  by value when that is equivalent but faster. */
@@ -2549,7 +2550,7 @@
   set_cfun (DECL_STRUCT_FUNCTION (current_function_decl));
 
   cfun->is_cilk_function = 1;
-  
+  CILK_FN_P (cfun->decl) = 1;
   /* Apparently we need to gimplify now because we can't leave
  non-GIMPLE functions lying around. */
   cg_hacks (

[PATCH, testsuite]: Increase array size in gcc.target/i386/pr33329.c

2012-06-17 Thread Uros Bizjak
Hello!

gcc.target/i386/pr33329.c is fully optimized with tree optimizers to a
constant. Attached patch increases array size to avoid
over-optimization and to perform intended RTL optimization check.

2012-06-17  Uros Bizjak  

* gcc.target/i386/pr33329.c (f): Increase tabs array to 1024.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: gcc.target/i386/pr33329.c
===
--- gcc.target/i386/pr33329.c   (revision 188714)
+++ gcc.target/i386/pr33329.c   (working copy)
@@ -5,11 +5,11 @@
 
 void f (void)
 {
-  int tabs[8], tabcount;
+  int tabs[1024], tabcount;
 
   for (tabcount = 1; tabcount <= 8; tabcount += 7)
 {
   int i;
-  for (i = 0; i < 8; i++)
+  for (i = 0; i < 1024; i++)
tabs[i] = i * 2;
   g (tabs);
 }


Re: [arm] Remove obsolete FPA support (2/n): Remove command-line options

2012-06-17 Thread Gerald Pfeifer
On Thu, 14 Jun 2012, Richard Earnshaw wrote:
> This patch removes the command line options that enabled generation of
> FPA and maverick instructions, along with their associated documentation.
> 
>   * arm.opt (mfp=2, mfp=3, mfpe, mfpe=2, mfpe=3): Delete options.
>   * arm-fpus.def (fpa, fpe2, fpe3, maverick): Delete FPU types.
>   * arm-tables.opt: Regenerated.
>   * doc/invoke.texi: Remove references to deleted options.

gcc-4.8/changes.html, please? :-)

If you prefer to provide raw text describing the change, I can
take it from there.

Gerald


Re: [Patch, Fortran, OOP] PR 53328: Ambiguous check for type-bound GENERIC shall ignore PASSed arguments

2012-06-17 Thread Janus Weil
Updated patch: After fixing two small errors, the patch also fixes PR
47710 (test case added).

Still regtesting cleanly ...

Cheers,
Janus


2012-06-17  Janus Weil  

PR fortran/47710
PR fortran/53328
* interface.c (count_types_test,generic_correspondence,
gfc_compare_interfaces): Ignore PASS arguments.
(check_interface1,compare_parameter): Pass NULL arguments to
gfc_compare_interfaces.
* gfortran.h (gfc_compare_interfaces): Modified Prototype.
* expr.c (gfc_check_pointer_assign): Pass NULL arguments to
gfc_compare_interfaces.
* resolve.c (resolve_structure_cons): Ditto.
(check_generic_tbp_ambiguity): Determine PASS arguments and pass them
to gfc_compare_interfaces.


2012-06-17  Janus Weil  

PR fortran/47710
PR fortran/53328
* gfortran.dg/typebound_generic_12.f03: New.
* gfortran.dg/typebound_generic_13.f03: New.




2012/6/17 Janus Weil :
> Hi all,
>
> here is a patch which concerns the ambiguity checking of generic TBPs,
> where F03 has similar rules as F95, with the difference that PASS
> arguments basically should be skipped in these tests. That is what the
> patch implements by passing the PASS arguments to
> 'gfc_compare_interfaces' and modifying the helper functions
> 'count_types_test' and 'generic_correspondence'.
>
> Patch is regtested on x86_64-unknown-linux-gnu. I also checked that it
> gives correct behavior for the extended tests posted by Salvatore at:
> http://gcc.gnu.org/ml/fortran/2012-05/msg00060.html
>
> Ok for trunk?
>
> Btw, we have a couple more PRs regarding generic TBPs, and I hope to
> find the time to tackle some of these soon.
>
> Cheers,
> Janus
>
>
>
> 2012-06-17  Janus Weil  
>
>        PR fortran/53328
>        * interface.c (count_types_test,generic_correspondence,
>        gfc_compare_interfaces): Ignore PASS arguments.
>        (check_interface1,compare_parameter): Pass NULL arguments to
>        gfc_compare_interfaces.
>        * gfortran.h (gfc_compare_interfaces): Modified Prototype.
>        * expr.c (gfc_check_pointer_assign): Pass NULL arguments to
>        gfc_compare_interfaces.
>        * resolve.c (resolve_structure_cons): Ditto.
>        (check_generic_tbp_ambiguity): Determine PASS arguments and pass them
>        to gfc_compare_interfaces.
>
>
> 2012-06-17  Janus Weil  
>
>        PR fortran/53328
>        * gfortran.dg/typebound_generic_12.f03: New.


pr53328_v2.diff
Description: Binary data


typebound_generic_12.f03
Description: Binary data


typebound_generic_13.f03
Description: Binary data


Re: [PATCH 2/3] Add XLP-specific atomic instructions and tweaks.

2012-06-17 Thread Richard Sandiford
Richard Henderson  writes:
> On 2012-06-16 00:45, Richard Sandiford wrote:
>>   [(mem:GPR (match_operand:P 1 "register_operand" "d"))]
>> 
>>Instead, we should define a new memory predicate/constraint pair
>>for memories that only accept register addresses.  I.e. there
>>should be a new predicate to go alongside things like
>>memory_operand and stack_operand, except that the new one would
>>be even more restrictive in the set of addresses that it allows.
>>mem_reg_operand seems as good a name as any, but I'm not wedded
>>to a particular name.
>
> C.f. mem_noofs_operand inthe ARM port, and its uses in sync.md.
>
>>The atomic_exchange and atomic_fetch_add expanders should use
>>the code I quoted in the earlier message to force the original
>>memory_operand into this more restrictive form:
>> 
>> if (!mem_reg_operand (operands[1], mode))
>>   {
>> addr = force_reg (Pmode, XEXP (operands[1], 0));
>> operands[1] = replace_equiv_address (operands[1], addr);
>>   }
>
> Not required if you use the proper predicate in the expander.
> The middle-end will take care of this for you.

I might be misunderstanding, sorry, but this expander is shared with
the normal LL/SC path, which can accept plain memory_operands.
I was thinking we'd want to keep the expander predicates the same
and apply these SWAP-style restrictions only when needed.

I suppose we could define a predicate that's equivalent to
mem_reg_operand when ISA_HAS_SWAP and to memory_operand otherwise
(and maybe do the same for arith_operand vs. register_operand
in point (2)).  That might get confusing though.  E.g. Maxim
has kept the ISA flags for SWAP and LDADD separate, which seems
like a good thing.  We'd then need another pair of predicates
for the LDADD case.

Richard


Re: [PATCH] Hoist adjacent loads

2012-06-17 Thread H.J. Lu
On Tue, Jun 12, 2012 at 8:45 AM, H.J. Lu  wrote:
> On Mon, Jun 11, 2012 at 1:37 PM, William J. Schmidt
>  wrote:
>> OK, once more with feeling... :)
>>
>> This patch differs from the previous one in two respects:  It disables
>> the optimization when either the then or else edge is well-predicted;
>> and it now uses the existing l1-cache-line-size parameter instead of a
>> new one (with updated commentary).
>>
>> Bootstraps and tests with no new regressions on
>> powerpc64-unknown-linux-gnu.  One last performance run is underway, but
>> I don't expect any surprises since both changes are more conservative.
>> The original benchmark issue is still resolved.
>>
>> Is this version ok for trunk?
>>
>> Thanks,
>> Bill
>>
>>
>> 2012-06-11  Bill Schmidt  
>>
>>        * opts.c: Add -fhoist-adjacent-loads to -O2 and above.
>>        * tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Add argument to forward
>>        declaration.
>>        (hoist_adjacent_loads, gate_hoist_loads): New forward declarations.
>>        (tree_ssa_phiopt): Call gate_hoist_loads.
>>        (tree_ssa_cs_elim): Add parm to tree_ssa_phiopt_worker call.
>>        (tree_ssa_phiopt_worker): Add do_hoist_loads to formal arg list; call
>>        hoist_adjacent_loads.
>>        (local_mem_dependence): New function.
>>        (hoist_adjacent_loads): Likewise.
>>        (gate_hoist_loads): Likewise.
>>        * common.opt (fhoist-adjacent-loads): New switch.
>>        * Makefile.in (tree-ssa-phiopt.o): Added dependencies.
>>
>>
>
> This may have caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53647
>

This also caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53703


-- 
H.J.


Re: [PATCH 3/3] Handle const_vector in mulv4si3 for pre-sse4.1.

2012-06-17 Thread Uros Bizjak
On Sun, Jun 17, 2012 at 8:37 PM, Uros Bizjak  wrote:
> Hello!
>
> Please note that you will probably hit PR33329, this is the reason
> that we expand multiplications after reload. Please see [1] for
> further explanation. There is gcc.target/i386/pr33329.c test to cover
> this issue, but it is not effective anymore since the simplification
> happens at tree level.
>
> [1] http://gcc.gnu.org/ml/gcc-patches/2007-09/msg00668.html


Please adapt mentioned testcase with:

--cut here--
Index: testsuite/gcc.target/i386/pr33329.c
===
--- testsuite/gcc.target/i386/pr33329.c (revision 188703)
+++ testsuite/gcc.target/i386/pr33329.c (working copy)
@@ -5,12 +5,12 @@

 void f (void)
 {
-  int tabs[8], tabcount;
+  int tabs[1024], tabcount;

-  for (tabcount = 1; tabcount <= 8; tabcount += 7)
+  for (tabcount = 1; tabcount <= 1024; tabcount += 7)
 {
   int i;
-  for (i = 0; i < 8; i++)
+  for (i = 0; i < 1024; i++)
tabs[i] = i * 2;
   g (tabs);
 }
--cut here--

Uros.


Re: [PATCH 3/3] Handle const_vector in mulv4si3 for pre-sse4.1.

2012-06-17 Thread Uros Bizjak
Hello!

Please note that you will probably hit PR33329, this is the reason
that we expand multiplications after reload. Please see [1] for
further explanation. There is gcc.target/i386/pr33329.c test to cover
this issue, but it is not effective anymore since the simplification
happens at tree level.

[1] http://gcc.gnu.org/ml/gcc-patches/2007-09/msg00668.html

Uros.
On Fri, Jun 15, 2012 at 10:57 PM, Richard Henderson  wrote:
> ---
>  gcc/config/i386/i386-protos.h |    1 +
>  gcc/config/i386/i386.c        |   76 
> +
>  gcc/config/i386/predicates.md |    7 
>  gcc/config/i386/sse.md        |   72 +++---
>  4 files changed, 97 insertions(+), 59 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index f300a56..431db6c 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -222,6 +222,7 @@ extern void ix86_expand_reduc (rtx (*)(rtx, rtx, rtx), 
> rtx, rtx);
>
>  extern void ix86_expand_vec_extract_even_odd (rtx, rtx, rtx, unsigned);
>  extern bool ix86_expand_pinsr (rtx *);
> +extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
>
>  /* In i386-c.c  */
>  extern void ix86_target_macros (void);
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 578a756..0dc08f3 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -38438,6 +38438,82 @@ ix86_expand_vec_extract_even_odd (rtx targ, rtx op0, 
> rtx op1, unsigned odd)
>   expand_vec_perm_even_odd_1 (&d, odd);
>  }
>
> +void
> +ix86_expand_sse2_mulv4si3 (rtx op0, rtx op1, rtx op2)
> +{
> +  rtx op1_m1, op1_m2;
> +  rtx op2_m1, op2_m2;
> +  rtx res_1, res_2;
> +
> +  /* Shift both input vectors down one element, so that elements 3
> +     and 1 are now in the slots for elements 2 and 0.  For K8, at
> +     least, this is faster than using a shuffle.  */
> +  op1_m1 = op1 = force_reg (V4SImode, op1);
> +  op1_m2 = gen_reg_rtx (V4SImode);
> +  emit_insn (gen_sse2_lshrv1ti3 (gen_lowpart (V1TImode, op1_m2),
> +                                gen_lowpart (V1TImode, op1),
> +                                GEN_INT (32)));
> +
> +  if (GET_CODE (op2) == CONST_VECTOR)
> +    {
> +      rtvec v;
> +
> +      /* Constant propagate the vector shift, leaving the dont-care
> +        vector elements as zero.  */
> +      v = rtvec_alloc (4);
> +      RTVEC_ELT (v, 0) = CONST_VECTOR_ELT (op2, 0);
> +      RTVEC_ELT (v, 2) = CONST_VECTOR_ELT (op2, 2);
> +      RTVEC_ELT (v, 1) = const0_rtx;
> +      RTVEC_ELT (v, 3) = const0_rtx;
> +      op2_m1 = gen_rtx_CONST_VECTOR (V4SImode, v);
> +      op2_m1 = force_reg (V4SImode, op2_m1);
> +
> +      v = rtvec_alloc (4);
> +      RTVEC_ELT (v, 0) = CONST_VECTOR_ELT (op2, 1);
> +      RTVEC_ELT (v, 2) = CONST_VECTOR_ELT (op2, 3);
> +      RTVEC_ELT (v, 1) = const0_rtx;
> +      RTVEC_ELT (v, 3) = const0_rtx;
> +      op2_m2 = gen_rtx_CONST_VECTOR (V4SImode, v);
> +      op2_m2 = force_reg (V4SImode, op2_m2);
> +    }
> +  else
> +    {
> +      op2_m1 = op2 = force_reg (V4SImode, op2);
> +      op2_m2 = gen_reg_rtx (V4SImode);
> +      emit_insn (gen_sse2_lshrv1ti3 (gen_lowpart (V1TImode, op2_m2),
> +                                    gen_lowpart (V1TImode, op2),
> +                                    GEN_INT (32)));
> +    }
> +
> +  /* Widening multiply of elements 0+2, and 1+3.  */
> +  res_1 = gen_reg_rtx (V4SImode);
> +  res_2 = gen_reg_rtx (V4SImode);
> +  emit_insn (gen_sse2_umulv2siv2di3 (gen_lowpart (V2DImode, res_1),
> +                                    op1_m1, op2_m1));
> +  emit_insn (gen_sse2_umulv2siv2di3 (gen_lowpart (V2DImode, res_2),
> +                                    op1_m2, op2_m2));
> +
> +  /* Move the results in element 2 down to element 1; we don't care
> +     what goes in elements 2 and 3.  Then we can merge the parts
> +     back together with an interleave.
> +
> +     Note that two other sequences were tried:
> +     (1) Use interleaves at the start instead of psrldq, which allows
> +     us to use a single shufps to merge things back at the end.
> +     (2) Use shufps here to combine the two vectors, then pshufd to
> +     put the elements in the correct order.
> +     In both cases the cost of the reformatting stall was too high
> +     and the overall sequence slower.  */
> +
> +  emit_insn (gen_sse2_pshufd_1 (res_1, res_1, const0_rtx, const2_rtx,
> +                               const0_rtx, const0_rtx));
> +  emit_insn (gen_sse2_pshufd_1 (res_2, res_2, const0_rtx, const2_rtx,
> +                               const0_rtx, const0_rtx));
> +  res_1 = emit_insn (gen_vec_interleave_lowv4si (op0, res_1, res_2));
> +
> +  set_unique_reg_note (res_1, REG_EQUAL, gen_rtx_MULT (V4SImode, op1, op2));
> +}
> +
>  /* Expand an insert into a vector register through pinsr insn.
>    Return true if successful.  */
>
> diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> index 92db809..f23e932 100644
> ---

Re: Change the ordering of cdce pass

2012-06-17 Thread H.J. Lu
On Thu, Jun 14, 2012 at 6:38 PM, Easwaran Raman  wrote:
> The conditional dead call elimination pass shrink wraps certain dead
> calls to math functions. It doesn't handle case like this:
>
> D.142420_139 = powD.549 (D.142421_138, D.142419_132);
>  fooD.120935.barD.113815 = D.142420_139;
> # foo.bar is dead here.
>
> This code gets cleaned up by DCE and leaves only pow, which can then
> be shrink-wrapped by cdce. So it seems reasonable to do this
> reordering. Bootstraps on x86_64 on linux with no test regression. OK
> for trunk?
>
> - Easwaran
>
> --
>
> 2012-06-14   Easwaran Raman  
>
>        * gcc/passes.c (init_optimization_passes): Remove pass_call_cdce
>        from its current position and insert after pass_dce.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53710

It may expose a latent bug.

-- 
H.J.


Re: [PATCH 2/3] Add XLP-specific atomic instructions and tweaks.

2012-06-17 Thread Richard Henderson
On 2012-06-16 00:45, Richard Sandiford wrote:
>[(mem:GPR (match_operand:P 1 "register_operand" "d"))]
> 
>Instead, we should define a new memory predicate/constraint pair
>for memories that only accept register addresses.  I.e. there
>should be a new predicate to go alongside things like
>memory_operand and stack_operand, except that the new one would
>be even more restrictive in the set of addresses that it allows.
>mem_reg_operand seems as good a name as any, but I'm not wedded
>to a particular name.

C.f. mem_noofs_operand inthe ARM port, and its uses in sync.md.

>The atomic_exchange and atomic_fetch_add expanders should use
>the code I quoted in the earlier message to force the original
>memory_operand into this more restrictive form:
> 
> if (!mem_reg_operand (operands[1], mode))
>   {
> addr = force_reg (Pmode, XEXP (operands[1], 0));
> operands[1] = replace_equiv_address (operands[1], addr);
>   }

Not required if you use the proper predicate in the expander.
The middle-end will take care of this for you.


r~


Re: [Patch, Fortran] PRs - fix TRANSFER checks

2012-06-17 Thread Janus Weil
Hi Tobias,

> Two rather simple patches.
>
> Build and regtested on x86-64-linux.
> As one of the issues is a 4.7/4.8 regression:
> OK for the trunk and 4.7?

l'd say ok for both trunk and 4.7.

Thanks for the patches,
Janus


Re: PATCH: Always create a new language function for nested functions

2012-06-17 Thread Markus Trippelsdorf
On 2012.05.29 at 19:07 +, Joseph S. Myers wrote:
> On Tue, 29 May 2012, Meador Inge wrote:
> 
> > 2012-05-29  Meador Inge  
> > 
> > * c-decl.c (c_push_function_context): Always create a new language
> > function.
> > (c_pop_function_context): Clear the language function created in
> > c_push_function_context.
> 
> Thanks, committed.

This patch caused http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53702 

-- 
Markus


[PATCH, libgcc]: Use __builtin_expect when checking for soft-fp exceptions

2012-06-17 Thread Uros Bizjak
Hello!

2012-06-17  Uros Bizjak  

* config/i386/sfp-machine.h (FP_HANDLE_EXCEPTIONS): Use
__builtin_expect when checking for exceptions.
* config/ia64/sfp-machine.h (FP_HANDLE_EXCEPTIONS): Ditto.

Tested on x86_64-pc-linux-gnu {,-m32} and ia64-unknown-linux-gnu,
committed to mainline SVN.

Uros.
Index: config/ia64/sfp-machine.h
===
--- config/ia64/sfp-machine.h   (revision 188696)
+++ config/ia64/sfp-machine.h   (working copy)
@@ -60,7 +60,7 @@
 
 #define FP_HANDLE_EXCEPTIONS   \
   do { \
-if (_fex)  \
+if (__builtin_expect (_fex, 0))\
   __sfp_handle_exceptions (_fex);  \
   } while (0);
 
Index: config/i386/sfp-machine.h
===
--- config/i386/sfp-machine.h   (revision 188696)
+++ config/i386/sfp-machine.h   (working copy)
@@ -51,7 +51,7 @@
 
 #define FP_HANDLE_EXCEPTIONS   \
   do { \
-if (_fex)  \
+if (__builtin_expect (_fex, 0))\
   __sfp_handle_exceptions (_fex);  \
   } while (0);
 


[PATCH, i386]: Fix vcvtph2ps vec_select selector

2012-06-17 Thread Uros Bizjak
Hello!

2012-06-17  Uros Bizjak  

* config/i386/sse.md (vcvtph2ps): Fix vec_select selector.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN, will
be backported to release branches.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 188703)
+++ config/i386/sse.md  (working copy)
@@ -11647,7 +11647,7 @@
  (unspec:V8SF [(match_operand:V8HI 1 "register_operand" "x")]
   UNSPEC_VCVTPH2PS)
  (parallel [(const_int 0) (const_int 1)
-(const_int 1) (const_int 2)])))]
+(const_int 2) (const_int 3)])))]
   "TARGET_F16C"
   "vcvtph2ps\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssecvt")


Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'

2012-06-17 Thread Gabriel Dos Reis
On Sun, Jun 17, 2012 at 6:58 AM, Richard Guenther
 wrote:
> On Sat, Jun 16, 2012 at 8:47 PM, Gabriel Dos Reis
>  wrote:
>> On Sat, Jun 16, 2012 at 3:54 AM, Duncan Sands  wrote:
>>> Hi,
>>>
>>>
> If ENABLE_BUILD_WITH_CXX is defined, then GCC itself is built with C++,
> and we want a C++ signature for functions.  If it is not defined, then
> GCC itself is not built with C++, and we want (and must have) a C
> signature.
>
> I suppose we would decide that fancy_abort always uses a C signature,
> but that seems odd.
>
> Ian


 I guess the issue is when people care only about C plugins, yet
 fancy_abort
 get implicitly exported with a C++ linkage.

 I suspect this goes back to the eternal question: what do we consider as
 part of the public GCC public API (no, Basile, I am not suggesting to have
 the same discussion again.)
>>>
>>>
>>> if the following are to hold
>>>
>>> (1) fancy_abort is declared in system.h
>>> (2) system.h should not be wrapped in extern "C" when included from a
>>> plugin,
>>> (3) it should be valid to include it from plugins compiled as C or as C++,
>>> (4) fancy_abort should use the same linkage as GCC, i.e. C when GCC built as
>>> C,
>>> C++ when built as C++ (aka ENABLE_BUILD_WITH_CXX).
>>>
>>> then something like the following seems inevitable:
>>>
>>> #ifdef ENABLE_BUILD_WITH_CXX
>>> #ifdef __cplusplus
>>> extern void fancy_abort(const char *, int, const char *) ATTRIBUTE_NORETURN;
>>> #else
>>> extern void _Z11fancy_abortPKciS0_(const char *, int, const char *)
>>> ATTRIBUTE_NORETURN;
>>> #endif
>>> #else
>>> #ifdef __cplusplus
>>> extern "C" void fancy_abort(const char *, int, const char *)
>>> ATTRIBUTE_NORETURN;
>>> #else
>>> extern void fancy_abort(const char *, int, const char *) ATTRIBUTE_NORETURN;
>>> #endif
>>> #endif
>>>
>>> That's pretty nasty.  But to avoid the nastiness one of (1) - (4) needs to
>>> be
>>> dropped.  Which one?
>>>
>>> Ciao, Duncan.
>>
>> It is not just nasty; it is fragile.
>> I think we should just give fancy_abort a C language specification.
>
> No, I think we should make system.h what system.h is about - include
> all system headers.
>
> declaring fancy_abort with either linkage is not part of "inlude
> system headers" and thus
> should not be done inside system.h.

but system.h is not just including system headers.


Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'

2012-06-17 Thread Gabriel Dos Reis
On Sun, Jun 17, 2012 at 6:55 AM, Richard Guenther
 wrote:
> On Fri, Jun 15, 2012 at 5:18 PM, Gabriel Dos Reis
>  wrote:
>> On Fri, Jun 15, 2012 at 10:13 AM, Duncan Sands  wrote:
>>> Hi Gabriel,
>>>
>> Richard just reminded me that we have two fancy_aborts.
>> Could you tell which one your code is indirectly using?
>
>
>
> the one installed as plugin/include/system.h, which seems to be
> gcc/include/system.h.


 OK.  I think that declaration has to have the C language spec.
 Would you prepare a patch for that?
>>>
>>>
>>> you mean: wrap the fancy_abort declaration in system.h in 'extern C'?
>>
>> Yes.  Thanks.
>
> I don't think that's correct - if GCC is built with a C++ compiler
> fancy_abort has
> C++ linkage.

But tthen, that would prevent a C plugins from working, as Duncan
initially reported.


Re: [PATCH, testsuite]: Fix scan-tree-dump-times argument order in gcc.dg/tree-ssa/vrp68.c.

2012-06-17 Thread Richard Guenther
On Sun, Jun 17, 2012 at 10:41 AM, Uros Bizjak  wrote:
> Hello!
>
> The testcase still fails on x86_64-pc-linux-gnu with:
>
> FAIL: gcc.dg/tree-ssa/vrp68.c scan-tree-dump-times vrp1 "link_error" 1
>
> since there are two calls to link_error.

Oops.  I wonder how I did not see those failures myself ...

Richard.

> 2012-06-17  Uros Bizjak  
>
>        * gcc.dg/tree-ssa/vrp68.c: Fix scan-tree-dump-times argument order.
>
> Committed to mainline SVN.
>
> Uros.


Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'

2012-06-17 Thread Richard Guenther
On Sat, Jun 16, 2012 at 8:47 PM, Gabriel Dos Reis
 wrote:
> On Sat, Jun 16, 2012 at 3:54 AM, Duncan Sands  wrote:
>> Hi,
>>
>>
 If ENABLE_BUILD_WITH_CXX is defined, then GCC itself is built with C++,
 and we want a C++ signature for functions.  If it is not defined, then
 GCC itself is not built with C++, and we want (and must have) a C
 signature.

 I suppose we would decide that fancy_abort always uses a C signature,
 but that seems odd.

 Ian
>>>
>>>
>>> I guess the issue is when people care only about C plugins, yet
>>> fancy_abort
>>> get implicitly exported with a C++ linkage.
>>>
>>> I suspect this goes back to the eternal question: what do we consider as
>>> part of the public GCC public API (no, Basile, I am not suggesting to have
>>> the same discussion again.)
>>
>>
>> if the following are to hold
>>
>> (1) fancy_abort is declared in system.h
>> (2) system.h should not be wrapped in extern "C" when included from a
>> plugin,
>> (3) it should be valid to include it from plugins compiled as C or as C++,
>> (4) fancy_abort should use the same linkage as GCC, i.e. C when GCC built as
>> C,
>> C++ when built as C++ (aka ENABLE_BUILD_WITH_CXX).
>>
>> then something like the following seems inevitable:
>>
>> #ifdef ENABLE_BUILD_WITH_CXX
>> #ifdef __cplusplus
>> extern void fancy_abort(const char *, int, const char *) ATTRIBUTE_NORETURN;
>> #else
>> extern void _Z11fancy_abortPKciS0_(const char *, int, const char *)
>> ATTRIBUTE_NORETURN;
>> #endif
>> #else
>> #ifdef __cplusplus
>> extern "C" void fancy_abort(const char *, int, const char *)
>> ATTRIBUTE_NORETURN;
>> #else
>> extern void fancy_abort(const char *, int, const char *) ATTRIBUTE_NORETURN;
>> #endif
>> #endif
>>
>> That's pretty nasty.  But to avoid the nastiness one of (1) - (4) needs to
>> be
>> dropped.  Which one?
>>
>> Ciao, Duncan.
>
> It is not just nasty; it is fragile.
> I think we should just give fancy_abort a C language specification.

No, I think we should make system.h what system.h is about - include
all system headers.

declaring fancy_abort with either linkage is not part of "inlude
system headers" and thus
should not be done inside system.h.

Richard.


Re: [Patch 4.6] In system.h, wrap include of C++ header in 'extern C++'

2012-06-17 Thread Richard Guenther
On Fri, Jun 15, 2012 at 5:18 PM, Gabriel Dos Reis
 wrote:
> On Fri, Jun 15, 2012 at 10:13 AM, Duncan Sands  wrote:
>> Hi Gabriel,
>>
> Richard just reminded me that we have two fancy_aborts.
> Could you tell which one your code is indirectly using?



 the one installed as plugin/include/system.h, which seems to be
 gcc/include/system.h.
>>>
>>>
>>> OK.  I think that declaration has to have the C language spec.
>>> Would you prepare a patch for that?
>>
>>
>> you mean: wrap the fancy_abort declaration in system.h in 'extern C'?
>
> Yes.  Thanks.

I don't think that's correct - if GCC is built with a C++ compiler
fancy_abort has
C++ linkage.

Richard.

>> Sure, I will prepare a patch.


Re: [RFC 0/3] Stuff related to pr53533

2012-06-17 Thread Richard Guenther
On Fri, 15 Jun 2012, Richard Henderson wrote:

> ... but not actually fixing it.
> 
> I was hoping that the first patch might give the vectorizer enough
> info to solve the costing problem, but no such luck.  Nevertheless
> it would seem that not having the info present at all would be a
> bit of a hindrence when actually tweeking the vectorizer later...
> 
> FYI the "equivalence" of fabs/fmul in the integer simd costing is
> based upon the AMD K8 document I had handy.  I don't know that I've
> ever seen proper latencies and such for the Intel cpus?
> 
> The second patch implements something that I mentioned in the PR,
> that we ought to be decomposing expensive vector multiplies just
> like we do for scalar multiplies.  I'm really quite surprised that
> we'd not noticed that v*4 wasn't being implemented via shift...
> 
> The third patch implements something that Richi mentioned in the
> PR, that we ought not be trying to shuffle the elements of a
> const_vector around; do that beforehand.
> 
> ---
> 
> Some additional notes on the testcase in the PR:
> 
> The computation is 3 iterations of a hash function:
>   (x + 12345) * 914237 - 13.
> 
> The -13 folds into the subsequent +12345 well enough, and so the
> simple expansion of this results in 7 operations.  And that's 
> exactly what we get when unrolling and vectorizing.
> 
> However, for the non-vectorized version, combine smooshes together
> (2.5) iterations of the hash function, utilizing modulo arithmetic:
>   h1 = (x + 12345) * 914237 - 13
>   h2 = (h1 + 12345) * 914237 - 13
>   h3 = (h2 + 12345) * 914237 - 13
>  = x*764146064584710053 + 10318335160567660
>  = x*0x101597a5 + 0x9deb476c (mod 2**32)
> 
> which is of course only 2 operations (combine actually misses one
> and leaves an extra plus for 3 operations).  Which means that even
> leaving aside everything above, the vectorized code is having
> to work much harder than the scalar code in the end.
> 
> Manually adjusting complete_hash_func with the above substitution
> and suddenly even the pre-sse4 vectorized version is faster than
> the unvectorized version (with 10 iterations and these patches):
> 
> scalar:   4.69 sec
> sse2: 2.46 sec
> sse4: 1.39 sec
> 
> So, it's not *really* the costing inside the vectorizer at all,
> and begs the question of why we're not taking modulo arithmetic
> into account earlier in the optimization path?

We certainly could at the expense of losing the fact that signed
expressions do not overflow.  I've started to add infrastructure
that would allow to do this without intermediate casting to unsigned
on the no-undefined-overflow branch a few years ago - but work on
it got stalled (and I won't be able to pick that up again in time
for 4.8, but hopefully for 4.9).  On the gimple-level it should
be the reassoc pass who eventually should decide that using
unsigned arithmetic is profitable because it can then associate.

Richard.

> Bootstrapped and tested on x86_64, but I'll leave some time for
> comment before committing any of this.
> 
> 
> r~
> 
> 
> Richard Henderson (3):
>   Add rtx costs for sse integer ops
>   Use synth_mult for vector multiplies vs scalar constant
>   Handle const_vector in mulv4si3 for pre-sse4.1.
> 
>  gcc/config/i386/i386-protos.h |1 +
>  gcc/config/i386/i386.c|  126 +++-
>  gcc/config/i386/predicates.md |7 +
>  gcc/config/i386/sse.md|   72 ++--
>  gcc/expmed.c  |  438 
> +++--
>  gcc/machmode.h|8 +-
>  6 files changed, 386 insertions(+), 266 deletions(-)
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


[Patch, Fortran, OOP] PR 53328: Ambiguous check for type-bound GENERIC shall ignore PASSed arguments

2012-06-17 Thread Janus Weil
Hi all,

here is a patch which concerns the ambiguity checking of generic TBPs,
where F03 has similar rules as F95, with the difference that PASS
arguments basically should be skipped in these tests. That is what the
patch implements by passing the PASS arguments to
'gfc_compare_interfaces' and modifying the helper functions
'count_types_test' and 'generic_correspondence'.

Patch is regtested on x86_64-unknown-linux-gnu. I also checked that it
gives correct behavior for the extended tests posted by Salvatore at:
http://gcc.gnu.org/ml/fortran/2012-05/msg00060.html

Ok for trunk?

Btw, we have a couple more PRs regarding generic TBPs, and I hope to
find the time to tackle some of these soon.

Cheers,
Janus



2012-06-17  Janus Weil  

PR fortran/53328
* interface.c (count_types_test,generic_correspondence,
gfc_compare_interfaces): Ignore PASS arguments.
(check_interface1,compare_parameter): Pass NULL arguments to
gfc_compare_interfaces.
* gfortran.h (gfc_compare_interfaces): Modified Prototype.
* expr.c (gfc_check_pointer_assign): Pass NULL arguments to
gfc_compare_interfaces.
* resolve.c (resolve_structure_cons): Ditto.
(check_generic_tbp_ambiguity): Determine PASS arguments and pass them
to gfc_compare_interfaces.


2012-06-17  Janus Weil  

PR fortran/53328
* gfortran.dg/typebound_generic_12.f03: New.


pr53328.diff
Description: Binary data


typebound_generic_12.f03
Description: Binary data


Re: [patch] Fix PR middle-end/53590

2012-06-17 Thread Richard Guenther
On Fri, Jun 15, 2012 at 6:48 PM, Eric Botcazou  wrote:
>> Btw, I think we should enable this flag by default for all languages but
>> Java so that if you enable -fnon-call-exceptions for C or C++ you don't get
>> too many spurious exceptions from dead code.
>
> The attached patch enables it for the C family of languages (I'm not too sure
> about the other languages).  It also adds the missing bits related to inlining
> (with the annoying FIXME for LTO in can_inline_edge_p).
>
> Bootstrapped/regtested on x86_64-suse-linux, OK for mainline?

Ok.

Thanks,
Richard.

>
> 2012-06-15  Eric Botcazou  
>
>        PR middle-end/53590
>        * doc/invoke.texi (-fdelete-dead-exceptions): Update.
>        * cif-code.def (DEAD_EXCEPTIONS): New code.
>        * ipa-inline.c (can_inline_edge_p): Return false if the caller can
>        delete dead exceptions but the callee cannot.
>        * tree-inline.c (initialize_cfun): Copy can_delete_dead_exceptions.
> c-family/
>        * c-opts.c (c_common_init_options_struct): Set
>        opts->x_flag_delete_dead_exceptions to 1.
>
>
> --
> Eric Botcazou


Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)

2012-06-17 Thread Richard Guenther
On Fri, Jun 15, 2012 at 5:00 PM, Ulrich Weigand  wrote:
> Richard Guenther wrote:
>> On Fri, Jun 15, 2012 at 3:13 PM, Ulrich Weigand  wrote:
>> > However, there is a second case where we need to check every pass: if
>> > we're not actually vectorizing any loop, but are performing basic-block
>> > SLP.  In this case, it would appear that we need the same check as
>> > described in the comment above, i.e. to verify that the stride is a
>> > multiple of the vector size.
>> >
>> > The patch below adds this check, and this indeed fixes the invalid access
>> > I was seeing in the test case (in the final assembler, we now get a
>> > vld1.16 instead of vldr).
>> >
>> > Tested on arm-linux-gnueabi with no regressions.
>> >
>> > OK for mainline?
>>
>> Ok.
>
> Thanks for the quick review; I've checked this in to mainline now.
>
> I just noticed that the test case also crashes on 4.7, but not on 4.6.
>
> Would a backport to 4.7 also be OK, once testing passes?

Yes.  Please leave it on mainline a few days to catch fallout from
autotesters.

Thanks,
Richard.

> Thanks,
> Ulrich
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  ulrich.weig...@de.ibm.com
>


[PATCH, testsuite]: Fix scan-tree-dump-times argument order in gcc.dg/tree-ssa/vrp68.c.

2012-06-17 Thread Uros Bizjak
Hello!

The testcase still fails on x86_64-pc-linux-gnu with:

FAIL: gcc.dg/tree-ssa/vrp68.c scan-tree-dump-times vrp1 "link_error" 1

since there are two calls to link_error.

2012-06-17  Uros Bizjak  

* gcc.dg/tree-ssa/vrp68.c: Fix scan-tree-dump-times argument order.

Committed to mainline SVN.

Uros.
Index: gcc.dg/tree-ssa/vrp68.c
===
--- gcc.dg/tree-ssa/vrp68.c (revision 188702)
+++ gcc.dg/tree-ssa/vrp68.c (working copy)
@@ -19,6 +19,6 @@
merging [1, 5] with ~[0, 6] so the first VRP pass can only eliminate
the ~[0, 0] check as redundant.  */
 
-/* { dg-final { scan-tree-dump-times "vrp1" 0 "link_error" { xfail *-*-* } } } 
*/
-/* { dg-final { scan-tree-dump-times "vrp1" 1 "link_error" } } */
+/* { dg-final { scan-tree-dump-times "link_error" 0 "vrp1" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump-times "link_error" 1 "vrp1" } } */
 /* { dg-final { cleanup-tree-dump "vrp1" } } */