date:20140605

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Uros Bizjak

Hello!

> +int
>  get_fpu_except_flags (void)
>  {
>unsigned short cw;
>int excepts;
>int result = 0;
>
> -  __asm__ __volatile__ ("fnstsw\t%0" : "=a" (cw));
> +  __asm__ __volatile__ ("fnstsw\t%0" : "=m" (cw));
>excepts = cw;
>
>if (has_sse())

You can use "=am" constraint here, and the compiler will be free to
choose the most appropriate form.

Also, you should use __asm__ __volatile__ consistently in the headers.

Uros.

Re: [PATCH] Add support for OpenMP fortran user defined reductions

2014-06-05 Thread Jakub Jelinek

On Tue, Jun 03, 2014 at 09:42:39AM +0200, FX wrote:
> > As discussed with Tobias on IRC yesterday, the fact that I'd like to
> > eventually backport the Fortran OpenMP 4.0 support to 4.9 branch
> > poses a problem with the module.c changes
> 
> But this is by design, because we’re not supposed to add new features
> (especially API-changing or module-changing ones) in a release branch. 
> The compatibility fixes you propose will increase the complexity of the
> module reader code, and creates some precedent.
> 
> I don’t think there’s much pressure from the “general public” for Fortran
> OpenMP 4.0, so having it in 4.10 only rather than 4.9 is probably not such
> a big deal.

We have some precedents already, e.g. 4.8.1 adding remaining C++11 bits to
complete C++11 support, instead of waiting for 4.9.0, and IMHO this is
similar, OpenMP 4.0 shipped only incomplete support in 4.9.0 (everything but
Fortran), and shouldn't be that risky to backport.

For the conditional MOD_VERSION I meant something like (incremental patch),
MOD_VERSION_OMP4 content is open for bikeshedding, can be "13" etc.
If it is "13", then perhaps such a patch could be applied only to 4.9.x,
but we'd need to make sure that if any new extensions to module.c are added
on the trunk, then MOD_VERSION is bumped there to "14".

--- gcc/fortran/module.c.jj 2014-06-02 16:23:21.0 +0200
+++ gcc/fortran/module.c2014-06-05 09:28:39.616639930 +0200
@@ -82,7 +82,8 @@ along with GCC; see the file COPYING3.
 
 /* Don't put any single quote (') in MOD_VERSION, if you want it to be
recognized.  */
-#define MOD_VERSION "13"
+#define MOD_VERSION "12"
+#define MOD_VERSION_OMP4 "12 OpenMP 4"
 
 
 /* Structure that describes a position within a module file.  */
@@ -196,6 +197,7 @@ static char* module_content;
 static long module_pos;
 static int module_line, module_column, only_flag;
 static int prev_module_line, prev_module_column;
+static bool module_omp4;
 
 static enum
 { IO_INPUT, IO_OUTPUT }
@@ -4064,7 +4066,13 @@ mio_symbol (gfc_symbol *sym)
   if (sym->formal_ns
   && sym->formal_ns->proc_name == sym
   && sym->formal_ns->entries == NULL)
-mio_omp_declare_simd (sym->formal_ns, &sym->formal_ns->omp_declare_simd);
+{
+  if (module_omp4)
+   mio_omp_declare_simd (sym->formal_ns,
+ &sym->formal_ns->omp_declare_simd);
+  else if (iomode == IO_OUTPUT)
+   gcc_assert (sym->formal_ns->omp_declare_simd == NULL);
+}
 
   mio_rparen ();
 }
@@ -4871,7 +4879,8 @@ read_module (void)
 
   /* Skip OpenMP UDRs.  */
   get_module_locus (&omp_udrs);
-  skip_list ();
+  if (module_omp4)
+skip_list ();
 
   mio_lparen ();
 
@@ -5138,9 +5147,12 @@ read_module (void)
   load_commons ();
   load_equiv ();
 
-  /* Load OpenMP user defined reductions.  */
-  set_module_locus (&omp_udrs);
-  load_omp_udrs ();
+  if (module_omp4)
+{
+  /* Load OpenMP user defined reductions.  */
+  set_module_locus (&omp_udrs);
+  load_omp_udrs ();
+}
 
   /* At this point, we read those symbols that are needed but haven't
  been loaded yet.  If one symbol requires another, the other gets
@@ -5842,11 +5854,16 @@ write_module (void)
   write_char ('\n');
   write_char ('\n');
 
-  mio_lparen ();
-  write_omp_udrs (gfc_current_ns->omp_udr_root);
-  mio_rparen ();
-  write_char ('\n');
-  write_char ('\n');
+  if (module_omp4)
+{
+  mio_lparen ();
+  write_omp_udrs (gfc_current_ns->omp_udr_root);
+  mio_rparen ();
+  write_char ('\n');
+  write_char ('\n');
+}
+  else
+gcc_assert (gfc_current_ns->omp_udr_root == NULL);
 
   /* Write symbol information.  First we traverse all symbols in the
  primary namespace, writing those that need to be written.
@@ -5916,6 +5933,19 @@ read_crc32_from_module_file (const char*
 }
 
 
+/* Set module_omp4 if any symbol has !$OMP DECLARE SIMD directives.  */
+
+static void
+find_omp_declare_simd (gfc_symtree *st)
+{
+  gfc_symbol *sym = st->n.sym;
+  if (sym->formal_ns
+  && sym->formal_ns->proc_name == sym
+  && sym->formal_ns->omp_declare_simd)
+module_omp4 = true;
+}
+
+
 /* Given module, dump it to disk.  If there was an error while
processing the module, dump_flag will be set to zero and we delete
the module file, even if it was already there.  */
@@ -5958,6 +5988,12 @@ gfc_dump_module (const char *name, int d
   if (gfc_cpp_makedep ())
 gfc_cpp_add_target (filename);
 
+  module_omp4 = false;
+  if (gfc_current_ns->omp_udr_root)
+module_omp4 = true;
+  else
+gfc_traverse_symtree (gfc_current_ns->sym_root, find_omp_declare_simd);
+
   /* Write the module to the temporary file.  */
   module_fp = gzopen (filename_tmp, "w");
   if (module_fp == NULL)
@@ -5965,7 +6001,7 @@ gfc_dump_module (const char *name, int d
 filename_tmp, xstrerror (errno));
 
   gzprintf (module_fp, "GFORTRAN module version '%s' created from %s\n",
-   MOD_VERSION, gfc_source_file);
+

Re: [PATCH, PR 61391]

2014-06-05 Thread Richard Biener

On Wed, Jun 4, 2014 at 5:23 PM, Yuri Rumyantsev  wrote:
> Sorry, I sent you 'bad' patch, resend it.

Ok.

Thanks,
Richard.

> 2014-06-04 19:19 GMT+04:00 Yuri Rumyantsev :
>> I converted test-case to Unix format and new patch is attached.
>>
>> 2014-06-04 19:14 GMT+04:00 Jakub Jelinek :
>>> On Wed, Jun 04, 2014 at 07:11:26PM +0400, Yuri Rumyantsev wrote:
 Here is update patch with test-case and new ChangeLog.
>>>
>>> If approved, please avoid the DOS style line endings in the testcase.
>>>
>>> Jakub

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Rainer Orth

Janne Blomqvist  writes:

> On Thu, Jun 5, 2014 at 1:04 AM, FX  wrote:
>>   2. Your review of the patch!
>
> Not a full review, just a few quick comments.
>
> - Wrt. libgfortran/gfortran.map: You have added the GFORTRAN_1.6
> symbol node, as you're the first one to export new symbols in the 4.10
> cycle. I've seen occasional confusion from users when they have symbol
> version mismatches and e.g. "1.4" doesn't match any version they've
> seen before. So I think it might be better to switch to a scheme where
> the symbol node name matches the compiler version, i.e. GFORTRAN_4.10.

Except libgcc_s.so.1, none of the other GCC runtime libraries does
that.  Changing schemes in the middle is going to be even more confusing
than staying with what we have here.  The only other reasonable scheme
is what libgomp.so.1 does, namely naming the versions after the OpenMP
standard they implement.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [patch, mips, tree] align microMIPS functions to 16 bits with -Os

2014-06-05 Thread Richard Biener

On Wed, 4 Jun 2014, Sandra Loosemore wrote:

> On 06/04/2014 06:20 AM, Richard Biener wrote:
> > On Tue, 3 Jun 2014, Richard Sandiford wrote:
> > 
> > > Richi suggested just changing the alignment at output time.  I assume
> > > that would be a case of replacing the DECL_ALIGN in:
> > > 
> > >/* Tell assembler to move to target machine's alignment for functions.
> > > */
> > >align = floor_log2 (DECL_ALIGN (decl) / BITS_PER_UNIT);
> > >if (align > 0)
> > >  {
> > >ASM_OUTPUT_ALIGN (asm_out_file, align);
> > >  }
> > > 
> > > with a hook.  (Is that right?)
> > 
> > Yeah, kind of.  Of course if DECL_ALIGN on function-decls is "unused"
> > then we may as well initialize it to 1 in tree.c and at an appropriate
> > stage adjust it to the result of a target hook invocation.
> > 
> > Appropriate stage would be the above place (which means DECL_ALIGN
> > is essentially "unused" for FUNCTION_DECLs).
> > 
> > So ... can you massage the DECL_ALIGN macro to ICE on FUNCTION_DECLs
> > and see where we access it?  (generic code will possibly trip on it,
> > but the question is is there any user that cares?)
> 
> Well, offhand, I know that one of the places is in handle_aligned_attribute,
> in c-common/c-common.c, where we handle user-specified alignment attributes.
> Here we are potentially both reading and writing the DECL_ALIGN field of a
> FUNCTION_DECL.

Ok, we definitely need to preserve that (documented) behavior.  I suppose
it also sets DECL_USER_ALIGN.  -falign-functions is probably another
setter of DECL_ALIGN here.

If we add a target hook that may adjust function alignment then it
has to honor any user set alignment, then -falign-functions and
then it may only increase alignment over the default FUNCTION_BOUNDARY.

The point to adjust alignment with the hook may still be output time,
but as we figured it can't simply ignore DECL_ALIGN.

Richard.

Re: Fix address space computation in expand_debug_expr

2014-06-05 Thread Richard Biener

On Wed, Jun 4, 2014 at 10:06 PM, Senthil Kumar Selvaraj
 wrote:
> For the AVR target, assertions in convert_debug_memory_address cause a
> couple of ICEs (see PR 52472). Jakub suggested returning a NULL rtx,
> which works, but on debugging further, I found that expand_debug_expr
> appears to incorrectly compute the address space for ADDR_EXPR and
> MEM_REFs.
>
> For ADDR_EXPR, TREE_TYPE(exp) is a POINTER_TYPE (always?), but in
> the generic address space, even if the object whose address is taken
> is in a different address space. expand_debug_expr takes
> TYPE_ADDR_SPACE(TREE_TYPE(exp)) and therefore computes the address
> space as generic. convert_debug_memory_address then asserts that the
> mode is a valid pointer mode in the address space and fails.
>
> Similarly, for MEM_REFs, TREE_TYPE(exp) is the type of the
> dereferenced value, and therefore checking if it is a POINTER_TYPE
> doesn't help for a single pointer dereference. The address space
> gets computed as generic even if the pointer points to a different
> address space. The address mode for the generic address space is
> passed to convert_debug_memory_address, and the assertion that that mode
> must match the mode of the rtx then fails.
>
> The below patch attempts to fix this by picking the right TREE_TYPE to
> pass to TYPE_ADDR_SPACE for MEM_REF (use type of arg 0) and
> ADDR_EXPR (check for pointer type and look at nested addr space).
>
> Does this look reasonable or did I get it all wrong?
>
> Regards
> Senthil
>
> diff --git gcc/cfgexpand.c gcc/cfgexpand.c
> index 8b0e466..ca78953 100644
> --- gcc/cfgexpand.c
> +++ gcc/cfgexpand.c
> @@ -3941,8 +3941,8 @@ expand_debug_expr (tree exp)
>   op0 = plus_constant (inner_mode, op0, INTVAL (op1));
> }
>
> -  if (POINTER_TYPE_P (TREE_TYPE (exp)))
> -   as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
> +  if (POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (exp, 0
> +   as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 0;
>else
> as = ADDR_SPACE_GENERIC;

TREE_OPERAND (exp, 0) always has pointer type so I'd change this to

as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 0;

> @@ -4467,7 +4467,11 @@ expand_debug_expr (tree exp)
>   return NULL;
> }
>
> -  as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
> +  if (POINTER_TYPE_P (TREE_TYPE (exp)))
> +as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
> +  else
> +as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
> +

Likewise - TREE_TYPE (exp) is always a pointer type.  Otherwise the
patch looks ok to me.

Richard.

>op0 = convert_debug_memory_address (mode, XEXP (op0, 0), as);
>
>return op0;

Re: [4.9, PR 61393] Disable IPA-CP of transactional memory clones

2014-06-05 Thread Richard Biener

On Wed, Jun 4, 2014 at 11:28 PM, Martin Jambor  wrote:
> Hi,
>
> this patch does the same thing (disables IPA-CP for nodes marked as
> tm_clone) for the same reason as described in the previous mail but
> for the 4.9 branch.
>
> I've confirmed it fixes the PR failure and passes bootstrap and
> testing on x86_64-linux.  OK for 4.9?

Ok for both the 4.8 and 4.9 patches.

We indeed should find a proper solution for trunk.

Richard.

> Thanks,
>
> Martin
>
>
> 2014-06-04  Martin Jambor  
>
> PR ipa/61393
> * ipa-cp.c (determine_versionability): Pretend that tm_clones are
> not versionable.
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 7fb7ae6..93b60d6 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -433,6 +433,8 @@ determine_versionability (struct cgraph_node *node)
>else if (!opt_for_fn (node->decl, optimize)
>|| !opt_for_fn (node->decl, flag_ipa_cp))
>  reason = "non-optimized function";
> +  else if (node->tm_clone)
> +reason = "transactional memory clone";
>else if (lookup_attribute ("omp declare simd", DECL_ATTRIBUTES 
> (node->decl)))
>  {
>/* Ideally we should clone the SIMD clones themselves and create

[Patch ARM] Fix bootstrap issue with thumb state + neon.

2014-06-05 Thread Ramana Radhakrishnan


Hi,

In certain configurations of our auto-testers bootstrap fails because 
the enabled attribute depends on "opt_enabled" which is driven by state 
that is not constant for a compilation unit. We probably need a hot/ 
cold alternative which may be useful per insn.


For now, work around this till we know if we really need this attribute 
on these patterns and what the impact in reality is.


Bootstrapped , tested and applied to trunk.


Ramana

 * config/arm/arm.md (enabled): Disable opt_enabled.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2efa59f..484b0c0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2014-06-03  Ramana Radhakrishnan  
+
+   * config/arm/arm.md (enabled): Remove opt_enabled.
+
 2014-06-02  Ramana Radhakrishnan  
 
PR target/61154
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index bec889a..f58a79b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -201,6 +201,17 @@
 (const_string "no")))
 
 ; Enable all alternatives that are both arch_enabled and insn_enabled.
+; FIXME:: opt_enabled has been temporarily removed till the time we have
+; an attribute that allows the use of such alternatives.
+; This depends on caching of speed_p, size_p on a per
+; alternative basis. The problem is that the enabled attribute
+; cannot depend on any state that is not cached or is not constant
+; for a compilation unit. We probably need a generic "hot/cold"
+; alternative which if implemented can help with this. We disable this
+; until such a time as this is implemented and / or the improvements or
+; regressions with removing this attribute are double checked.
+; See ashldi3_neon and di3_neon in neon.md.
+
  (define_attr "enabled" "no,yes"
(cond [(and (eq_attr "predicable_short_it" "no")
   (and (eq_attr "predicated" "yes")
@@ -216,9 +227,6 @@
  (const_string "no")
 
  (eq_attr "arch_enabled" "no")
- (const_string "no")
-
- (eq_attr "opt_enabled" "no")
  (const_string "no")]
 (const_string "yes")))

Re: C++ PATCH for c++/61382 (init-list evaluation order)

2014-06-05 Thread Andreas Schwab

* g++.dg/cpp0x/initlist86.C (main): Initialize i.

diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist86.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist86.C
index 16af476..ace2ef9 100644
--- a/gcc/testsuite/g++.dg/cpp0x/initlist86.C
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist86.C
@@ -11,7 +11,7 @@ extern "C" int printf (const char *, ...);
 
 int main()
 {
-  int i;
+  int i = 0;
   A a{i++,i++};
   if (a.i != 0 || a.j != 1)
 __builtin_abort();
-- 
2.0.0

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [RFC][AArch64] Define TARGET_SPILL_CLASS

2014-06-05 Thread Ramana Radhakrishnan

>
> Thanks Richard for the comments. My primary intention here is to use
> TARGET_SPILL_CLASS to make FP_REGS as spill registers.

> Do you think
> AArch64 can benefit from TARGET_SPILL_CLASS hook. I agree that just
> increasing GP2FP and FP2GP for all the modes as I am doing is not the
> right think to do.
>

I suspect TARGET_SPILL_CLASS again needs to be a per-core decision,
the cost of moving between FP and Integer registers really depends on
the implementation and having this spill all the time to FP register
may not be good enough. So a default definition of TARGET_SPILL_CLASS
doesn't sound to me prima-facie.

I don't think increasing GP2FP and FP2GP costs is a bad thing. In a
number of benchmarks we've seen increased moves between FP and integer
registers and having this fix appears to help some of them. However
moving this to generic model needs more benchmarking across a variety
of cores before it can safely be applied there.

regards
Ramana


> Thanks,
> Kugan

[PATCH][match-and-simplify] Fix testsuite fallout

2014-06-05 Thread Richard Biener


This fixes most of the testsuite fallout of earlier patches.

Committed.

Richard.

2014-06-05  Richard Biener  

* gimple-match-head.c (gimple_resimplify2): Verify constant
folded result is really constant.
(gimple_match_and_simplify): Likewise.  Bail out for internal
functions.
* match.pd: Match what fold-const.c does for folding of
division and modulo with regarding to possibly zero 2nd operand.
Add comment to bogus pow (x, 0.5) to sqrt (x) simplification.

Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 211234)
+++ gcc/gimple-match-head.c (working copy)
@@ -173,7 +173,8 @@ gimple_resimplify2 (gimple_seq *seq,
}
}
}
-  if (tem != NULL_TREE)
+  if (tem != NULL_TREE
+ && CONSTANT_CLASS_P (tem))
{
  res_ops[0] = tem;
  res_ops[1] = NULL_TREE;
@@ -334,7 +335,8 @@ gimple_match_and_simplify (enum tree_cod
   if (constant_for_folding (op0))
 {
   tree res = fold_unary_to_constant (code, type, op0);
-  if (res != NULL_TREE)
+  if (res != NULL_TREE
+ && CONSTANT_CLASS_P (res))
return res;
 }
 
@@ -354,9 +356,8 @@ gimple_match_and_simplify (enum tree_cod
   if (constant_for_folding (op0) && constant_for_folding (op1))
 {
   tree res = fold_binary_to_constant (code, type, op0, op1);
-  /* ???  We can't assert that we fold this to a constant as
-for example we can't fold things like 1 / 0.  */
-  if (res != NULL_TREE)
+  if (res != NULL_TREE
+ && CONSTANT_CLASS_P (res))
return res;
 }
 
@@ -566,6 +567,9 @@ gimple_match_and_simplify (gimple stmt,
   && gimple_call_lhs (stmt) != NULL_TREE)
 {
   tree fn = gimple_call_fn (stmt);
+  /* ???  Internal function support missing.  */
+  if (!fn)
+   return false;
   if (TREE_CODE (fn) == SSA_NAME
  && valueize)
fn = valueize (fn);
Index: gcc/match.pd
===
--- gcc/match.pd(revision 211221)
+++ gcc/match.pd(working copy)
@@ -40,29 +40,21 @@ along with GCC; see the file COPYING3.
 (match_and_simplify
   (mult @0 integer_onep)
   @0)
+/* Make sure to preserve divisions by zero.  This is the reason why
+   we don't simplify x / x to 1 or 0 / x to 0.  */
 (match_and_simplify
   (trunc_div @0 integer_onep)
   @0)
-/* It's hard to preserve non-folding of / 0 which is done by a
-   positional check in fold-const.c (to preserve warnings).  The
-   issue here is that we fold too early in frontends.
-   Also fold happilt folds 0 / x to 0 (even if x turns out to be zero later). 
*/
-(match_and_simplify
-  (trunc_div integer_zerop@0 @1)
-  @0)
-(match_and_simplify
-  (trunc_div @0 @0)
-  { build_one_cst (type); })
 (match_and_simplify
   (trunc_mod @0 integer_onep)
   { build_zero_cst (type); })
+/* Same applies to modulo operations, but fold is inconsistent here
+   and simplifies 0 % x to 0.  */
 (match_and_simplify
   (trunc_mod integer_zerop@0 @1)
+  if (!integer_zerop (@1))
   @0)
 (match_and_simplify
-  (trunc_mod @0 @0)
-  { build_zero_cst (type); })
-(match_and_simplify
   (bit_ior @0 integer_zerop)
   @0)
 (match_and_simplify
@@ -280,6 +272,8 @@ to (minus @1 @0)
   (BUILT_IN_POW @0 (PLUS_EXPR @1 { build_one_cst (TREE_TYPE (@1)); })))
 (match_and_simplify
   (BUILT_IN_POW @0 REAL_CST_P@1)
+  /* This needs to be conditionalized on flag_unsafe_math_optimizations,
+ but we keep it for now to exercise function re-optimization.  */
   if (REAL_VALUES_EQUAL (TREE_REAL_CST (@1), dconsthalf))
   (BUILT_IN_SQRT @0))

[C PATCH] More locus tweaks (PR c/56724)

2014-06-05 Thread Marek Polacek

My previous patch for 56724 didn't handle all cases of this.
So let's fix this now by using expr_loc for ic_argpass.

Tested x86_64-unknown-linux-gnu, applying to trunk.

2014-06-05  Marek Polacek  

PR c/56724
* c-typeck.c (convert_for_assignment): Use expr_loc for ic_argpass.

* gcc.dg/pr56724-3.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index e0d3fde..f09f39e 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -6025,8 +6025,9 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
 
 where NULL is typically defined in C to be '(void *) 0'.  */
   if (VOID_TYPE_P (ttr) && rhs != null_pointer_node && !VOID_TYPE_P (ttl))
-   warning_at (location, OPT_Wc___compat,
-   "request for implicit conversion "
+   warning_at (errtype == ic_argpass ? expr_loc : location,
+   OPT_Wc___compat,
+   "request for implicit conversion "
"from %qT to %qT not permitted in C++", rhstype, type);
 
   /* See if the pointers point to incompatible address spaces.  */
@@ -6038,7 +6039,7 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
  switch (errtype)
{
case ic_argpass:
- error_at (location, "passing argument %d of %qE from pointer to "
+ error_at (expr_loc, "passing argument %d of %qE from pointer to "
"non-enclosed address space", parmnum, rname);
  break;
case ic_assign:
@@ -6067,7 +6068,7 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
  switch (errtype)
  {
  case ic_argpass:
-   warning_at (location, OPT_Wsuggest_attribute_format,
+   warning_at (expr_loc, OPT_Wsuggest_attribute_format,
"argument %d of %qE might be "
"a candidate for a format attribute",
parmnum, rname);
@@ -6246,9 +6247,10 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
   switch (errtype)
 {
 case ic_argpass:
-  error_at (location, "incompatible type for argument %d of %qE", parmnum, 
rname);
+  error_at (expr_loc, "incompatible type for argument %d of %qE", parmnum,
+   rname);
   inform ((fundecl && !DECL_IS_BUILTIN (fundecl))
- ? DECL_SOURCE_LOCATION (fundecl) : input_location,
+ ? DECL_SOURCE_LOCATION (fundecl) : expr_loc,
  "expected %qT but argument is of type %qT", type, rhstype);
   break;
 case ic_assign:
@@ -6257,12 +6259,12 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
   break;
 case ic_init:
   error_at (location,
-   "incompatible types when initializing type %qT using type %qT",
+   "incompatible types when initializing type %qT using type %qT",
type, rhstype);
   break;
 case ic_return:
   error_at (location,
-   "incompatible types when returning type %qT but %qT was "
+   "incompatible types when returning type %qT but %qT was "
"expected", rhstype, type);
   break;
 default:
diff --git gcc/testsuite/gcc.dg/pr56724-3.c gcc/testsuite/gcc.dg/pr56724-3.c
index e69de29..192d719 100644
--- gcc/testsuite/gcc.dg/pr56724-3.c
+++ gcc/testsuite/gcc.dg/pr56724-3.c
@@ -0,0 +1,14 @@
+/* PR c/56724 */
+/* { dg-do compile } */
+/* { dg-options "-Wc++-compat" } */
+
+extern void xfer (int, int, unsigned char *);
+struct T { int a; } t;
+
+void
+call (int x, int y, void *arg)
+{
+  unsigned char *uc = arg; /* { dg-warning "23:request for implicit 
conversion" } */
+  xfer (x, y, arg); /* { dg-warning "15:request for implicit conversion" } */
+  xfer (x, y, t); /* { dg-error "15:incompatible type for" } */
+}

Marek

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread FX

Hi Janne,

Thanks for the quick feedback.

> - Wrt. libgfortran/gfortran.map: You have added the GFORTRAN_1.6
> symbol node, as you're the first one to export new symbols in the 4.10
> cycle. I've seen occasional confusion from users when they have symbol
> version mismatches and e.g. "1.4" doesn't match any version they've
> seen before. So I think it might be better to switch to a scheme where
> the symbol node name matches the compiler version, i.e. GFORTRAN_4.10.

I don’t have an opinion on that, I’ll follow what you settle on.


> - Many of the intrinsics are just thin wrappers around
> __builtin_foo(). Couldn't these be generated inline instead?

They could, but… having them in the library allows to rely on its mechanism for 
detection of features, providing IEEE modules and functions only when we 
actually support them. I’m open to moving some of the IEEE handling towards the 
front-end, but we need to think clearly about that…

FX

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Uros Bizjak

Hello!

> 0. Gradual underflow control is implemented as "not supported by the 
> processor" (its SUPPORT
> function returns false, and the GET and SET procedures abort if you call 
> them). That’s explicitly
> allowed by the standard, so it’s not actually “missing". We can improve on 
> this in the future, if
> people can help.

Please look at libgcc/config/i386/crtfastmath.c for how to set
MXCSR_FTZ from mxcsr. You already have all necessary bits in place,
the function is basically only:

+  if (has_sse())
+  {
+unsigned int cw_sse;
+
+__asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse));
+cw_sse |= MXCSR_DAZ;
+__asm__ __volatile__ ("%vldmxcsr\t%0" : : "m" (cw_sse));
+  }

Please note, that FTZ applies only to SSE math. x87 and (IIRC) soft-FP
don't handle this setting.

Uros.

Re: [Patch] PR55189 enable -Wreturn-type by default

2014-06-05 Thread Sylvestre Ledru

On 05/06/2014 01:31, Joseph S. Myers wrote:
> On Wed, 4 Jun 2014, Sylvestre Ledru wrote:
> 
>> Hello,
>>
>> Finally, I have been able to update all tests with -Wreturn-type enabled
>> by default. AFAIK, under GNU/Linux Debian Jessie 64 bits, there is no
>> PASS->FAIL tests.
>>
>> Now, I would like to know if I can commit that into the repository. Who
>> can review that?
>>
>> As attachment, you will find the actual (tiny) patch.
>>
>> I split the tests update by languages. As they are big ( 1260 files
>> changed, 1638 insertions(+), 903 deletions(-) ), I uploaded the patches
>> on my server:
> 
> Some of those patches appear to be addressing cases where control appears 
> to reach the end of a function returning non-void, as opposed to cases 
> where the return type defaults to int. 
Do you have an example of the patches you are talking about?

> As I said in
> , I don't think that 
> warning is appropriate to enable by default as it catches perfectly valid 
> C90 / C99 code that avoids using extensions to annotate noreturn 
> functions.
I can try to implement that but I don't know where to start. Any clue?

> (I *do* think it's appropriate to enable by default the warning about 
> return type defaulting to int - more generally, to enable -Wimplicit-int 
> -Wimplicit-function-declaration - and the -Wreturn-type warning about a 
> return statement without a value in a function returning non-void also 
> seems appropriate to enable by default.  
I can try to enable them too by default. It seems my patches are
covering most of the tests updates.

> Warning about the absence of any
> return statement in a function returning non-void is probably also a 
> reasonable default warning from the -Wreturn-type set; it's specifically 
> the flow-based warnings that can give false positives in the absence of 
> noreturn annotations that I'm dubious about enabling by default.)

You are talking about code like this one (from Jonathan Wakely) ?

int f(int c)
{
if (c)
   return 0;
function_that_never_returns();
}

Initially, I implemented -Wmissing-return to manage this case (
https://gcc.gnu.org/ml/gcc-patches/2014-01/msg00820.html ) but Jason
suggested to remove that:
https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01033.html
(I don't have a strong opinion on the subject).

Sylvestre

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread FX

> Please look at libgcc/config/i386/crtfastmath.c for how to set
> MXCSR_FTZ from mxcsr. You already have all necessary bits in place,
> the function is basically only:
> 
> +  if (has_sse())
> +  {
> +unsigned int cw_sse;
> +
> +__asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse));
> +cw_sse |= MXCSR_DAZ;
> +__asm__ __volatile__ ("%vldmxcsr\t%0" : : "m" (cw_sse));
> +  }

Thanks for the suggestion!


> Please note, that FTZ applies only to SSE math. x87 and (IIRC) soft-FP
> don't handle this setting.

Yeah, that’s also why I prefer for now to have it declared as unsupported: the 
Fortran standard doesn’t really allow for partial support such as this, so I’m 
still trying to figure out what The Right Thing To Do is.

FX

Re: [patch, arm] fix gcc.target/arm/pr45094.c options

2014-06-05 Thread Richard Earnshaw

On 30/05/14 19:43, Sandra Loosemore wrote:
> This execution test specifies -mcpu=cortex-a8 but there is no 
> corresponding check to make sure that the hardware/simulator being used 
> to run the test can run cortex-a8 code.  (The specific case we tripped 
> over was in combination with a -mbig-endian multilib; the combination of 
> the two options results in BE8 code rather than BE32.)  It seems 
> simplest just to remove the specific -mcpu option and rely on the 
> multilib options to supply appropriate test flags for the execution 
> environment.
> 
> OK to commit?
> 
> -Sandra
> 
> 
> 2014-05-30  Julian Brown  
>   Sandra Loosemore  
> 
>   gcc/testsuite/
>   * gcc.target/arm/pr45094.c: Remove -mcpu=cortex-a8, dg-skip-if
>   options.

OK

R.

> 
> 
> pr45094.patch
> 
> 
> Index: gcc/testsuite/gcc.target/arm/pr45094.c
> ===
> --- gcc/testsuite/gcc.target/arm/pr45094.c(revision 211087)
> +++ gcc/testsuite/gcc.target/arm/pr45094.c(working copy)
> @@ -1,7 +1,6 @@
>  /* { dg-do run } */
> -/* { dg-skip-if "incompatible options" { arm*-*-* } { "-march=*" } { 
> "-march=armv7-a" } } */
>  /* { dg-require-effective-target arm_neon_hw } */
> -/* { dg-options "-O2 -mcpu=cortex-a8" } */
> +/* { dg-options "-O2" } */
>  /* { dg-add-options arm_neon } */
>  
>  #include 
>

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-05 Thread Evgeny Stupachenko

make check passed: no new fails.

On Wed, May 28, 2014 at 5:09 PM, Evgeny Stupachenko  wrote:
> Hi,
>
> The patch introduces alternative way of permutations for load groups
> of size 2 and 3 which should be faster on architectures with low
> parallelism.
> The patch gives 2 times gain on Silvermont to the test from PR52252
> (in addition to already committed 3 times gain).
>
> Patch passes bootstrap on x86. Make check is in progress.
>
> ChangeLog:
>
> 2014-05-28  Evgeny Stupachenko  
>
> * config/i386/i386.c (ix86_have_vector_parallel_execution): New.
> (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
> * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
> * target.def (have_vector_parallel_execution): New.
> * doc/tm.texi.in (have_vector_parallel_execution)): New.
> * doc/tm.texi: Regenerate.
> * targhooks.c (default_have_vector_parallel_execution): New.
> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
> Introduces alternative way of loads group permutaions.
> (vect_transform_grouped_load): Try alternative way of permutaions.
>
> Evgeny

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Uros Bizjak

On Thu, Jun 5, 2014 at 11:35 AM, FX  wrote:
>> Please look at libgcc/config/i386/crtfastmath.c for how to set
>> MXCSR_FTZ from mxcsr. You already have all necessary bits in place,
>> the function is basically only:
>>
>> +  if (has_sse())
>> +  {
>> +unsigned int cw_sse;
>> +
>> +__asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse));
>> +cw_sse |= MXCSR_DAZ;
>> +__asm__ __volatile__ ("%vldmxcsr\t%0" : : "m" (cw_sse));
>> +  }

Oops, the above should read MXCSR_FTZ.

> Thanks for the suggestion!
>
>
>> Please note, that FTZ applies only to SSE math. x87 and (IIRC) soft-FP
>> don't handle this setting.
>
> Yeah, that’s also why I prefer for now to have it declared as unsupported: 
> the Fortran standard doesn’t really allow for partial support such as this, 
> so I’m still trying to figure out what The Right Thing To Do is.

Referring to some older mails [1], this looks like a performance-only
setting (sort of fast-math). So, we can perhaps just set this bit,
regardless of the details. Maybe soft-fp will grow support for FTZ
sometime, it looks like a useful addition from the performance POV.

[1] https://gcc.gnu.org/ml/fortran/2013-11/msg00133.html

Uros.

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-05 Thread Ramana Radhakrishnan

On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko  wrote:
> Hi,
>
> The patch introduces alternative way of permutations for load groups
> of size 2 and 3 which should be faster on architectures with low
> parallelism.
> The patch gives 2 times gain on Silvermont to the test from PR52252
> (in addition to already committed 3 times gain).
>
> Patch passes bootstrap on x86. Make check is in progress.

Why do we need a new hook ? Can't you derive this information from
something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
though used in the reassociation logic but also serves a similar
purpose ?

Also the documentation of this hook is incomplete at best and wrong at
worst as this is not applied everywhere in the vectorizer but just for
this special case for load store permuting. Implying this is useful
everywhere in the vectorizer does not appear to be correct.

regards
Ramana




>
> ChangeLog:
>
> 2014-05-28  Evgeny Stupachenko  
>
> * config/i386/i386.c (ix86_have_vector_parallel_execution): New.
> (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
> * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
> * target.def (have_vector_parallel_execution): New.
> * doc/tm.texi.in (have_vector_parallel_execution)): New.
> * doc/tm.texi: Regenerate.
> * targhooks.c (default_have_vector_parallel_execution): New.
> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
> Introduces alternative way of loads group permutaions.
> (vect_transform_grouped_load): Try alternative way of permutaions.
>
> Evgeny

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Rainer Orth

Rainer Orth  writes:

> Janne Blomqvist  writes:
>
>> On Thu, Jun 5, 2014 at 1:04 AM, FX  wrote:
>>>   2. Your review of the patch!
>>
>> Not a full review, just a few quick comments.
>>
>> - Wrt. libgfortran/gfortran.map: You have added the GFORTRAN_1.6
>> symbol node, as you're the first one to export new symbols in the 4.10
>> cycle. I've seen occasional confusion from users when they have symbol
>> version mismatches and e.g. "1.4" doesn't match any version they've
>> seen before. So I think it might be better to switch to a scheme where
>> the symbol node name matches the compiler version, i.e. GFORTRAN_4.10.
>
> Except libgcc_s.so.1, none of the other GCC runtime libraries does
> that.  Changing schemes in the middle is going to be even more confusing
> than staying with what we have here.  The only other reasonable scheme
> is what libgomp.so.1 does, namely naming the versions after the OpenMP
> standard they implement.

Besides, the request constitues a fundamental misunderstanding how
interface (and symbol) versioning work.  I bet those same users clamour
to change the SONAME from libgfortran.so.3 to .so.4 to match the GCC
major version ;-(  Tell them to read up on interface vs. release
versioning in the libtool manual; symbol versioning is just a more
granular version of interface versioning.

Imagine the next version of gcc is called 5.0 instead of 4.10.  Would
you change the SONAME to .so.5 (suggesting an incompatible change) and
interface version to GFORTRAN_5.0 even if no symbols were added?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [RFC][AArch64] Define TARGET_SPILL_CLASS

2014-06-05 Thread Marcus Shawcroft

On 5 June 2014 09:29, Ramana Radhakrishnan  wrote:
>>
>> Thanks Richard for the comments. My primary intention here is to use
>> TARGET_SPILL_CLASS to make FP_REGS as spill registers.
>
>> Do you think
>> AArch64 can benefit from TARGET_SPILL_CLASS hook. I agree that just
>> increasing GP2FP and FP2GP for all the modes as I am doing is not the
>> right think to do.
>>
>
> I suspect TARGET_SPILL_CLASS again needs to be a per-core decision,
> the cost of moving between FP and Integer registers really depends on
> the implementation and having this spill all the time to FP register
> may not be good enough. So a default definition of TARGET_SPILL_CLASS
> doesn't sound to me prima-facie.
>
> I don't think increasing GP2FP and FP2GP costs is a bad thing. In a
> number of benchmarks we've seen increased moves between FP and integer
> registers and having this fix appears to help some of them. However
> moving this to generic model needs more benchmarking across a variety
> of cores before it can safely be applied there.

I'm aligned with Richards earlier comment on this topic.  Specifically
I'd like to see numbers in processor specific tuning tables, then we
can take a view on how to adjust the generic numbers.

/Marcus

Re: [RFC][AArch64] Define TARGET_SPILL_CLASS

2014-06-05 Thread Ramana Radhakrishnan

>> I don't think increasing GP2FP and FP2GP costs is a bad thing. In a
>> number of benchmarks we've seen increased moves between FP and integer
>> registers and having this fix appears to help some of them. However
>> moving this to generic model needs more benchmarking across a variety
>> of cores before it can safely be applied there.
>
> I'm aligned with Richards earlier comment on this topic.  Specifically
> I'd like to see numbers in processor specific tuning tables, then we
> can take a view on how to adjust the generic numbers.

Agreed - that's what I wanted to say but probably wasn't clear enough.

Ramana

>
> /Marcus

Re: libgo patch committed: Merge from revision 18783 of master

2014-06-05 Thread Matthias Klose

Am 05.06.2014 03:28, schrieb Ian Lance Taylor:
> I have committed a patch to libgo to merge from revision
> 18783:00cce3a34d7e of the master library.  This revision was committed
> January 7.  I picked this revision to merge to because the next revision
> deleted a file that is explicitly merged in by the libgo/merge.sh
> script.
> 
> Among other things, this patch changes type descriptors to add a new
> pointer to a zero value.  In gccgo this is implemented as a common
> variable, and that requires some changes to the compiler and a small
> change to go-gcc.cc.
> 
> As usual the patch is too large to include in this e-mail message.  I've
> appended the changes to parts of libgo that are more gccgo-specific.
> 
> Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
> Committed to mainline.

Is it time to bump the soname on trunk?

Re: [fortran, patch] IEEE intrinsic modules

2014-06-05 Thread Janne Blomqvist

On Thu, Jun 5, 2014 at 1:13 PM, Rainer Orth  
wrote:
> Rainer Orth  writes:
>
>> Janne Blomqvist  writes:
>>
>>> On Thu, Jun 5, 2014 at 1:04 AM, FX  wrote:
   2. Your review of the patch!
>>>
>>> Not a full review, just a few quick comments.
>>>
>>> - Wrt. libgfortran/gfortran.map: You have added the GFORTRAN_1.6
>>> symbol node, as you're the first one to export new symbols in the 4.10
>>> cycle. I've seen occasional confusion from users when they have symbol
>>> version mismatches and e.g. "1.4" doesn't match any version they've
>>> seen before. So I think it might be better to switch to a scheme where
>>> the symbol node name matches the compiler version, i.e. GFORTRAN_4.10.
>>
>> Except libgcc_s.so.1, none of the other GCC runtime libraries does
>> that.  Changing schemes in the middle is going to be even more confusing
>> than staying with what we have here.  The only other reasonable scheme
>> is what libgomp.so.1 does, namely naming the versions after the OpenMP
>> standard they implement.

libgcc_s and glibc seem to use such a scheme, suggesting that
competently maintained libraries can choose to do it that way.

> Besides, the request constitues a fundamental misunderstanding how
> interface (and symbol) versioning work.

Considering that yours truly was the one who originally added symbol
versioning to libgfortran, I dare claim that I have at least a
superficial understanding how it works, thank you very much. Back then
there was some discussion what kind of versioning scheme to use, and I
arbitrarily chose the current one.

>  I bet those same users clamour
> to change the SONAME from libgfortran.so.3 to .so.4 to match the GCC
> major version ;-(

I haven't seen that particular complaint, but it's of course possible
that somebody complains about that.

>  Tell them to read up on interface vs. release
> versioning in the libtool manual; symbol versioning is just a more
> granular version of interface versioning.

I think it's quite infeasible to get most GFortran users to care for,
much less learn about, such a topic.

> Imagine the next version of gcc is called 5.0 instead of 4.10.  Would
> you change the SONAME to .so.5 (suggesting an incompatible change)

No, duh?

> and
> interface version to GFORTRAN_5.0 even if no symbols were added?

No, why would a new symbol node be added if no actual symbols are added?

Now, where I see a minor trouble is that if we first add a symbol node
"GFORTRAN_4.10", and then later on it is decided that the next release
will be 5.0. So then we should change the name of the symbol node to
match, which might cause issues for people who have been building
against trunk versions. But I guess that is par for the course when
using trunk.

-- 
Janne Blomqvist

Re: [PATCH, Pointer Bounds Checker 6/x] New static constructor types

2014-06-05 Thread Ilya Enkovich

2014-06-04 17:35 GMT+04:00 Richard Biener :
> On Wed, Jun 4, 2014 at 3:13 PM, Ilya Enkovich  wrote:
>> 2014-06-04 13:58 GMT+04:00 Richard Biener :
>>> On Wed, Apr 16, 2014 at 2:33 PM, Ilya Enkovich  
>>> wrote:
 Hi,

 This patch add new static constructor types used by Pointer Bounds 
 Checker.  It was approved earlier for 4.9 and I'll assume patch is OK for 
 trunk if no objections arise.

 Patch was bootstrapped and tested for linux-x86_64.

 Thanks,
 Ilya
 --
 gcc/

 2014-04-16  Ilya Enkovich  

 * ipa.c (cgraph_build_static_cdtor_1): Support contructors
 with "chkp ctor" and "bnd_legacy" attributes.
 * gimplify.c (gimplify_init_constructor): Avoid infinite
 loop during gimplification of bounds initializer.


 diff --git a/gcc/gimplify.c b/gcc/gimplify.c
 index 7441784..67ab515 100644
 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c
 @@ -3803,10 +3803,19 @@ gimplify_init_constructor (tree *expr_p, 
 gimple_seq *pre_p, gimple_seq *post_p,
individual element initialization.  Also don't do this for small
all-zero initializers (which aren't big enough to merit
clearing), and don't try to make bitwise copies of
 -  TREE_ADDRESSABLE types.  */
 +  TREE_ADDRESSABLE types.
 +
 +  We cannot apply such transformation when compiling chkp static
 +  initializer because creation of initializer image in the memory
 +  will require static initialization of bounds for it.  It should
 +  result in another gimplification of similar initializer and we
 +  may fall into infinite loop.  */
 if (valid_const_initializer
 && !(cleared || num_nonzero_elements == 0)
 -   && !TREE_ADDRESSABLE (type))
 +   && !TREE_ADDRESSABLE (type)
 +   && (!current_function_decl
 +   || !lookup_attribute ("chkp ctor",
 + DECL_ATTRIBUTES 
 (current_function_decl
>>>
>>> Simply make the type TREE_ADDRESSABLE?
>>
>> Wouldn't it be a hack to mark it addressable just to not hit this
>> condition? It would also require to have an unshared copy of the type
>> to not affect other statements with that type.
>>
>>>
   {
 HOST_WIDE_INT size = int_size_in_bytes (type);
 unsigned int align;
 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index 26e9b03..5ab3aed 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -1345,9 +1345,11 @@ make_pass_ipa_whole_program_visibility 
 (gcc::context *ctxt)
  }

  /* Generate and emit a static constructor or destructor.  WHICH must
 -   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
 -   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
 -   initialization priority for this constructor or destructor.
 +   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
 +   (for chp static vars constructor) or 'B' (for chkp static bounds
 +   constructor).  BODY is a STATEMENT_LIST containing GENERIC
 +   statements.  PRIORITY is the initialization priority for this
 +   constructor or destructor.

 FINAL specify whether the externally visible name for collect2 should
 be produced. */
 @@ -1406,6 +1408,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, 
 int priority, bool final)
DECL_STATIC_CONSTRUCTOR (decl) = 1;
decl_init_priority_insert (decl, priority);
break;
 +case 'P':
 +  DECL_STATIC_CONSTRUCTOR (decl) = 1;
 +  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("chkp ctor"),
 + NULL,
 + NULL_TREE);
>>>
>>> Ick.  Please try to avoid using attributes for this.  Rather adjust the
>>> caller of this function to set a flag in the cgraph node.
>>
>> It is too late because all early local passes are executed by that
>> time and I need this attribute to be set before them.
>
> Ok, so where do you call this from?  It should be possible to create
> the function in GIMPLE form directly, avoiding the need to go through
> gimplification.

I create constructors at the end of unit compilation in
chkp_finish_file. Constructor body in this case is MODIFY_EXPRs for
all statically initialized pointers. "chkp ctor" attribute is used to
instrument this constructor properly - all pointer modifications
should be replaced with corresponding bounds initialization. Thus
gimplification is not the only place where this attribute is used.

Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> So I don't like this patch at all.
>>>
>>> Richard.
>>>
 +  decl_init_priority_insert (decl, priority);
 +  break;
 +case 'B':
 +  DECL

RE: [PATCH][MIPS] P5600 scheduling

2014-06-05 Thread Andrew Bennett

> FWIW, since regenerated files are often not posted as part of the patch,
> I'd just assumed the committer would do that.  I should have checked the
> changelog though...

Thats fine.  I will remember in future not to include regenerated files in
my patches.

> 
> > Secondly, I have changed invoke.texi to document the -march=p5600
> > option.  Thirdly, binutils defines p5600 as mip32r5, not mips32r2,
> > (which was causing assembler errors if you build using the
> > -march=p5600 gcc command line option).  I have updated the
> > MIPS_ISA_LEVEL_SPEC to map -march=p5600 to -mips32r5, and made the
> > PROCESSOR_P5600 use the MIPS32r5 ISA value.  Finally I have updated
> > the processor for mips32r5 entry to use p5600 rather than 74k.
> >
> > The updated patch and ChangeLog are shown below.
> >
> > Ok to commit?
> 
> OK, thanks.

Great. Committed as r211265.

Regards,


Andrew

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-05 Thread Ilya Enkovich

On 04 Jun 11:59, Richard Biener wrote:
> On Wed, Jun 4, 2014 at 8:46 AM, Jeff Law  wrote:
> > On 06/03/14 03:29, Richard Biener wrote:
> >>
> >> On Tue, Jun 3, 2014 at 7:55 AM, Ilya Enkovich 
> >> wrote:
> >>>
> >>> 2014-06-02 21:27 GMT+04:00 Jeff Law :
> 
>  On 06/02/14 04:48, Ilya Enkovich wrote:
> >>
> >>
> >> Hmm, so if I understand things correctly, src_fun has no loop
> >> structures attached, thus there's nothing to copy.  Presumably at
> >> some later point we build loop structures for the copy from scratch?
> >
> >
> > I suppose it is just a simple bug with absent NULL pointer check.  Here
> > is
> > original code:
> >
> > /* Duplicate the loop tree, if available and wanted.  */
> > if (loops_for_fn (src_cfun) != NULL
> > && current_loops != NULL)
> >   {
> > copy_loops (id, entry_block_map->loop_father,
> > get_loop (src_cfun, 0));
> > /* Defer to cfgcleanup to update loop-father fields of
> > basic-blocks.  */
> > loops_state_set (LOOPS_NEED_FIXUP);
> >   }
> >
> > /* If the loop tree in the source function needed fixup, mark the
> >destination loop tree for fixup, too.  */
> > if (loops_for_fn (src_cfun)->state & LOOPS_NEED_FIXUP)
> >   loops_state_set (LOOPS_NEED_FIXUP);
> >
> > As you may see we have check for absent loops structure in the first
> > if-statement and no check in the second one.  I hit segfault and added
> > the
> > check.
> 
> 
>  Downthread you indicated you're not in SSA form which might explain the
>  inconsistency here.  If so, then we need to make sure that the loop & df
>  structures do get set up properly later.
> >>>
> >>>
> >>> That is what init_data_structures pass will do for us as Richard pointed.
> >>> Right?
> >>
> >>
> >> loops are set up during the CFG construction and thus are available
> >> everywhere.
> >
> > Which would argue that the hunk that checks for the loop tree's existence
> > before accessing it should not be needed.  Ilya -- is it possible you hit
> > this prior to Richi's work to build the loop structures as part of CFG
> > construction and maintain them throughout compilation.
> 
> That's likely.  It's still on my list of janitor things to do to remove all
> those if (current_loops) checks ...

I tried to remove this loops check and got no failures this time.  So, here is 
a new patch version.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--
gcc/

2014-06-05  Ilya Enkovich  

* tree-inline.c (tree_function_versioning): Check DF info existence
before accessing it.


diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 4293241..2972346 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -5350,8 +5350,9 @@ tree_function_versioning (tree old_decl, tree new_decl,
   DECL_ARGUMENTS (new_decl) = DECL_ARGUMENTS (old_decl);
   initialize_cfun (new_decl, old_decl,
   old_entry_block->count);
-  DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
-= id.src_cfun->gimple_df->ipa_pta;
+  if (DECL_STRUCT_FUNCTION (new_decl)->gimple_df)
+DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
+  = id.src_cfun->gimple_df->ipa_pta;
 
   /* Copy the function's static chain.  */
   p = DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl;

Re: [PATCH, PR C++/61038] - g++ -E is unusable with UDL strings

2014-06-05 Thread Ed Smith-Rowland


On 05/20/2014 04:44 PM, Jason Merrill wrote:

On 05/13/2014 08:59 PM, Ed Smith-Rowland wrote:

+  escape_it = escape_it || cpp_userdef_string_p (token->type)
+|| cpp_userdef_char_p (token->type);


Let's add the new cases to the previous statement instead of a new 
one.  OK with that change.


Jason



PR c++/61038
I was asked to combine the escape logic for regular chars and strings
with the escape logic for user-defined literals chars and strings.
I just forgot the first time.

After rebuilding and testing committed as obvious.

ed@bad-horse:~/gcc_literal$ svn diff -rPREV  libcpp/ChangeLog 
libcpp/macro.c

Index: libcpp/ChangeLog
===
--- libcpp/ChangeLog(revision 211266)
+++ libcpp/ChangeLog(working copy)
@@ -1,3 +1,9 @@
+2014-06-04  Edward Smith-Rowland  <3dw...@verizon.net>
+
+PR c++/61038
+* macro.c (stringify_arg (cpp_reader *, macro_arg *)):
+Combine user-defined escape logic with the other string and char logic.
+
 2014-05-26  Richard Biener  

 * configure.ac: Remove long long and __int64 type checks,
Index: libcpp/macro.c
===
--- libcpp/macro.c(revision 211265)
+++ libcpp/macro.c(working copy)
@@ -492,11 +492,10 @@
|| token->type == CPP_WSTRING || token->type == CPP_WCHAR
|| token->type == CPP_STRING32 || token->type == CPP_CHAR32
|| token->type == CPP_STRING16 || token->type == CPP_CHAR16
-   || token->type == CPP_UTF8STRING);
+   || token->type == CPP_UTF8STRING
+   || cpp_userdef_string_p (token->type)
+   || cpp_userdef_char_p (token->type));

-  escape_it = escape_it || cpp_userdef_string_p (token->type)
-|| cpp_userdef_char_p (token->type);
-
   /* Room for each char being written in octal, initial space and
  final quote and NUL.  */
   len = cpp_token_len (token);

[AArch64][COMMITTED] Update stack layout comment.

2014-06-05 Thread Marcus Shawcroft


I've committed to the attached to clarify the stack layout comment.
/Marcusdiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 961e5c9..3348cf2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2078,37 +2078,35 @@ aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT offset,
 	|   |
 	|  incoming stack arguments |
 	|   |
-	+---+ <-- arg_pointer_rtx
-	|   |
+	+---+
+	|   | <-- incoming stack pointer (aligned)
 	|  callee-allocated save area   |
 	|  for register varargs |
 	|   |
-	+---+ <-- frame_pointer_rtx
-	|   |
-	|  local variables  |
+	+---+
+	|  local variables  | <-- frame_pointer_rtx
 	|   |
 	+---+
 	|  padding0 | \
 	+---+  |
-	|   |  |
-	|   |  |
 	|  callee-saved registers   |  | frame.saved_regs_size
-	|   |  |
 	+---+  |
 	|  LR'  |  |
 	+---+  |
-	|  FP'  | /
-  P +---+ <-- hard_frame_pointer_rtx
+	|  FP'  | / <- hard_frame_pointer_rtx (aligned)
++---+
 	|  dynamic allocation   |
 	+---+
-	|   |
-	|  outgoing stack arguments |
-	|   |
-	+---+ <-- stack_pointer_rtx
+	|  padding  |
+	+---+
+	|  outgoing stack arguments | <-- arg_pointer
+|   |
+	+---+
+	|   | <-- stack_pointer_rtx (aligned)
 
-   Dynamic stack allocations such as alloca insert data at point P.
-   They decrease stack_pointer_rtx but leave frame_pointer_rtx and
-   hard_frame_pointer_rtx unchanged.  */
+   Dynamic stack allocations via alloca() decrease stack_pointer_rtx
+   but leave frame_pointer_rtx and hard_frame_pointer_rtx
+   unchanged.  */
 
 /* Generate the prologue instructions for entry into a function.
Establish the stack frame by decreasing the stack pointer with a
-- 
1.7.9.5

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-05 Thread Evgeny Stupachenko

New hook is related to vector instructions only. Vector instructions
could be sequential in pipeline, but scalar - parallel. For x86
architectures TARGET_SCHED_REASSOC_WIDTH does not give required
differentiation.
General hooks could be potentially reused in other algorithms/by other
architectures.

Thanks,
Evgeny

On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan
 wrote:
> On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko  
> wrote:
>> Hi,
>>
>> The patch introduces alternative way of permutations for load groups
>> of size 2 and 3 which should be faster on architectures with low
>> parallelism.
>> The patch gives 2 times gain on Silvermont to the test from PR52252
>> (in addition to already committed 3 times gain).
>>
>> Patch passes bootstrap on x86. Make check is in progress.
>
> Why do we need a new hook ? Can't you derive this information from
> something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
> though used in the reassociation logic but also serves a similar
> purpose ?
>
> Also the documentation of this hook is incomplete at best and wrong at
> worst as this is not applied everywhere in the vectorizer but just for
> this special case for load store permuting. Implying this is useful
> everywhere in the vectorizer does not appear to be correct.
>
> regards
> Ramana
>
>
>
>
>>
>> ChangeLog:
>>
>> 2014-05-28  Evgeny Stupachenko  
>>
>> * config/i386/i386.c (ix86_have_vector_parallel_execution): New.
>> (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
>> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
>> * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
>> * target.def (have_vector_parallel_execution): New.
>> * doc/tm.texi.in (have_vector_parallel_execution)): New.
>> * doc/tm.texi: Regenerate.
>> * targhooks.c (default_have_vector_parallel_execution): New.
>> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
>> Introduces alternative way of loads group permutaions.
>> (vect_transform_grouped_load): Try alternative way of permutaions.
>>
>> Evgeny

[PATCH][match-and-simplify] Fix bogus transform

2014-06-05 Thread Richard Biener


Transcribe error from forwprop, catched by pr54767.f90 in the
fortran torture.

Committed.

2014-06-05  Richard Biener  

* match.pd: Fix bogus (~x & y) | x -> x & y transform which
should have simplified to x | y.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 211258)
+++ gcc/match.pd(working copy)
@@ -356,11 +356,11 @@ to (minus @1 @0)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   (bit_and @0 @1))
 
-/* (~x & y) | x -> x & y */
+/* (~x & y) | x -> x | y */
 (match_and_simplify
   (bit_ior (bit_and (bit_not @0) @1) @0)
   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
-  (bit_and @0 @1))
+  (bit_ior @0 @1))
 
 /* ~~x -> x */
 (match_and_simplify

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-05 Thread Ramana Radhakrishnan


On 06/05/14 12:43, Evgeny Stupachenko wrote:

New hook is related to vector instructions only. Vector instructions
could be sequential in pipeline, but scalar - parallel. For x86
architectures TARGET_SCHED_REASSOC_WIDTH does not give required
differentiation.
General hooks could be potentially reused in other algorithms/by other
architectures.


It already takes a "mode" argument. Couldn't you use a vector mode to 
work this out ?


If it is not enough then please be more specific about the documentation 
of this hook about where it is useful so that it's easy for people 
reading the documentation to understand at a glance what purpose it serves.



Ramana



Thanks,
Evgeny

On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan
 wrote:

On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko  wrote:

Hi,

The patch introduces alternative way of permutations for load groups
of size 2 and 3 which should be faster on architectures with low
parallelism.
The patch gives 2 times gain on Silvermont to the test from PR52252
(in addition to already committed 3 times gain).

Patch passes bootstrap on x86. Make check is in progress.


Why do we need a new hook ? Can't you derive this information from
something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
though used in the reassociation logic but also serves a similar
purpose ?

Also the documentation of this hook is incomplete at best and wrong at
worst as this is not applied everywhere in the vectorizer but just for
this special case for load store permuting. Implying this is useful
everywhere in the vectorizer does not appear to be correct.

regards
Ramana






ChangeLog:

2014-05-28  Evgeny Stupachenko  

 * config/i386/i386.c (ix86_have_vector_parallel_execution): New.
 (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
 * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
 * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
 * target.def (have_vector_parallel_execution): New.
 * doc/tm.texi.in (have_vector_parallel_execution)): New.
 * doc/tm.texi: Regenerate.
 * targhooks.c (default_have_vector_parallel_execution): New.
 * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
 Introduces alternative way of loads group permutaions.
 (vect_transform_grouped_load): Try alternative way of permutaions.

Evgeny

Re: Fix a function decl in gfortran

2014-06-05 Thread Bernd Schmidt


On 06/04/2014 10:36 PM, Tobias Burnus wrote:

Bernd Schmidt wrote:

Even with this applied, I'm still seeing similar failures.


I didn't claim that the patch would fix everything – nor that it was
well tested.


Just wanted to report back since the problem doesn't really show up on 
normal targets.



Can you try the attached version? The change is that I now properly use
"se->ignore_optional" to test whether absent optional arguments should
be skipped - rather than using this mornings ad-hoc solution of doing so
unconditionally. Additionally, the patch has now survived stage2
building – which is more testing than I could do this morning.


This seems to work. Thanks!


Bernd

[PATCH][Fortran] Do not search -L library path for libgfortran.spec

2014-06-05 Thread Richard Biener

(sorry for the dups, forgot to include gcc-patches)

The following patch fixes PR61418 by removing the odd code (probably
copied from Java) that adds a -specs= command line flag if
libgfortran.spec was found in a path specified by -L.  It shouldn't
be necessary to search those paths (and it could be harmful instead,
as random paths may contain random files named libgfortran.spec).
The proper way to add a search path for specs is to specify -B.

libgfortran.spec is handled specially here - libgomp.spec for
example is not found in -L paths (libgcj.spec is though, but I
don't intend to fix that ...).

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu.

Ok for trunk and 4.9 branch?

Thanks,
Richard.

2014-06-05  Richard Biener  

PR fortran/61418
* gfortranspec.c (spec_file): Remove.
(find_spec_file): Likewise.
(lang_specific_driver): Do not look for specs file in -L
or append -specs command line argument.
(lang_specific_pre_link): Always %:include libgfortran.spec.

Index: gcc/fortran/gfortranspec.c
===
--- gcc/fortran/gfortranspec.c  (revision 211228)
+++ gcc/fortran/gfortranspec.c  (working copy)
@@ -73,34 +73,11 @@ static void append_arg (const struct cl_
 static unsigned int g77_newargc;
 static struct cl_decoded_option *g77_new_decoded_options;
 
-/* The path to the spec file.  */
-static char *spec_file = NULL;
-
 /* This will be NULL if we encounter a situation where we should not
link in the fortran libraries.  */
 static const char *library = NULL;
 
 
-/* Return full path name of spec file if it is in DIR, or NULL if
-   not.  */
-static char *
-find_spec_file (const char *dir)
-{
-  const char dirsep_string[] = { DIR_SEPARATOR, '\0' };
-  char *spec;
-  struct stat sb;
-
-  spec = XNEWVEC (char, strlen (dir) + sizeof (SPEC_FILE) + 4);
-  strcpy (spec, dir);
-  strcat (spec, dirsep_string);
-  strcat (spec, SPEC_FILE);
-  if (!stat (spec, &sb))
-return spec;
-  free (spec);
-  return NULL;
-}
-
-
 /* Return whether strings S1 and S2 are both NULL or both the same
string.  */
 
@@ -313,12 +290,6 @@ For more information about these matters
 cool facility for handling --help and --verbose --help.  */
  return;
 
-   case OPT_L:
- if (!spec_file)
-   spec_file = find_spec_file (decoded_options[i].arg);
- break;
-
-
default:
  break;
}
@@ -449,12 +420,6 @@ For more information about these matters
 
 #endif
 
-  /* Read the specs file corresponding to libgfortran.
- If we didn't find the spec file on the -L path, we load it
- via lang_specific_pre_link.  */
-  if (spec_file)
-append_option (OPT_specs_, spec_file, 1);
-
   if (verbose && g77_new_decoded_options != g77_x_decoded_options)
 {
   fprintf (stderr, _("Driving:"));
@@ -473,8 +438,7 @@ For more information about these matters
 int
 lang_specific_pre_link (void)
 {
-  free (spec_file);
-  if (spec_file == NULL && library)
+  if (library)
 do_spec ("%:include(libgfortran.spec)");
 
   return 0;

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 1:18 PM, Ilya Enkovich  wrote:
> On 04 Jun 11:59, Richard Biener wrote:
>> On Wed, Jun 4, 2014 at 8:46 AM, Jeff Law  wrote:
>> > On 06/03/14 03:29, Richard Biener wrote:
>> >>
>> >> On Tue, Jun 3, 2014 at 7:55 AM, Ilya Enkovich 
>> >> wrote:
>> >>>
>> >>> 2014-06-02 21:27 GMT+04:00 Jeff Law :
>> 
>>  On 06/02/14 04:48, Ilya Enkovich wrote:
>> >>
>> >>
>> >> Hmm, so if I understand things correctly, src_fun has no loop
>> >> structures attached, thus there's nothing to copy.  Presumably at
>> >> some later point we build loop structures for the copy from scratch?
>> >
>> >
>> > I suppose it is just a simple bug with absent NULL pointer check.  Here
>> > is
>> > original code:
>> >
>> > /* Duplicate the loop tree, if available and wanted.  */
>> > if (loops_for_fn (src_cfun) != NULL
>> > && current_loops != NULL)
>> >   {
>> > copy_loops (id, entry_block_map->loop_father,
>> > get_loop (src_cfun, 0));
>> > /* Defer to cfgcleanup to update loop-father fields of
>> > basic-blocks.  */
>> > loops_state_set (LOOPS_NEED_FIXUP);
>> >   }
>> >
>> > /* If the loop tree in the source function needed fixup, mark the
>> >destination loop tree for fixup, too.  */
>> > if (loops_for_fn (src_cfun)->state & LOOPS_NEED_FIXUP)
>> >   loops_state_set (LOOPS_NEED_FIXUP);
>> >
>> > As you may see we have check for absent loops structure in the first
>> > if-statement and no check in the second one.  I hit segfault and added
>> > the
>> > check.
>> 
>> 
>>  Downthread you indicated you're not in SSA form which might explain the
>>  inconsistency here.  If so, then we need to make sure that the loop & df
>>  structures do get set up properly later.
>> >>>
>> >>>
>> >>> That is what init_data_structures pass will do for us as Richard pointed.
>> >>> Right?
>> >>
>> >>
>> >> loops are set up during the CFG construction and thus are available
>> >> everywhere.
>> >
>> > Which would argue that the hunk that checks for the loop tree's existence
>> > before accessing it should not be needed.  Ilya -- is it possible you hit
>> > this prior to Richi's work to build the loop structures as part of CFG
>> > construction and maintain them throughout compilation.
>>
>> That's likely.  It's still on my list of janitor things to do to remove all
>> those if (current_loops) checks ...
>
> I tried to remove this loops check and got no failures this time.  So, here 
> is a new patch version.
>
> Bootstrapped and tested on linux-x86_64.

Ok (you can commit this now).

Thanks,
Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2014-06-05  Ilya Enkovich  
>
> * tree-inline.c (tree_function_versioning): Check DF info existence
> before accessing it.
>
>
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 4293241..2972346 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -5350,8 +5350,9 @@ tree_function_versioning (tree old_decl, tree new_decl,
>DECL_ARGUMENTS (new_decl) = DECL_ARGUMENTS (old_decl);
>initialize_cfun (new_decl, old_decl,
>old_entry_block->count);
> -  DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
> -= id.src_cfun->gimple_df->ipa_pta;
> +  if (DECL_STRUCT_FUNCTION (new_decl)->gimple_df)
> +DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
> +  = id.src_cfun->gimple_df->ipa_pta;
>
>/* Copy the function's static chain.  */
>p = DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl;

[AARch64][COMMITTED] Move saved_varargs_size.

2014-06-05 Thread Marcus Shawcroft

I've just committed the attached to co-locate saved_varargs_size with 
the other target specific frame related state...


/Marcus

2014-06-05  Marcus Shawcroft  

* config/aarch64/aarch64.h (machine_function): Move
saved_varargs_size from here...
(aarch64_frameGTY): ... to here.

* config/aarch64/aarch64.c (aarch64_expand_prologue)
(aarch64_expand_epilogue, aarch64_final_eh_return_addr)
(aarch64_initial_elimination_offset)
(aarch64_setup_incoming_varargs): Adjust location of
saved_varargs_size.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3348cf2..c67bac4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2130,9 +2130,9 @@ aarch64_expand_prologue (void)
   rtx insn;
 
   aarch64_layout_frame ();
-  original_frame_size = get_frame_size () + cfun->machine->saved_varargs_size;
-  gcc_assert ((!cfun->machine->saved_varargs_size || cfun->stdarg)
-	  && (cfun->stdarg || !cfun->machine->saved_varargs_size));
+  original_frame_size = get_frame_size () + cfun->machine->frame.saved_varargs_size;
+  gcc_assert ((!cfun->machine->frame.saved_varargs_size || cfun->stdarg)
+	  && (cfun->stdarg || !cfun->machine->frame.saved_varargs_size));
   frame_size = (original_frame_size + cfun->machine->frame.saved_regs_size
 		+ crtl->outgoing_args_size);
   offset = frame_size = AARCH64_ROUND_UP (frame_size,
@@ -2296,7 +2296,7 @@ aarch64_expand_epilogue (bool for_sibcall)
   rtx cfa_reg;
 
   aarch64_layout_frame ();
-  original_frame_size = get_frame_size () + cfun->machine->saved_varargs_size;
+  original_frame_size = get_frame_size () + cfun->machine->frame.saved_varargs_size;
   frame_size = (original_frame_size + cfun->machine->frame.saved_regs_size
 		+ crtl->outgoing_args_size);
   offset = frame_size = AARCH64_ROUND_UP (frame_size,
@@ -2495,7 +2495,7 @@ aarch64_final_eh_return_addr (void)
 {
   HOST_WIDE_INT original_frame_size, frame_size, offset, fp_offset;
   aarch64_layout_frame ();
-  original_frame_size = get_frame_size () + cfun->machine->saved_varargs_size;
+  original_frame_size = get_frame_size () + cfun->machine->frame.saved_varargs_size;
   frame_size = (original_frame_size + cfun->machine->frame.saved_regs_size
 		+ crtl->outgoing_args_size);
   offset = frame_size = AARCH64_ROUND_UP (frame_size,
@@ -4258,7 +4258,7 @@ aarch64_initial_elimination_offset (unsigned from, unsigned to)
   aarch64_layout_frame ();
   frame_size = (get_frame_size () + cfun->machine->frame.saved_regs_size
 		+ crtl->outgoing_args_size
-		+ cfun->machine->saved_varargs_size);
+		+ cfun->machine->frame.saved_varargs_size);
 
   frame_size = AARCH64_ROUND_UP (frame_size, STACK_BOUNDARY / BITS_PER_UNIT);
   offset = frame_size;
@@ -6943,7 +6943,7 @@ aarch64_setup_incoming_varargs (cumulative_args_t cum_v, enum machine_mode mode,
 
   /* We don't save the size into *PRETEND_SIZE because we want to avoid
  any complication of having crtl->args.pretend_args_size changed.  */
-  cfun->machine->saved_varargs_size
+  cfun->machine->frame.saved_varargs_size
 = (AARCH64_ROUND_UP (gr_saved * UNITS_PER_WORD,
 		  STACK_BOUNDARY / BITS_PER_UNIT)
+ vr_saved * UNITS_PER_VREG);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index ced5a5e..392d095 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -514,6 +514,13 @@ extern enum aarch64_processor aarch64_tune;
 struct GTY (()) aarch64_frame
 {
   HOST_WIDE_INT reg_offset[FIRST_PSEUDO_REGISTER];
+
+  /* The number of extra stack bytes taken up by register varargs.
+ This area is allocated by the callee at the very top of the
+ frame.  This value is rounded up to a multiple of
+ STACK_BOUNDARY.  */
+  HOST_WIDE_INT saved_varargs_size;
+
   HOST_WIDE_INT saved_regs_size;
   /* Padding if needed after the all the callee save registers have
  been saved.  */
@@ -526,11 +533,6 @@ struct GTY (()) aarch64_frame
 typedef struct GTY (()) machine_function
 {
   struct aarch64_frame frame;
-
-  /* The number of extra stack bytes taken up by register varargs.
- This area is allocated by the callee at the very top of the frame.  */
-  HOST_WIDE_INT saved_varargs_size;
-
 } machine_function;
 #endif
 
-- 
1.7.9.5

Re: Fix address space computation in expand_debug_expr

2014-06-05 Thread Senthil Kumar Selvaraj

On Thu, Jun 05, 2014 at 09:46:25AM +0200, Richard Biener wrote:
> On Wed, Jun 4, 2014 at 10:06 PM, Senthil Kumar Selvaraj
>  wrote:
> > For the AVR target, assertions in convert_debug_memory_address cause a
> > couple of ICEs (see PR 52472). Jakub suggested returning a NULL rtx,
> > which works, but on debugging further, I found that expand_debug_expr
> > appears to incorrectly compute the address space for ADDR_EXPR and
> > MEM_REFs.
> >
> > For ADDR_EXPR, TREE_TYPE(exp) is a POINTER_TYPE (always?), but in
> > the generic address space, even if the object whose address is taken
> > is in a different address space. expand_debug_expr takes
> > TYPE_ADDR_SPACE(TREE_TYPE(exp)) and therefore computes the address
> > space as generic. convert_debug_memory_address then asserts that the
> > mode is a valid pointer mode in the address space and fails.
> >
> > Similarly, for MEM_REFs, TREE_TYPE(exp) is the type of the
> > dereferenced value, and therefore checking if it is a POINTER_TYPE
> > doesn't help for a single pointer dereference. The address space
> > gets computed as generic even if the pointer points to a different
> > address space. The address mode for the generic address space is
> > passed to convert_debug_memory_address, and the assertion that that mode
> > must match the mode of the rtx then fails.
> >
> > The below patch attempts to fix this by picking the right TREE_TYPE to
> > pass to TYPE_ADDR_SPACE for MEM_REF (use type of arg 0) and
> > ADDR_EXPR (check for pointer type and look at nested addr space).
> >
> > Does this look reasonable or did I get it all wrong?
> >
> > Regards
> > Senthil
> >
> > diff --git gcc/cfgexpand.c gcc/cfgexpand.c
> > index 8b0e466..ca78953 100644
> > --- gcc/cfgexpand.c
> > +++ gcc/cfgexpand.c
> > @@ -3941,8 +3941,8 @@ expand_debug_expr (tree exp)
> >   op0 = plus_constant (inner_mode, op0, INTVAL (op1));
> > }
> >
> > -  if (POINTER_TYPE_P (TREE_TYPE (exp)))
> > -   as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
> > +  if (POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (exp, 0
> > +   as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 
> > 0;
> >else
> > as = ADDR_SPACE_GENERIC;
> 
> TREE_OPERAND (exp, 0) always has pointer type so I'd change this to
> 
> as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 0;
> 
> > @@ -4467,7 +4467,11 @@ expand_debug_expr (tree exp)
> >   return NULL;
> > }
> >
> > -  as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
> > +  if (POINTER_TYPE_P (TREE_TYPE (exp)))
> > +as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
> > +  else
> > +as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
> > +
> 
> Likewise - TREE_TYPE (exp) is always a pointer type.  Otherwise the
> patch looks ok to me.
> 
> Richard.
> 
> >op0 = convert_debug_memory_address (mode, XEXP (op0, 0), as);
> >
> >return op0;

Modified patch attached. This fixes PR 52472 (the ADDR_EXPR case) as well 
as a couple of ICEs in gcc.target/avr/torture/addr-space-2-x.c (MEM_REF
case). I've also added a new testcase for the PR.

I don't have commit access - could someone apply this for me please?
It'd be great if this was backported to 4.9 and 4.8 branches as well.

Regards
Senthil


gcc/ChangeLog

2014-06-05  Senthil Kumar Selvaraj  

PR target/52472
* cfgexpand.c (expand_debug_expr): Use address space of nested
TREE_TYPE for ADDR_EXPR and MEM_REF.

gcc/testsuite/ChangeLog

2014-06-05  Senthil Kumar Selvaraj  

PR target/52472
* gcc.target/avr/pr52472.c: New test.

diff --git gcc/cfgexpand.c gcc/cfgexpand.c
index 8b0e466..e161cb7 100644
--- gcc/cfgexpand.c
+++ gcc/cfgexpand.c
@@ -3941,10 +3941,7 @@ expand_debug_expr (tree exp)
  op0 = plus_constant (inner_mode, op0, INTVAL (op1));
}
 
-  if (POINTER_TYPE_P (TREE_TYPE (exp)))
-   as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
-  else
-   as = ADDR_SPACE_GENERIC;
+  as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 0;
 
   op0 = convert_debug_memory_address (targetm.addr_space.address_mode (as),
  op0, as);
@@ -4467,7 +4464,7 @@ expand_debug_expr (tree exp)
  return NULL;
}
 
-  as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
+  as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (exp)));
   op0 = convert_debug_memory_address (mode, XEXP (op0, 0), as);
 
   return op0;
diff --git gcc/testsuite/gcc.target/avr/pr52472.c 
gcc/testsuite/gcc.target/avr/pr52472.c
new file mode 100644
index 000..701cfb4
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr52472.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -g -Wno-pointer-to-int-cast" } */
+
+/* This testcase exposes PR52472. expand_debug_expr mistakenly
+   considers the address space of data to be generic, and
+   asserts that PSImode pointers aren't valid in the generic 
+   address space. */
+
+

Re: [PATCH][C++] Fix PR61004

2014-06-05 Thread Richard Biener

On Tue, May 6, 2014 at 12:55 PM, Richard Biener  wrote:
> On Wed, 30 Apr 2014, Richard Biener wrote:
>
>>
>> With no longer recording alias subsets using BINFOs we now emit
>> bogus alias warnings for accessing empty bases.  The following
>> avoids this, accessing those with an incompatible alias-set is
>> harmless.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok?
>
> Ping.

Ping.

Thanks,
RIchard.

>> Thanks,
>> Richard.
>>
>> 2014-04-30  Richard Biener  
>>
>>   PR c++/61004
>>   * typeck.c (cp_build_indirect_ref): Do not emit strict-aliasing
>>   warnings for accessing empty classes.
>>
>>   * g++.dg/diagnostic/pr61004.C: New testcase.
>>
>> Index: gcc/cp/typeck.c
>> ===
>> --- gcc/cp/typeck.c   (revision 209928)
>> +++ gcc/cp/typeck.c   (working copy)
>> @@ -2921,8 +2921,9 @@ cp_build_indirect_ref (tree ptr, ref_ope
>>of  the  result  is  "T."  */
>>tree t = TREE_TYPE (type);
>>
>> -  if (CONVERT_EXPR_P (ptr)
>> -  || TREE_CODE (ptr) == VIEW_CONVERT_EXPR)
>> +  if ((CONVERT_EXPR_P (ptr)
>> +|| TREE_CODE (ptr) == VIEW_CONVERT_EXPR)
>> +   && (!CLASS_TYPE_P (t) || !CLASSTYPE_EMPTY_P (t)))
>>   {
>> /* If a warning is issued, mark it to avoid duplicates from
>>the backend.  This only needs to be done at
>> Index: gcc/testsuite/g++.dg/diagnostic/pr61004.C
>> ===
>> --- gcc/testsuite/g++.dg/diagnostic/pr61004.C (revision 0)
>> +++ gcc/testsuite/g++.dg/diagnostic/pr61004.C (working copy)
>> @@ -0,0 +1,11 @@
>> +// { dg-do compile }
>> +// { dg-options "-O2 -Wall" }
>> +
>> +struct A{ };
>> +struct B:A{};
>> +void f(A const&);
>> +int main()
>> +{
>> +  B b;
>> +  f(b); // { dg-bogus "strict-aliasing" }
>> +}
>>

[PATCH, tree-ssa] Optimize loop invariant phi defs constants

2014-06-05 Thread Christian Bruel

Hello,

while checking why a loop snippet like

  for (i = 0; i <= 5000; i++)
if (b)
  a = 2;
else
  a = x;

was not optimized in -O2 (unless loop unrolling or loop switching), I
found out that the case was already  partially handled by Richard in
PR43934. So this patch just adds a cost to the phi defs constants to
allow the whole test to be hoisted out of the loop.

Richard, does this seem reasonable and OK for 4.10 ?

bootstrapped/regtested for x86

many thanks

Christian





2014-06-03  Christian Bruel  

	PR tree-optimization/43934
	* tree-ssa-loop-im.c (determine_max_movement): Add PHI def constant cost.
2014-06-03  Christian Bruel  

	PR tree-optimization/43934
	* gcc.dg/tree-ssa/ssa-lim-8.c: New testcase.

Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-8.c	(working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-lim1-details" } */
+
+void bar (int);
+void foo (int n, int m)
+{
+  unsigned i;
+  for (i = 0; i < n; ++i)
+{
+  int x;
+  if (m < 0)
+	x = 1;
+  else
+	x = m;
+  bar (x);
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Moving PHI node" 1 "lim1"  } } */
+/* { dg-final { cleanup-tree-dump "lim1" } } */
Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c	(revision 211255)
+++ gcc/tree-ssa-loop-im.c	(working copy)
@@ -719,8 +719,21 @@ determine_max_movement (gimple stmt, bool must_pre
   FOR_EACH_PHI_ARG (use_p, stmt, iter, SSA_OP_USE)
 	{
 	  val = USE_FROM_PTR (use_p);
+
 	  if (TREE_CODE (val) != SSA_NAME)
-	continue;
+	{
+	  unsigned cst_cost = 1;
+
+	  gcc_assert (TREE_CODE (val) == INTEGER_CST
+			  || TREE_CODE (val) == REAL_CST
+			  || TREE_CODE (val) == VECTOR_CST
+			  || TREE_CODE (val) == COMPLEX_CST
+			  || TREE_CODE (val) == ADDR_EXPR);
+
+	  min_cost = MIN (min_cost, cst_cost);
+	  total_cost += cst_cost;
+	  continue;
+	}
 	  if (!add_dependency (val, lim_data, loop, false))
 	return false;
 	  def_data = get_lim_data (SSA_NAME_DEF_STMT (val));

[AArch64] [COMMITTED] Restructure callee save slot allocation logic.

2014-06-05 Thread Marcus Shawcroft

Refactor the existing implementation to use symbol SLOT_REQUIRED / 
SLOT_NOT_REQUIRED.


The magic numbers used for SLOT_REQUIRED is moved from 0 to -1 in
order to distinquish between the cases of SLOT_REQUIRED and offset 0.
This distinction is important for a subsequent patch to restrucutre the 
handling of X29/X30.


Commmitted

/Marcus

2014-06-05  Marcus Shawcroft  

* config/aarch64/aarch64.c (SLOT_NOT_REQUIRED, SLOT_REQUIRED): Define.
(aarch64_layout_frame): Use SLOT_NOT_REQUIRED and SLOT_REQUIRED.
(aarch64_register_saved_on_entry): Adjust test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c67bac4..15ac880 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1812,28 +1812,32 @@ aarch64_layout_frame (void)
   if (reload_completed && cfun->machine->frame.laid_out)
 return;
 
+#define SLOT_NOT_REQUIRED (-2)
+#define SLOT_REQUIRED (-1)
+
   /* First mark all the registers that really need to be saved...  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
-cfun->machine->frame.reg_offset[regno] = -1;
+cfun->machine->frame.reg_offset[regno] = SLOT_NOT_REQUIRED;
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
-cfun->machine->frame.reg_offset[regno] = -1;
+cfun->machine->frame.reg_offset[regno] = SLOT_NOT_REQUIRED;
 
   /* ... that includes the eh data registers (if needed)...  */
   if (crtl->calls_eh_return)
 for (regno = 0; EH_RETURN_DATA_REGNO (regno) != INVALID_REGNUM; regno++)
-  cfun->machine->frame.reg_offset[EH_RETURN_DATA_REGNO (regno)] = 0;
+  cfun->machine->frame.reg_offset[EH_RETURN_DATA_REGNO (regno)]
+	= SLOT_REQUIRED;
 
   /* ... and any callee saved register that dataflow says is live.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
 if (df_regs_ever_live_p (regno)
 	&& !call_used_regs[regno])
-  cfun->machine->frame.reg_offset[regno] = 0;
+  cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
 if (df_regs_ever_live_p (regno)
 	&& !call_used_regs[regno])
-  cfun->machine->frame.reg_offset[regno] = 0;
+  cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
 
   if (frame_pointer_needed)
 {
@@ -1844,14 +1848,14 @@ aarch64_layout_frame (void)
 
   /* Now assign stack slots for them.  */
   for (regno = R0_REGNUM; regno <= R28_REGNUM; regno++)
-if (cfun->machine->frame.reg_offset[regno] != -1)
+if (cfun->machine->frame.reg_offset[regno] == SLOT_REQUIRED)
   {
 	cfun->machine->frame.reg_offset[regno] = offset;
 	offset += UNITS_PER_WORD;
   }
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
-if (cfun->machine->frame.reg_offset[regno] != -1)
+if (cfun->machine->frame.reg_offset[regno] == SLOT_REQUIRED)
   {
 	cfun->machine->frame.reg_offset[regno] = offset;
 	offset += UNITS_PER_WORD;
@@ -1863,7 +1867,7 @@ aarch64_layout_frame (void)
   offset += UNITS_PER_WORD;
 }
 
-  if (cfun->machine->frame.reg_offset[R30_REGNUM] != -1)
+  if (cfun->machine->frame.reg_offset[R30_REGNUM] == SLOT_REQUIRED)
 {
   cfun->machine->frame.reg_offset[R30_REGNUM] = offset;
   offset += UNITS_PER_WORD;
@@ -1896,7 +1900,7 @@ aarch64_set_frame_expr (rtx frame_pattern)
 static bool
 aarch64_register_saved_on_entry (int regno)
 {
-  return cfun->machine->frame.reg_offset[regno] != -1;
+  return cfun->machine->frame.reg_offset[regno] >= 0;
 }
 
 
-- 
1.7.9.5

Re: patch to fix PR61325

2014-06-05 Thread James Greenhalgh

On Wed, Jun 04, 2014 at 08:00:51PM +0100, Vladimir Makarov wrote:
> On 2014-06-03, 6:02 PM, James Greenhalgh wrote:
> > On Thu, May 29, 2014 at 06:38:22PM +0100, Vladimir Makarov wrote:
> >>The following patch PR61325.  The details can be found on
> >>
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61325
> >>
> >>The patch was bootstrapped and tested on x86/x86-64.
> >>
> >>Committed as rev. 211060 to gcc-4.9 branch and as rev.211061 to trunk.
> >>
> >> 2014-05-29  Vladimir Makarov  
> >>
> >>  PR rtl-optimization/61325
> >>  * lra-constraints.c (process_address): Rename to
> >>  process_address_1.
> >>  (process_address): New function.
> >>
> >> 2014-05-29  Vladimir Makarov  
> >>
> >>  PR rtl-optimization/61325
> >>  * gcc.target/aarch64/pr61325.c: New.
> >
> > Hi Vlad,
> >
> > This patch appears to cause issues where the ARM backend can get stuck in a
> > seemingly infinite loop.
> >
> > Compiling:
> >
> > ./gcc.c-torture/compile/unalign-1.c
> >
> 
> Sorry for inconvenience.
> 
> Could you test the following patch

Hi Vlad,

The patch works for me. I've bootstrapped and tested it on
arm-none-linux-gnueabihf. It has also had a regression run on aarch64-none-elf.
Cristophe, did you get a chance to do any more thorough testing of the patch?

>From my perspective, I think this should go in, and be backported to 4.9.

Thanks,
James

Re: [PATCH][C++] Fix PR61004

2014-06-05 Thread Jason Merrill


OK.

Jason

Re: [PATCH, tree-ssa] Optimize loop invariant phi defs constants

2014-06-05 Thread Richard Biener

On Thu, 5 Jun 2014, Christian Bruel wrote:

> Hello,
> 
> while checking why a loop snippet like
> 
>   for (i = 0; i <= 5000; i++)
> if (b)
>   a = 2;
> else
>   a = x;
> 
> was not optimized in -O2 (unless loop unrolling or loop switching), I
> found out that the case was already  partially handled by Richard in
> PR43934. So this patch just adds a cost to the phi defs constants to
> allow the whole test to be hoisted out of the loop.
> 
> Richard, does this seem reasonable and OK for 4.10 ?

Ok with dropping the assert, constant-propagating cst_cost and
adding a comment like /* Assign const 1 to invariants.  */

Thanks,
Richard.

Re: C++ PATCH for c++/61382 (init-list evaluation order)

2014-06-05 Thread Jason Merrill


Oops, thanks.

Jason

RE: [PATCH, Fortan] fix initialization of flag_errno_math and flag_associative_math

2014-06-05 Thread VandeVondele Joost

I have now verified that both new testcases indeed pass with

gcc version 4.6.0 20100520 (experimental) [trunk revision 159620] (GCC)

that is the revision where Tobias enabled associative math by default. So that 
this patch fixes a regression.

[COMMITTED][AArch64] Unify callee save slot allocation for X29 and X30.

2014-06-05 Thread Marcus Shawcroft

This patch restructures the callee slave slot allocation code to handle 
X29 and X30 consistently with the other core registers.  The patch also 
ensures that the offset recorded for X30 is accurate.


Committed.

/Marcus

2014-06-05  Marcus Shawcroft  
Jiong Wang  

* config/aarch64/aarch64.c (aarch64_layout_frame): Correct
initialization of R30 offset.  Update offset.  Iterate core
regisers upto X30.  Remove X29, X30 specific code.From 4063c0f8ee2914a07001072b60da397a0bdbaa6e Mon Sep 17 00:00:00 2001
From: Marcus Shawcroft 
Date: Tue, 22 Apr 2014 18:12:53 +0100
Subject: [PATCH 4/8] [AArch64] Handle FP/LR slot allocation cleanly.

Rebase of Jiong's patch.
---
 gcc/config/aarch64/aarch64.c |   18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 15ac880..6d9fe4d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1841,13 +1841,15 @@ aarch64_layout_frame (void)
 
   if (frame_pointer_needed)
 {
-  cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
+  /* FP and LR are placed in the linkage record.  */
   cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
+  cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
   cfun->machine->frame.hardfp_offset = 2 * UNITS_PER_WORD;
+  offset += 2 * UNITS_PER_WORD;
 }
 
   /* Now assign stack slots for them.  */
-  for (regno = R0_REGNUM; regno <= R28_REGNUM; regno++)
+  for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
 if (cfun->machine->frame.reg_offset[regno] == SLOT_REQUIRED)
   {
 	cfun->machine->frame.reg_offset[regno] = offset;
@@ -1861,18 +1863,6 @@ aarch64_layout_frame (void)
 	offset += UNITS_PER_WORD;
   }
 
-  if (frame_pointer_needed)
-{
-  cfun->machine->frame.reg_offset[R29_REGNUM] = offset;
-  offset += UNITS_PER_WORD;
-}
-
-  if (cfun->machine->frame.reg_offset[R30_REGNUM] == SLOT_REQUIRED)
-{
-  cfun->machine->frame.reg_offset[R30_REGNUM] = offset;
-  offset += UNITS_PER_WORD;
-}
-
   cfun->machine->frame.padding0 =
 (AARCH64_ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT) - offset);
   offset = AARCH64_ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT);
-- 
1.7.9.5

Re: [PATCH, Pointer Bounds Checker 13/x] Early versioning

2014-06-05 Thread Ilya Enkovich

2014-06-05 15:58 GMT+04:00 Richard Biener :
> On Thu, Jun 5, 2014 at 1:18 PM, Ilya Enkovich  wrote:
>> On 04 Jun 11:59, Richard Biener wrote:
>>> On Wed, Jun 4, 2014 at 8:46 AM, Jeff Law  wrote:
>>> > On 06/03/14 03:29, Richard Biener wrote:
>>> >>
>>> >> On Tue, Jun 3, 2014 at 7:55 AM, Ilya Enkovich 
>>> >> wrote:
>>> >>>
>>> >>> 2014-06-02 21:27 GMT+04:00 Jeff Law :
>>> 
>>>  On 06/02/14 04:48, Ilya Enkovich wrote:
>>> >>
>>> >>
>>> >> Hmm, so if I understand things correctly, src_fun has no loop
>>> >> structures attached, thus there's nothing to copy.  Presumably at
>>> >> some later point we build loop structures for the copy from scratch?
>>> >
>>> >
>>> > I suppose it is just a simple bug with absent NULL pointer check.  
>>> > Here
>>> > is
>>> > original code:
>>> >
>>> > /* Duplicate the loop tree, if available and wanted.  */
>>> > if (loops_for_fn (src_cfun) != NULL
>>> > && current_loops != NULL)
>>> >   {
>>> > copy_loops (id, entry_block_map->loop_father,
>>> > get_loop (src_cfun, 0));
>>> > /* Defer to cfgcleanup to update loop-father fields of
>>> > basic-blocks.  */
>>> > loops_state_set (LOOPS_NEED_FIXUP);
>>> >   }
>>> >
>>> > /* If the loop tree in the source function needed fixup, mark the
>>> >destination loop tree for fixup, too.  */
>>> > if (loops_for_fn (src_cfun)->state & LOOPS_NEED_FIXUP)
>>> >   loops_state_set (LOOPS_NEED_FIXUP);
>>> >
>>> > As you may see we have check for absent loops structure in the first
>>> > if-statement and no check in the second one.  I hit segfault and added
>>> > the
>>> > check.
>>> 
>>> 
>>>  Downthread you indicated you're not in SSA form which might explain the
>>>  inconsistency here.  If so, then we need to make sure that the loop & 
>>>  df
>>>  structures do get set up properly later.
>>> >>>
>>> >>>
>>> >>> That is what init_data_structures pass will do for us as Richard 
>>> >>> pointed.
>>> >>> Right?
>>> >>
>>> >>
>>> >> loops are set up during the CFG construction and thus are available
>>> >> everywhere.
>>> >
>>> > Which would argue that the hunk that checks for the loop tree's existence
>>> > before accessing it should not be needed.  Ilya -- is it possible you hit
>>> > this prior to Richi's work to build the loop structures as part of CFG
>>> > construction and maintain them throughout compilation.
>>>
>>> That's likely.  It's still on my list of janitor things to do to remove all
>>> those if (current_loops) checks ...
>>
>> I tried to remove this loops check and got no failures this time.  So, here 
>> is a new patch version.
>>
>> Bootstrapped and tested on linux-x86_64.
>
> Ok (you can commit this now).

Thanks! Committed to trunk

Ilya

>
> Thanks,
> Richard.
>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2014-06-05  Ilya Enkovich  
>>
>> * tree-inline.c (tree_function_versioning): Check DF info existence
>> before accessing it.
>>
>>
>> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
>> index 4293241..2972346 100644
>> --- a/gcc/tree-inline.c
>> +++ b/gcc/tree-inline.c
>> @@ -5350,8 +5350,9 @@ tree_function_versioning (tree old_decl, tree new_decl,
>>DECL_ARGUMENTS (new_decl) = DECL_ARGUMENTS (old_decl);
>>initialize_cfun (new_decl, old_decl,
>>old_entry_block->count);
>> -  DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
>> -= id.src_cfun->gimple_df->ipa_pta;
>> +  if (DECL_STRUCT_FUNCTION (new_decl)->gimple_df)
>> +DECL_STRUCT_FUNCTION (new_decl)->gimple_df->ipa_pta
>> +  = id.src_cfun->gimple_df->ipa_pta;
>>
>>/* Copy the function's static chain.  */
>>p = DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl;

[C++ RFH] PR 56961

2014-06-05 Thread Paolo Carlini


Hi,

in this minor issue, after a permerror about "passing ‘volatile foo’ as 
‘this’ argument discards qualifiers" we crash with an infinite recursion 
in the gimplifier. The testcase:


struct foo { };

typedef struct
{
volatile foo fields;
} CSPHandleState;

CSPHandleState a;

void fn1 ()
{
CSPHandleState b;
b.fields = foo();
}

involves the empty struct foo, and I noticed that the crash doesn't 
happen otherwise. Therefore this comment in cp-gimplify.c seems relevant:


624 /* Remove any copies of empty classes. We check that the RHS
625 has a simple form so that TARGET_EXPRs and non-empty
626 CONSTRUCTORs get reduced properly, and we leave the return
627 slot optimization alone because it isn't a copy (FIXME so it
628 shouldn't be represented as one).
629
630 Also drop volatile variables on the RHS to avoid infinite
631 recursion from gimplify_expr trying to load the value. */
632 if (!TREE_SIDE_EFFECTS (op1)
633 || (DECL_P (op1) && TREE_THIS_VOLATILE (op1)))
634 *expr_p = op0;
635 else if (TREE_CODE (op1) == MEM_REF
636 && TREE_THIS_VOLATILE (op1))
637 {
638 /* Similarly for volatile MEM_REFs on the RHS. */

and in fact we get there, op1 is volatile, but we don't adjust things 
because op1 is a COMPONENT_REF, not a decl, not a MEM_REF. Then I'm 
wondering if we should handle in the same place the COMPONENT_REF 
case?!? Eg, brutally hacking the above to handle a COMPONENT_REF like a 
DECL_P avoids the infinite recursion. The below is *expr_p (its arg0 is 
op0 and its arg1 is op1). Hints?


Thanks!
Paolo.




unit size 
align 8 symtab 0 alias set -1 canonical type 0x76835150
fields 
nonlocal decl_4 VOID file 56961.C line 1 col 12
align 1 context  result 0x76835150 foo>

> context 
full-name "struct foo"
X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
pointer_to_this  reference_to_this 
 chain >

side-effects
arg 0 
used ignored QI file 56961.C line 13 col 19 size 0x766d5918 8> unit size 

align 8 context >
arg 1 type type_6 QI size  unit size 0x766d5930 1>
align 8 symtab 0 alias set -1 canonical type 0x76835498 fields 
 context 0x766e0170 D.1>

full-name "volatile struct foo"
X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
pointer_to_this >
side-effects volatile
arg 0 CSPHandleState>
addressable used tree_1 decl_5 QI file 56961.C line 12 col 18 size 
 unit size 

align 8 context >
arg 1 foo>
side-effects volatile used nonlocal decl_3 QI file 56961.C line 5 col 16 
size  unit size 

align 8 offset_align 128
offset 
bit offset  context 0x76835348 CSPHandleState> chain >>>

Re: [C++ RFH] PR 56961

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 2:59 PM, Paolo Carlini  wrote:
> Hi,
>
> in this minor issue, after a permerror about "passing ‘volatile foo’ as
> ‘this’ argument discards qualifiers" we crash with an infinite recursion in
> the gimplifier. The testcase:
>
> struct foo { };
>
> typedef struct
> {
> volatile foo fields;
> } CSPHandleState;
>
> CSPHandleState a;
>
> void fn1 ()
> {
> CSPHandleState b;
> b.fields = foo();
> }
>
> involves the empty struct foo, and I noticed that the crash doesn't happen
> otherwise. Therefore this comment in cp-gimplify.c seems relevant:
>
> 624 /* Remove any copies of empty classes. We check that the RHS
> 625 has a simple form so that TARGET_EXPRs and non-empty
> 626 CONSTRUCTORs get reduced properly, and we leave the return
> 627 slot optimization alone because it isn't a copy (FIXME so it
> 628 shouldn't be represented as one).
> 629
> 630 Also drop volatile variables on the RHS to avoid infinite
> 631 recursion from gimplify_expr trying to load the value. */
> 632 if (!TREE_SIDE_EFFECTS (op1)
> 633 || (DECL_P (op1) && TREE_THIS_VOLATILE (op1)))
> 634 *expr_p = op0;
> 635 else if (TREE_CODE (op1) == MEM_REF
> 636 && TREE_THIS_VOLATILE (op1))
> 637 {
> 638 /* Similarly for volatile MEM_REFs on the RHS. */
>
> and in fact we get there, op1 is volatile, but we don't adjust things
> because op1 is a COMPONENT_REF, not a decl, not a MEM_REF. Then I'm
> wondering if we should handle in the same place the COMPONENT_REF case?!?
> Eg, brutally hacking the above to handle a COMPONENT_REF like a DECL_P
> avoids the infinite recursion. The below is *expr_p (its arg0 is op0 and its
> arg1 is op1). Hints?

See my comment in the PR.  You need to handle all references/decls
here but preserve side-effects in operands of references.  For example
by gimplifying as unused address (just an idea):

Index: gcc/cp/cp-gimplify.c
===
--- gcc/cp/cp-gimplify.c(revision 211262)
+++ gcc/cp/cp-gimplify.c(working copy)
@@ -629,18 +629,14 @@ cp_gimplify_expr (tree *expr_p, gimple_s

   Also drop volatile variables on the RHS to avoid infinite
   recursion from gimplify_expr trying to load the value.  */
-   if (!TREE_SIDE_EFFECTS (op1)
-   || (DECL_P (op1) && TREE_THIS_VOLATILE (op1)))
+   if (!TREE_SIDE_EFFECTS (op1))
  *expr_p = op0;
-   else if (TREE_CODE (op1) == MEM_REF
-&& TREE_THIS_VOLATILE (op1))
+   else if (TREE_THIS_VOLATILE (op1)
+&& (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
  {
-   /* Similarly for volatile MEM_REFs on the RHS.  */
-   if (!TREE_SIDE_EFFECTS (TREE_OPERAND (op1, 0)))
- *expr_p = op0;
-   else
- *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
-   TREE_OPERAND (op1, 0), op0);
+   *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
+ op0, build_fold_addr_expr (op1));
+
  }
else
  *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),

Richard.


> Thanks!
> Paolo.
>
> 
>
>  type  size 
> unit size 
> align 8 symtab 0 alias set -1 canonical type 0x76835150
> fields 
> nonlocal decl_4 VOID file 56961.C line 1 col 12
> align 1 context  result  0x76835150 foo>
>> context 
> full-name "struct foo"
> X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
> pointer_to_this  reference_to_this
>  chain >
> side-effects
> arg 0 
> used ignored QI file 56961.C line 13 col 19 size  8> unit size 
> align 8 context >
> arg 1  type  QI size  unit size  1>
> align 8 symtab 0 alias set -1 canonical type 0x76835498 fields
>  context  D.1>
> full-name "volatile struct foo"
> X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
> pointer_to_this >
> side-effects volatile
> arg 0  CSPHandleState>
> addressable used tree_1 decl_5 QI file 56961.C line 12 col 18 size
>  unit size 
> align 8 context >
> arg 1  foo>
> side-effects volatile used nonlocal decl_3 QI file 56961.C line 5 col 16
> size  unit size 
> align 8 offset_align 128
> offset 
> bit offset  context  0x76835348 CSPHandleState> chain >>>
>
>
>

Re: [PATCH, Fortan] fix initialization of flag_errno_math and flag_associative_math

2014-06-05 Thread Janne Blomqvist

On Thu, Jun 5, 2014 at 3:54 PM, VandeVondele  Joost
 wrote:
> I have now verified that both new testcases indeed pass with
>
> gcc version 4.6.0 20100520 (experimental) [trunk revision 159620] (GCC)
>
> that is the revision where Tobias enabled associative math by default. So 
> that this patch fixes a regression.

Ok for trunk. Thanks!


-- 
Janne Blomqvist

Re: [C++ RFH] PR 56961

2014-06-05 Thread Jason Merrill


On 06/05/2014 09:05 AM, Richard Biener wrote:

+   *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
+ op0, build_fold_addr_expr (op1));


That seems like a fine approach.

Jason

[COMMITTED][AArch64] Add frame_size and hard_fp_offset to machine.frame

2014-06-05 Thread Marcus Shawcroft

Each of the various frame related functions in the backend contains a 
bespoke set of frame layout calculations.  This patch recognizes that 
all of these functions require three pieces of information from which 
they can trivially compute the various offsets and sizes they need.  We 
compute the STACK_BOUNDARY rounded locations of the frame_pointer, the 
location of the hard_frame_pointer and the frame_size  once and cache 
them in the machine.frame structure.


Committed
/Marcus

2014-06-05  Marcus Shawcroft  

* config/aarch64/aarch64.h (aarch64_frame): Add hard_fp_offset and
frame_size.
* config/aarch64/aarch64.c (aarch64_layout_frame): Initialize
aarch64_frame hard_fp_offset and frame_size.
(aarch64_expand_prologue): Use aarch64_frame hard_fp_offset and
frame_size; remove original_frame_size.
(aarch64_expand_epilogue, aarch64_final_eh_return_addr): Likewise.
(aarch64_initial_elimination_offset): Remove frame_size and
offset.  Use aarch64_frame frame_size.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 6d9fe4d..0f86a5b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1868,6 +1868,18 @@ aarch64_layout_frame (void)
   offset = AARCH64_ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT);
 
   cfun->machine->frame.saved_regs_size = offset;
+
+  cfun->machine->frame.hard_fp_offset
+= AARCH64_ROUND_UP (cfun->machine->frame.saved_varargs_size
+			+ get_frame_size ()
+			+ cfun->machine->frame.saved_regs_size,
+			STACK_BOUNDARY / BITS_PER_UNIT);
+
+  cfun->machine->frame.frame_size
+= AARCH64_ROUND_UP (cfun->machine->frame.hard_fp_offset
+			+ crtl->outgoing_args_size,
+			STACK_BOUNDARY / BITS_PER_UNIT);
+
   cfun->machine->frame.laid_out = true;
 }
 
@@ -2118,26 +2130,20 @@ aarch64_expand_prologue (void)
 
  sub sp, sp, 
   */
-  HOST_WIDE_INT original_frame_size;	/* local variables + vararg save */
   HOST_WIDE_INT frame_size, offset;
-  HOST_WIDE_INT fp_offset;		/* FP offset from SP */
+  HOST_WIDE_INT fp_offset;		/* Offset from hard FP to SP.  */
   rtx insn;
 
   aarch64_layout_frame ();
-  original_frame_size = get_frame_size () + cfun->machine->frame.saved_varargs_size;
-  gcc_assert ((!cfun->machine->frame.saved_varargs_size || cfun->stdarg)
-	  && (cfun->stdarg || !cfun->machine->frame.saved_varargs_size));
-  frame_size = (original_frame_size + cfun->machine->frame.saved_regs_size
-		+ crtl->outgoing_args_size);
-  offset = frame_size = AARCH64_ROUND_UP (frame_size,
-	  STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (flag_stack_usage_info)
-current_function_static_stack_size = frame_size;
+current_function_static_stack_size = cfun->machine->frame.frame_size;
+
+  frame_size = cfun->machine->frame.frame_size;
+  offset = cfun->machine->frame.frame_size;
 
-  fp_offset = (offset
-	   - original_frame_size
-	   - cfun->machine->frame.saved_regs_size);
+  fp_offset = cfun->machine->frame.frame_size
+	  - cfun->machine->frame.hard_fp_offset;
 
   /* Store pairs and load pairs have a range only -512 to 504.  */
   if (offset >= 512)
@@ -2148,7 +2154,7 @@ aarch64_expand_prologue (void)
 	 register area.  This will allow the pre-index write-back
 	 store pair instructions to be used for setting up the stack frame
 	 efficiently.  */
-  offset = original_frame_size + cfun->machine->frame.saved_regs_size;
+  offset = cfun->machine->frame.hard_fp_offset;
   if (offset >= 512)
 	offset = cfun->machine->frame.saved_regs_size;
 
@@ -2284,28 +2290,23 @@ aarch64_expand_prologue (void)
 void
 aarch64_expand_epilogue (bool for_sibcall)
 {
-  HOST_WIDE_INT original_frame_size, frame_size, offset;
+  HOST_WIDE_INT frame_size, offset;
   HOST_WIDE_INT fp_offset;
   rtx insn;
   rtx cfa_reg;
 
   aarch64_layout_frame ();
-  original_frame_size = get_frame_size () + cfun->machine->frame.saved_varargs_size;
-  frame_size = (original_frame_size + cfun->machine->frame.saved_regs_size
-		+ crtl->outgoing_args_size);
-  offset = frame_size = AARCH64_ROUND_UP (frame_size,
-	  STACK_BOUNDARY / BITS_PER_UNIT);
 
-  fp_offset = (offset
-	   - original_frame_size
-	   - cfun->machine->frame.saved_regs_size);
+  offset = frame_size = cfun->machine->frame.frame_size;
+  fp_offset = cfun->machine->frame.frame_size
+	  - cfun->machine->frame.hard_fp_offset;
 
   cfa_reg = frame_pointer_needed ? hard_frame_pointer_rtx : stack_pointer_rtx;
 
   /* Store pairs and load pairs have a range only -512 to 504.  */
   if (offset >= 512)
 {
-  offset = original_frame_size + cfun->machine->frame.saved_regs_size;
+  offset = cfun->machine->frame.hard_fp_offset;
   if (offset >= 512)
 	offset = cfun->machine->frame.saved_regs_size;
 
@@ -2487,16 +2488,12 @@ aarch64_expand_epilogue (bool for_sibcall)
 rtx
 aarch64_final_eh_return_addr (void)
 {
-  HOST_WIDE_INT original_frame_size, frame_size, offset, fp_offset;
+  HOST_WIDE_INT fp_offset;
+

Re: [C++ RFH] PR 56961

2014-06-05 Thread Paolo Carlini


Hi,

On 06/05/2014 03:12 PM, Jason Merrill wrote:

On 06/05/2014 09:05 AM, Richard Biener wrote:

+   *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
+ op0, build_fold_addr_expr (op1));


That seems like a fine approach.
Thanks a lot guys. Therefore I'm going to regtest it and if everything 
goes well commit it with a testcase.


Thanks again,
Paolo.

Re: [C++ RFH] PR 56961

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 3:12 PM, Paolo Carlini  wrote:
> Hi,
>
>
> On 06/05/2014 03:12 PM, Jason Merrill wrote:
>>
>> On 06/05/2014 09:05 AM, Richard Biener wrote:
>>>
>>> +   *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
>>> + op0, build_fold_addr_expr (op1));
>>
>>
>> That seems like a fine approach.
>
> Thanks a lot guys. Therefore I'm going to regtest it and if everything goes
> well commit it with a testcase.

I think the operands have to be reversed though - the type matches that
of op0.  Sorry ;)

Richard.

> Thanks again,
> Paolo.

Re: patch to fix PR61325

2014-06-05 Thread Christophe Lyon

On 5 June 2014 14:37, James Greenhalgh  wrote:
> On Wed, Jun 04, 2014 at 08:00:51PM +0100, Vladimir Makarov wrote:
>> On 2014-06-03, 6:02 PM, James Greenhalgh wrote:
>> > On Thu, May 29, 2014 at 06:38:22PM +0100, Vladimir Makarov wrote:
>> >>The following patch PR61325.  The details can be found on
>> >>
>> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61325
>> >>
>> >>The patch was bootstrapped and tested on x86/x86-64.
>> >>
>> >>Committed as rev. 211060 to gcc-4.9 branch and as rev.211061 to trunk.
>> >>
>> >> 2014-05-29  Vladimir Makarov  
>> >>
>> >>  PR rtl-optimization/61325
>> >>  * lra-constraints.c (process_address): Rename to
>> >>  process_address_1.
>> >>  (process_address): New function.
>> >>
>> >> 2014-05-29  Vladimir Makarov  
>> >>
>> >>  PR rtl-optimization/61325
>> >>  * gcc.target/aarch64/pr61325.c: New.
>> >
>> > Hi Vlad,
>> >
>> > This patch appears to cause issues where the ARM backend can get stuck in a
>> > seemingly infinite loop.
>> >
>> > Compiling:
>> >
>> > ./gcc.c-torture/compile/unalign-1.c
>> >
>>
>> Sorry for inconvenience.
>>
>> Could you test the following patch
>
> Hi Vlad,
>
> The patch works for me. I've bootstrapped and tested it on
> arm-none-linux-gnueabihf. It has also had a regression run on 
> aarch64-none-elf.
> Cristophe, did you get a chance to do any more thorough testing of the patch?
>
> From my perspective, I think this should go in, and be backported to 4.9.
>
> Thanks,
> James
>
>

I did run the tests and saw several regressions. As I wasn't very
confident, I have relaunched my tests, they are still running.

Christophe.

[PATCH][match-and-simplify] Annotate generated source with line directives

2014-06-05 Thread Richard Biener


The following makes genmatch annotate gimple-match.c with (commented
for now) line directives, similar to other generator programs.
This should help associating generated code with parts in match.pd.

Bootstrapped on x86_64-unknown-linux-gnu, applied.

Richard.

2014-06-05  Richard Biener  

* genmatch.c (output_line_directive): New function.
(struct simplify): Add locations for match, ifexpr and result.
(write_nary_simplifiers): Annotate the source with line
directives.
(parse_match_and_simplify): Record locations of match, ifexpr
and result part.
(main): Adjust.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 211234)
+++ gcc/genmatch.c  (working copy)
@@ -31,6 +31,52 @@ along with GCC; see the file COPYING3.
 #include "vec.h"
 
 
+/* libccp helpers.  */
+
+static struct line_maps *line_table;
+
+static bool
+#if GCC_VERSION >= 4001
+__attribute__((format (printf, 6, 0)))
+#endif
+error_cb (cpp_reader *, int, int, source_location location,
+ unsigned int, const char *msg, va_list *ap)
+{
+  const line_map *map;
+  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
+  expanded_location loc = linemap_expand_location (line_table, map, location);
+  fprintf (stderr, "%s:%d:%d error: ", loc.file, loc.line, loc.column);
+  vfprintf (stderr, msg, *ap);
+  fprintf (stderr, "\n");
+  exit (1);
+}
+
+static void
+#if GCC_VERSION >= 4001
+__attribute__((format (printf, 2, 3)))
+#endif
+fatal_at (const cpp_token *tk, const char *msg, ...)
+{
+  va_list ap;
+  va_start (ap, msg);
+  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  va_end (ap);
+}
+
+static void
+output_line_directive (FILE *f, source_location location)
+{
+  const line_map *map;
+  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
+  expanded_location loc = linemap_expand_location (line_table, map, location);
+  /* Other gen programs really output line directives here, at least for
+ development it's right now more convenient to have line information
+ from the generated file.  Still keep the directives as comment for now
+ to easily back-point to the meta-description.  */
+  fprintf (f, "/* #line %d \"%s\" */\n", loc.line, loc.file);
+}
+
+
 /* Grammar
 
  capture = '@' number
@@ -247,13 +293,19 @@ e_operation::e_operation (const char *id
 
 struct simplify {
   simplify (const char *name_,
-   struct operand *match_, struct operand *ifexpr_,
-   struct operand *result_)
-  : name (name_), match (match_), ifexpr (ifexpr_), result (result_) {}
+   struct operand *match_, source_location match_location_,
+   struct operand *ifexpr_, source_location ifexpr_location_,
+   struct operand *result_, source_location result_location_)
+  : name (name_), match (match_), match_location (match_location_),
+  ifexpr (ifexpr_), ifexpr_location (ifexpr_location_),
+  result (result_), result_location (result_location_) {}
   const char *name;
   struct operand *match;
+  source_location match_location;
   struct operand *ifexpr;
+  source_location ifexpr_location;
   struct operand *result;
+  source_location result_location;
 };
 
 
@@ -529,6 +581,7 @@ write_nary_simplifiers (FILE *f, vecmatch_location);
   fprintf (f, "  if (code == %s)\n", e->operation->op->id);
   fprintf (f, "{\n");
   fprintf (f, "  tree captures[4] = {};\n");
@@ -540,10 +593,12 @@ write_nary_simplifiers (FILE *f, vecifexpr)
{
+ output_line_directive (f, s->ifexpr_location);
  fprintf (f, "  if (!(");
  s->ifexpr->gen_gimple_transform (f, fail_label, NULL);
- fprintf (f, ")) goto %s;", fail_label);
+ fprintf (f, ")) goto %s;\n", fail_label);
}
+  output_line_directive (f, s->result_location);
   if (s->result->type == operand::OP_EXPR)
{
  e = static_cast  (s->result);
@@ -659,39 +714,6 @@ write_gimple (FILE *f, vec&
 }
 
 
-/* libccp helpers.  */
-
-static struct line_maps *line_table;
-
-static bool
-#if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
-#endif
-error_cb (cpp_reader *, int, int, source_location location,
- unsigned int, const char *msg, va_list *ap)
-{
-  const line_map *map;
-  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
-  expanded_location loc = linemap_expand_location (line_table, map, location);
-  fprintf (stderr, "%s:%d:%d error: ", loc.file, loc.line, loc.column);
-  vfprintf (stderr, msg, *ap);
-  fprintf (stderr, "\n");
-  exit (1);
-}
-
-static void
-#if GCC_VERSION >= 4001
-__attribute__((format (printf, 2, 3)))
-#endif
-fatal_at (const cpp_token *tk, const char *msg, ...)
-{
-  va_list ap;
-  va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
-  va_end (ap);
-}
-
-
 /* Read the next non-whitespace t

Re: [C++ RFH] PR 56961

2014-06-05 Thread Paolo Carlini


Hi,

On 06/05/2014 03:20 PM, Richard Biener wrote:
I think the operands have to be reversed though - the type matches 
that of op0. Sorry ;)

Something like this, then?

Thanks,
Paolo.

///
Index: cp-gimplify.c
===
--- cp-gimplify.c   (revision 211274)
+++ cp-gimplify.c   (working copy)
@@ -629,19 +629,12 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p,
 
   Also drop volatile variables on the RHS to avoid infinite
   recursion from gimplify_expr trying to load the value.  */
-   if (!TREE_SIDE_EFFECTS (op1)
-   || (DECL_P (op1) && TREE_THIS_VOLATILE (op1)))
+   if (!TREE_SIDE_EFFECTS (op1))
  *expr_p = op0;
-   else if (TREE_CODE (op1) == MEM_REF
-&& TREE_THIS_VOLATILE (op1))
- {
-   /* Similarly for volatile MEM_REFs on the RHS.  */
-   if (!TREE_SIDE_EFFECTS (TREE_OPERAND (op1, 0)))
- *expr_p = op0;
-   else
- *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
-   TREE_OPERAND (op1, 0), op0);
- }
+   else if (TREE_THIS_VOLATILE (op1)
+&& (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
+*expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
+  build_fold_addr_expr (op1), op0);
else
  *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
op0, op1);

Re: [C++ RFH] PR 56961

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 3:26 PM, Paolo Carlini  wrote:
> Hi,
>
>
> On 06/05/2014 03:20 PM, Richard Biener wrote:
>>
>> I think the operands have to be reversed though - the type matches that of
>> op0. Sorry ;)
>
> Something like this, then?

Yes.  I suppose it's ok to re-order side-effects lhs, rhs to rhs, lhs?
Otherwise you'd need to do sth like

  op0 = save_expr (op0);
  *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
op0,
build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
   build_fold_addr_expr (op1), op0));

(which may or may not work or be a good idea with zero-size aggregate op0)

Richard.

> Thanks,
> Paolo.
>
> ///

Re: [PATCH][match-and-simplify] Annotate generated source with line directives

2014-06-05 Thread Richard Earnshaw

On 05/06/14 14:25, Richard Biener wrote:
> 
> The following makes genmatch annotate gimple-match.c with (commented
> for now) line directives, similar to other generator programs.
> This should help associating generated code with parts in match.pd.
> 

I've often found these annotations more of a hindrance than a help
(mostly because when they return to pure boilerplate code there's no
line directive to switch back to the main C file and we end with the
debugger pointing at a random bit of the gen source that's irrelevant).

It would be nice if there were some way to turn this annotation off
(perhaps with a configury option)... or maybe to write the source file
lines as comments in the auto-generated code.

Perhaps there already is one, but if there is, I don't know about it... :-(

R.

> Bootstrapped on x86_64-unknown-linux-gnu, applied.
> 
> Richard.
> 
> 2014-06-05  Richard Biener  
> 
>   * genmatch.c (output_line_directive): New function.
>   (struct simplify): Add locations for match, ifexpr and result.
>   (write_nary_simplifiers): Annotate the source with line
>   directives.
>   (parse_match_and_simplify): Record locations of match, ifexpr
>   and result part.
>   (main): Adjust.
> 
> Index: gcc/genmatch.c
> ===
> --- gcc/genmatch.c(revision 211234)
> +++ gcc/genmatch.c(working copy)
> @@ -31,6 +31,52 @@ along with GCC; see the file COPYING3.
>  #include "vec.h"
>  
>  
> +/* libccp helpers.  */
> +
> +static struct line_maps *line_table;
> +
> +static bool
> +#if GCC_VERSION >= 4001
> +__attribute__((format (printf, 6, 0)))
> +#endif
> +error_cb (cpp_reader *, int, int, source_location location,
> +   unsigned int, const char *msg, va_list *ap)
> +{
> +  const line_map *map;
> +  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, 
> &map);
> +  expanded_location loc = linemap_expand_location (line_table, map, 
> location);
> +  fprintf (stderr, "%s:%d:%d error: ", loc.file, loc.line, loc.column);
> +  vfprintf (stderr, msg, *ap);
> +  fprintf (stderr, "\n");
> +  exit (1);
> +}
> +
> +static void
> +#if GCC_VERSION >= 4001
> +__attribute__((format (printf, 2, 3)))
> +#endif
> +fatal_at (const cpp_token *tk, const char *msg, ...)
> +{
> +  va_list ap;
> +  va_start (ap, msg);
> +  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
> +  va_end (ap);
> +}
> +
> +static void
> +output_line_directive (FILE *f, source_location location)
> +{
> +  const line_map *map;
> +  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, 
> &map);
> +  expanded_location loc = linemap_expand_location (line_table, map, 
> location);
> +  /* Other gen programs really output line directives here, at least for
> + development it's right now more convenient to have line information
> + from the generated file.  Still keep the directives as comment for now
> + to easily back-point to the meta-description.  */
> +  fprintf (f, "/* #line %d \"%s\" */\n", loc.line, loc.file);
> +}
> +
> +
>  /* Grammar
>  
>   capture = '@' number
> @@ -247,13 +293,19 @@ e_operation::e_operation (const char *id
>  
>  struct simplify {
>simplify (const char *name_,
> - struct operand *match_, struct operand *ifexpr_,
> - struct operand *result_)
> -  : name (name_), match (match_), ifexpr (ifexpr_), result (result_) {}
> + struct operand *match_, source_location match_location_,
> + struct operand *ifexpr_, source_location ifexpr_location_,
> + struct operand *result_, source_location result_location_)
> +  : name (name_), match (match_), match_location (match_location_),
> +  ifexpr (ifexpr_), ifexpr_location (ifexpr_location_),
> +  result (result_), result_location (result_location_) {}
>const char *name;
>struct operand *match;
> +  source_location match_location;
>struct operand *ifexpr;
> +  source_location ifexpr_location;
>struct operand *result;
> +  source_location result_location;
>  };
>  
>  
> @@ -529,6 +581,7 @@ write_nary_simplifiers (FILE *f, vec   continue;
>char fail_label[16];
>snprintf (fail_label, 16, "fail%d", label_cnt++);
> +  output_line_directive (f, s->match_location);
>fprintf (f, "  if (code == %s)\n", e->operation->op->id);
>fprintf (f, "{\n");
>fprintf (f, "  tree captures[4] = {};\n");
> @@ -540,10 +593,12 @@ write_nary_simplifiers (FILE *f, vec   }
>if (s->ifexpr)
>   {
> +   output_line_directive (f, s->ifexpr_location);
> fprintf (f, "  if (!(");
> s->ifexpr->gen_gimple_transform (f, fail_label, NULL);
> -   fprintf (f, ")) goto %s;", fail_label);
> +   fprintf (f, ")) goto %s;\n", fail_label);
>   }
> +  output_line_directive (f, s->result_location);
>if (s->result->type == operand::OP_EXPR)
>   {
> e = static_cast  (s->result);
> @@ -659,39

Re: [PATCH][match-and-simplify] Annotate generated source with line directives

2014-06-05 Thread Richard Biener

On Thu, 5 Jun 2014, Richard Earnshaw wrote:

> On 05/06/14 14:25, Richard Biener wrote:
> > 
> > The following makes genmatch annotate gimple-match.c with (commented
> > for now) line directives, similar to other generator programs.
> > This should help associating generated code with parts in match.pd.
> > 
> 
> I've often found these annotations more of a hindrance than a help
> (mostly because when they return to pure boilerplate code there's no
> line directive to switch back to the main C file and we end with the
> debugger pointing at a random bit of the gen source that's irrelevant).
> 
> It would be nice if there were some way to turn this annotation off
> (perhaps with a configury option)... or maybe to write the source file
> lines as comments in the auto-generated code.

Yeah, that's what I do right now (because for debugging the generated
code it's not really useful but a hindrance as you say).

> Perhaps there already is one, but if there is, I don't know about it... :-(

Not that I know.

Richard.

> R.
> 
> > Bootstrapped on x86_64-unknown-linux-gnu, applied.
> > 
> > Richard.
> > 
> > 2014-06-05  Richard Biener  
> > 
> > * genmatch.c (output_line_directive): New function.
> > (struct simplify): Add locations for match, ifexpr and result.
> > (write_nary_simplifiers): Annotate the source with line
> > directives.
> > (parse_match_and_simplify): Record locations of match, ifexpr
> > and result part.
> > (main): Adjust.
> > 
> > Index: gcc/genmatch.c
> > ===
> > --- gcc/genmatch.c  (revision 211234)
> > +++ gcc/genmatch.c  (working copy)
> > @@ -31,6 +31,52 @@ along with GCC; see the file COPYING3.
> >  #include "vec.h"
> >  
> >  
> > +/* libccp helpers.  */
> > +
> > +static struct line_maps *line_table;
> > +
> > +static bool
> > +#if GCC_VERSION >= 4001
> > +__attribute__((format (printf, 6, 0)))
> > +#endif
> > +error_cb (cpp_reader *, int, int, source_location location,
> > + unsigned int, const char *msg, va_list *ap)
> > +{
> > +  const line_map *map;
> > +  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, 
> > &map);
> > +  expanded_location loc = linemap_expand_location (line_table, map, 
> > location);
> > +  fprintf (stderr, "%s:%d:%d error: ", loc.file, loc.line, loc.column);
> > +  vfprintf (stderr, msg, *ap);
> > +  fprintf (stderr, "\n");
> > +  exit (1);
> > +}
> > +
> > +static void
> > +#if GCC_VERSION >= 4001
> > +__attribute__((format (printf, 2, 3)))
> > +#endif
> > +fatal_at (const cpp_token *tk, const char *msg, ...)
> > +{
> > +  va_list ap;
> > +  va_start (ap, msg);
> > +  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
> > +  va_end (ap);
> > +}
> > +
> > +static void
> > +output_line_directive (FILE *f, source_location location)
> > +{
> > +  const line_map *map;
> > +  linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, 
> > &map);
> > +  expanded_location loc = linemap_expand_location (line_table, map, 
> > location);
> > +  /* Other gen programs really output line directives here, at least for
> > + development it's right now more convenient to have line information
> > + from the generated file.  Still keep the directives as comment for now
> > + to easily back-point to the meta-description.  */
> > +  fprintf (f, "/* #line %d \"%s\" */\n", loc.line, loc.file);
> > +}
> > +
> > +
> >  /* Grammar
> >  
> >   capture = '@' number
> > @@ -247,13 +293,19 @@ e_operation::e_operation (const char *id
> >  
> >  struct simplify {
> >simplify (const char *name_,
> > -   struct operand *match_, struct operand *ifexpr_,
> > -   struct operand *result_)
> > -  : name (name_), match (match_), ifexpr (ifexpr_), result (result_) {}
> > +   struct operand *match_, source_location match_location_,
> > +   struct operand *ifexpr_, source_location ifexpr_location_,
> > +   struct operand *result_, source_location result_location_)
> > +  : name (name_), match (match_), match_location (match_location_),
> > +  ifexpr (ifexpr_), ifexpr_location (ifexpr_location_),
> > +  result (result_), result_location (result_location_) {}
> >const char *name;
> >struct operand *match;
> > +  source_location match_location;
> >struct operand *ifexpr;
> > +  source_location ifexpr_location;
> >struct operand *result;
> > +  source_location result_location;
> >  };
> >  
> >  
> > @@ -529,6 +581,7 @@ write_nary_simplifiers (FILE *f, vec > continue;
> >char fail_label[16];
> >snprintf (fail_label, 16, "fail%d", label_cnt++);
> > +  output_line_directive (f, s->match_location);
> >fprintf (f, "  if (code == %s)\n", e->operation->op->id);
> >fprintf (f, "{\n");
> >fprintf (f, "  tree captures[4] = {};\n");
> > @@ -540,10 +593,12 @@ write_nary_simplifiers (FILE *f, vec > }
> >if (s->ifexpr)
> > {
> > + output

[PATCH] Move get_addr_base_and_unit_offset_1() out of tree-dfa.h

2014-06-05 Thread Andrew MacLeod

I'd like to move this rather large inline function out of the header 
file and into the .c file.  The function has the following comment:


/* ??? This is a static inline here to avoid the overhead of the 
indirect calls
+to VALUEIZE.  But is this overhead really that significant? And 
should we

+perhaps just rely on WHOPR to specialize the function?  */


I highly doubt we'd be able to measure any compile time difference by 
not inlining this, however due diligence:


get_addr_base_and_unit_offset_1 () is only called from 2 files.
 - tree-dfa.c : Being moved here, so not an issue.

 - gimple-fold.c : Only called from gimple_fold_stmt_to_constant_1 ().  
This function is called internally only from 
gimple_fold_stmt_to_constant ().  Both functions also take a passed in 
VALUEIZE function pointer and pass it on.


*All* calls to the gimple_fold_stmt_to_constant* functions occur 
*outside* of gimple-fold.c, so there would never be any inlined versions 
that remove the indirect call to VALUEIZE anyway.


Bootstrapped on x86_64-unknown-linux-gnu, regressions running.
Assuming no new failures, OK for trunk?

Andrew


	* tree-dfa.h (get_addr_base_and_unit_offset_1): Move from here.
	* tree-dfa.c (get_addr_base_and_unit_offset_1): To here.

Index: tree-dfa.c
===
*** tree-dfa.c	(revision 211144)
--- tree-dfa.c	(working copy)
*** get_ref_base_and_extent (tree exp, HOST_
*** 664,669 
--- 664,808 
  /* Returns the base object and a constant BITS_PER_UNIT offset in *POFFSET that
 denotes the starting address of the memory access EXP.
 Returns NULL_TREE if the offset is not constant or any component
+is not BITS_PER_UNIT-aligned.
+VALUEIZE if non-NULL is used to valueize SSA names.  It should return
+its argument or a constant if the argument is known to be constant.  */
+ 
+ tree
+ get_addr_base_and_unit_offset_1 (tree exp, HOST_WIDE_INT *poffset,
+  tree (*valueize) (tree))
+ {
+   HOST_WIDE_INT byte_offset = 0;
+ 
+   /* Compute cumulative byte-offset for nested component-refs and array-refs,
+  and find the ultimate containing object.  */
+   while (1)
+ {
+   switch (TREE_CODE (exp))
+ 	{
+ 	case BIT_FIELD_REF:
+ 	  {
+ 	HOST_WIDE_INT this_off = TREE_INT_CST_LOW (TREE_OPERAND (exp, 2));
+ 	if (this_off % BITS_PER_UNIT)
+ 	  return NULL_TREE;
+ 	byte_offset += this_off / BITS_PER_UNIT;
+ 	  }
+ 	  break;
+ 
+ 	case COMPONENT_REF:
+ 	  {
+ 	tree field = TREE_OPERAND (exp, 1);
+ 	tree this_offset = component_ref_field_offset (exp);
+ 	HOST_WIDE_INT hthis_offset;
+ 
+ 	if (!this_offset
+ 		|| TREE_CODE (this_offset) != INTEGER_CST
+ 		|| (TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field))
+ 		% BITS_PER_UNIT))
+ 	  return NULL_TREE;
+ 
+ 	hthis_offset = TREE_INT_CST_LOW (this_offset);
+ 	hthis_offset += (TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field))
+ 			 / BITS_PER_UNIT);
+ 	byte_offset += hthis_offset;
+ 	  }
+ 	  break;
+ 
+ 	case ARRAY_REF:
+ 	case ARRAY_RANGE_REF:
+ 	  {
+ 	tree index = TREE_OPERAND (exp, 1);
+ 	tree low_bound, unit_size;
+ 
+ 	if (valueize
+ 		&& TREE_CODE (index) == SSA_NAME)
+ 	  index = (*valueize) (index);
+ 
+ 	/* If the resulting bit-offset is constant, track it.  */
+ 	if (TREE_CODE (index) == INTEGER_CST
+ 		&& (low_bound = array_ref_low_bound (exp),
+ 		TREE_CODE (low_bound) == INTEGER_CST)
+ 		&& (unit_size = array_ref_element_size (exp),
+ 		TREE_CODE (unit_size) == INTEGER_CST))
+ 	  {
+ 		offset_int woffset
+ 		  = wi::sext (wi::to_offset (index) - wi::to_offset (low_bound),
+ 			  TYPE_PRECISION (TREE_TYPE (index)));
+ 		woffset *= wi::to_offset (unit_size);
+ 		byte_offset += woffset.to_shwi ();
+ 	  }
+ 	else
+ 	  return NULL_TREE;
+ 	  }
+ 	  break;
+ 
+ 	case REALPART_EXPR:
+ 	  break;
+ 
+ 	case IMAGPART_EXPR:
+ 	  byte_offset += TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (exp)));
+ 	  break;
+ 
+ 	case VIEW_CONVERT_EXPR:
+ 	  break;
+ 
+ 	case MEM_REF:
+ 	  {
+ 	tree base = TREE_OPERAND (exp, 0);
+ 	if (valueize
+ 		&& TREE_CODE (base) == SSA_NAME)
+ 	  base = (*valueize) (base);
+ 
+ 	/* Hand back the decl for MEM[&decl, off].  */
+ 	if (TREE_CODE (base) == ADDR_EXPR)
+ 	  {
+ 		if (!integer_zerop (TREE_OPERAND (exp, 1)))
+ 		  {
+ 		offset_int off = mem_ref_offset (exp);
+ 		byte_offset += off.to_short_addr ();
+ 		  }
+ 		exp = TREE_OPERAND (base, 0);
+ 	  }
+ 	goto done;
+ 	  }
+ 
+ 	case TARGET_MEM_REF:
+ 	  {
+ 	tree base = TREE_OPERAND (exp, 0);
+ 	if (valueize
+ 		&& TREE_CODE (base) == SSA_NAME)
+ 	  base = (*valueize) (base);
+ 
+ 	/* Hand back the decl for MEM[&decl, off].  */
+ 	if (TREE_CODE (base) == ADDR_EXPR)
+ 	  {
+ 		if (TMR_INDEX (exp) || TMR_INDEX2 (exp))
+ 		  return NULL_TREE;
+ 		if (!integer_zerop (TMR_OFFSET (exp)))
+ 		  {
+ 		offset_int off = mem_

Re: [PATCH] Move get_addr_base_and_unit_offset_1() out of tree-dfa.h

2014-06-05 Thread Richard Biener

On Thu, Jun 5, 2014 at 3:46 PM, Andrew MacLeod  wrote:
> I'd like to move this rather large inline function out of the header file
> and into the .c file.  The function has the following comment:
>
> /* ??? This is a static inline here to avoid the overhead of the indirect
> calls
> +to VALUEIZE.  But is this overhead really that significant? And should
> we
> +perhaps just rely on WHOPR to specialize the function?  */
>
>
> I highly doubt we'd be able to measure any compile time difference by not
> inlining this, however due diligence:
>
> get_addr_base_and_unit_offset_1 () is only called from 2 files.
>  - tree-dfa.c : Being moved here, so not an issue.
>
>  - gimple-fold.c : Only called from gimple_fold_stmt_to_constant_1 ().  This
> function is called internally only from gimple_fold_stmt_to_constant ().
> Both functions also take a passed in VALUEIZE function pointer and pass it
> on.
>
> *All* calls to the gimple_fold_stmt_to_constant* functions occur *outside*
> of gimple-fold.c, so there would never be any inlined versions that remove
> the indirect call to VALUEIZE anyway.
>
> Bootstrapped on x86_64-unknown-linux-gnu, regressions running.
> Assuming no new failures, OK for trunk?

Ok.

Thanks,
Richard.

> Andrew
>

Re: libgo patch committed: Merge from revision 18783 of master

2014-06-05 Thread Ian Lance Taylor

On Thu, Jun 5, 2014 at 3:24 AM, Matthias Klose  wrote:
> Am 05.06.2014 03:28, schrieb Ian Lance Taylor:
>> I have committed a patch to libgo to merge from revision
>> 18783:00cce3a34d7e of the master library.  This revision was committed
>> January 7.  I picked this revision to merge to because the next revision
>> deleted a file that is explicitly merged in by the libgo/merge.sh
>> script.
>>
>> Among other things, this patch changes type descriptors to add a new
>> pointer to a zero value.  In gccgo this is implemented as a common
>> variable, and that requires some changes to the compiler and a small
>> change to go-gcc.cc.
>>
>> As usual the patch is too large to include in this e-mail message.  I've
>> appended the changes to parts of libgo that are more gccgo-specific.
>>
>> Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
>> Committed to mainline.
>
> Is it time to bump the soname on trunk?

Yes, I'll do that when I've merged gccgo all the way up to the Go 1.3
release.

Ian

[GOMP4, COMMITTED] OpenACC present data clause.

2014-06-05 Thread Thomas Schwinge

From: tschwinge 

gcc/
* gimplify.c (gimplify_scan_omp_clauses) :
Don't block OMP_CLAUSE_MAP_FORCE_PRESENT.
gcc/testsuite/
* c-c++-common/goacc/data-clause-duplicate-1.c: Extend.
* c-c++-common/goacc/present-1.c: New file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211277 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  5 +
 gcc/gimplify.c |  1 -
 gcc/testsuite/ChangeLog.gomp   |  5 +
 gcc/testsuite/c-c++-common/goacc/data-clause-duplicate-1.c |  2 ++
 gcc/testsuite/c-c++-common/goacc/present-1.c   | 11 +++
 5 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/present-1.c

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 011fe77..7371aa5 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-06-05  Thomas Schwinge  
+
+   * gimplify.c (gimplify_scan_omp_clauses) :
+   Don't block OMP_CLAUSE_MAP_FORCE_PRESENT.
+
 2014-06-04  Thomas Schwinge  
 
* cgraphunit.c (ipa_passes, compile): Handle flag_openacc next to
diff --git gcc/gimplify.c gcc/gimplify.c
index e98e6e5..6eaf6fd 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6014,7 +6014,6 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
case OMP_CLAUSE_MAP:
  switch (OMP_CLAUSE_MAP_KIND (c))
{
-   case OMP_CLAUSE_MAP_FORCE_PRESENT:
case OMP_CLAUSE_MAP_FORCE_DEALLOC:
case OMP_CLAUSE_MAP_FORCE_DEVICEPTR:
  input_location = OMP_CLAUSE_LOCATION (c);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 78882c0..4e0ee28 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-06-05  Thomas Schwinge  
+
+   * c-c++-common/goacc/data-clause-duplicate-1.c: Extend.
+   * c-c++-common/goacc/present-1.c: New file.
+
 2014-03-20  Thomas Schwinge  
 
* c-c++-common/goacc-gomp/nesting-1.c: New file.
diff --git gcc/testsuite/c-c++-common/goacc/data-clause-duplicate-1.c 
gcc/testsuite/c-c++-common/goacc/data-clause-duplicate-1.c
index 4cb3cc2..5c5ab02 100644
--- gcc/testsuite/c-c++-common/goacc/data-clause-duplicate-1.c
+++ gcc/testsuite/c-c++-common/goacc/data-clause-duplicate-1.c
@@ -10,4 +10,6 @@ fun (void)
   /* { dg-error "'fp' appears more than once in map clauses" "" { target *-*-* 
} 9 } */
   /* { dg-message "sorry, unimplemented: data clause not yet implemented" "" { 
target *-*-* } 9 } */
   ;
+#pragma acc data create(fp) present(fp) /* { dg-error "'fp' appears more than 
once in map clauses" } */
+  ;
 }
diff --git gcc/testsuite/c-c++-common/goacc/present-1.c 
gcc/testsuite/c-c++-common/goacc/present-1.c
new file mode 100644
index 000..03ee592
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/present-1.c
@@ -0,0 +1,11 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+void
+f (char *cp)
+{
+#pragma acc parallel present(cp[7:9])
+  ;
+}
+
+/* { dg-final { scan-tree-dump-times "#pragma acc parallel 
map\\(force_present:\\*\\(cp \\+ 7\\) \\\[len: 9]\\) map\\(alloc:cp \\\[pointer 
assign, bias: 7]\\)" 1 "original" } } */
+/* { dg-final { cleanup-tree-dump "original" } } */
-- 
1.9.1

[GOMP4, COMMITTED] OpenACC deviceptr clause.

2014-06-05 Thread Thomas Schwinge

From: tschwinge 

gcc/c/
* c-typeck.c (handle_omp_array_sections, c_finish_omp_clauses):
Handle OMP_CLAUSE_MAP_FORCE_DEVICEPTR.
gcc/
* gimplify.c (gimplify_scan_omp_clauses)
(gimplify_adjust_omp_clauses): Handle
OMP_CLAUSE_MAP_FORCE_DEVICEPTR.
* omp-low.c (scan_sharing_clauses, lower_oacc_offload)
(lower_omp_target): Likewise.
* tree-core.h (enum omp_clause_map_kind)
: Update comment.
gcc/testsuite/
* c-c++-common/goacc/data-clause-duplicate-1.c: The OpenACC
deviceptr clause is now supported.
* c-c++-common/goacc/deviceptr-1.c: Extend.
* c-c++-common/goacc/deviceptr-2.c: New file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211278 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  8 +++
 gcc/c/ChangeLog.gomp   |  5 ++
 gcc/c/c-typeck.c   |  5 +-
 gcc/gimplify.c |  7 ++-
 gcc/omp-low.c  | 60 +++---
 gcc/testsuite/ChangeLog.gomp   |  5 ++
 .../c-c++-common/goacc/data-clause-duplicate-1.c   |  4 +-
 gcc/testsuite/c-c++-common/goacc/deviceptr-1.c | 22 +++-
 gcc/testsuite/c-c++-common/goacc/deviceptr-2.c | 23 +
 gcc/tree-core.h|  3 +-
 10 files changed, 127 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/deviceptr-2.c

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 7371aa5..88f09b3 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,13 @@
 2014-06-05  Thomas Schwinge  
 
+   * gimplify.c (gimplify_scan_omp_clauses)
+   (gimplify_adjust_omp_clauses): Handle
+   OMP_CLAUSE_MAP_FORCE_DEVICEPTR.
+   * omp-low.c (scan_sharing_clauses, lower_oacc_offload)
+   (lower_omp_target): Likewise.
+   * tree-core.h (enum omp_clause_map_kind)
+   : Update comment.
+
* gimplify.c (gimplify_scan_omp_clauses) :
Don't block OMP_CLAUSE_MAP_FORCE_PRESENT.
 
diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp
index 91978db..1e80031 100644
--- gcc/c/ChangeLog.gomp
+++ gcc/c/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-06-05  Thomas Schwinge  
+
+   * c-typeck.c (handle_omp_array_sections, c_finish_omp_clauses):
+   Handle OMP_CLAUSE_MAP_FORCE_DEVICEPTR.
+
 2014-03-20  Thomas Schwinge  
 
* c-parser.c: Update comments.
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index c4ba531..839cdf7 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -11747,6 +11747,7 @@ handle_omp_array_sections (tree c)
   OMP_CLAUSE_SIZE (c) = size;
   if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
return false;
+  gcc_assert (OMP_CLAUSE_MAP_KIND (c) != OMP_CLAUSE_MAP_FORCE_DEVICEPTR);
   tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
   OMP_CLAUSE_MAP_KIND (c2) = OMP_CLAUSE_MAP_POINTER;
   if (!c_mark_addressable (t))
@@ -12168,7 +12169,9 @@ c_finish_omp_clauses (tree clauses)
  else if (!c_mark_addressable (t))
remove = true;
  else if (!(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
-&& OMP_CLAUSE_MAP_KIND (c) == OMP_CLAUSE_MAP_POINTER)
+&& (OMP_CLAUSE_MAP_KIND (c) == OMP_CLAUSE_MAP_POINTER
+|| (OMP_CLAUSE_MAP_KIND (c)
+== OMP_CLAUSE_MAP_FORCE_DEVICEPTR)))
   && !lang_hooks.types.omp_mappable_type (TREE_TYPE (t)))
{
  error_at (OMP_CLAUSE_LOCATION (c),
diff --git gcc/gimplify.c gcc/gimplify.c
index 6eaf6fd..a1b6be6 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6015,7 +6015,6 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  switch (OMP_CLAUSE_MAP_KIND (c))
{
case OMP_CLAUSE_MAP_FORCE_DEALLOC:
-   case OMP_CLAUSE_MAP_FORCE_DEVICEPTR:
  input_location = OMP_CLAUSE_LOCATION (c);
  /* TODO.  */
  sorry ("data clause not yet implemented");
@@ -6533,6 +6532,12 @@ gimplify_adjust_omp_clauses (tree *list_p)
   && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST
   && OMP_CLAUSE_MAP_KIND (c) != OMP_CLAUSE_MAP_POINTER)
{
+ /* For OMP_CLAUSE_MAP_FORCE_DEVICEPTR, we'll never enter here,
+because for these, TREE_CODE (DECL_SIZE (decl)) will always be
+INTEGER_CST.  */
+ gcc_assert (OMP_CLAUSE_MAP_KIND (c)
+ != OMP_CLAUSE_MAP_FORCE_DEVICEPTR);
+
  tree decl2 = DECL_VALUE_EXPR (decl);
  gcc_assert (TREE_CODE (decl2) == INDIRECT_REF);
  decl2 = TREE_OPERAND (decl2, 0);
diff --git gcc/omp-low.c gcc/omp-low.c
index 3e282c0..39f0598 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -1708,6 +1708,18 @@ scan_sharing_cla

[AArch64] Make sure start callee-save offset for D registers aligned

2014-06-05 Thread Jiong Wang


For AArch64, there may have been an odd num core registers need to be saved.

This small patch ensure we remain 16 byte aligned for subsequent STP writes of 
D registers.

OK for trunk?

thanks.

gcc/
  * config/aarch64/aarch64.c (aarch64_layout_frame): Make sure start offset
for vector registers in callee-saved area 16-byte aligned.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index aada704..c4abf1e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1793,6 +1793,10 @@ aarch64_layout_frame (void)
 	offset += UNITS_PER_WORD;
   }
 
+  /* Align offset to 16-bytes.
+ There may have been an odd num core registers. Ensure we remain
+ 16 byte aligned for subsequent STP writes of D registers.  */
+  offset = AARCH64_ROUND_UP (offset, 2 * UNITS_PER_WORD);
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
 if (cfun->machine->frame.reg_offset[regno] == SLOT_REQUIRED)
   {

[Google/4_8] Fix testsuite/gcc.dg/ipa/ipa-sra-6.c

2014-06-05 Thread Teresa Johnson

I just committed the following patch to google/4_8 as 211279. This
fixes a test failure with my backport of tree-sra fix r211180 from
trunk.

It turns out that the bug I fixed affected this test case, but because
the dumping format is slightly different between google/4_8 and both
google/4_9 and trunk, it didn't show up as a test failure for
either google/4_9 or trunk.

Essentially, after my fix, both in google/4_8 and google/4_9 (and
presumably trunk), I am getting more output in the eipa_sra dump
output. Looks like we in fact were previously not properly updating a
recursive call due to this same issue that I was fixing, although it
didn't manifest as an ICE like in the case I fixed. With the fix, we
now properly update a recursive call being optimized by SRA. The test
case is expecting to see exactly one occurrence of "foo " in the dump
output. In google/4_8, one of the new dump lines matches because it
looks like:

Adjusting call (0 -> 2) foo -> foo.isra.0

In google/4_9 and trunk, the additional dump lines don't match because
the node's order number is being printed after the name:

Adjusting call foo/0 -> foo.isra.0/2


2014-06-05  Teresa Johnson  

* testsuite/gcc.dg/ipa/ipa-sra-6.c: Update to handle
recent tree-sra.c fix.

Index: testsuite/gcc.dg/ipa/ipa-sra-6.c
===
--- testsuite/gcc.dg/ipa/ipa-sra-6.c (revision 210862)
+++ testsuite/gcc.dg/ipa/ipa-sra-6.c (working copy)
@@ -30,5 +30,5 @@ int main (int argc, char *argv[])
   return foo (&cow, 0);
 }

-/* { dg-final { scan-tree-dump-times "foo " 1 "eipa_sra"  } } */
+/* { dg-final { scan-tree-dump-times "foo " 2 "eipa_sra"  } } */
 /* { dg-final { cleanup-tree-dump "eipa_sra" } } */


Teresa

-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [C++ RFH] PR 56961

2014-06-05 Thread Paolo Carlini


Hi,

On 06/05/2014 03:35 PM, Richard Biener wrote:

On Thu, Jun 5, 2014 at 3:26 PM, Paolo Carlini  wrote:

Hi,


On 06/05/2014 03:20 PM, Richard Biener wrote:

I think the operands have to be reversed though - the type matches that of
op0. Sorry ;)

Something like this, then?

Yes.  I suppose it's ok to re-order side-effects lhs, rhs to rhs, lhs?
Otherwise you'd need to do sth like

   op0 = save_expr (op0);
   *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
 op0,
 build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
build_fold_addr_expr (op1), op0));

(which may or may not work or be a good idea with zero-size aggregate op0)
In any case, I think that we would not regress on this... and, well, we 
have plenty of time for further tweaks (this goes to mainline only, of 
course). The below passes testing, if nobody has further comments 
tomorrow I will commit it.


Thanks again,
Paolo.

//
/cp
2014-06-05  Richard Biener  
Paolo Carlini  

PR c++/56961
* cp-gimplify.c (cp_gimplify_expr, [MODIFY_EXPR]): Rework
handling of empty classes.

/testsuite
2014-06-05  Richard Biener  
Paolo Carlini  

PR c++/56961
* g++.dg/parse/pr56961.C: New.
Index: cp/cp-gimplify.c
===
--- cp/cp-gimplify.c(revision 211274)
+++ cp/cp-gimplify.c(working copy)
@@ -629,19 +629,12 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p,
 
   Also drop volatile variables on the RHS to avoid infinite
   recursion from gimplify_expr trying to load the value.  */
-   if (!TREE_SIDE_EFFECTS (op1)
-   || (DECL_P (op1) && TREE_THIS_VOLATILE (op1)))
+   if (!TREE_SIDE_EFFECTS (op1))
  *expr_p = op0;
-   else if (TREE_CODE (op1) == MEM_REF
-&& TREE_THIS_VOLATILE (op1))
- {
-   /* Similarly for volatile MEM_REFs on the RHS.  */
-   if (!TREE_SIDE_EFFECTS (TREE_OPERAND (op1, 0)))
- *expr_p = op0;
-   else
- *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
-   TREE_OPERAND (op1, 0), op0);
- }
+   else if (TREE_THIS_VOLATILE (op1)
+&& (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
+ *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
+   build_fold_addr_expr (op1), op0);
else
  *expr_p = build2 (COMPOUND_EXPR, TREE_TYPE (*expr_p),
op0, op1);
Index: testsuite/g++.dg/parse/pr56961.C
===
--- testsuite/g++.dg/parse/pr56961.C(revision 0)
+++ testsuite/g++.dg/parse/pr56961.C(working copy)
@@ -0,0 +1,16 @@
+// PR c++/56961
+
+struct foo { };
+
+typedef struct
+{
+  volatile foo fields;
+} CSPHandleState;
+ 
+CSPHandleState a;
+
+void fn1 ()
+{
+  CSPHandleState b;
+  b.fields = foo();  // { dg-error "discards qualifiers" }
+}

Re: [PATCH, Fortan] fix initialization of flag_errno_math and flag_associative_math

2014-06-05 Thread Dominique Dhumieres

> Ok for trunk. Thanks!

Please don't rush! The behavior of -fno-signed-zeros -fno-trapping-math
implying associative math has been changed (as in reverted) between r165758
(implied associative math) and r165930 (lost associative math). AFAICT
it could be due to 165823. Investigating! I am also lookinf for the introduction
of *frontend_set*.

TIA

Dominique

[PATCH] Remove some (now) pointless current_loops checks

2014-06-05 Thread Richard Biener


This removes the obvious ones.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-06-05  Richard Biener  

* cfgexpand.c (expand_gimple_cond): Remove check for current_loops.
(construct_init_block): Likewise.
(construct_exit_block): Likewise.
(pass_expand::execute): Likewise.
* graphite.c (graphite_transforms): Replace check for current_loops
with a check for > 1 loops.
(pass_graphite_transforms::execute): Adjust.
* ipa-split.c (split_function): Remove check for current_loops.
* omp-low.c (expand_parallel_call): Likewise.
(expand_omp_for_init_counts): Likewise.
(extract_omp_for_update_vars): Likewise.
(expand_omp_for_generic): Likewise.
(expand_omp_sections): Likewise.
(expand_omp_target): Likewise.
* tracer.c (tail_duplicate): Likewise.
(pass_tracer::execute): Likewise.
* trans-mem.c (expand_transaction): Likewise.
* tree-complex.c (expand_complex_div_wide): Likewise.
* tree-eh.c (lower_resx): Likewise.
(cleanup_empty_eh_merge_phis): Likewise.
* tree-predcom.c (run_tree_predictive_commoning): Replace check for
current_loops with a check for > 1 loops.
(pass_predcom::execute): Adjust.
* tree-scalar-evolution.c (scev_reset): Remove check for current_loops.
* tree-ssa-copy.c (copy_prop_visit_phi_node): Likewise.
* tree-ssa-dom.c (pass_phi_only_cprop::execute): Likewise.
* tree-ssa-tail-merge.c (tail_merge_optimize): Likewise.
* tree-ssa-threadupdate.c (thread_through_all_blocks): Likewise.
* tree-switch-conversion.c (process_switch): Likewise.
* tree-tailcall.c (tree_optimize_tail_calls_1): Likewise.
* tree-vrp.c (vrp_visit_phi_node): Likewise.
(execute_vrp): Likewise.
* ubsan.c (ubsan_expand_null_ifn): Likewise.

Index: trunk/gcc/cfgexpand.c
===
*** trunk.orig/gcc/cfgexpand.c  2014-06-03 12:09:30.292420262 +0200
--- trunk/gcc/cfgexpand.c   2014-06-05 14:48:59.157864386 +0200
*** expand_gimple_cond (basic_block bb, gimp
*** 2170,2177 
false_edge->flags |= EDGE_FALLTHRU;
new_bb->count = false_edge->count;
new_bb->frequency = EDGE_FREQUENCY (false_edge);
!   if (current_loops && bb->loop_father)
! add_bb_to_loop (new_bb, bb->loop_father);
new_edge = make_edge (new_bb, dest, 0);
new_edge->probability = REG_BR_PROB_BASE;
new_edge->count = new_bb->count;
--- 2170,2176 
false_edge->flags |= EDGE_FALLTHRU;
new_bb->count = false_edge->count;
new_bb->frequency = EDGE_FREQUENCY (false_edge);
!   add_bb_to_loop (new_bb, bb->loop_father);
new_edge = make_edge (new_bb, dest, 0);
new_edge->probability = REG_BR_PROB_BASE;
new_edge->count = new_bb->count;
*** construct_init_block (void)
*** 5276,5283 
   ENTRY_BLOCK_PTR_FOR_FN (cfun));
init_block->frequency = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
init_block->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
!   if (current_loops && ENTRY_BLOCK_PTR_FOR_FN (cfun)->loop_father)
! add_bb_to_loop (init_block, ENTRY_BLOCK_PTR_FOR_FN (cfun)->loop_father);
if (e)
  {
first_block = e->dest;
--- 5275,5281 
   ENTRY_BLOCK_PTR_FOR_FN (cfun));
init_block->frequency = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
init_block->count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
!   add_bb_to_loop (init_block, ENTRY_BLOCK_PTR_FOR_FN (cfun)->loop_father);
if (e)
  {
first_block = e->dest;
*** construct_exit_block (void)
*** 5355,5362 
exit_block = create_basic_block (NEXT_INSN (head), end, prev_bb);
exit_block->frequency = EXIT_BLOCK_PTR_FOR_FN (cfun)->frequency;
exit_block->count = EXIT_BLOCK_PTR_FOR_FN (cfun)->count;
!   if (current_loops && EXIT_BLOCK_PTR_FOR_FN (cfun)->loop_father)
! add_bb_to_loop (exit_block, EXIT_BLOCK_PTR_FOR_FN (cfun)->loop_father);
  
ix = 0;
while (ix < EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds))
--- 5353,5359 
exit_block = create_basic_block (NEXT_INSN (head), end, prev_bb);
exit_block->frequency = EXIT_BLOCK_PTR_FOR_FN (cfun)->frequency;
exit_block->count = EXIT_BLOCK_PTR_FOR_FN (cfun)->count;
!   add_bb_to_loop (exit_block, EXIT_BLOCK_PTR_FOR_FN (cfun)->loop_father);
  
ix = 0;
while (ix < EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds))
*** pass_expand::execute (function *fun)
*** 5821,5828 
timevar_push (TV_POST_EXPAND);
/* We are no longer in SSA form.  */
fun->gimple_df->in_ssa_p = false;
!   if (current_loops)
! loops_state_clear (LOOP_CLOSED_SSA);
  
/* Expansion is used by optimization passes too, set maybe_hot_insn_p
   conservatively to true until they are all profile aware.  */
--- 5818,5824 -

Re: [PATCH, Pointer Bounds Checker 19/x] Support bounds in expand

2014-06-05 Thread Ilya Enkovich

On 04 Jun 16:36, Michael Matz wrote:
> Hi,
> 
> On Mon, 2 Jun 2014, Ilya Enkovich wrote:
> 
> > > There is exactly one place (except for the self-recursive ones) where 
> > > you call the new store_expr with a non-null argument for bounds 
> > > target, and it seems to be only necessary for when some sub-expression 
> > > of the RHS is a call.  Can you somehow arrange to move that handling 
> > > to the single place in expand_assignment() so that you don't need to 
> > > change the signature of store_expr?
> > 
> > I see the only nice way to do it - store_expr should return bounds of 
> > expanded exp. Currently it always return NULL_RTX. Does it look better 
> > than a new argument?
> 
> IMHO it does.  That or introducing a new store_expr_with_bounds (with the 
> new argument) and letting store_expr be a wrapper for that, passing the 
> NULL.  Basically anything that avoids adding a new parameter for most of 
> the existing calls to store_expr.
> 
> 
> Ciao,
> Michael.

Here is an updated version using store_expr_with_bounds and store_expr as a 
wrapper for it.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--
gcc/

2014-06-05  Ilya Enkovich  

* calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
(arg_data): Add fields special_slot, pointer_arg and
pointer_offset.
(store_bounds): New.
(emit_call_1): Propagate instrumentation flag for CALL.
(initialize_argument_information): Compute pointer_arg,
pointer_offset and special_slot for pointer bounds arguments.
(finalize_must_preallocate): Preallocate when storing bounds
in bounds table.
(compute_argument_addresses): Skip pointer bounds.
(expand_call): Store bounds into tables separately.  Return
result joined with resulting bounds.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate bounds flag for CALL_EXPR.
(expand_return): Add returned bounds arg.  Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr_with_bounds): New.  Replaces store_expr with new bounds
target argument.  Handle bounds returned by calls.
(store_expr): Now wraps store_expr_with_bounds.
* expr.h (store_expr_with_bounds): New.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(bounds_parm_data): New.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parms_augmented_arg_list): Add bounds of the result
structure pointer as the second argument.
(assign_parm_find_entry_rtl): Mark bounds are never passed on
the stack.
(assign_parm_is_stack_parm): Likewise.
(assign_parm_load_bounds): New.
(assign_bounds): New.
(assign_parms): Load bounds and determine a location for
returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
* function.h (rtl_data): Add field for returned bounds.

diff --git a/gcc/calls.c b/gcc/calls.c
index e1dc8eb..5fbbe9f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -44,11 +44,14 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm_p.h"
 #include "timevar.h"
 #include "sbitmap.h"
+#include "bitmap.h"
 #include "langhooks.h"
 #include "target.h"
 #include "cgraph.h"
 #include "except.h"
 #include "dbgcnt.h"
+#include "tree-chkp.h"
+#include "rtl-chkp.h"

 /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
 #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
@@ -76,6 +79,15 @@ struct arg_data
   /* If REG is a PARALLEL, this is a copy of VALUE pulled into the correct
  form for emit_group_move.  */
   rtx parallel_value;
+  /* If value is passed in neither reg nor stack, this field holds a number
+ of a special slot to be used.  */
+  rtx special_slot;
+  /* For pointer bounds hold an index of parm bounds are bound to.  -1 if
+ there is no such pointer.  */
+  int pointer_arg;
+  /* If pointer_arg refers a structure, then pointer_offset holds an offset
+ of a pointer in this structure.  */
+  int pointer_offset;
   /* If REG was promoted from the actual mode of the argument expression,
  indicates whether the promotion is sign- or zero-extended.  */
   int unsignedp;
@@ -133,6 +145,7 @@ static void emit_call_1 (rtx, tree, tree, tree, 
HOST_WIDE_INT, HOST_WIDE_INT,
 HOST_WIDE_INT, rtx, rtx, int, rtx, int,
 cumulative_args_t);
 static void precompute_register_parameters (int, struct arg_data *, int *);
+static void store_bounds (struct arg_data *, struct arg_data *);
 static int store_one_arg (struct arg_data *, rtx, int, int, int);
 static void store_unaligned_arguments_into_pseudo

[PATCH, x86] Improves x86 permutation expand

2014-06-05 Thread Evgeny Stupachenko

Hi,

The patch passed bootstrap and make check. No new fails.
The patch gives ~10% to test in pr52252 and potentially in pr61403.
Is it ok?

Thanks,
Evgeny

ChangeLog:

2014-06-05  Evgeny Stupachenko  

 * config/i386/i386.c (expand_vec_perm_pblendv): New.
 Permutation expand using pblendv.
 * config/i386/i386.c (ix86_expand_vec_perm_const_1): New
 scheme for permutation expand.

Patch:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8827256..e1c8126 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -43185,6 +43185,76 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d)
   return ok;
 }

+/* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
+   the permutation using the SSE4_1 pblendv instruction.  Potentially
+   reduces permutaion from 2 pshufb and or to 1 pshufb and pblendv.  */
+
+static bool
+expand_vec_perm_pblendv (struct expand_vec_perm_d *d)
+{
+  unsigned i, which, nelt = d->nelt;
+  struct expand_vec_perm_d dcopy, dcopy1;
+
+  /* Figure out where permutation elements stay not in their
+ respective lanes.  */
+  for (i = 0, which = 0; i < nelt; ++i)
+{
+  unsigned e = d->perm[i];
+  if (e != i)
+   which |= (e < nelt ? 1 : 2);
+}
+  /* We can pblend the part where elements stay not in their
+ respective lanes only when these elements are all in one
+ half of a permutation.
+ {0 1 8 3 4 5 9 7} is ok as 8, 9 are not at their respective
+ lanes, but both 8 and 9 >= 8
+ {0 1 8 3 4 5 2 7} is not ok as 2 and 8 are not at their
+ respective lanes and 8 >= 8, but 2 not.  */
+  if (which != 1 && which != 2)
+return false;
+
+  /* First we apply one operand permutation to the part where
+ elements stay not in their respective lanes.  */
+  dcopy = *d;
+  if (which == 2)
+dcopy.op0 = dcopy.op1 = d->op1;
+  else
+dcopy.op0 = dcopy.op1 = d->op0;
+  dcopy.one_operand_p = true;
+
+  for (i = 0; i < nelt; ++i)
+{
+  unsigned e = d->perm[i];
+  if (which == 2)
+   dcopy.perm[i] = ((e >= nelt) ? (e - nelt) : e);
+}
+
+  if (!expand_vec_perm_1 (&dcopy))
+return false;
+
+  /* Next we put permuted elements into thier positions.  */
+  dcopy1 = *d;
+  if (which == 2)
+dcopy1.op1 = dcopy.target;
+  else
+dcopy1.op0 = dcopy.target;
+
+  for (i = 0; i < nelt; ++i)
+{
+  unsigned e = d->perm[i];
+  if (which == 2)
+   dcopy1.perm[i] = ((e >= nelt) ? (nelt + i) : e);
+  else
+   dcopy1.perm[i] = ((e < nelt) ? i : e);
+}
+
+  if (!expand_vec_perm_blend (&dcopy1))
+return false;
+
+  return true;
+}
+
 static bool expand_vec_perm_interleave3 (struct expand_vec_perm_d *d);

 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Try to simplify
@@ -44557,6 +44627,9 @@ ix86_expand_vec_perm_const_1 (struct
expand_vec_perm_d *d)
   if (expand_vec_perm_vperm2f128 (d))
 return true;

+  if (expand_vec_perm_pblendv (d))
+return true;
+
   /* Try sequences of three instructions.  */

   if (expand_vec_perm_2vperm2f128_vshuf (d))

Re: std::quoted doesn't respect padding

2014-06-05 Thread Ed Smith-Rowland


On 04/01/2014 07:33 AM, Jonathan Wakely wrote:

[CCing gcc-patches]

On 11/03/14 11:18 -0400, Ed Smith-Rowland wrote:

On 02/14/2014 07:56 PM, Jonathan Wakely wrote:

We need to implement this fix (probably after 4.9 is released though)

http://cplusplus.github.io/LWG/lwg-active.html#2344


Here is a patch (Stage 1 obviously).


A couple of things I didn't notice earlier ...


Index: include/std/iomanip
===
--- include/std/iomanip(revision 208430)
+++ include/std/iomanip(working copy)
@@ -41,6 +41,7 @@

#if __cplusplus >= 201103L
#include 
+#include  // used in quoted.


We really only need  for __cplusplus > 201103L, otherwise we
include it unnecessarily for C++11.



-return __os;
+return __os << __ostr.str();
  }


It should be slightly more efficient to do __os << __ostr.rdbuf() here,
and in the other operator<< overload, since that copies directly from
the stringbuf to __os's own streambuf, rather than creating a
temporary std::string and copying from that.



Sorry for the hiatus...

Here is a new patch with issues fixed.

OK if it passes testing on x86_64-linux?

Ed

2014-06-05  Ed Smith-Rowland  <3dw...@verizon.net>

DR 2344 - std::quoted doesn't respect padding
* include/std/iomanip: Allow for padding in quoted inserters.
* testsuite/27_io/manipulators/standard/char/dr2344.cc: New.
* testsuite/27_io/manipulators/standard/wchar_t/dr2344.cc: New.

Index: include/std/iomanip
===
--- include/std/iomanip (revision 211281)
+++ include/std/iomanip (working copy)
@@ -41,7 +41,10 @@
 
 #if __cplusplus >= 201103L
 #include 
+#if __cplusplus > 201103L
+#include  // used in quoted.
 #endif
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -342,7 +345,6 @@
 
 /**
  * @brief Struct for delimited strings.
- *The left and right delimiters can be different.
  */
 template
   struct _Quoted_string
@@ -364,8 +366,10 @@
   };
 
 /**
- * @brief Inserter for delimited strings.
- *The left and right delimiters can be different.
+ * @brief Inserter for quoted strings.
+ *
+ *  _GLIBCXX_RESOLVE_LIB_DEFECTS
+ *  DR 2344 quoted()'s interaction with padding is unclear
  */
 template
   auto&
@@ -372,21 +376,24 @@
   operator<<(std::basic_ostream<_CharT, _Traits>& __os,
 const _Quoted_string& __str)
   {
-   __os << __str._M_delim;
+   std::basic_ostringstream<_CharT, _Traits> __ostr;
+   __ostr << __str._M_delim;
for (const _CharT* __c = __str._M_string; *__c; ++__c)
  {
if (*__c == __str._M_delim || *__c == __str._M_escape)
- __os << __str._M_escape;
-   __os << *__c;
+ __ostr << __str._M_escape;
+   __ostr << *__c;
  }
-   __os << __str._M_delim;
+   __ostr << __str._M_delim;
 
-   return __os;
+   return __os << __ostr.rdbuf();
   }
 
 /**
- * @brief Inserter for delimited strings.
- *The left and right delimiters can be different.
+ * @brief Inserter for quoted strings.
+ *
+ *  _GLIBCXX_RESOLVE_LIB_DEFECTS
+ *  DR 2344 quoted()'s interaction with padding is unclear
  */
 template
   auto&
@@ -393,16 +400,17 @@
   operator<<(std::basic_ostream<_CharT, _Traits>& __os,
 const _Quoted_string<_String, _CharT>& __str)
   {
-   __os << __str._M_delim;
+   std::basic_ostringstream<_CharT, _Traits> __ostr;
+   __ostr << __str._M_delim;
for (auto& __c : __str._M_string)
  {
if (__c == __str._M_delim || __c == __str._M_escape)
- __os << __str._M_escape;
-   __os << __c;
+ __ostr << __str._M_escape;
+   __ostr << __c;
  }
-   __os << __str._M_delim;
+   __ostr << __str._M_delim;
 
-   return __os;
+   return __os << __ostr.rdbuf();
   }
 
 /**
Index: testsuite/27_io/manipulators/standard/char/dr2344.cc
===
--- testsuite/27_io/manipulators/standard/char/dr2344.cc(revision 0)
+++ testsuite/27_io/manipulators/standard/char/dr2344.cc(working copy)
@@ -0,0 +1,50 @@
+// { dg-do run }
+// { dg-options "-std=gnu++14" }
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Pu

Re: [PATCH 3/5] introduce the binding oracle

2014-06-05 Thread Tom Tromey

> "Jeff" == Jeff Law  writes:

Jeff> Just a nit.  C-style comment would be appreciated.  It might also help
Jeff> to clarify what "much more sane" really means here.

I made this change locally.
The new comment reads:

  /* Temporarily hide any binding oracle.  Without this, calls to
 debug_tree from the debugger will end up calling into the oracle,
 making for a confusing debug session.  As the oracle isn't needed
 here for normal operation, it's simplest to suppress it.  */

Tom

Re: std::quoted doesn't respect padding

2014-06-05 Thread Jonathan Wakely


On 05/06/14 11:43 -0400, Ed Smith-Rowland wrote:

On 04/01/2014 07:33 AM, Jonathan Wakely wrote:

[CCing gcc-patches]

On 11/03/14 11:18 -0400, Ed Smith-Rowland wrote:

On 02/14/2014 07:56 PM, Jonathan Wakely wrote:

We need to implement this fix (probably after 4.9 is released though)

http://cplusplus.github.io/LWG/lwg-active.html#2344


Here is a patch (Stage 1 obviously).


A couple of things I didn't notice earlier ...


Index: include/std/iomanip
===
--- include/std/iomanip(revision 208430)
+++ include/std/iomanip(working copy)
@@ -41,6 +41,7 @@

#if __cplusplus >= 201103L
#include 
+#include  // used in quoted.


We really only need  for __cplusplus > 201103L, otherwise we
include it unnecessarily for C++11.



-return __os;
+return __os << __ostr.str();
 }


It should be slightly more efficient to do __os << __ostr.rdbuf() here,
and in the other operator<< overload, since that copies directly from
the stringbuf to __os's own streambuf, rather than creating a
temporary std::string and copying from that.



Sorry for the hiatus...

Here is a new patch with issues fixed.

OK if it passes testing on x86_64-linux?


Great, OK for trunk and 4.9 too, I think.

Thanks!

Re: [PATCH]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc

2014-06-05 Thread Richard Earnshaw

On 04/06/14 07:56, Tony Wang wrote:
> Hi there,
> 
> In libgcc the file ieee754-sf.S and ieee754-df.S have some function
> pairs which will be bundled into one .o file and sharing the same
> .text section. For example, the fmul and fdiv, the libgcc makefile
> will build them into one .o file and archived into libgcc.a. So when
> user only call single float point multiply functions, the fdiv
> function will also be linked, and as fmul and fdiv share the same
> .text section, linker option --gc-sections or -flot can’t remove the
> dead code.
> 
> So the optimization just separates the function pair(fmul/fdiv and
> dmul/ddiv) into different sections, following the naming pattern of
> –ffunction-sections(.text.__functionname), through which the unused
> sections of fdiv/ddiv can be eliminated through option --gcc-sections
> when users only use fmul/dmul.The solution is to add a conditional
> statement in the macro FUNC_START, which will conditional change the
> section of a function from .text to .text.__\name. when compiling with
> the L_arm_muldivsf3 or L_arm_muldivdf3 macro.
> 
> There are 3 parts: mul, div and common. This patch puts mul and common
> together, so that user's multiply won't pull-in div, butuser's div
> will still pull-in mul. It is reasonable because size of mul is far
> smaller than size of div.
> 
> ChangLog changes are:
> 
> ***gcc/libgcc/ChangeLog***
> 
> 2014-05-28  Tony Wang  
> 
> * config/arm/lib1funcs.S (FUNC_START): Add conditional section
> redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3
> 
> Bootstrapped on x86_64-linux-gnu and no regression found in the
> testsuite. Patch is in attachment.
> The code reduction for thumb2 on cortex-m3 is:
> 1. When user only use single float point multiply:
> fmul+fdiv => fmul will have a code size reduction of 318 bytes.
> 2. When user only use double float point multiply:
> dmul+ddiv => dmul will have a code size reduction of 474 bytes.
> 
> BR,
> Tony
> 
> 
> libgcc_mul_div_code_size_reduction.diff
> 
> 
> diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
> index b617137..0454bc8 100644
> --- a/libgcc/config/arm/lib1funcs.S
> +++ b/libgcc/config/arm/lib1funcs.S
> @@ -419,7 +419,11 @@ SYM (\name):
>  #endif
>  
>  .macro FUNC_START name
> +#if defined (L_arm_muldivsf3) || defined (L_arm_muldivdf3)
> + .section.text.__\name,"ax",%progbits
> +#else
>   .text
> +#endif
>   .globl SYM (__\name)
>   TYPE (__\name)
>   .align 0
> @@ -468,7 +472,11 @@ _L__\name:
>  #define EQUIV .thumb_set
>  #else
>  .macro   ARM_FUNC_START name
> +#if defined (L_arm_muldivsf3) || defined (L_arm_muldivdf3)
> + .section.text.__\name,"ax",%progbits
> +#else
>   .text
> +#endif
>   .globl SYM (__\name)
>   TYPE (__\name)
>   .align 0
> 

I've two concerns about this:

1) the hacky approach to selecting when to use a separate section
2) the possibility that this will create out-of-range branches between
code fragments that cannot be veneered because the labels are untyped.
This is potentially exacerbated by the way GNU LD orders sections with
different names.

Fixing 1) is relatively straight-forward.  Extend the FUNC_START macro
to take an optional argument that controls whether a special section
name is used.

Fixing 2) is harder.  First you must mark all symbols that cross
fragment boundaries as function symbols (they don't have to be global).
 Secondly, you must ensure that r12 (IP) is not live at such points.
This might involve substantial restructuring to the code (I haven't
checked).

R.

Re: libgo patch committed: Merge from revision 18783 of master

2014-06-05 Thread Uros Bizjak

Hello!

> I have committed a patch to libgo to merge from revision
> 18783:00cce3a34d7e of the master library.  This revision was committed
> January 7.  I picked this revision to merge to because the next revision
> deleted a file that is explicitly merged in by the libgo/merge.sh
> script.

crypto/x509 fails on x86 Fedora20 with:

--- FAIL: TestImports (0.00 seconds)
testing.go:228: failed to run x509_test_import.go: exec: "go":
executable file not found in $PATH
FAIL
FAIL: crypto/x509

Uros.

Re: [patch i386]: Fix PR/46219 Generate indirect jump instruction

2014-06-05 Thread Kai Tietz

Thanks for all your hints.  Here is the updated patch

ChangeLog testsuite

2014-06-05  Kai Tietz  

PR target/46219
* gcc.target/i386/sibcall-4.c: Remove xfail.

ChangeLog

2014-06-05  Kai Tietz  
Richard Henderson  

PR target/46219
   * config/i386/predicates.md (memory_nox32_operand): Add memory_operand
checking for !TARGET_X32.
* config/i386/i386.md (UNSPEC_PEEPSIB): New unspec constant.
(sibcall_intern): New define_insn, plus required peepholes.
(sibcall_pop_intern): Likewise.
(sibcall_value_intern): Likewise.
(sibcall_value_pop_intern): Likewise.

Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu.  Ok for apply?

Index: config/i386/i386.md
===
--- config/i386/i386.md(Revision 211255)
+++ config/i386/i386.md(Arbeitskopie)
@@ -111,6 +111,7 @@
   UNSPEC_LEA_ADDR
   UNSPEC_XBEGIN_ABORT
   UNSPEC_STOS
+  UNSPEC_PEEPSIB

   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -11382,6 +11383,53 @@
   "* return ix86_output_call_insn (insn, operands[0]);"
   [(set_attr "type" "call")])

+(define_insn "*sibcall_intern"
+  [(call (unspec [(mem:QI (match_operand:W 0 "memory_operand"))]
+   UNSPEC_PEEPSIB)
+ (match_operand 1))]
+  ""
+  "* return ix86_output_call_insn (insn, operands[0]);"
+  [(set_attr "type" "call")])
+
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand")
+(match_operand:DI 1 "memory_nox32_operand"))
+   (call (mem:QI (match_dup 0))
+ (match_operand 3))]
+  "TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (1))"
+  [(call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB)
+ (match_dup 3))])
+
+(define_peephole2
+  [(set (match_operand:DI 0 "register_operand")
+(match_operand:DI 1 "memory_nox32_operand"))
+   (unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (call (mem:QI (match_dup 0))
+ (match_operand 3))]
+  "TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (2))"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB)
+ (match_dup 3))])
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+(match_operand:SI 1 "memory_nox32_operand"))
+   (call (mem:QI (match_dup 0))
+ (match_operand 3))]
+  "!TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (1))"
+  [(call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB)
+ (match_dup 3))])
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+(match_operand:SI 1 "memory_nox32_operand"))
+   (unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (call (mem:QI (match_dup 0))
+ (match_operand 3))]
+  "!TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (2))"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB) (match_dup 3))])
+
 (define_expand "call_pop"
   [(parallel [(call (match_operand:QI 0)
 (match_operand:SI 1))
@@ -11415,6 +11463,43 @@
   "* return ix86_output_call_insn (insn, operands[0]);"
   [(set_attr "type" "call")])

+(define_insn "*sibcall_pop_intern"
+  [(call (unspec [(mem:QI (match_operand:SI 0 "memory_operand"))]
+   UNSPEC_PEEPSIB)
+ (match_operand 1))
+   (set (reg:SI SP_REG)
+(plus:SI (reg:SI SP_REG)
+ (match_operand:SI 2 "immediate_operand" "i")))]
+  "!TARGET_64BIT"
+  "* return ix86_output_call_insn (insn, operands[0]);"
+  [(set_attr "type" "call")])
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+(match_operand:SI 1 "memory_nox32_operand"))
+   (parallel [(call (mem:QI (match_dup 0))
+(match_operand 3))
+  (set (reg:SI SP_REG)
+   (match_operand 4))])]
+  "!TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (1))"
+  [(parallel [(call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB)
+(match_dup 3))
+  (set (reg:SI SP_REG) (match_dup 4))])])
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+(match_operand:SI 1 "memory_nox32_operand"))
+   (unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (parallel [(call (mem:QI (match_dup 0))
+(match_operand 3))
+  (set (reg:SI SP_REG)
+   (match_operand 4))])]
+  "!TARGET_64BIT && SIBLING_CALL_P (peep2_next_insn (2))"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)
+   (parallel [(call (unspec [(mem:QI (match_dup 1))] UNSPEC_PEEPSIB)
+(match_dup 3))
+  (set (reg:SI SP_REG) (match_dup 4))])])
+
 ;; Call subroutine, returning value in operand 0

 (define_expand "call_value"
@@ -11457,6 +11542,63 @@
   "* return ix86_output_call_insn (insn, operands[1]);"
   [(set_attr "type" "callv")])

+(define_insn "*sibcall_value_intern"
+  [(set (match_operand 0)
+(call (unspec [(mem:QI (match_operand:W 1 "memory_operand"))]
+UNSPEC_PEEPSIB)
+  (match_operand 2)))]
+  ""
+  "* return ix86_output_call_insn (insn, operands[1]);"
+

Re: [PATCH, x86] Improves x86 permutation expand

2014-06-05 Thread Richard Henderson

On 06/05/2014 08:29 AM, Evgeny Stupachenko wrote:
> +  /* Figure out where permutation elements stay not in their
> + respective lanes.  */
> +  for (i = 0, which = 0; i < nelt; ++i)
> +{
> +  unsigned e = d->perm[i];
> +  if (e != i)
> +   which |= (e < nelt ? 1 : 2);
> +}
> +  /* We can pblend the part where elements stay not in their
> + respective lanes only when these elements are all in one
> + half of a permutation.
> + {0 1 8 3 4 5 9 7} is ok as 8, 9 are not at their respective
> + lanes, but both 8 and 9 >= 8
> + {0 1 8 3 4 5 2 7} is not ok as 2 and 8 are not at their
> + respective lanes and 8 >= 8, but 2 not.  */
> +  if (which != 1 && which != 2)
> +return false;

I was about to suggest that you'd get more success by putting the blend first,
and do the shuffle second.  But I suppose it does cover a few cases that the
other way would miss, e.g.

  { 0 4 7 3 }

because we can't blend 0 and 4 (or 3 and 7) into the same vector.  Whereas the
direction you're trying can't handle

  { 0 6 6 1 }

But that can be implemented with

  { 0 1 2 3 }
  { 4 5 6 7 }
  ---
  { 0 1 6 3 } (pblend)
  ---
  { 0 6 6 1 } (pshufb)

So I guess we should cover these two cases in successive patches.

> +  if (!expand_vec_perm_blend (&dcopy1))
> +return false;
> +
> +  return true;

You should avoid doing any work in this function if the ISA isn't enabled.
Don't wait until the last test for blend to fail.  Separate that out from the
start of expand_vec_perm_blend as a subroutine, perhaps.

We should be able to prove that we've got a valid blend as input here, so I'd
be more inclined to write

  ok = expand_vec_perm_blend (&dcopy1);
  gcc_assert (ok);
  return true;

> +  if (!expand_vec_perm_1 (&dcopy))
> +return false;

If we know we have pblend, then we know we have pshufb, so again I don't see
how expand_vec_perm_1 can fail.  Another assert would be good.

There is a point, earlier in the function, where we know whether we're going to
succeed or not.  I believe just after

> +  if (which != 1 && which != 2)
> +return false;

You should add a

  if (d->testing_p)
return true;

at that point.

r~

Re: [patch i386]: Fix PR/46219 Generate indirect jump instruction

2014-06-05 Thread Richard Henderson

On 06/05/2014 09:47 AM, Kai Tietz wrote:
> +(define_insn "*sibcall_intern"
> +  [(call (unspec [(mem:QI (match_operand:W 0 "memory_operand"))]

Probably best to use memory_nox32_operand here (and the other define_insn
patterns) too.


Otherwise ok.


r~

Re: [C++ RFH] PR 56961

2014-06-05 Thread Jason Merrill


On 06/05/2014 09:35 AM, Richard Biener wrote:

I suppose it's ok to re-order side-effects lhs, rhs to rhs, lhs?


Yes.

Jason

C++ PATCH for c++/61343 (missing init for thread_local)

2014-06-05 Thread Jason Merrill

The bug here was that we were setting 
DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P for a variable where the 
explicitly written initializer is constant, but becomes a non-constant 
constructor call.  Fixed thus.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 210aa8e75e0827163fd3041890404d39d485d23d
Author: Jason Merrill 
Date:   Wed Jun 4 13:28:38 2014 -0400

	PR c++/61343
	* decl.c (check_initializer): Maybe clear
	DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 3d4058c..b068df8 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5856,6 +5856,13 @@ check_initializer (tree decl, tree init, int flags, vec **cleanups)
   if (init && init != error_mark_node)
 init_code = build2 (INIT_EXPR, type, decl, init);
 
+  if (init_code)
+{
+  /* We might have set these in cp_finish_decl.  */
+  DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = false;
+  TREE_CONSTANT (decl) = false;
+}
+
   if (init_code && DECL_IN_AGGR_P (decl))
 {
   static int explained = 0;
diff --git a/gcc/testsuite/g++.dg/tls/thread_local9.C b/gcc/testsuite/g++.dg/tls/thread_local9.C
new file mode 100644
index 000..c75528a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tls/thread_local9.C
@@ -0,0 +1,23 @@
+// PR c++/61343
+
+// { dg-do run { target c++11 } }
+// { dg-add-options tls }
+// { dg-require-effective-target tls_runtime }
+
+struct Foo {
+  int value;
+
+  Foo() noexcept {
+value = 12;
+  }
+};
+
+static thread_local Foo a{};
+
+static __attribute__((noinline)) void UseA() {
+  if (a.value != 12) __builtin_abort();
+}
+
+int main() {
+  UseA();
+}

C++ PATCH to be more helpful about noexcept/thread_local in C++98 mode

2014-06-05 Thread Jason Merrill

We end up in cp_parser_diagnose_invalid_type_name for uses of noexcept 
and thread_local as well as constexpr, and we can give the same helpful 
message.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit df531541273db6a1f885f48f7a4923bfbc437999
Author: Jason Merrill 
Date:   Wed Jun 4 13:29:01 2014 -0400

	* parser.c (cp_parser_diagnose_invalid_type_name): Give helpful note
	for noexcept and thread_local, too.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 60e6cda..7d574d0 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2920,6 +2920,13 @@ cp_parser_diagnose_invalid_type_name (cp_parser *parser,
   if (cxx_dialect < cxx11 && id == ridpointers[(int)RID_CONSTEXPR])
 	inform (location, "C++11 % only available with "
 		"-std=c++11 or -std=gnu++11");
+  else if (cxx_dialect < cxx11 && id == ridpointers[(int)RID_NOEXCEPT])
+	inform (location, "C++11 % only available with "
+		"-std=c++11 or -std=gnu++11");
+  else if (cxx_dialect < cxx11
+	   && !strcmp (IDENTIFIER_POINTER (id), "thread_local"))
+	inform (location, "C++11 % only available with "
+		"-std=c++11 or -std=gnu++11");
   else if (processing_template_decl && current_class_type
 	   && TYPE_BINFO (current_class_type))
 	{

Re: libgo patch committed: Merge from revision 18783 of master

2014-06-05 Thread Ian Lance Taylor

On Thu, Jun 5, 2014 at 9:38 AM, Uros Bizjak  wrote:
>
>> I have committed a patch to libgo to merge from revision
>> 18783:00cce3a34d7e of the master library.  This revision was committed
>> January 7.  I picked this revision to merge to because the next revision
>> deleted a file that is explicitly merged in by the libgo/merge.sh
>> script.
>
> crypto/x509 fails on x86 Fedora20 with:
>
> --- FAIL: TestImports (0.00 seconds)
> testing.go:228: failed to run x509_test_import.go: exec: "go":
> executable file not found in $PATH
> FAIL
> FAIL: crypto/x509

Thanks.  I'll fix this in the next import.

Ian

Re: [Patch] PR55189 enable -Wreturn-type by default

2014-06-05 Thread Joseph S. Myers

On Thu, 5 Jun 2014, Sylvestre Ledru wrote:

> > Some of those patches appear to be addressing cases where control appears 
> > to reach the end of a function returning non-void, as opposed to cases 
> > where the return type defaults to int. 
> Do you have an example of the patches you are talking about?

In 0004-Update-gcc-tests-with-warning-return-type-enabled-by.patch the 
very first change is adding such a "return 0;" (as are lots of others).

> You are talking about code like this one (from Jonathan Wakely) ?
> 
> int f(int c)
> {
> if (c)
>return 0;
> function_that_never_returns();
> }

Yes.

> Initially, I implemented -Wmissing-return to manage this case (
> https://gcc.gnu.org/ml/gcc-patches/2014-01/msg00820.html ) but Jason
> suggested to remove that:
> https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01033.html
> (I don't have a strong opinion on the subject).

I think splitting the option like that makes sense.  Compatibility 
indicates that -Wreturn-type and -Wall should still enable 
-Wmissing-return, but only the other pieces of -Wreturn-type should be 
enabled by default, at least for C.  (Enabling -Wimplicit-int by default 
might be a good starting point.)

Also, at least one testsuite change in your patch is wrong.  You add an 
"int" return type to c90-impl-int-1.c, which is explicitly checking the 
implicit int functionality for C90; use of dg-warning there would be more 
appropriate (since the point is that it doesn't give an error with 
-pedantic-errors).  It would probably also be best not to add 
-Wno-return-type in c99-impl-int-1.c.  (Any places where /* { dg-bogus 
"warning" "warning in place of error" } */ in tests causes problems 
because you get a new warning *in addition* to the existing error can have 
that dg-bogus removed and a dg-warning directive for the warning added - 
dg-warning/dg-error used not to distinguish properly between warnings and 
errors, so requiring such dg-bogus directives if you wanted to test the 
difference, but that was fixed a long time ago.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[PR tree-optimization/61289] Fix equivalence invalidation when threading across loop backedge

2014-06-05 Thread Jeff Law



When I wrote the improved support for threading across backedges I tried 
to minimize the cost to invalidate equivalences.  This led to some 
convoluted code to track things which might need invalidation (at PHI 
nodes), then further hacks to invalidate equivalences implied by 
traversal of particular edges, etc.


This bug is another example of an edge equivalence that needs to be 
invalidated.  And like other edge equivalences that need invalidation, 
there's no chance to track it as the equivalency is created outside the 
threading code.


Rather than layer another hack on the existing hacks, I just ripped out 
the hacks and did the more expensive equivalency invalidation.  In 
reality we're only invalidating after following a backedge in the CFG 
and only if we encounter statements that doesn't produce something 
useful.  So it shouldn't be all that expensive.


Note the tests may look the same, but they're subtly different in that 
they have different orderings of arguments in an equality test, which is 
important for this testcase.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu and 
applied to the trunk.  Will backport to 4.9 after it bakes a bit on the 
trunk.



Jeff
commit 1afaa5b951f3a5dae5d6c2355c1457c7d175e1c9
Author: Jeff Law 
Date:   Thu Jun 5 12:20:42 2014 -0600

PR tree-optimization/61289
* tree-ssa-threadedge.c (invalidate_equivalences): Remove SRC_MAP and
DST_MAP parameters.   Invalidate by walking all the SSA_NAME_VALUES
looking for those which match LHS.  All callers changed.
(record_temporary_equivalences_from_phis): Remove SRC_MAP and DST_MAP
parameters and code which manipulated them.  All callers changed.
(record_temporary_equivalences_from_stmts_at_dest): Remove SRC_MAP
and DST_MAP parameters.  Simplify invalidation code by just calling
invalidate_equivalences.  All callers changed.
(thread_across_edge): Simplify now that we don't need to maintain
the map of equivalences to invalidate.

PR tree-optimization/61289
* g++.dg/pr61289.C: New test.
* g++.dg/pr61289-2.C: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4d88dd2..94a30d4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2014-06-05  Jeff Law  
+
+   PR tree-optimization/61289
+   * tree-ssa-threadedge.c (invalidate_equivalences): Remove SRC_MAP and
+   DST_MAP parameters.   Invalidate by walking all the SSA_NAME_VALUES
+   looking for those which match LHS.  All callers changed.
+   (record_temporary_equivalences_from_phis): Remove SRC_MAP and DST_MAP
+   parameters and code which manipulated them.  All callers changed.
+   (record_temporary_equivalences_from_stmts_at_dest): Remove SRC_MAP
+   and DST_MAP parameters.  Simplify invalidation code by just calling
+   invalidate_equivalences.  All callers changed.
+   (thread_across_edge): Simplify now that we don't need to maintain
+   the map of equivalences to invalidate.
+
 2014-06-05  Kai Tietz  
Richard Henderson  
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 54a4026..5fb5103 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2014-06-05  Jeff Law  
+
+   PR tree-optimization/61289
+   * g++.dg/pr61289.C: New test.
+   * g++.dg/pr61289-2.C: New test.
+
 2014-06-05  Richard Biener  
Paolo Carlini  
 
diff --git a/gcc/testsuite/g++.dg/pr61289-2.c b/gcc/testsuite/g++.dg/pr61289-2.c
new file mode 100644
index 000..4cc3ebe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr61289-2.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-exceptions" } */
+struct S
+{
+  inline int fn1 () const { return s; }
+  __attribute__ ((noinline, noclone)) S *fn2 (int);
+  __attribute__ ((noinline, noclone)) void fn3 ();
+  __attribute__ ((noinline, noclone)) static S *fn4 (int);
+  S (int i) : s (i) {}
+  int s;
+};
+
+int a = 0;
+S *b = 0;
+
+S *
+S::fn2 (int i)
+{
+  a++;
+  if (a == 1)
+return b;
+  if (a > 3)
+__builtin_abort ();
+  b = this;
+  return new S (i + s);
+}
+
+S *
+S::fn4 (int i)
+{
+  b = new S (i);
+  return b;
+}
+
+void
+S::fn3 ()
+{
+  delete this;
+}
+
+void
+foo ()
+{
+  S *c = S::fn4 (20);
+  for (int i = 0; i < 2;)
+{
+  S *d = c->fn2 (c->fn1 () + 10);
+  if (c != d)
+{
+  c->fn3 ();
+  c = d;
+  ++i;
+}
+}
+  c->fn3 ();
+}
+
+int
+main ()
+{
+  foo ();
+}
diff --git a/gcc/testsuite/g++.dg/pr61289.C b/gcc/testsuite/g++.dg/pr61289.C
new file mode 100644
index 000..ea7ccea
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr61289.C
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-exceptions" } */
+
+struct S
+{
+  inline int fn1 () const { return s; }
+  __attribute__ ((noinline, noclone)) S *fn2 (int);
+  __attribute__ ((noinline, noclone)) void fn3 ();
+  __attribute__ ((noinline, noclone)) static S *fn4 (int);
+  S (int i)

Re: [patch][gomp4] openacc loops

2014-06-05 Thread Tobias Burnus


Janne Blomqvist wrote:

Fortran does not allow aliasing of dummy arguments,


That's not quite true: It permits aliasing variables (also without 
TARGET or POINTER attribute) – but if you modify one, you may no longer 
access the other, unless they do have the POINTER or TARGET attribute. 
(See below for the formal description.)


so a compiler is allowed to optimize assuming aliasing does not occur. 
The exception is dummy arguments with the POINTER attribute, those can 
alias with other variables having the POINTER or TARGET attributes. So 
an ALLOCATABLE variable can not alias with any other variable, unless 
it has the TARGET attribute.


Well, two variables with TARGET attribute are also permitted to alias.

Tobias

PR: Now the same as above, but using a quote from Fortran 2008:

"12.5.2.13 Restrictions on entities associated with dummy arguments
While an entity is associated with a dummy argument, the following 
restrictions hold.
(1) Action that affects the allocation status of the entity or a 
subobject thereof shall be taken through the dummy argument.
(2) If the allocation status of the entity or a subobject thereof is 
affected through the dummy argument,
then at any time during the invocation and execution of the procedure, 
either before or after the allocation or deallocation, it shall be 
referenced only through the dummy argument.
(3) Action that affects the value of the entity or any subobject of it 
shall be taken only through the

dummy argument unless
(a)  the dummy argument has the POINTER attribute or
(b)  the dummy argument has the TARGET attribute, the dummy argument 
does not have INTENT(IN), the dummy argument is a scalar object or an 
assumed-shape array without the CONTIGUOUS attribute, and the actual 
argument is a target other than an array section with a vector subscript.
(4) If the value of the entity or any subobject of it is affected 
through the dummy argument, then at any time during the invocation and 
execution of the procedure, either before or after the definition, it 
may be referenced only through that dummy argument unless

(a) the dummy argument has the POINTER attribute or
(b) the dummy argument has the TARGET attribute, the dummy argument does 
not have INTENT(IN), the dummy argument is a scalar object or an 
assumed-shape array without the CONTIGUOUS attribute, and the actual 
argument is a target other than an array section with a vector subscript."

Re: Fix address space computation in expand_debug_expr

2014-06-05 Thread Jeff Law


On 06/05/14 06:19, Senthil Kumar Selvaraj wrote:



gcc/ChangeLog

2014-06-05  Senthil Kumar Selvaraj  

PR target/52472
* cfgexpand.c (expand_debug_expr): Use address space of nested
 TREE_TYPE for ADDR_EXPR and MEM_REF.

gcc/testsuite/ChangeLog

2014-06-05  Senthil Kumar Selvaraj  

PR target/52472
* gcc.target/avr/pr52472.c: New test.

Thanks.  Installed on your behalf.

Jeff

Re: [PATCH 4/5] add gcc/gdb interface files

2014-06-05 Thread Jeff Law


On 06/04/14 14:39, Tom Tromey wrote:

"Jakub" == Jakub Jelinek  writes:



+GCC_METHOD7 (gcc_decl, build_decl,
+const char */* name */,
+enum gcc_c_symbol_kind /* sym_kind */,
+gcc_type /* sym_type */,
+const char */* substitution_name */,
+gcc_address /* address */,
+const char */* filename */,
+unsigned int /* line_number */)

I must say that I hate the embedded comments in the signatures.
Especially when you end up with something like:


It's not so bad with colorizing but not everybody likes fruit salad.  I
can see how it would be pretty painful without.
Yea, maybe I need different colors, but anytime something colorizes my 
first reaction is to get annoyed because I can't find what I'm looking 
for.  Probably a consequence of my white-on-black terminals not playing 
well with the default colors.


[ OK, not entirely true, when looking at diffs a trailing whitespace 
gets colorized, so I tend not to have those leak through in my own 
patches anymore. ]





Jakub> Why it can't be:
Jakub> GCC_METHOD7 (gcc_decl, build_decl,
Jakub>const char *name,
Jakub>enum gfc_c_symbol_kind sym_kind,
Jakub> ...
Jakub> i.e. provide comments in the form of argument names
Jakub> (sure, you can't use bool for the name of the parameter then...).

It's important that just the types are there.
For example the .def file is used to instantiate C++ templates:

#define GCC_METHOD7(R, N, A, B, C, D, E, F, G) \
   rpc,

Here we can't have a parameter name.


I chose this commenting approach since it named the parameters, albeit
in comments.

The "/* bool */" comments are there because gdb doesn't have a bool
type, but it still seemed worthwhile to document the intent.


I could drop the names and extend the various introductory comments to
explain argument ordering.  What do you think of that?

Seems reasonable to me.

jeff

Handle MULTILIB_REUSE in auto-generated SYSROOT_SUFFIX_SPEC macro

2014-06-05 Thread Julian Brown

Hi,

The print-sysroot-suffix.sh script that can be used (via the
t-sysroot-suffix makefile fragment) to auto-generate
the SYSROOT_SUFFIX_SPEC macro for non-trivial multilib setups does not
take into account the MULTILIB_REUSE target fragment variable.

I'm not sure of a way to demonstrate how this causes problems with a
vanilla tree, but consider the attached patch
(arm-sysroot-mlib-arrangement-1.diff) intended to create a compiler
with three multilibs:

  .; (little-endian, soft float)
  be;@mbig-endian(big-endian, soft float)
  vfp;@mfloat-abi=softfp (little-endian, hardware FP)

Notice that we are not building a multilib for the be+vfp combination.
Instead we use the MULTILIB_REUSE macro to make that combination fall
back to using just the soft-float big-endian multilib:

MULTILIB_REUSE = mbig-endian=mbig-endian/mfloat-abi.softfp

But now, compiling code will fail with errors such as:

$ arm-none-linux-gnueabi-gcc hello.c -mbig-endian -mfloat-abi=softfp \
-o hello
../arm-none-linux-gnueabi/bin/ld: 
/path/to/install/arm-none-linux-gnueabi/libc/usr/lib/libc.a(s_signbit.o): 
compiled for a little endian system and target is big endian

Invoking the compiler with -print-sysroot vs. -print-multi-directory
illustrates the problem:

$ arm-none-linux-gnueabi-gcc hello.c -mbig-endian -mfloat-abi=softfp \
-print-sysroot
/path/to/install/arm-none-linux-gnueabi/libc

$ arm-none-linux-gnueabi-gcc hello.c -mbig-endian -mfloat-abi=softfp \
-print-multi-directory
be

What we wanted was for the first command to give the same result that
invoking without -mfloat-abi=softfp does (which was the purpose of the
MULTILIB_REUSE setting):

$ arm-none-linux-gnueabi-gcc hello.c -mbig-endian -print-sysroot
/path/to/install/arm-none-linux-gnueabi/libc/be

but, that doesn't work at present. The attached patch fixes that: it's
based on a part of CodeSourcery's earlier MULTILIB_ALIASES support
(by Paul Brook originally, I think -- I don't think it ever made it
upstream, but it worked quite similarly to MULTILIB_REUSE, that did),
and allows the above multilib arrangement to work correctly.

OK for mainline? (The ARM bits are for reference only and are not meant
to be committed, of course.)

Thanks,

Julian

ChangeLog

gcc/
* config/print-sysroot-suffix.sh: Handle MULTILIB_REUSE settings.
* config/t-sysroot-suffix (sysroot-suffix.h): Pass MULTILIB_REUSE
to print-sysroot-suffix.sh script.
Index: gcc/config.gcc
===
--- gcc/config.gcc	(revision 210209)
+++ gcc/config.gcc	(working copy)
@@ -1014,7 +1014,9 @@ arm*-*-linux-*)			# ARM GNU/Linux with E
 	;;
 	esac
 	tmake_file="${tmake_file} arm/t-arm arm/t-arm-elf arm/t-bpabi arm/t-linux-eabi"
+	tmake_file="$tmake_file t-sysroot-suffix"
 	tm_file="$tm_file arm/bpabi.h arm/linux-eabi.h arm/aout.h arm/arm.h"
+	tm_file="$tm_file ./sysroot-suffix.h"
 	# Define multilib configuration for arm-linux-androideabi.
 	case ${target} in
 	*-androideabi)
Index: gcc/config/arm/t-linux-eabi
===
--- gcc/config/arm/t-linux-eabi	(revision 210209)
+++ gcc/config/arm/t-linux-eabi	(working copy)
@@ -20,8 +20,15 @@
 # CLEAR_INSN_CACHE in linux-gas.h does not work in Thumb mode.
 # If you set MULTILIB_OPTIONS to a non-empty value you should also set
 # MULTILIB_DEFAULTS in linux-elf.h.
-MULTILIB_OPTIONS	=
-MULTILIB_DIRNAMES	=
+MULTILIB_OPTIONS	= mbig-endian mfloat-abi=softfp
+MULTILIB_DIRNAMES	= be vfp
+MULTILIB_OSDIRNAMES	= mbig-endian=!be mfloat-abi.softfp=!vfp
+MULTILIB_MATCHES	=
+MULTILIB_EXCEPTIONS	=
+
+MULTILIB_REUSE		= mbig-endian=mbig-endian/mfloat-abi.softfp
+
+MULTILIB_REQUIRED	= mbig-endian mfloat-abi=softfp
 
 #MULTILIB_OPTIONS += mcpu=fa606te/mcpu=fa626te/mcpu=fmp626/mcpu=fa726te
 #MULTILIB_DIRNAMES+= fa606te fa626te fmp626 fa726te
Index: gcc/config/print-sysroot-suffix.sh
===
Index: gcc/config/t-sysroot-suffix
===
--- gcc/config/print-sysroot-suffix.sh	(revision 210209)
+++ gcc/config/print-sysroot-suffix.sh	(working copy)
@@ -29,6 +29,7 @@
 #  MULTILIB_OSDIRNAMES \
 #  MULTILIB_OPTIONS \
 #  MULTILIB_MATCHES \
+#  MULTILIB_REUSE
 #  > t-sysroot-suffix.h
 
 # The three options exactly correspond to the variables of the same
@@ -54,6 +55,7 @@ set -e
 dirnames="$1"
 options="$2"
 matches="$3"
+reuse="$4"
 
 cat > print-sysroot-suffix3.sh <<\EOF
 #! /bin/sh
@@ -80,7 +82,14 @@ shift 2
 n="\" \\
 $padding\""
 if [ $# = 0 ]; then
+  case $optstring in
 EOF
+for x in $reuse; do
+  l=`echo $x | sed -e 's/=.*$//' -e 's/\./=/g'`
+  r=`echo $x | sed -e 's/^.*=//' -e 's/\./=/g'`
+  echo "/$r/) optstring=\"/$l/\" ;;" >> print-sysroot-suffix2.sh
+done
+echo "  esac" >> print-sysroot-suffix2.sh
 
 pat=
 for x in $dirnames; do
--- gcc/config/t-sysroot-suff

Re: [PATCH 4/5] add gcc/gdb interface files

2014-06-05 Thread Jakub Jelinek

On Thu, Jun 05, 2014 at 01:23:37PM -0600, Jeff Law wrote:
> >>>+GCC_METHOD7 (gcc_decl, build_decl,
> >>>+   const char */* name */,
> >>>+   enum gcc_c_symbol_kind /* sym_kind */,
> >>>+   gcc_type /* sym_type */,
> >>>+   const char */* substitution_name */,
> >>>+   gcc_address /* address */,
> >>>+   const char */* filename */,
> >>>+   unsigned int /* line_number */)
> 
> [ OK, not entirely true, when looking at diffs a trailing whitespace
> gets colorized, so I tend not to have those leak through in my own
> patches anymore. ]

Another possibility would be to give the macros twice as many arguments
as there are parameters and just through the odd arguments away when
expanding to the template parameters.  That would mean you write
GCC_METHOD7 (gcc_decl, build_decl,
 const char *, name,
 enum gcc_c_symbol_kind, sym_kind,
 const char *, substitution_name,
 gcc_address, address,
 const char *, filename,
 unsigned int, line_number);
though it isn't very nice either, guess I'm bikeshedding... ;)

Jakub

Re: [PATCH 5/5] add libcc1

2014-06-05 Thread Tom Tromey

> "Mike" == Mike Stump  writes:

Mike> On May 16, 2014, at 11:48 AM, Tom Tromey  wrote:
>> This patch adds the plugin to the gcc tree

Mike> So, this code isn’t as portable as gcc (I can run a native gcc
Mike> bootstrap on my binutils sim simulator for my target, and I’m a newlib
Mike> target), so it needs autoconf to explain if enough features are
Mike> present; sockets I think would be one of the many things that would
Mike> kill my build for example.

If you could enumerate the things you think are necessary to check, I
can arrange for the plugin to not be built if those are not available.

I added a check like this for socketpair to my tree.  I also plan to
deal with the lack of plugin functionality as Joseph pointed out.

I suppose I will add a check for fork.  How about pipe?  select?

Tom

Re: [PATCH] Fix logic for detection of zero initializer (PR c/53119)

2014-06-05 Thread Jeff Law


On 06/04/14 10:47, S. Gilles wrote:

PR c/53119
c/
* c-typeck.c (push_init_level, process_init_element,
pop_init_level): Correct check for zero initialization, move
missing brace warning to respect zero initialization.

* gcc.dg/pr53119.c: New testcase.
Thanks to both of you for the patch and review work.  I've installed the 
patch on the trunk.



Jeff

[Google/4_8] Reduce memory overhead of LIPO COMDAT fixups

2014-06-05 Thread Teresa Johnson

(cc'ing a few additional people to help with review as David is out
and I'm not sure Rong is available)

This patch greatly reduces the memory overhead of the new COMDAT fixup
analysis, by changing the second level hash tables to linked lists.
I found that almost none of the second level hash tables contained more
than one entry, but each hash table required a lot of memory overhead.

I also now query the fixup type before adding the checksums to the
pointer sets during callgraph building, which would have enabled a workaround
for the memory issue.

Tested with regression tests and internal tests (see ref below for details).

Google ref b/15415042.

2014-06-05  Teresa Johnson  

* dyn-ipa.c (struct lineno_checksum_alias): Replaced pointer set.
(struct checksum_alias_info): Enabled linked list.
(cfg_checksum_get_key): Removed.
(find_cfg_checksum): New function.
(cfg_checksum_insert): Operate on linked list.
(checksum_set_insert): Ditto.
(gcov_build_callgraph): Allow disabling checksum insertion.
(gcov_find_new_ic_target): Operate on linked list.
(gcov_fixup_counters_checksum): Ditto.
(gcov_fixup_counters_lineno): Ditto.
(__gcov_compute_module_groups): Compute fixup type earlier.

Index: dyn-ipa.c
===
--- dyn-ipa.c   (revision 211288)
+++ dyn-ipa.c   (working copy)
@@ -79,16 +79,15 @@ struct dyn_cgraph
   unsigned num_nodes_executed;
   /* used by new algorithm  */
   struct modu_node *modu_nodes;
-  /* Set indexed by lineno_checksum, returns another dyn_pointer_set*,
- indexed by cfg_checksum.  That returns a checksum_alias_info struct.  */
+  /* Set indexed by lineno_checksum, returns a linked list of
+ checksum_alias_info structs.  */
   struct dyn_pointer_set *lineno_pointer_sets;
 };

 /* Struct holding information for functions with the same lineno_checksum.  */
 struct lineno_checksum_alias
 {
-  /* Set indexed by cfg_checksum, holding a checksum_alias_info struct.  */
-  struct dyn_pointer_set *cfg_pointer_set;
+  struct checksum_alias_info *cfg_checksum_list;
   unsigned lineno_checksum;
 };

@@ -96,6 +95,7 @@ struct lineno_checksum_alias
checksums.  */
 struct checksum_alias_info
 {
+  struct checksum_alias_info *next_cfg_checksum;
   struct checksum_alias *alias_list;
   unsigned cfg_checksum;
 };
@@ -205,6 +205,7 @@ pointer_set_create (unsigned (*get_key) (const voi
 static struct dyn_cgraph the_dyn_call_graph;
 static int total_zero_count = 0;
 static int total_insane_count = 0;
+static int fixup_type = 0;

 enum GROUPING_ALGORITHM
 {
@@ -374,14 +375,6 @@ lineno_checksum_get_key (const void *p)
   return ((const struct lineno_checksum_alias *) p)->lineno_checksum;
 }

-/* The cfg_checksum value in P is the key for a cfg_pointer_set.  */
-
-static inline unsigned
-cfg_checksum_get_key (const void *p)
-{
-  return ((const struct checksum_alias_info *) p)->cfg_checksum;
-}
-
 /* Create a new checksum_alias struct for function with GUID, FI_PTR,
and ZERO_COUNTS flag.  Prepends to list NEXT and returns new struct.  */

@@ -398,28 +391,44 @@ new_checksum_alias (gcov_type guid, const struct g
   return alias;
 }

-/* Insert a new checksum_alias struct into pointer set P for function with
+/* Locate the checksum_alias_info in LIST that matches CFG_CHECKSUM.  */
+
+static struct checksum_alias_info *
+find_cfg_checksum (struct checksum_alias_info *list, unsigned cfg_checksum)
+{
+  for (; list; list = list->next_cfg_checksum)
+{
+  if (list->cfg_checksum == cfg_checksum)
+return list;
+}
+  return NULL;
+}
+
+/* Insert a new checksum_alias struct into LIST for function with
CFG_CHECKSUM and associated GUID, FI_PTR, and ZERO_COUNTS flag.  */

-static void
-cfg_checksum_set_insert (struct dyn_pointer_set *p, unsigned cfg_checksum,
- gcov_type guid, const struct gcov_fn_info *fi_ptr,
- int zero_counts)
+static struct checksum_alias_info *
+cfg_checksum_insert (unsigned cfg_checksum, gcov_type guid,
+ const struct gcov_fn_info *fi_ptr, int zero_counts,
+ struct checksum_alias_info *list)
 {
-  struct checksum_alias_info **m = (struct checksum_alias_info **)
-pointer_set_find_or_insert (p, cfg_checksum);
-  if (*m)
+  struct checksum_alias_info *alias_info;
+  alias_info = find_cfg_checksum (list, cfg_checksum);
+  if (alias_info)
 {
-  gcc_assert ((*m)->alias_list);
-  (*m)->alias_list = new_checksum_alias (guid, fi_ptr, zero_counts,
- (*m)->alias_list);
+  gcc_assert (alias_info->alias_list);
+  alias_info->alias_list = new_checksum_alias (guid, fi_ptr, zero_counts,
+   alias_info->alias_list);
+  return list;
 }
   else
 {
-  *m = XNEW (struct checksum_alias_info);
-  (*m)->cfg_checksum = cfg_check

[PATCH, rs6000][trunk, 4.9, 4.8] Fix PR target/61415, long double 128 issues

2014-06-05 Thread Peter Bergner

PR61415 shows a problem for two test cases that should only be tested if the
target supports a 128-bit long double.  In addition, the 128-bit long double
pack and unpack builtins should not be enabled unless the target supports
128-bit long double.  The following patch accomplishes that, as well as
removing the unused (and redundant) builtins __builtin_longdouble_dw0 and
__builtin_longdouble_dw1.

Is this ok for trunk assuming my powerpc64le-linux bootstrap and regtesting
are clean?

Is this also ok for the FSF 4.9 and FSF 4.8 branches?  Without the gcc/
changes, we hit an ICE whenever we call __builtin_pack_longdouble and
__builtin_unpack_longdouble when -mlong-double-64 is in effect.

Peter


gcc/
PR target/61415
* config/rs6000/rs6000-builtin.def (BU_MISC_1): Delete.
(BU_MISC_2): Rename to ...
(BU_LDBL128_2): ... this.
* config/rs6000/rs6000.h (RS6000_BTM_LDBL128): New define.
(RS6000_BTM_COMMON): Add RS6000_BTM_LDBL128.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Handle
RS6000_BTM_LDBL128.
(rs6000_invalid_builtin): Add long double 128-bit builtin support.
(rs6000_builtin_mask_names): Add RS6000_BTM_LDBL128.
* config/rs6000/rs6000.md (unpacktf_0): Remove define)expand.
(unpacktf_1): Likewise.
* doc/extend.texi (__builtin_longdouble_dw0): Remove documentation.
(__builtin_longdouble_dw1): Likewise.
* doc/sourcebuild.texi (longdouble128): Document.

gcc/testsuite/
PR target/61415
* lib/target-supports.exp (check_effective_target_longdouble128): New.
* gcc.target/powerpc/pack02.c: Use it.
* gcc.target/powerpc/tfmode_off.c: Likewise.

Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 211281)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -622,20 +622,13 @@
 | RS6000_BTC_TERNARY), \
CODE_FOR_ ## ICODE) /* ICODE */
 
-/* Miscellaneous builtins.  */
-#define BU_MISC_1(ENUM, NAME, ATTR, ICODE) \
+/* 128-bit long double floating point builtins.  */
+#define BU_LDBL128_2(ENUM, NAME, ATTR, ICODE)  \
   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */  \
"__builtin_" NAME,  /* NAME */  \
-   RS6000_BTM_HARD_FLOAT,  /* MASK */  \
+   (RS6000_BTM_HARD_FLOAT  /* MASK */  \
+| RS6000_BTM_LDBL128), \
(RS6000_BTC_ ## ATTR/* ATTR */  \
-| RS6000_BTC_UNARY),   \
-   CODE_FOR_ ## ICODE) /* ICODE */
-
-#define BU_MISC_2(ENUM, NAME, ATTR, ICODE) \
-  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */  \
-   "__builtin_" NAME,  /* NAME */  \
-   RS6000_BTM_HARD_FLOAT,  /* MASK */  \
-   (RS6000_BTC_ ## ATTR/* ATTR */  \
 | RS6000_BTC_BINARY),  \
CODE_FOR_ ## ICODE) /* ICODE */
 
@@ -1593,10 +1586,8 @@
 BU_DFP_MISC_2 (PACK_TD,"pack_dec128",  CONST,  packtd)
 BU_DFP_MISC_2 (UNPACK_TD,  "unpack_dec128",CONST,  unpacktd)
 
-BU_MISC_2 (PACK_TF,"pack_longdouble",  CONST,  packtf)
-BU_MISC_2 (UNPACK_TF,  "unpack_longdouble",CONST,  unpacktf)
-BU_MISC_1 (UNPACK_TF_0,"longdouble_dw0",   CONST,  
unpacktf_0)
-BU_MISC_1 (UNPACK_TF_1,"longdouble_dw1",   CONST,  
unpacktf_1)
+BU_LDBL128_2 (PACK_TF, "pack_longdouble",  CONST,  packtf)
+BU_LDBL128_2 (UNPACK_TF,   "unpack_longdouble",CONST,  unpacktf)
 
 BU_P7_MISC_2 (PACK_V1TI,   "pack_vector_int128",   CONST,  packv1ti)
 BU_P7_MISC_2 (UNPACK_V1TI, "unpack_vector_int128", CONST,  unpackv1ti)
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 211281)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3041,7 +3041,8 @@
  | ((TARGET_CRYPTO)? RS6000_BTM_CRYPTO: 0)
  | ((TARGET_HTM)   ? RS6000_BTM_HTM   : 0)
  | ((TARGET_DFP)   ? RS6000_BTM_DFP   : 0)
- | ((TARGET_HARD_FLOAT)? RS6000_BTM_HARD_FLOAT : 0));
+ | ((TARGET_HARD_FLOAT)? RS6000_BTM_HARD_FLOAT : 0)
+ | ((TARGET_LONG_DOUBLE_128)   ? RS6000_BTM_LDBL128 : 0));
 }
 
 /* Override command line options.  Mostly we process the processor type and

Re: [patch, mips, tree] align microMIPS functions to 16 bits with -Os

2014-06-05 Thread Sandra Loosemore


On 06/05/2014 01:39 AM, Richard Biener wrote:


[snip]

Ok, we definitely need to preserve that (documented) behavior.  I suppose
it also sets DECL_USER_ALIGN.  -falign-functions is probably another
setter of DECL_ALIGN here.

If we add a target hook that may adjust function alignment then it
has to honor any user set alignment, then -falign-functions and
then it may only increase alignment over the default FUNCTION_BOUNDARY.

The point to adjust alignment with the hook may still be output time,
but as we figured it can't simply ignore DECL_ALIGN.


H.  The MIPS tweak we want here is to decrease the alignment for 
certain functions, so I suppose this means we'd have to go about it 
backwards by making FUNCTION_BOUNDARY 16 rather than 32 (at least for 
-Os) and then later increasing it back to 32 for all non-microMIPS 
functions.


Richard S., WDYT?  Shall I hack up another patch that does it that way, 
or do you think that's still the wrong way to go about it?


-Sandra

RFA: Small tweak to ira-lives.c:single_reg_class

2014-06-05 Thread Richard Sandiford

I'm about to post a series of patches that reworks the handling of
standard constraints.  As part of that I needed to make single_reg_class
handle "extra" constraints in a similar way to the standard ones.
It's not a particularly worthwhile change in itself -- not enough to justify
this long essay -- but I split it out because it's the only part of the
series that changes codegen.

The function looks like this:

case 'i':
  if (CONSTANT_P (op)
  || (equiv_const != NULL_RTX && CONSTANT_P (equiv_const)))
return NO_REGS;
  break;

case 'n':
  if (CONST_SCALAR_INT_P (op)
  || (equiv_const != NULL_RTX && CONST_SCALAR_INT_P (equiv_const)))
return NO_REGS;
  break;

case 's':
  if ((CONSTANT_P (op) && !CONST_SCALAR_INT_P (op))
  || (equiv_const != NULL_RTX
  && CONSTANT_P (equiv_const)
  && !CONST_SCALAR_INT_P (equiv_const)))
return NO_REGS;
  break;

case 'I':
case 'J':
case 'K':
case 'L':
case 'M':
case 'N':
case 'O':
case 'P':
  if ((CONST_INT_P (op)
   && CONST_OK_FOR_CONSTRAINT_P (INTVAL (op), c, constraints))
  || (equiv_const != NULL_RTX
  && CONST_INT_P (equiv_const)
  && CONST_OK_FOR_CONSTRAINT_P (INTVAL (equiv_const),
c, constraints)))
return NO_REGS;
  break;

case 'E':
case 'F':
  if (CONST_DOUBLE_AS_FLOAT_P (op) 
  || (GET_CODE (op) == CONST_VECTOR
  && GET_MODE_CLASS (GET_MODE (op)) == MODE_VECTOR_FLOAT)
  || (equiv_const != NULL_RTX
  && (CONST_DOUBLE_AS_FLOAT_P (equiv_const)
  || (GET_CODE (equiv_const) == CONST_VECTOR
  && (GET_MODE_CLASS (GET_MODE (equiv_const))
  == MODE_VECTOR_FLOAT)
return NO_REGS;
  break;

case 'G':
case 'H':
  if ((CONST_DOUBLE_AS_FLOAT_P (op) 
   && CONST_DOUBLE_OK_FOR_CONSTRAINT_P (op, c, constraints))
  || (equiv_const != NULL_RTX
  && CONST_DOUBLE_AS_FLOAT_P (equiv_const) 
  && CONST_DOUBLE_OK_FOR_CONSTRAINT_P (equiv_const,
   c, constraints)))
return NO_REGS;
  /* ??? what about memory */
case 'r':
case 'a': case 'b': case 'c': case 'd': case 'e': case 'f':
case 'h': case 'j': case 'k': case 'l':
case 'q': case 't': case 'u':
case 'v': case 'w': case 'x': case 'y': case 'z':
case 'A': case 'B': case 'C': case 'D':
case 'Q': case 'R': case 'S': case 'T': case 'U':
case 'W': case 'Y': case 'Z':
  next_cl = (c == 'r'
 ? GENERAL_REGS
 : REG_CLASS_FROM_CONSTRAINT (c, constraints));
  if (cl == NO_REGS
  ? ira_class_singleton[next_cl][GET_MODE (op)] < 0
  : (ira_class_singleton[cl][GET_MODE (op)]
 != ira_class_singleton[next_cl][GET_MODE (op)]))
return NO_REGS;
  cl = next_cl;
  break;
  [...]
default:
  return NO_REGS;

So for known constant contraints we check whether OP or its equivalent
constant satisfies the constraint and return NO_REGS if so.  I'd like to
extend this behaviour to the extra constraints, since some targets match
constants (often symbolic or unspec-based constants) there too.

The code that checks next_cl effectively assumes that the constraint is
always a register constraint.  If it's something else, next_cl will be
NO_REGS, which isn't a singleton class, so we'll return NO_REGS regardless
of what type of constraint we're matching or what OP is.  (In principle
this includes register constraints that are disabled on the current subtarget,
since they'll have a next_cl of NO_REGS too.)

In order to handle extra constant constraints as described above,
we'd need to ignore cases where next_cl itself is NO_REGS.  This brings
me on to memory and address constraints...

The comment says:

  /* ??? what about memory */

At the moment we return NO_REGS for target-independent memory constraints
like "m", "o" and "g", because of the default case.  The handling of
next_cl means that we effectively do the same for extra memory constraints,
since next_cl is always NO_REGS for them.  That seems reasonable to me
and I'm not trying to change it here.  The patch just makes the current
choice explicit by checking for extra memory constraints.

Likewise we return NO_REGS for 'p' and (indirectly) for extra address
constraints.  This too makes sense, since I don't think we support
a singleton BASE_REG_CLASS.  Again the patch makes that explicit.

So all in all, the patch o

Re: [Google/4_8] Reduce memory overhead of LIPO COMDAT fixups

2014-06-05 Thread Rong Xu

This patch looks good to me.

-Rong

On Thu, Jun 5, 2014 at 12:45 PM, Teresa Johnson  wrote:
> (cc'ing a few additional people to help with review as David is out
> and I'm not sure Rong is available)
>
> This patch greatly reduces the memory overhead of the new COMDAT fixup
> analysis, by changing the second level hash tables to linked lists.
> I found that almost none of the second level hash tables contained more
> than one entry, but each hash table required a lot of memory overhead.
>
> I also now query the fixup type before adding the checksums to the
> pointer sets during callgraph building, which would have enabled a workaround
> for the memory issue.
>
> Tested with regression tests and internal tests (see ref below for details).
>
> Google ref b/15415042.
>
> 2014-06-05  Teresa Johnson  
>
> * dyn-ipa.c (struct lineno_checksum_alias): Replaced pointer set.
> (struct checksum_alias_info): Enabled linked list.
> (cfg_checksum_get_key): Removed.
> (find_cfg_checksum): New function.
> (cfg_checksum_insert): Operate on linked list.
> (checksum_set_insert): Ditto.
> (gcov_build_callgraph): Allow disabling checksum insertion.
> (gcov_find_new_ic_target): Operate on linked list.
> (gcov_fixup_counters_checksum): Ditto.
> (gcov_fixup_counters_lineno): Ditto.
> (__gcov_compute_module_groups): Compute fixup type earlier.
>
> Index: dyn-ipa.c
> ===
> --- dyn-ipa.c   (revision 211288)
> +++ dyn-ipa.c   (working copy)
> @@ -79,16 +79,15 @@ struct dyn_cgraph
>unsigned num_nodes_executed;
>/* used by new algorithm  */
>struct modu_node *modu_nodes;
> -  /* Set indexed by lineno_checksum, returns another dyn_pointer_set*,
> - indexed by cfg_checksum.  That returns a checksum_alias_info struct.  */
> +  /* Set indexed by lineno_checksum, returns a linked list of
> + checksum_alias_info structs.  */
>struct dyn_pointer_set *lineno_pointer_sets;
>  };
>
>  /* Struct holding information for functions with the same lineno_checksum.  
> */
>  struct lineno_checksum_alias
>  {
> -  /* Set indexed by cfg_checksum, holding a checksum_alias_info struct.  */
> -  struct dyn_pointer_set *cfg_pointer_set;
> +  struct checksum_alias_info *cfg_checksum_list;
>unsigned lineno_checksum;
>  };
>
> @@ -96,6 +95,7 @@ struct lineno_checksum_alias
> checksums.  */
>  struct checksum_alias_info
>  {
> +  struct checksum_alias_info *next_cfg_checksum;
>struct checksum_alias *alias_list;
>unsigned cfg_checksum;
>  };
> @@ -205,6 +205,7 @@ pointer_set_create (unsigned (*get_key) (const voi
>  static struct dyn_cgraph the_dyn_call_graph;
>  static int total_zero_count = 0;
>  static int total_insane_count = 0;
> +static int fixup_type = 0;
>
>  enum GROUPING_ALGORITHM
>  {
> @@ -374,14 +375,6 @@ lineno_checksum_get_key (const void *p)
>return ((const struct lineno_checksum_alias *) p)->lineno_checksum;
>  }
>
> -/* The cfg_checksum value in P is the key for a cfg_pointer_set.  */
> -
> -static inline unsigned
> -cfg_checksum_get_key (const void *p)
> -{
> -  return ((const struct checksum_alias_info *) p)->cfg_checksum;
> -}
> -
>  /* Create a new checksum_alias struct for function with GUID, FI_PTR,
> and ZERO_COUNTS flag.  Prepends to list NEXT and returns new struct.  */
>
> @@ -398,28 +391,44 @@ new_checksum_alias (gcov_type guid, const struct g
>return alias;
>  }
>
> -/* Insert a new checksum_alias struct into pointer set P for function with
> +/* Locate the checksum_alias_info in LIST that matches CFG_CHECKSUM.  */
> +
> +static struct checksum_alias_info *
> +find_cfg_checksum (struct checksum_alias_info *list, unsigned cfg_checksum)
> +{
> +  for (; list; list = list->next_cfg_checksum)
> +{
> +  if (list->cfg_checksum == cfg_checksum)
> +return list;
> +}
> +  return NULL;
> +}
> +
> +/* Insert a new checksum_alias struct into LIST for function with
> CFG_CHECKSUM and associated GUID, FI_PTR, and ZERO_COUNTS flag.  */
>
> -static void
> -cfg_checksum_set_insert (struct dyn_pointer_set *p, unsigned cfg_checksum,
> - gcov_type guid, const struct gcov_fn_info *fi_ptr,
> - int zero_counts)
> +static struct checksum_alias_info *
> +cfg_checksum_insert (unsigned cfg_checksum, gcov_type guid,
> + const struct gcov_fn_info *fi_ptr, int zero_counts,
> + struct checksum_alias_info *list)
>  {
> -  struct checksum_alias_info **m = (struct checksum_alias_info **)
> -pointer_set_find_or_insert (p, cfg_checksum);
> -  if (*m)
> +  struct checksum_alias_info *alias_info;
> +  alias_info = find_cfg_checksum (list, cfg_checksum);
> +  if (alias_info)
>  {
> -  gcc_assert ((*m)->alias_list);
> -  (*m)->alias_list = new_checksum_alias (guid, fi_ptr, zero_counts,
> - (*m)->alias_l

Re: [RS6000] PR61300 K&R incoming args

2014-06-05 Thread Jeff Law


On 05/29/14 00:55, Alan Modra wrote:

One of the nice features of the ELFv2 ABI is that stack frames are
smaller compared to ELFv1.  We don't allocate a parameter save area
unless we actually use it.  However, for variable argument lists, we
kept the simple va_list type which is a pointer to the memory location
of the next parameter.  This means calls to variable argument list
functions must allocate the parameter save area, and hence calls to
unprototyped functions must also do so.

The wrinkle with K&R style C functions is that function *definitions*
may be unprototyped.  So when compiling a function body we can't use
!prototype_p() to say we have a parameter save area.  A call in some
other compilation unit might be prototyped and so not allocate a
parameter save area.  Another consequence of unprototyped function
definitions is that the return type and argument types may not be
available on the function type node.  Instead you need to look at the
return and arguments on the function decl.

Now, function.c always passes a decl to REG_PARM_STACK_SPACE, but
calls.c sometimes passes a decl and sometimes a type.  This latter
fact makes it necessary, I think, to define an
INCOMING_REG_PARM_STACK_SPACE used by function.c.  You can't blindly
use a decl from calls.c as that falls foul of C++..

The following implements this.  Bootstrapped and regression tested
powerpc64le-linux and powerpc64-linux all langs (except Ada since I
didn't have gnat installed.)  OK to apply?

PR target/61300
* doc/tm.texi.in (INCOMING_REG_PARM_STACK_SPACE): Document.
* doc/tm.texi: Regenerate.
* function.c (INCOMING_REG_PARM_STACK_SPACE): Provide default.
Use throughout in place of REG_PARM_STACK_SPACE.
* config/rs6000/rs6000.c (rs6000_reg_parm_stack_space): Add
"incoming" param.  Pass to rs6000_function_parms_need_stack.
(rs6000_function_parms_need_stack): Add "incoming" param, ignore
prototype_p when incoming.  Use function decl when incoming
to handle K&R style functions.
* config/rs6000/rs6000.h (REG_PARM_STACK_SPACE): Adjust.
(INCOMING_REG_PARM_STACK_SPACE): Define.

Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  (revision 210919)
+++ gcc/doc/tm.texi.in  (working copy)
@@ -3499,6 +3499,13 @@ which.
  @c above is overfull.  not sure what to do.  --mew 5feb93  did
  @c something, not sure if it looks good.  --mew 10feb93

+@defmac INCOMING_REG_PARM_STACK_SPACE (@var{fndecl})
+Like @code{REG_PARM_STACK_SPACE}, but for incoming register arguments.
+Define this macro if space guaranteed when compiling a function body
+is different to space required when making a call, a situation that
+can arise with unprototyped functions.
+@end defmac
Might be better to clarify ever-so-slightly with "unprototyped function 
definitions" or "K&R style function defintions".  Similarly for the 
comment for the definition in rs6000.h





+#if defined (REG_PARM_STACK_SPACE) && !defined (INCOMING_REG_PARM_STACK_SPACE)
+#define INCOMING_REG_PARM_STACK_SPACE REG_PARM_STACK_SPACE
+#endif
Seems like a reasonable way to simplify this stuff a bit.  Certainly 
makes it easier to review.


There's a few other references to REG_PARM_STACK_SPACE in comments, can 
you review those and change to INCOMING_REG_PARM_STACK_SPACE if necessary.


I'm certainly not a PPC expert, but I can generally see what you're 
doing in rs6000.c.  I'm going to assume you got it right and if not 
you'll address any issues.


OK to apply.

Thanks for your patience,
Jeff

1 2 >

1 - 100 of 145 matches

Mail list logo