date:20140108

Re: [patch][i386] Remove code executed only if reload_in_progress (i.e. never)

2014-01-08 Thread Steven Bosscher

On Wed, Jan 8, 2014 at 10:51 PM, Steven Bosscher wrote:
> Hello Uros, and everyone else,
>
> Now that LRA is always used for the i386 targets, reload_in_progress
> is never set so all code conditional on it is now dead. The attached
> patch removes this code.

Bootstrapped&tested on x86_64-unknown-linux-gnu.

Ciao!
Steven

Re: [Patch, i386] Separate Intel processor with expanded ISA

2014-01-08 Thread Kirill Yukhin

Hello Allan,
On 07 Jan 20:54, Allan Sandfeld Jensen wrote:
> On Sunday 29 December 2013, Allan Sandfeld Jensen wrote:
> > The function dispatcher might currently choose functions declared with
> > target("arch=ivybridge") on a Sandy Bridge CPU. This happens because the
> > function is only detected as sandybridge when generated.
This looks like a bug to me. Could you add testcase as well?

--
Thanks, K

Re: [PATCH] Fix for PR 59524

2014-01-08 Thread Rainer Orth

Jeff Law  writes:

>> gcc/testsuite/ChangeLog
>> +2014-01-08  Balaji V. Iyer  
>> +
>> +   PR testsuite/59524
>> +   * gcc.dg/cilk-plus/cilk-plus.exp: Make sure the cilk keywords tests
>> +   are run only if the Cilk library is available/enabled.
>> +   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
>> +   * lib/target-supports.exp (check_libcilkrts_available): New function.
>> +
>>
>> Is this Ok for trunk?
> Yes.

Better implement this as an effective-target keyword (cilkplus_runtime
to abstract from the library name?) so it's easily usable in individual
testcases if necessary, and document that in gcc/doc/sourcebuild.texi.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Xinliang David Li

On Wed, Jan 8, 2014 at 2:20 PM, Rong Xu  wrote:
> On Wed, Dec 18, 2013 at 9:28 AM, Xinliang David Li  wrote:

  #ifdef L_gcov_merge_ior
  /* The profile merging function that just adds the counters.  It is given
 -   an array COUNTERS of N_COUNTERS old counters and it reads the same 
 number
 -   of counters from the gcov file.  */
 +   an array COUNTERS of N_COUNTERS old counters.
 +   When SRC==NULL, it reads the same number of counters from the gcov 
 file.
 +   Otherwise, it reads from SRC array.  */
  void
 -__gcov_merge_ior (gcov_type *counters, unsigned n_counters)
 +__gcov_merge_ior (gcov_type *counters, unsigned n_counters,
 +  gcov_type *src, unsigned w __attribute__ ((unused)))
>>>
>>> So the new in-memory variants are introduced for merging tool, while libgcc 
>>> use gcov_read_counter
>>> interface?
>>> Perhaps we can actually just duplicate the functions to avoid runtime to do 
>>> all the scalling
>>> and in_mem tests it won't need?
>>
>>
>> I thought about this one a little. How about making the interface
>> change conditionally, but still share the implementation?  The merge
>> function bodies mostly remain unchanged and there is no runtime
>> penalty for libgcov.  The new macros can be shared across most of the
>> mergers.
>>
>> #ifdef IN_PREOFILE_TOOL
>> #define GCOV_MERGE_EXTRA_ARGS  gcov_type *src, unsigned w
>> #define GCOV_READ_COUNTER  *(src++) * w
>> #else
>> #define GCOV_MERGE_EXTRA_ARGS
>> #define GCOV_READ_COUNTER gcov_read_counter ()
>> #endif
>>
>> __gcov_merge_add (gcov_type *counters, unsigned n_counters,
>>   GCOV_MERGE_EXTRA_ARGS)
>> {
>>
>>  for (; n_counters; counters++, n_counters--)
>>   {
>>   *counters += GCOV_READ_COUNTER ;
>>}
>>
>> }
>>
>> thanks,
>
> Personally I don't think the run time test of in_mem will cause any
> issue. This is in profile dumping, why don't we care a few more cycle
> heres? it won't pollute the profile.
>
> If you really don't like that, we can use the above approach, or I can
> hide the logic in gcov_read_counter(), i.e. overload
> gcov_read_counter() in profile_tool. For that, I will need a new
> global variable SRC and set it before calling the merge function.
> I would prefer to keep weight in _gcov_merge_* argument list.
>
> What do you think?

or perhaps just define an inline wrapper function to read counter.
This wrapper function takes src as input. if src is NULL, call
gcov_read_counter.

David

>
> -Rong
>
>>
>> David
>>
>>>
>>> I would suggest going with libgcov.h changes and clenaups first, with 
>>> interface changes next
>>> and the gcov-tool is probably quite obvious at the end?
>>> Do you think you can split the patch this way?
>>>
>>> Thanks and sorry for taking long to review. I should have more time again 
>>> now.
>>> Honza

Re: [PATCH] Fix PR49718 : allow no_instrument_function attribute in class member definition/declaration

2014-01-08 Thread Jeff Law


On 01/08/14 02:05, Laurent Alfonsi wrote:

 All,

I was looking at PR49718. I have enclosed a simple fix for this bug report.

2014-01-07  Laurent Alfonsi 

 * c-family/c-common.c (handle_no_instrument_function_attribute): Allow
   no_instrument_function attribute in class member
definition/declaration.


Looking at the implementation of the function attributes, I see no
reason anymore to keep this error message.
Let me know if I missed something.
I have also added a testcase in the enclosed patch.

2014-01-07  Laurent Alfonsi 

 PR c++/49718
 * g++.dg/pr49718.C: New
Isn't the idea here that if we've already generated the function body 
(presumably with instrumentation) that a no-instrument attribute 
appearing on a later declaration won't do anything useful?


jeff

Re: [PATCH] Final removal of mudflap

2014-01-08 Thread Jeff Law


On 01/01/14 04:28, Ryan Hill wrote:

On Sat, 26 Oct 2013 14:41:01 -0600
Jeff Law  wrote:


Here's the final patch to remove mudflap.  Per the multiple
recommendations, it leaves the options as nops and warns for them.


Can you write something about this for changes.html?
It's been so long since I did anything with our web pages, I'm not 
entirely sure of proper procedures anymore.


Gerald, this look OK?




Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.51
diff -c -p -r1.51 changes.html
*** changes.html8 Jan 2014 06:38:59 -   1.51
--- changes.html9 Jan 2014 04:55:07 -
***
*** 15,20 
--- 15,23 
  Caveats
  

+ The mudflap runtime checker has been removed.  The mudflap
+ options remain, but do nothing.
+ 
  Support for a number of older systems and recently
  unmaintained or untested target ports of GCC has been declared
  obsolete in GCC 4.9.  Unless there is activity to revive them, the

Re: [PATCH] Fix PR 59631

2014-01-08 Thread Jeff Law


On 01/07/14 08:14, Iyer, Balaji V wrote:

Hello Everyone,
The attached patch will fix the issue reported in PR 59631. The main 
issue was the usage of Cilk spawn [and _Cilk_sync] with -fcilkplus caused an 
ICE. This patch should fix that. The issue was only reported for C++ but the 
issue exists in C compiler also.  This patch fixes both C and C++. A test case 
is also included.

Is this Ok for trunk?

Here are the ChangeLog entries:
+++ gcc/c/ChangeLog
+2014-01-07  Balaji V. Iyer  
+
+   PR c++/59631
+   * c-parser.c (c_parser_postfix_expression): Replaced consecutive if
+   statements with if-elseif statements.

+++ gcc/testsuite/ChangeLog
+2014-01-07  Balaji V. Iyer  
+
+   PR c++/59631
+   * gcc.dg/cilk-plus/cilk-plus.exp: Removed "-fcilkplus" from flags list.
+   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
+   * c-c++-common/cilk-plus/CK/spawnee_inline.c: Replaced second dg-option
+   with dg-additional-options.
+   * c-c++-common/cilk-plus/CK/varargs_test.c: Likewise.
+   * c-c++-common/cilk-plus/CK/steal_check.c: Likewise.
+   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
+   * c-c++-common/cilk-plus/CK/spawning_arg.c: Likewise.
+   * c-c++-common/cilk-plus/CK/invalid_spawns.c: Added a dg-options tag.
+   * c-c++-common/cilk-plus/CK/pr59631.c: New testcase.

+++ gcc/cp/ChangeLog
+2014-01-07  Balaji V. Iyer  
+
+   PR c++/59631
+   * parser.c (cp_parser_postfix_expression): Added a new if-statement
+   and replaced an existing if-statement with else-if statement.
+   Changed an existing error message wording to match the one from the C
+   parser.
Yes this is fine.  One could easily argue this falls within the Cilk+ 
maintainership and that you don't need approval for it.


jeff

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2014-01-08 Thread Jeff Law


On 01/08/14 16:13, Jakub Jelinek wrote:


Please add a function comment for it (perhaps saying that it is like
single_set but never allows more than one SET).

Ok with that change.

My bad.  Thanks for catching it.

Attaching the installed patch for reference.

Jeff
commit 12e467a657653de484178f0d0d66786a92d62ebe
Author: law 
Date:   Thu Jan 9 04:42:38 2014 +

* ree.c (get_sub_rtx): New function, extracted from...
(merge_def_and_ext): Here.
(combine_reaching_defs): Use get_sub_rtx.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@206454 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f63918e..e4872f2 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2014-01-08  Jeff Law  
+
+   * ree.c (get_sub_rtx): New function, extracted from...
+   (merge_def_and_ext): Here.
+   (combine_reaching_defs): Use get_sub_rtx.
+
 2014-01-08  Eric Botcazou  
 
* cgraph.h (varpool_variable_node): Do not choke on null node.
diff --git a/gcc/ree.c b/gcc/ree.c
index ec09c7a..1c4f3ad 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -580,27 +580,21 @@ make_defs_and_copies_lists (rtx extend_insn, const_rtx 
set_pat,
   return ret;
 }
 
-/* Merge the DEF_INSN with an extension.  Calls combine_set_extension
-   on the SET pattern.  */
-
-static bool
-merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state *state)
+/* If DEF_INSN has single SET expression, possibly buried inside
+   a PARALLEL, return the address of the SET expression, else
+   return NULL.  This is similar to single_set, except that
+   single_set allows multiple SETs when all but one is dead.  */
+static rtx *
+get_sub_rtx (rtx def_insn)
 {
-  enum machine_mode ext_src_mode;
-  enum rtx_code code;
-  rtx *sub_rtx;
-  rtx s_expr;
-  int i;
-
-  ext_src_mode = GET_MODE (XEXP (SET_SRC (cand->expr), 0));
-  code = GET_CODE (PATTERN (def_insn));
-  sub_rtx = NULL;
+  enum rtx_code code = GET_CODE (PATTERN (def_insn));
+  rtx *sub_rtx = NULL;
 
   if (code == PARALLEL)
 {
-  for (i = 0; i < XVECLEN (PATTERN (def_insn), 0); i++)
+  for (int i = 0; i < XVECLEN (PATTERN (def_insn), 0); i++)
 {
-  s_expr = XVECEXP (PATTERN (def_insn), 0, i);
+  rtx s_expr = XVECEXP (PATTERN (def_insn), 0, i);
   if (GET_CODE (s_expr) != SET)
 continue;
 
@@ -609,7 +603,7 @@ merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state 
*state)
   else
 {
   /* PARALLEL with multiple SETs.  */
-  return false;
+  return NULL;
 }
 }
 }
@@ -618,10 +612,27 @@ merge_def_and_ext (ext_cand *cand, rtx def_insn, 
ext_state *state)
   else
 {
   /* It is not a PARALLEL or a SET, what could it be ? */
-  return false;
+  return NULL;
 }
 
   gcc_assert (sub_rtx != NULL);
+  return sub_rtx;
+}
+
+/* Merge the DEF_INSN with an extension.  Calls combine_set_extension
+   on the SET pattern.  */
+
+static bool
+merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state *state)
+{
+  enum machine_mode ext_src_mode;
+  rtx *sub_rtx;
+
+  ext_src_mode = GET_MODE (XEXP (SET_SRC (cand->expr), 0));
+  sub_rtx = get_sub_rtx (def_insn);
+
+  if (sub_rtx == NULL)
+return false;
 
   if (REG_P (SET_DEST (*sub_rtx))
   && (GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode
@@ -707,8 +718,13 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, 
ext_state *state)
   /* If there is an overlap between the destination of DEF_INSN and
 CAND->insn, then this transformation is not safe.  Note we have
 to test in the widened mode.  */
+  rtx *dest_sub_rtx = get_sub_rtx (def_insn);
+  if (dest_sub_rtx == NULL
+ || !REG_P (SET_DEST (*dest_sub_rtx)))
+   return false;
+
   rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand->insn))),
-REGNO (SET_DEST (PATTERN (def_insn;
+REGNO (SET_DEST (*dest_sub_rtx)));
   if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn
return false;

Re: [Patch, bfin/c6x] Fix ICE for backends that rely on reorder_loops.

2014-01-08 Thread Jeff Law


On 01/07/14 09:07, Bernd Schmidt wrote:

On 01/05/2014 05:10 PM, Teresa Johnson wrote:

On Sun, Jan 5, 2014 at 3:39 AM, Bernd Schmidt
 wrote:

I have a different patch which I'll submit next week after some more
testing. The assert in cfgrtl is unnecessarily broad and really only
needs
to trigger if -freorder-blocks-and-partition; there's nothing wrong with
entering cfglayout after normal bb-reorder.


Currently -freorder-blocks-and-partition is the default for x86. I
assume that hw-doloop is not enabled for any i386 targets, which is
why we haven't seen this?


Precisely.


And will this mean that -freorder-blocks-and-partition cannot be used
for the targets that use hw-doloop? If so, should
-freorder-blocks-and-partition be prevented with a warning for those
targets?


If someone explicitly chooses that option we can turn off the reordering
in hw-doloop. That should happen sufficiently rarely that it isn't a
problem. That's what the patch below does - bootstraped on x86_64-linux,
tested there and with bfin-elf. Ok?

Yes.  This is fine.

jeff

Re: [PATCH] Fix cfgcleanup regression (PR rtl-optimization/59724)

2014-01-08 Thread Jeff Law


On 01/08/14 15:42, Jakub Jelinek wrote:

On Wed, Jan 08, 2014 at 05:54:55PM +0100, Uros Bizjak wrote:

This caused PR59724 on alpha:

20021116-1.c: In function ‘foo’:
20021116-1.c:31:1: error: NOTE_INSN_BASIC_BLOCK is missing for block 9
  }
  ^
20021116-1.c:31:1: error: insn outside basic block
(jump_insn 94 52 93 9 (return) 20021116-1.c:31 -1
  (nil)
  -> return)


Ugh, indeed.  The problem is that try_head_merge_bb really wants
flow_find_head_matching_sequence to count all (non-note) insns, not
just active insns, because otherwise as in the above testcase we
can have e.g. 2 active insns followed by one non-active, all matching
(flow_find_head_matching_sequence returns 2) and on another edge
just 2 active insns and nothing else matching.  2 == 2, so the caller
thinks it doesn't matter which one is shorter, but we have the insn range
of 3 insns together.

So, this patch just reverts the try_head_merge_bb changes and makes
flow_find_head_matching_sequence behave the old way when called from
try_head_merge_bb, i.e. count all non-note insns, and only when called
from ifcvt.c count just active insns.  Plus the ifcvt.c change ensures
we don't mistakenly call it with stop_after == 0 (which wouldn't actually
stop).

Bootstrapped/regtested on x86_64-linux and i686-linux, Uros is testing it
on Alpha.  Ok for trunk?

2014-01-08  Jakub Jelinek  

PR rtl-optimization/59724
* ifcvt.c (cond_exec_process_if_block): Don't call
flow_find_head_matching_sequence with 0 longest_match.
* cfgcleanup.c (flow_find_head_matching_sequence): Count even
non-active insns if !stop_after.
(try_head_merge_bb): Revert 2014-01-07 changes.

OK.

Jeff

Re: [PATCH] Fix for PR 59524

2014-01-08 Thread Jeff Law


On 01/08/14 14:16, Iyer, Balaji V wrote:

Hello Everyone,
Attached, please find a patch will fix the bug mentioned in PR 59524. 
The main issue was that Cilk keywords tests are running even when the user 
configured the compiler with --disable-libcilkrts. This patch should fix this 
issue for C and C++. This is tested on x86 and x86_64.

Here are the ChangeLog entries

gcc/testsuite/ChangeLog
+2014-01-08  Balaji V. Iyer  
+
+   PR testsuite/59524
+   * gcc.dg/cilk-plus/cilk-plus.exp: Make sure the cilk keywords tests
+   are run only if the Cilk library is available/enabled.
+   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
+   * lib/target-supports.exp (check_libcilkrts_available): New function.
+

Is this Ok for trunk?

Yes.

jeff

Re: [GOOGLE] Remove mod_id_to_name map

2014-01-08 Thread Xinliang David Li

Ok.

David

On Wed, Jan 8, 2014 at 10:58 AM, Dehao Chen  wrote:
> This patch removes mod_id_to_name map because the info is already
> there in module_infos. And also, AutoFDO don't have access to update
> this map because its a file-static structure.
>
> Bootstrapped and passed regression test.
>
> OK for google branch?
>
> Thanks,
> Dehao
>
> Index: gcc/coverage.c
> ===
> --- gcc/coverage.c (revision 206366)
> +++ gcc/coverage.c (working copy)
> @@ -615,37 +615,17 @@ reorder_module_groups (const char *imports_file, u
>module_name_tab.dispose ();
>  }
>
> -typedef struct {
> -  unsigned int mod_id;
> -  const char *mod_name;
> -} mod_id_to_name_t;
> -
> -static vec *mod_names;
> -
> -static void
> -record_module_name (unsigned int mod_id, const char *name)
> -{
> -  mod_id_to_name_t t;
> -
> -  t.mod_id = mod_id;
> -  t.mod_name = xstrdup (name);
> -  if (!mod_names)
> -vec_alloc (mod_names, 10);
> -  mod_names->safe_push (t);
> -}
> -
>  /* Return the module name for module with MOD_ID.  */
>
>  const char *
>  get_module_name (unsigned int mod_id)
>  {
>size_t i;
> -  mod_id_to_name_t *elt;
>
> -  for (i = 0; mod_names->iterate (i, &elt); i++)
> +  for (i = 0; i < num_in_fnames; i++)
>  {
> -  if (elt->mod_id == mod_id)
> -return elt->mod_name;
> +  if (module_infos[i]->ident == mod_id)
> +return lbasename (module_infos[i]->source_filename);
>  }
>
>gcc_assert (0);
> @@ -927,9 +907,6 @@ read_counts_file (const char *da_file_name, unsign
>   }
>  }
>
> -  record_module_name (mod_info->ident,
> -  lbasename (mod_info->source_filename));
> -
>if (dump_enabled_p ())
>  {
>dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,

RE: [PATCH] Fix PR58115

2014-01-08 Thread Bernd Edlinger

Hi,

On Tue, 7 Jan 2014 15:10:20, Richard Biener wrote:
>
> On Tue, Jan 7, 2014 at 1:12 PM, Richard Sandiford
>  wrote:
>> Bernd Edlinger  writes:
 How about this patch for the big comment?

>>>
>>> The comment should say that target_set_current_function()
>>> cannot call target_reinit() because:
>>>
>>> target_reinit()=>lang_dependent_init_target()
>>> =>init_optabs()=>init_all_optabs(this_fn_optabs);
>>>
>>> uses this_fn_optabs which is undefined here.
>>>
>>> However many targets (nios2, rx, i386, rs6000) do exactly that.
>>>
>>> Is there currently any target, that sets this_target_optab in the
>>> target_set_current_function?
>>
>> MIPS :-) (via save_target_globals_default_opts=>save_target_globals)
>>
>> I think other targets need to do the same thing in order for tests
>> like that extended intrinsics_4.c to work. How does this patch look?
>> Tested on x86_64-linux-gnu.
>>
>> I didn't remove save_target_globals_default_opts because there the
>> temporary optimization_current_node also protects a call to init_reg_sets.
>
> Well, if it works the patch is ok. You're way more familiar with the
> details of this machinery.
>
> Richard.
>

I found another test case that still fails with today's trunk:

#include 

__m256 a[10], b[10], c[10];

void __attribute__((target ("sse2"), optimize (3)))
foo (void)
{
}

void __attribute__((target ("avx"), optimize (3)))
bar (void)
{
  a[0] = _mm256_and_ps (b[0], c[0]);
}

compile with i686-pc-linux-gnu-gcc -O2 -msse2 -mno-avx -S  

The attached patch seems to fix this test case for
targets that do not have SWITCHABLE_TARGET.

What do you think about it?

I think Jakub's patch will fix this case, but I did not try.
However even if the i368 is now clean, there are
still many targets that use target_reinit() in
target_set_current_function.


Bernd.

>> Thanks,
>> Richard
>>
>>
>> gcc/
>> PR target/58115
>> * target-globals.c (save_target_globals): Remove this_fn_optab
>> handling.
>> * toplev.c: Include optabs.h.
>> (target_reinit): Temporarily restore the global options if another
>> set of options are in force.
>>
>> gcc/testsuite/
>> * gcc.target/i386/intrinsics_4.c (bar): New function.
>>
>> Index: gcc/target-globals.c
>> ===
>> --- gcc/target-globals.c 2014-01-02 22:16:03.042278971 +
>> +++ gcc/target-globals.c 2014-01-07 12:08:33.569900970 +
>> @@ -68,7 +68,6 @@ struct target_globals *
>> save_target_globals (void)
>> {
>> struct target_globals *g;
>> - struct target_optabs *saved_this_fn_optabs = this_fn_optabs;
>>
>> g = ggc_alloc_target_globals ();
>> g->flag_state = XCNEW (struct target_flag_state);
>> @@ -88,10 +87,8 @@ save_target_globals (void)
>> g->bb_reorder = XCNEW (struct target_bb_reorder);
>> g->lower_subreg = XCNEW (struct target_lower_subreg);
>> restore_target_globals (g);
>> - this_fn_optabs = this_target_optabs;
>> init_reg_sets ();
>> target_reinit ();
>> - this_fn_optabs = saved_this_fn_optabs;
>> return g;
>> }
>>
>> Index: gcc/toplev.c
>> ===
>> --- gcc/toplev.c 2014-01-07 08:11:43.888058805 +
>> +++ gcc/toplev.c 2014-01-07 12:10:19.448096479 +
>> @@ -78,6 +78,7 @@ Software Foundation; either version 3, o
>> #include "diagnostic-color.h"
>> #include "context.h"
>> #include "pass_manager.h"
>> +#include "optabs.h"
>>
>> #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
>> #include "dbxout.h"
>> @@ -1752,6 +1753,23 @@ target_reinit (void)
>> {
>> struct rtl_data saved_x_rtl;
>> rtx *saved_regno_reg_rtx;
>> + tree saved_optimization_current_node;
>> + struct target_optabs *saved_this_fn_optabs;
>> +
>> + /* Temporarily switch to the default optimization node, so that
>> + *this_target_optabs is set to the default, not reflecting
>> + whatever a previous function used for the optimize
>> + attribute. */
>> + saved_optimization_current_node = optimization_current_node;
>> + saved_this_fn_optabs = this_fn_optabs;
>> + if (saved_optimization_current_node != optimization_default_node)
>> + {
>> + optimization_current_node = optimization_default_node;
>> + cl_optimization_restore
>> + (&global_options,
>> + TREE_OPTIMIZATION (optimization_default_node));
>> + }
>> + this_fn_optabs = this_target_optabs;
>>
>> /* Save *crtl and regno_reg_rtx around the reinitialization
>> to allow target_reinit being called even after prepare_function_start. */
>> @@ -1769,7 +1787,16 @@ target_reinit (void)
>> /* Reinitialize lang-dependent parts. */
>> lang_dependent_init_target ();
>>
>> - /* And restore it at the end, as free_after_compilation from
>> + /* Restore the original optimization node. */
>> + if (saved_optimization_current_node != optimization_default_node)
>> + {
>> + optimization_current_node = saved_optimization_current_node;
>> + cl_optimization_restore (&global_options,
>> + TREE_OPTIMIZATION (optimization_current_node));
>> + }
>> + this_fn_optabs = saved_this_fn_optabs;
>> +
>> + /* Re

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 04:02:17PM -0700, Jeff Law wrote:
>   * ree.c (get_sub_rtx): New function, extracted from...
>   (merge_def_and_ext): Here.
>   (combine_reaching_defs): Use get_sub_rtx.

> --- a/gcc/ree.c
> +++ b/gcc/ree.c
> @@ -580,27 +580,17 @@ make_defs_and_copies_lists (rtx extend_insn, const_rtx 
> set_pat,
>return ret;
>  }
>  
> -/* Merge the DEF_INSN with an extension.  Calls combine_set_extension
> -   on the SET pattern.  */
> -
> -static bool
> -merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state *state)
> +static rtx *
> +get_sub_rtx (rtx def_insn)

Please add a function comment for it (perhaps saying that it is like
single_set but never allows more than one SET).

Ok with that change.

Jakub

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2014-01-08 Thread Jeff Law


On 01/08/14 01:14, Eric Botcazou wrote:

Committed after private email approval from Jakub.  I made one
additional trivial change (missing whitespace in a comment).


This breaks bootstrap with RTL checking enabled:

/home/eric/svn/gcc/libgcc/config/libbid/bid64_noncomp.c:119:1: internal
compiler error: RTL check: expected code 'set' or 'clobber', have 'parallel'
in combine_reaching_defs, at ree.c:711
There were two issues in that code.  The first assumed the form of 
DEF_INSN was (set (dest) (src)), the second assumed that the destination 
must be a reg before checking its REGNO.


ree.c already had some code which effectively defined the form that the 
defining insn could take.  It's not quite "single_set", though I'd 
really prefer that be the form in the future.  Anyway, I pulled that 
code out of merge_def_and_ext so that it could also be used by 
combine_reaching_defs.


With that I was able to bootstrap & regression test with 
--enable-checking=rtl as well as a normal bootstrap and regression test 
on x86_64-unknown-linux-gnu.


OK for the trunk?


* ree.c (get_sub_rtx): New function, extracted from...
(merge_def_and_ext): Here.
(combine_reaching_defs): Use get_sub_rtx.


diff --git a/gcc/ree.c b/gcc/ree.c
index ec09c7a..b41e891 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -580,27 +580,17 @@ make_defs_and_copies_lists (rtx extend_insn, const_rtx 
set_pat,
   return ret;
 }
 
-/* Merge the DEF_INSN with an extension.  Calls combine_set_extension
-   on the SET pattern.  */
-
-static bool
-merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state *state)
+static rtx *
+get_sub_rtx (rtx def_insn)
 {
-  enum machine_mode ext_src_mode;
-  enum rtx_code code;
-  rtx *sub_rtx;
-  rtx s_expr;
-  int i;
-
-  ext_src_mode = GET_MODE (XEXP (SET_SRC (cand->expr), 0));
-  code = GET_CODE (PATTERN (def_insn));
-  sub_rtx = NULL;
+  enum rtx_code code = GET_CODE (PATTERN (def_insn));
+  rtx *sub_rtx = NULL;
 
   if (code == PARALLEL)
 {
-  for (i = 0; i < XVECLEN (PATTERN (def_insn), 0); i++)
+  for (int i = 0; i < XVECLEN (PATTERN (def_insn), 0); i++)
 {
-  s_expr = XVECEXP (PATTERN (def_insn), 0, i);
+  rtx s_expr = XVECEXP (PATTERN (def_insn), 0, i);
   if (GET_CODE (s_expr) != SET)
 continue;
 
@@ -609,7 +599,7 @@ merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state 
*state)
   else
 {
   /* PARALLEL with multiple SETs.  */
-  return false;
+  return NULL;
 }
 }
 }
@@ -618,10 +608,27 @@ merge_def_and_ext (ext_cand *cand, rtx def_insn, 
ext_state *state)
   else
 {
   /* It is not a PARALLEL or a SET, what could it be ? */
-  return false;
+  return NULL;
 }
 
   gcc_assert (sub_rtx != NULL);
+  return sub_rtx;
+}
+
+/* Merge the DEF_INSN with an extension.  Calls combine_set_extension
+   on the SET pattern.  */
+
+static bool
+merge_def_and_ext (ext_cand *cand, rtx def_insn, ext_state *state)
+{
+  enum machine_mode ext_src_mode;
+  rtx *sub_rtx;
+
+  ext_src_mode = GET_MODE (XEXP (SET_SRC (cand->expr), 0));
+  sub_rtx = get_sub_rtx (def_insn);
+
+  if (sub_rtx == NULL)
+return false;
 
   if (REG_P (SET_DEST (*sub_rtx))
   && (GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode
@@ -707,8 +714,13 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, 
ext_state *state)
   /* If there is an overlap between the destination of DEF_INSN and
 CAND->insn, then this transformation is not safe.  Note we have
 to test in the widened mode.  */
+  rtx *dest_sub_rtx = get_sub_rtx (def_insn);
+  if (dest_sub_rtx == NULL
+ || !REG_P (SET_DEST (*dest_sub_rtx)))
+   return false;
+
   rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand->insn))),
-REGNO (SET_DEST (PATTERN (def_insn;
+REGNO (SET_DEST (*dest_sub_rtx)));
   if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn
return false;

Re: [Patch] Regex bracket matcher cache optimization

2014-01-08 Thread Tim Shen

On Wed, Jan 8, 2014 at 5:38 PM, Paolo Carlini  wrote:
> I agree, it's probably fine for now, but please actually attach the patch ;)

Oops sorry >.<


So my plan is to instantiate _Compiler and _Executor instead of user
interfaces like basic_regex or regex_match, because the implementation
may change (say add a new executor) later. Is that Ok?


-- 
Regards,
Tim Shen
commit d9f47e783680a1cab86bd704e67236025cbdff18
Author: tim 
Date:   Mon Jan 6 00:03:41 2014 -0500

2014-01-08  Tim Shen  

* bits/regex_automaton.tcc: Indentation fix.
* bits/regex_compiler.h (__compile_nfa<>(), _Compiler<>,
_RegexTranslator<> _AnyMatcher<>, _CharMatcher<>,
_BracketMatcher<>): Add bool option template parameters and
specializations to make matching more efficient and space saving.
* bits/regex_compiler.tcc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex_automaton.tcc 
b/libstdc++-v3/include/bits/regex_automaton.tcc
index 7edc67f..e222803 100644
--- a/libstdc++-v3/include/bits/regex_automaton.tcc
+++ b/libstdc++-v3/include/bits/regex_automaton.tcc
@@ -134,9 +134,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _NFA<_TraitsT>::_M_dot(std::ostream& __ostr) const
 {
   __ostr << "digraph _Nfa {\n"
-   "  rankdir=LR;\n";
+   "  rankdir=LR;\n";
   for (size_t __i = 0; __i < this->size(); ++__i)
-(*this)[__i]._M_dot(__ostr, __i);
+   (*this)[__i]._M_dot(__ostr, __i);
   __ostr << "}\n";
   return __ostr;
 }
diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 4ac67df..b73fe30 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -39,7 +39,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
-  template
+  template
 struct _BracketMatcher;
 
   /// Builds an NFA from an input iterator interval.
@@ -63,7 +63,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename _ScannerT::_TokenT _TokenT;
   typedef _StateSeq<_TraitsT>_StateSeqT;
   typedef std::stack<_StateSeqT, std::vector<_StateSeqT>> _StackT;
-  typedef _BracketMatcher<_TraitsT>  
_BMatcherT;
   typedef std::ctype_CtypeT;
 
   // accepts a specific token or returns false.
@@ -91,20 +90,30 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   bool
   _M_bracket_expression();
 
-  void
-  _M_expression_term(_BMatcherT& __matcher);
+  template
+   void
+   _M_insert_any_matcher_ecma();
 
-  bool
-  _M_range_expression(_BMatcherT& __matcher);
+  template
+   void
+   _M_insert_any_matcher_posix();
 
-  bool
-  _M_collating_symbol(_BMatcherT& __matcher);
+  template
+   void
+   _M_insert_char_matcher();
 
-  bool
-  _M_equivalence_class(_BMatcherT& __matcher);
+  template
+   void
+   _M_insert_character_class_matcher();
 
-  bool
-  _M_character_class(_BMatcherT& __matcher);
+  template
+   void
+   _M_insert_bracket_matcher(bool __neg);
+
+  template
+   void
+   _M_expression_term(_BracketMatcher<_TraitsT, __icase, __collate>&
+  __matcher);
 
   int
   _M_cur_int_value(int __radix);
@@ -148,16 +157,110 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __compile_nfa(__cfirst, __cfirst + __len, __traits, __flags);
 }
 
-  template
-struct _AnyMatcher
+  // [28.13.14]
+  template
+class _RegexTranslator
 {
-  typedef typename _TraitsT::char_type   _CharT;
+public:
+  typedef typename _TraitsT::char_type   _CharT;
+  typedef typename _TraitsT::string_type _StringT;
+  typedef typename std::conditional<__collate,
+   _StringT,
+   _CharT>::type _StrTransT;
 
   explicit
-  _AnyMatcher(const _TraitsT& __traits)
+  _RegexTranslator(const _TraitsT& __traits)
   : _M_traits(__traits)
   { }
 
+  _CharT
+  _M_translate(_CharT __ch) const
+  {
+   if (__icase)
+ return _M_traits.translate_nocase(__ch);
+   else if (__collate)
+ return _M_traits.translate(__ch);
+   else
+ return __ch;
+  }
+
+  _StrTransT
+  _M_transform(_CharT __ch) const
+  {
+   return _M_transform_impl(__ch, typename integral_constant::type());
+  }
+
+private:
+  _StrTransT
+  _M_transform_impl(_CharT __ch, false_type) const
+  { return __ch; }
+
+  _StrTransT
+  _M_transform_impl(_CharT __ch, true_type) const
+  {
+   _StrTransT __str = _StrTransT(1, _M_translate(__ch));
+   return _M_traits.transform(__str.begin(), __str.end());
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+class _RegexTranslator<_TraitsT, false, false>
+{
+public:
+  typedef typename _Tr

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 12:19:02PM +0100, Richard Biener wrote:
> On Wed, 8 Jan 2014, Jakub Jelinek wrote:
> 
> > On Wed, Jan 08, 2014 at 12:15:40PM +0100, Richard Biener wrote:
> > > > I start to think this is a too complex transform for stmt folding ...
> > > 
> > > Alternatively do update_call_from_tree (gsi, 
> > > get_or_create_ssa_default_def 
> > > (cfun, create_tmp_var (TREE_TYPE (lhs.
> > 
> > The lhs might not be is_gimple_reg_type though.  What to do in that case?
> 
> In that case you can remove the stmt.

Ok, so like this?  Unfortunately, I haven't been able to construct a
testcase where this would be folded later than during gimplification where
lhs is obviously not a SSA_NAME and we can safely replace the call.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2014-01-08  Jakub Jelinek  

PR tree-optimization/59622
* gimple-fold.c (gimple_fold_call): Fix a typo in message.  Handle
__cxa_pure_virtual similarly to __builtin_unreachable, but replace
the OBJ_TYPE_REF call with the noreturn and add if needed a setter
of the lhs SSA_NAME.  Don't devirtualize for inplace at all.

* g++.dg/opt/pr59622-2.C: New test.
* g++.dg/opt/pr59622-3.C: New test.
* g++.dg/opt/pr59622-4.C: New test.

--- gcc/gimple-fold.c.jj2014-01-08 10:23:24.536443566 +0100
+++ gcc/gimple-fold.c   2014-01-08 17:02:40.356635177 +0100
@@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator *
  (OBJ_TYPE_REF_EXPR 
(callee)
{
  fprintf (dump_file,
-  "Type inheritnace inconsistent devirtualization of ");
+  "Type inheritance inconsistent devirtualization of ");
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
  fprintf (dump_file, " to ");
  print_generic_expr (dump_file, callee, TDF_SLIM);
@@ -1177,26 +1177,46 @@ gimple_fold_call (gimple_stmt_iterator *
  gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee));
  changed = true;
}
-  else if (flag_devirtualize && virtual_method_call_p (callee))
+  else if (flag_devirtualize && !inplace && virtual_method_call_p (callee))
{
  bool final;
  vec targets
= possible_polymorphic_call_targets (callee, &final);
  if (final && targets.length () <= 1)
{
+ tree fndecl;
  if (targets.length () == 1)
+   fndecl = targets[0]->decl;
+ else
+   fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+
+ /* If fndecl (like __builtin_unreachable or
+__cxa_pure_virtual) takes no arguments, doesn't have
+return value and is noreturn, if the call doesn't have
+lhs or lhs isn't SSA_NAME, replace the call with
+the noreturn call, otherwise insert it before the call
+and replace the call with setting of lhs to default def.  */
+ if (TREE_THIS_VOLATILE (fndecl)
+ && VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl)))
+ && TYPE_ARG_TYPES (TREE_TYPE (fndecl)) == void_list_node)
{
- gimple_call_set_fndecl (stmt, targets[0]->decl);
- changed = true;
-   }
- else if (!inplace)
-   {
- tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
+ tree lhs = gimple_call_lhs (stmt);
  gimple new_stmt = gimple_build_call (fndecl, 0);
  gimple_set_location (new_stmt, gimple_location (stmt));
- gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+ if (lhs && TREE_CODE (lhs) == SSA_NAME)
+   {
+ tree var = create_tmp_var (TREE_TYPE (lhs), NULL);
+ tree def = get_or_create_ssa_default_def (cfun, var);
+ gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+ update_call_from_tree (gsi, def);
+   }
+ else
+   gsi_replace (gsi, new_stmt, true);
  return true;
}
+
+ gimple_call_set_fndecl (stmt, fndecl);
+ changed = true;
}
}
 }
--- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj 2014-01-08 16:42:49.588747876 
+0100
+++ gcc/testsuite/g++.dg/opt/pr59622-2.C2014-01-08 16:42:49.588747876 
+0100
@@ -0,0 +1,21 @@
+// PR tree-optimization/59622
+// { dg-do compile }
+// { dg-options "-O2" }
+
+namespace
+{
+  struct A
+  {
+A () {}
+virtual A *bar (int) = 0;
+A *baz (int x) { return bar (x); }
+  };
+}
+
+A *a;
+
+void
+foo ()
+{
+  a->baz (0);
+}
--- gcc/testsuite/g++.dg/opt/pr59622-3.C.jj 2014-01-08 17:00:20.944359961 
+0100
+++ gcc/testsuite/g++.dg/opt/pr59622-3.C2014-01-08 17:00:44.558244745 
+0100

[PATCH] Fix cfgcleanup regression (PR rtl-optimization/59724)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 05:54:55PM +0100, Uros Bizjak wrote:
> This caused PR59724 on alpha:
> 
> 20021116-1.c: In function ‘foo’:
> 20021116-1.c:31:1: error: NOTE_INSN_BASIC_BLOCK is missing for block 9
>  }
>  ^
> 20021116-1.c:31:1: error: insn outside basic block
> (jump_insn 94 52 93 9 (return) 20021116-1.c:31 -1
>  (nil)
>  -> return)

Ugh, indeed.  The problem is that try_head_merge_bb really wants
flow_find_head_matching_sequence to count all (non-note) insns, not
just active insns, because otherwise as in the above testcase we
can have e.g. 2 active insns followed by one non-active, all matching
(flow_find_head_matching_sequence returns 2) and on another edge
just 2 active insns and nothing else matching.  2 == 2, so the caller
thinks it doesn't matter which one is shorter, but we have the insn range
of 3 insns together.

So, this patch just reverts the try_head_merge_bb changes and makes
flow_find_head_matching_sequence behave the old way when called from
try_head_merge_bb, i.e. count all non-note insns, and only when called
from ifcvt.c count just active insns.  Plus the ifcvt.c change ensures
we don't mistakenly call it with stop_after == 0 (which wouldn't actually
stop).

Bootstrapped/regtested on x86_64-linux and i686-linux, Uros is testing it
on Alpha.  Ok for trunk?

2014-01-08  Jakub Jelinek  

PR rtl-optimization/59724
* ifcvt.c (cond_exec_process_if_block): Don't call
flow_find_head_matching_sequence with 0 longest_match.
* cfgcleanup.c (flow_find_head_matching_sequence): Count even
non-active insns if !stop_after.
(try_head_merge_bb): Revert 2014-01-07 changes.

--- gcc/ifcvt.c.jj  2014-01-08 10:23:20.0 +0100
+++ gcc/ifcvt.c 2014-01-08 18:46:17.017715169 +0100
@@ -522,7 +522,10 @@ cond_exec_process_if_block (ce_if_block
  n_insns -= 2 * n_matching;
}

-  if (then_start && else_start)
+  if (then_start
+ && else_start
+ && then_n_insns > n_matching
+ && else_n_insns > n_matching)
{
  int longest_match = MIN (then_n_insns - n_matching,
   else_n_insns - n_matching);
--- gcc/cfgcleanup.c.jj 2014-01-07 08:54:05.772736321 +0100
+++ gcc/cfgcleanup.c2014-01-08 18:41:14.433307914 +0100
@@ -1421,7 +1421,8 @@ flow_find_cross_jump (basic_block bb1, b
 /* Like flow_find_cross_jump, except start looking for a matching sequence from
the head of the two blocks.  Do not include jumps at the end.
If STOP_AFTER is nonzero, stop after finding that many matching
-   instructions.  */
+   instructions.  If STOP_AFTER is zero, count all INSN_P insns, if it is
+   non-zero, only count active insns.  */

 int
 flow_find_head_matching_sequence (basic_block bb1, basic_block bb2, rtx *f1,
@@ -1493,7 +1494,7 @@ flow_find_head_matching_sequence (basic_

  beforelast1 = last1, beforelast2 = last2;
  last1 = i1, last2 = i2;
- if (active_insn_p (i1))
+ if (!stop_after || active_insn_p (i1))
ninsns++;
}

@@ -2408,7 +2409,9 @@ try_head_merge_bb (basic_block bb)
   max_match--;
   if (max_match == 0)
return false;
-  e0_last_head = prev_active_insn (e0_last_head);
+  do
+   e0_last_head = prev_real_insn (e0_last_head);
+  while (DEBUG_INSN_P (e0_last_head));
 }

   if (max_match == 0)
@@ -2428,14 +2431,16 @@ try_head_merge_bb (basic_block bb)
   basic_block merge_bb = EDGE_SUCC (bb, ix)->dest;
   rtx head = BB_HEAD (merge_bb);

-  if (!active_insn_p (head))
-   head = next_active_insn (head);
+  while (!NONDEBUG_INSN_P (head))
+   head = NEXT_INSN (head);
   headptr[ix] = head;
   currptr[ix] = head;

   /* Compute the end point and live information  */
   for (j = 1; j < max_match; j++)
-   head = next_active_insn (head);
+   do
+ head = NEXT_INSN (head);
+   while (!NONDEBUG_INSN_P (head));
   simulate_backwards_to_point (merge_bb, live, head);
   IOR_REG_SET (live_union, live);
 }

Jakub

Fix segfault with weak external symbols

2014-01-08 Thread Eric Botcazou

This is a regression present on the mainline for weak external symbols and 
languages with non-call exceptions:

0xb222df crash_signal
/home/eric/svn/gcc/gcc/toplev.c:337
0x75ed9c symtab_alias_ultimate_target(symtab_node*, availability*)
/home/eric/svn/gcc/gcc/symtab.c:989
0xb69a59 varpool_variable_node
/home/eric/svn/gcc/gcc/cgraph.h:1430
0xb69a59 tree_could_trap_p(tree_node*)
/home/eric/svn/gcc/gcc/tree-eh.c:2691
0xb6a85c stmt_could_throw_1_p
/home/eric/svn/gcc/gcc/tree-eh.c:2751
0xb6a85c stmt_could_throw_p(gimple_statement_base*)
/home/eric/svn/gcc/gcc/tree-eh.c:2780
0xb6d46f lower_eh_constructs_2
/home/eric/svn/gcc/gcc/tree-eh.c:2028
0xb6d46f lower_eh_constructs_1
/home/eric/svn/gcc/gcc/tree-eh.c:2123
0xb6f871 lower_eh_constructs
/home/eric/svn/gcc/gcc/tree-eh.c:2141
0xb6f871 execute
/home/eric/svn/gcc/gcc/tree-eh.c:2193
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

In tree_could_trap_p:

case VAR_DECL:
  /* Assume that accesses to weak vars may trap, unless we know
 they are certainly defined in current TU or in some other
 LTO partition.  */
  if (DECL_WEAK (expr))
{
  struct varpool_node *node;
  if (!DECL_EXTERNAL (expr))
return false;
  node = varpool_variable_node (varpool_get_node (expr), NULL);
  if (node && node->symbol.in_other_partition)
return false;
  return true;
}
  return false;

The problem is that varpool_get_node returns NULL and varpool_variable_node
(and its callee symtab_alias_ultimate_target) chokes on the NULL.  This is
a regression from the 4.8.x series, where the same NULL goes through the
function without a hitch.

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2014-01-08  Eric Botcazou  

* cgraph.h (varpool_variable_node): Do not choke on null node.


2014-01-08  Eric Botcazou  

* gnat.dg/weak2.ad[sb]: New test.


-- 
Eric BotcazouIndex: cgraph.h
===
--- cgraph.h	(revision 206418)
+++ cgraph.h	(working copy)
@@ -1426,8 +1426,12 @@ varpool_variable_node (varpool_node *nod
 {
   varpool_node *n;
 
-  n = dyn_cast  (symtab_alias_ultimate_target (node,
-			 availability));
+  if (node)
+n = dyn_cast  (symtab_alias_ultimate_target (node,
+			   availability));
+  else
+n = NULL;
+
   if (!n && availability)
 *availability = AVAIL_NOT_AVAILABLE;
   return n;
-- { dg-do compile }

package body Weak2 is

   function F return Integer is
   begin
  return Var;
   end;

end Weak2;
package Weak2 is

   Var : Integer;
   pragma Import (Ada, Var, "var_name");
   pragma Weak_External (Var);

   function F return Integer;

end Weak2;

Re: [Patch] Regex bracket matcher cache optimization

2014-01-08 Thread Paolo Carlini


Hi,

On 01/08/2014 11:11 PM, Tim Shen wrote:

On Wed, Jan 8, 2014 at 5:20 AM, Paolo Carlini  wrote:

On 01/08/2014 10:24 AM, Jonathan Wakely wrote:

Ouch! Yes, that's quite a bit slower, and this code is already very
slow to compile.

With this patch (who is based on a-fixed.diff, committed earlerly),
who use templated member functions instead of templating the whole
_Compiler, time consumption is:
g++ -g -Wall -std=c++11 -g -Wall -std=c++11 -O3 regextest.cc  3.79s
user 0.14s system 98% cpu 3.981 total

Comparing to 4.5s it's better and probably fine.

Booted and tested with -m32 and -m64 respectively.

I agree, it's probably fine for now, but please actually attach the patch ;)

Paolo.

Re: [PATCH] Fix PR59471

2014-01-08 Thread Jakub Jelinek

On Tue, Jan 07, 2014 at 03:54:56PM +0100, Richard Biener wrote:
> 2014-01-07  Richard Biener  
> 
>   PR middle-end/59471
>   * gimplify.c (gimplify_expr): Gimplify register-register type
>   VIEW_CONVERT_EXPRs to separate stmts.
> 
>   * gcc.dg/pr59471.c: New testcase.

The testcase fails on i686-linux, because of the ABI warnings.

I've verified following change ICEd without your fix and works with your
fix, bootstrapped/regtested it on x86_64-linux and i686-linux and committed
to trunk as obvious.

2014-01-08  Jakub Jelinek  

PR middle-end/59471
* gcc.dg/pr59471.c (foo): Avoid vector type arguments or return
type, use pointers to vector type instead.

--- gcc/testsuite/gcc.dg/pr59471.c.jj   2014-01-08 10:23:20.0 +0100
+++ gcc/testsuite/gcc.dg/pr59471.c  2014-01-08 17:52:42.0 +0100
@@ -9,8 +9,8 @@ __attribute__ ((__vector_size__ (16)));
 typedef unsigned int uint32x4_t
 __attribute__ ((__vector_size__ (16)));

-uint8x4_t
-foo (uint16x8_t x)
+void
+foo (uint16x8_t *x, uint8x4_t *y)
 {
-  return (uint8x4_t) ((uint32x4_t) x)[0];
+  *y = (uint8x4_t) ((uint32x4_t) (*x))[0];
 }

Jakub

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu

On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
>>  #endif
>>  /* crc32 for this program.  */
>>  static gcov_unsigned_t crc32;
>> +/* Use this summary checksum rather the computed one if the value is
>> + *non-zero.  */
>> +static gcov_unsigned_t saved_summary_checksum;
>
> Why do you need to save the checksum? Won't it reset summary back with 
> multiple streaming?

This was for the gcov_tool. checksum will be recomputed in gcov_exit
and the value will depend on
the order of gcov_info list. (the order will be different after
reading from gcda files to memory). The purpose was
to have the same summary_checksum so that I can get identical gcov-dump output.

>
> I would really like to avoid introducing those static vars that are used 
> exclusively
> by gcov_exit.  What about putting them into an gcov_context structure that
> is passed around the functions that was broken out?

With my recently patch the localizes this_prg, we only use 64 more
bytes in bss. Do you still we have to remove
all these statics?

>

Re: [MIPS, committed] Revert some Octeon BADDU patches

2014-01-08 Thread Eric Botcazou

> This patch just reverts some changes I'd made to the BADDU patterns
> for the infamous (truncate:QI (plus:SI ...)) -> (plus:QI ...)
> simplification. That simplification was limited to CISCy targets for PR
> 58295.
> 
> Tested on mips64-linux-gnu and applied.  It fixes the octeon-baddu-1.c
> failures.

You presumably need to apply it to the 4.8 branch as well.

-- 
Eric Botcazou

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu

On Wed, Dec 18, 2013 at 9:28 AM, Xinliang David Li  wrote:
>>>
>>>  #ifdef L_gcov_merge_ior
>>>  /* The profile merging function that just adds the counters.  It is given
>>> -   an array COUNTERS of N_COUNTERS old counters and it reads the same 
>>> number
>>> -   of counters from the gcov file.  */
>>> +   an array COUNTERS of N_COUNTERS old counters.
>>> +   When SRC==NULL, it reads the same number of counters from the gcov file.
>>> +   Otherwise, it reads from SRC array.  */
>>>  void
>>> -__gcov_merge_ior (gcov_type *counters, unsigned n_counters)
>>> +__gcov_merge_ior (gcov_type *counters, unsigned n_counters,
>>> +  gcov_type *src, unsigned w __attribute__ ((unused)))
>>
>> So the new in-memory variants are introduced for merging tool, while libgcc 
>> use gcov_read_counter
>> interface?
>> Perhaps we can actually just duplicate the functions to avoid runtime to do 
>> all the scalling
>> and in_mem tests it won't need?
>
>
> I thought about this one a little. How about making the interface
> change conditionally, but still share the implementation?  The merge
> function bodies mostly remain unchanged and there is no runtime
> penalty for libgcov.  The new macros can be shared across most of the
> mergers.
>
> #ifdef IN_PREOFILE_TOOL
> #define GCOV_MERGE_EXTRA_ARGS  gcov_type *src, unsigned w
> #define GCOV_READ_COUNTER  *(src++) * w
> #else
> #define GCOV_MERGE_EXTRA_ARGS
> #define GCOV_READ_COUNTER gcov_read_counter ()
> #endif
>
> __gcov_merge_add (gcov_type *counters, unsigned n_counters,
>   GCOV_MERGE_EXTRA_ARGS)
> {
>
>  for (; n_counters; counters++, n_counters--)
>   {
>   *counters += GCOV_READ_COUNTER ;
>}
>
> }
>
> thanks,

Personally I don't think the run time test of in_mem will cause any
issue. This is in profile dumping, why don't we care a few more cycle
heres? it won't pollute the profile.

If you really don't like that, we can use the above approach, or I can
hide the logic in gcov_read_counter(), i.e. overload
gcov_read_counter() in profile_tool. For that, I will need a new
global variable SRC and set it before calling the merge function.
I would prefer to keep weight in _gcov_merge_* argument list.

What do you think?

-Rong

>
> David
>
>>
>> I would suggest going with libgcov.h changes and clenaups first, with 
>> interface changes next
>> and the gcov-tool is probably quite obvious at the end?
>> Do you think you can split the patch this way?
>>
>> Thanks and sorry for taking long to review. I should have more time again 
>> now.
>> Honza

[MIPS, committed] Fix all but one gcc.dg/tree-ssa failure

2014-01-08 Thread Richard Sandiford

Some of the tests were failing due to the branch cost and some were
failing due to !LOGICAL_OP_NON_SHORT_CIRCUIT.  I just skipped the
latter, as for ARM Cortex-M.

I'll look at the gcc.dg/tree-ssa/ssa-dom-thread-4.c failure separately.

Tested on mips64-linux-gnu and applied.

Thanks,
Richard


gcc/testsuite/
* gcc.dg/tree-ssa/reassoc-32.c, gcc.dg/tree-ssa/reassoc-33.c,
gcc.dg/tree-ssa/reassoc-34.c, gcc.dg/tree-ssa/reassoc-35.c,
gcc.dg/tree-ssa/reassoc-36.c: Extend -mbranch-cost handling to MIPS.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c,
gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c,
gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c,
gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c,
gcc.dg/tree-ssa/vrp87.c, gcc.dg/tree-ssa/forwprop-28.c: Skip for MIPS.

Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c  2014-01-08 22:11:48.552943720 
+
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c  2014-01-08 22:11:50.069956983 
+
@@ -1,7 +1,7 @@
 /* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-*"} } } */
 
 /* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
-/* { dg-additional-options "-mbranch-cost=2" { target avr-*-* } } */
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* } } */
 
 
 int test (int a, int b, int c)
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c  2014-01-08 22:11:48.553943729 
+
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-33.c  2014-01-08 22:11:50.070956992 
+
@@ -1,7 +1,7 @@
 /* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-* hppa*-*-*"} } } */
 
 /* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
-/* { dg-additional-options "-mbranch-cost=2" { target avr-*-* } } */
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* } } */
 
 int test (int a, int b, int c)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c  2014-01-08 22:11:48.552943720 
+
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-34.c  2014-01-08 22:11:50.070956992 
+
@@ -1,7 +1,7 @@
 /* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-* hppa*-*-*"} } } */
 
 /* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
-/* { dg-additional-options "-mbranch-cost=2" { target avr-*-* } } */
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* } } */
 
 int test (int a, int b, int c)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c  2014-01-08 22:11:48.553943729 
+
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-35.c  2014-01-08 22:11:50.070956992 
+
@@ -1,7 +1,7 @@
 /* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-* hppa*-*-*"} } } */
 
 /* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
-/* { dg-additional-options "-mbranch-cost=2" { target avr-*-* } } */
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* } } */
 
 int test (unsigned int a, int b, int c)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c  2014-01-08 22:11:48.553943729 
+
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-36.c  2014-01-08 22:11:50.070956992 
+
@@ -1,7 +1,7 @@
 /* { dg-do run { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* v850*-*-* 
picochip*-*-* moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* 
xtensa*-*-* hppa*-*-*"} } } */
 
 /* { dg-options "-O2 -fno-inline -fdump-tree-reassoc1-details" } */
-/* { dg-additional-options "-mbranch-cost=2" { target avr-*-* } } */
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* } } */
 
 int test (int a, int b, int c)
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c2014-01-08 
22:11:48.552943720 +
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c2014-01-08 
22:11:50.070956992 +
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { ! "m68k*-*-* mmix*-*-* mep*-*-* bfin*-*-* 
v850*-*-* picoch

Re: [Patch] Regex bracket matcher cache optimization

2014-01-08 Thread Tim Shen

On Wed, Jan 8, 2014 at 5:20 AM, Paolo Carlini  wrote:
> On 01/08/2014 10:24 AM, Jonathan Wakely wrote:
>> Ouch! Yes, that's quite a bit slower, and this code is already very
>> slow to compile.

With this patch (who is based on a-fixed.diff, committed earlerly),
who use templated member functions instead of templating the whole
_Compiler, time consumption is:
g++ -g -Wall -std=c++11 -g -Wall -std=c++11 -O3 regextest.cc  3.79s
user 0.14s system 98% cpu 3.981 total

Comparing to 4.5s it's better and probably fine.

Booted and tested with -m32 and -m64 respectively.

> I only want to add that, besides keeping compile-time under control for
> 4.9.0 - please investigate a bit more along the mentioned lines - we should
> also start experimenting with exporting the instantiations. I don't know
> what the other implementations are doing, but in general it definitely makes
> sense, for compile-time performance too. I think we already said that some
> time ago, but the issue seems more important now. Maybe it's really
> unavoidable if we need template complexity for first class run-time
> performance.

After this patch I plan to instantiate _Compiler and _Executor.


-- 
Regards,
Tim Shen

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu

Here is the patch that addresses Honza's concern about bss increment.
It just makes this_prg a local variable.

Some comments are inlined.

On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:

> 
> Do you know how the size of libgcov changed with your patch?
> Quick check of current mainline on compiling empty main gives:
>
> jh@gcc10:~/trunk/build/gcc$ cat t.c
> main()
> {
> }
> jh@gcc10:~/trunk/build/gcc$ ./xgcc -B ./ -O2 -fprofile-generate -o a.out-new 
> --static t.c
> jh@gcc10:~/trunk/build/gcc$ gcc -O2 -fprofile-generate -o a.out-old --static 
> t.c
> jh@gcc10:~/trunk/build/gcc$ size a.out-old
>textdata bss dec hex filename
>  6081413560   16728  628429   996cd a.out-old
> jh@gcc10:~/trunk/build/gcc$ size a.out-new
>textdata bss dec hex filename
>  6126213688   22880  639189   9c0d5 a.out-new
>
> Without profiling I get:
> jh@gcc10:~/trunk/build/gcc$ size a.out-new-no
> jh@gcc10:~/trunk/build/gcc$ size a.out-old-no
>textdata bss dec hex filename
>  5997193448   12568  615735   96537 a.out-old-no
>textdata bss dec hex filename
>  6002473448   12568  616263   96747 a.out-new-no
>
> Quite big for empty program, but mostly glibc fault, I suppose
> (that won't be an issue for embedded platforms). But anyway
> we increased text size overhead from 8k to 12k, BSS size
> overhead from 4k to 10k and data by another 1k.
>

I think it would more fair to compare r204729 and r204730. Your
comparison had some other changes in libgcov such as time_profiler and
indirecto_call_profiler_v2.

Using the same empty t.c, for r204729, we have
xur2%208:gcc >> ./xgcc -B ./ -O2 -fprofile-generate --static -o
a.out-r204729 t.c
xur2%209:gcc >> size a.out-r204729
   text   databssdechex filename
 803207   6352  15448 825007  c96af a.out-r204729
xur2%210:gcc >> ./xgcc -B ./ -O2 --static -o a.out-r204729-no t.c
xur2%211:gcc >> size a.out-r204729-no
   text   databssdechex filename
 790337   6112  11336 807785  c5369 a.out-r204729-no

For r204730, we have
xur2%216:gcc >> ./xgcc -B ./ -O2 -fprofile-generate --static -o
a.out-r204730 t.c
xur2%217:gcc >> size a.out-r204730
   text   databssdechex filename
 802919   6384  21592 830895  cadaf a.out-r204730
xur2%218:gcc >> ./xgcc -B ./ -O2  --static -o a.out-r204730-no t.c
xur2%219:gcc >> size a.out-r204730-no
   text   databssdechex filename
 790337   6112  11336 807785  c5369 a.out-r204730-no

r204730 actually has smaller text, data size with -fprofile-generate.
You are right about there are 6kb more bss space due to the static
variables introduced. It mostly caused by this_prg object.

With the attached trunk patch that localizes this_prg, we have
xur2%42:fdo >> size a.out-new
   text   databssdechex filename
 803479   6456  15512 825447  c9867 a.out-new
xur2%43:fdo >> size a.out-new-no
   text   databssdechex filename
 790545   6112  11368 808025  c5459 a.out-new-no

We are now using 64 more bytes in m64.

Objects size for r204730:
   text   databssdechex filename
 57  0  0 57 39 _gcov_average_profiler.o
 66  0  0 66 42 _gcov_dump.o
516  0  0516204 _gcov_execle.o
476  0  04761dc _gcov_execl.o
476  0  04761dc _gcov_execlp.o
108  0  0108 6c _gcov_execve.o
 98  0  0 98 62 _gcov_execv.o
 98  0  0 98 62 _gcov_execvp.o
126  0 40166 a6 _gcov_flush.o
101  0  0101 65 _gcov_fork.o
122  0  0122 7a _gcov_indirect_call_profiler.o
178  0 16194 c2 _gcov_indirect_call_profiler_v2.o
 89  0  0 89 59 _gcov_interval_profiler.o
 52  0  0 52 34 _gcov_ior_profiler.o
126  0  0126 7e _gcov_merge_add.o
242  0  0242 f2 _gcov_merge_delta.o
126  0  0126 7e _gcov_merge_ior.o
251  0  0251 fb _gcov_merge_single.o
156  0  0156 9c _gcov_merge_time_profile.o
   9252  0   6144  15396   3c24 _gcov.o
115  0  0115 73 _gcov_one_value_profiler.o
 69  0  0 69 45 _gcov_pow2_profiler.o
 66  0  0 66 42 _gcov_reset.o
 77  0  8 85 55 _gcov_time_profiler.o

Objects size for r204729:
   text   databssdechex filename
 57  0  0 57 39 _gcov_average_profiler.o
 72  0  0 72 48 _gcov_dump.o
516  0  0516204 _gcov_execle.o
476  0  04761dc _gcov_execl.o
476  0  04761dc _gcov_execlp.o
108  0  0108 6c _gcov_execve.o
 98  0  0 98 62 _gcov_execv.o
 98  0  0 98 62 _gcov_execvp.o
101  0  0101 65 _gcov_fork.o
122  0

Re: [patch][i386] Remove code executed only if reload_in_progress (i.e. never)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 10:51:53PM +0100, Steven Bosscher wrote:
> Hello Uros, and everyone else,
> 
> Now that LRA is always used for the i386 targets, reload_in_progress
> is never set so all code conditional on it is now dead. The attached
> patch removes this code.
> 
> Sadly I'm having difficulty testing the patch because I have no access
> to a suitable x86_64 or ix86 box :-) I'll try to test the patch on a
> compile farm machine, but I'm already posting the patch to hear if
> this is still OK for this late stage of the development cycle. It's
> not as if we're going to go back to reload so the code really is dead
> AFAICT, but it's obviously not a bug fix.

While LRA is always on, making it harder to test with reload doesn't seem to
be a good idea to me for 4.9, when some RA issue is reported for these
architectures, often one just patches config/i386/i386.c by hand to enable
reload instead of LRA and tests it with that instead.  This patch would mean
we'd need to keep around a patchset to apply for those purposes.

>   * i386/i386.c (legitimize_pic_address): Remove never-executed code,
>   reload_in_progress is never set if LRA is used.
>   (legitimize_tls_address): Likewise.
>   (ix86_expand_move): Likewise.
>   (ix86_expand_binary_operator): Likewise.
>   (ix86_expand_unary_operator): Likewise.
>   * i386/predicates.md (index_register_operand): Likewise.

config/ prefix would be needed in the ChangeLog entries.

Jakub

Re: [PATCH,rs6000] Add -maltivec={le,be} options

2014-01-08 Thread Bill Schmidt

On Wed, 2014-01-08 at 16:46 -0500, David Edelsohn wrote:
> On Tue, Jan 7, 2014 at 6:59 PM, Bill Schmidt
>  wrote:
> > On Tue, 2014-01-07 at 22:18 +, Joseph S. Myers wrote:
> >> On Tue, 7 Jan 2014, Bill Schmidt wrote:
> >>
> >> > Yes, sorry for not being more clear.  This is indeed for interpretation
> >> > of element numbers in Altivec intrinsics such as vec_splat, vec_extract,
> >> > vec_insert, and so forth.  By default these will match array element
> >> > order for the target endianness.  But with -maltivec=be for a little
> >> > endian target, we will force use of big-endian element order (matching
> >> > the behavior of the underlying hardware instructions).
> >>
> >> Thanks for the explanation.  I think you should make the .texi
> >> documentation say something more like this.
> >>
> >
> > Sure, I can wordsmith something along those lines.  Thanks for the
> > feedback!
> 
> This patch is okay with the documentation clarification requested by Joseph.
> 
> I also would suggest removing "but may be enabled in the future" from
> the "le" option and limit the comment to ignored on big-endian
> targets.
> 
> Also, please add a comment to -maltivec that it defaults to the native
> endian order.  And for -maltivec=be, please state that this is the
> default for big-endian; for -maltivec=le, please state that this is
> the default for little-endian. It's important to be clear and
> redundant in this type of documentation.
> 
> Thanks, David
> 

OK, thanks very much for the review.  I'll clean up the documentation as
requested this evening.

Thanks,
Bill

[patch][i386] Remove code executed only if reload_in_progress (i.e. never)

2014-01-08 Thread Steven Bosscher

Hello Uros, and everyone else,

Now that LRA is always used for the i386 targets, reload_in_progress
is never set so all code conditional on it is now dead. The attached
patch removes this code.

Sadly I'm having difficulty testing the patch because I have no access
to a suitable x86_64 or ix86 box :-) I'll try to test the patch on a
compile farm machine, but I'm already posting the patch to hear if
this is still OK for this late stage of the development cycle. It's
not as if we're going to go back to reload so the code really is dead
AFAICT, but it's obviously not a bug fix.

Ciao!
Steven

* i386/i386.c (legitimize_pic_address): Remove never-executed code,
reload_in_progress is never set if LRA is used.
(legitimize_tls_address): Likewise.
(ix86_expand_move): Likewise.
(ix86_expand_binary_operator): Likewise.
(ix86_expand_unary_operator): Likewise.
* i386/predicates.md (index_register_operand): Likewise.

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 206444)
+++ config/i386/i386.c  (working copy)
@@ -13013,11 +13013,7 @@ legitimize_pic_address (rtx orig, rtx reg)
   && ix86_cmodel != CM_SMALL_PIC && gotoff_operand (addr, Pmode))
 {
   rtx tmpreg;
-  /* This symbol may be referenced via a displacement from the PIC
-base address (@GOTOFF).  */
 
-  if (reload_in_progress)
-   df_set_regs_ever_live (PIC_OFFSET_TABLE_REGNUM, true);
   if (GET_CODE (addr) == CONST)
addr = XEXP (addr, 0);
   if (GET_CODE (addr) == PLUS)
@@ -13046,11 +13042,6 @@ legitimize_pic_address (rtx orig, rtx reg)
 }
   else if (!TARGET_64BIT && !TARGET_PECOFF && gotoff_operand (addr, Pmode))
 {
-  /* This symbol may be referenced via a displacement from the PIC
-base address (@GOTOFF).  */
-
-  if (reload_in_progress)
-   df_set_regs_ever_live (PIC_OFFSET_TABLE_REGNUM, true);
   if (GET_CODE (addr) == CONST)
addr = XEXP (addr, 0);
   if (GET_CODE (addr) == PLUS)
@@ -13108,11 +13099,6 @@ legitimize_pic_address (rtx orig, rtx reg)
}
   else
{
- /* This symbol must be referenced via a load from the
-Global Offset Table (@GOT).  */
-
- if (reload_in_progress)
-   df_set_regs_ever_live (PIC_OFFSET_TABLE_REGNUM, true);
  new_rtx = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), UNSPEC_GOT);
  new_rtx = gen_rtx_CONST (Pmode, new_rtx);
  if (TARGET_64BIT)
@@ -13164,8 +13150,6 @@ legitimize_pic_address (rtx orig, rtx reg)
{
  if (!TARGET_64BIT)
{
- if (reload_in_progress)
-   df_set_regs_ever_live (PIC_OFFSET_TABLE_REGNUM, true);
  new_rtx = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op0),
UNSPEC_GOTOFF);
  new_rtx = gen_rtx_PLUS (Pmode, new_rtx, op1);
@@ -13453,8 +13437,6 @@ legitimize_tls_address (rtx x, enum tls_model mode
}
   else if (flag_pic)
{
- if (reload_in_progress)
-   df_set_regs_ever_live (PIC_OFFSET_TABLE_REGNUM, true);
  pic = pic_offset_table_rtx;
  type = TARGET_ANY_GNU_TLS ? UNSPEC_GOTNTPOFF : UNSPEC_GOTTPOFF;
}
@@ -16644,10 +16626,8 @@ ix86_expand_move (enum machine_mode mode, rtx oper
  /* dynamic-no-pic */
  if (MACHOPIC_INDIRECT)
{
- rtx temp = ((reload_in_progress
-  || ((op0 && REG_P (op0))
-  && mode == Pmode))
- ? op0 : gen_reg_rtx (Pmode));
+ rtx temp = (op0 && REG_P (op0) && mode == Pmode)
+ ? op0 : gen_reg_rtx (Pmode);
  op1 = machopic_indirect_data_reference (op1, temp);
  if (MACHOPIC_PURE)
op1 = machopic_legitimize_pic_address (op1, mode,
@@ -17318,16 +17298,9 @@ ix86_expand_binary_operator (enum rtx_code code, e
  /* Emit the instruction.  */
 
   op = gen_rtx_SET (VOIDmode, dst, gen_rtx_fmt_ee (code, mode, src1, src2));
-  if (reload_in_progress)
-{
-  /* Reload doesn't know about the flags register, and doesn't know that
- it doesn't want to clobber it.  We can only do this with PLUS.  */
-  gcc_assert (code == PLUS);
-  emit_insn (op);
-}
-  else if (reload_completed
-  && code == PLUS
-  && !rtx_equal_p (dst, src1))
+  if (reload_completed
+  && code == PLUS
+  && !rtx_equal_p (dst, src1))
 {
   /* This is going to be an LEA; avoid splitting it later.  */
   emit_insn (op);
@@ -17494,13 +17467,8 @@ ix86_expand_unary_operator (enum rtx_code code, en
   /* Emit the instruction.  */
 
   op = gen_rtx_SET (VOIDmode, dst, gen_rtx_fmt_e (code, mode, src));
-  if (reload_in_progress || code == NOT)
-{
-  /* Reload doesn't know about the flags register, and

Re: [PATCH,rs6000] Add -maltivec={le,be} options

2014-01-08 Thread David Edelsohn

On Tue, Jan 7, 2014 at 6:59 PM, Bill Schmidt
 wrote:
> On Tue, 2014-01-07 at 22:18 +, Joseph S. Myers wrote:
>> On Tue, 7 Jan 2014, Bill Schmidt wrote:
>>
>> > Yes, sorry for not being more clear.  This is indeed for interpretation
>> > of element numbers in Altivec intrinsics such as vec_splat, vec_extract,
>> > vec_insert, and so forth.  By default these will match array element
>> > order for the target endianness.  But with -maltivec=be for a little
>> > endian target, we will force use of big-endian element order (matching
>> > the behavior of the underlying hardware instructions).
>>
>> Thanks for the explanation.  I think you should make the .texi
>> documentation say something more like this.
>>
>
> Sure, I can wordsmith something along those lines.  Thanks for the
> feedback!

This patch is okay with the documentation clarification requested by Joseph.

I also would suggest removing "but may be enabled in the future" from
the "le" option and limit the comment to ignored on big-endian
targets.

Also, please add a comment to -maltivec that it defaults to the native
endian order.  And for -maltivec=be, please state that this is the
default for big-endian; for -maltivec=le, please state that this is
the default for little-endian. It's important to be clear and
redundant in this type of documentation.

Thanks, David

Re: [PATCH,rs6000,committed] Remove duplicates from altivec_overloaded_builtins

2014-01-08 Thread David Edelsohn

On Wed, Jan 8, 2014 at 3:15 PM, Bill Schmidt
 wrote:
> This patch removes a couple of redundant entries I noticed in
> altivec_overloaded_builtins.  Identical entries occur nearby.
>
> Bootstrapped and tested on powerpc64-unknown-linux-gnu with no
> regressions, applied as obvious.
>
> Thanks,
> Bill
>
>
> 2014-01-08  Bill Schmidt  
>
> * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
> two duplicate entries.

Okay, good catch.

Thanks, David

Re: microMIPS jump instructions

2014-01-08 Thread Richard Sandiford

"Moore, Catherine"  writes:
> 2014-01-08  Catherine Moore  
> 
>   gcc/testsuite/
>   * gcc.target/mips/umips-branch-3.c: New test.
>   * gcc.target/mips/umips-branch-4.c: New test.
> 
>   gcc/
>   * config/mips/mips.md (simple_return): Attempt to use JRC for microMIPS.
>   * config/mips/mips.h (MIPS_CALL): Attempt to use JALS for microMIPS.

OK, thanks, but:

> Index: gcc/config/mips/mips.md
> ===
> --- gcc/config/mips/mips.md   (revision 206407)
> +++ gcc/config/mips/mips.md   (working copy)
> @@ -1,5 +1,5 @@
>  ;;  Mips.md   Machine Description for MIPS based processors
> -;;  Copyright (C) 1989-2014 Free Software Foundation, Inc.
> +;;  Copyright (C) 1989-2013 Free Software Foundation, Inc.
>  ;;  Contributed by   A. Lichnewsky, l...@inria.inria.fr
>  ;;  Changes by   Michael Meissner, meiss...@osf.org
>  ;;  64-bit r4000 support by Ian Lance Taylor, i...@cygnus.com, and

please drop this bit.

Richard

[PATCH] Fix for PR 59524

2014-01-08 Thread Iyer, Balaji V

Hello Everyone,
Attached, please find a patch will fix the bug mentioned in PR 59524. 
The main issue was that Cilk keywords tests are running even when the user 
configured the compiler with --disable-libcilkrts. This patch should fix this 
issue for C and C++. This is tested on x86 and x86_64.

Here are the ChangeLog entries

gcc/testsuite/ChangeLog
+2014-01-08  Balaji V. Iyer  
+
+   PR testsuite/59524
+   * gcc.dg/cilk-plus/cilk-plus.exp: Make sure the cilk keywords tests
+   are run only if the Cilk library is available/enabled.
+   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
+   * lib/target-supports.exp (check_libcilkrts_available): New function.
+

Is this Ok for trunk?

Thanks,

Balaji V. Iyer.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 519d472..e0a0e43 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2014-01-08  Balaji V. Iyer  
+
+   PR testsuite/59524
+   * gcc.dg/cilk-plus/cilk-plus.exp: Make sure the cilk keywords tests
+   are run only if the Cilk library is available/enabled.
+   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
+   * lib/target-supports.exp (check_libcilkrts_available): New function.
+
 2014-01-07  Yufeng Zhang  
 
* gcc.target/arm/neon/vst1Q_laneu64-1.c: New test.
diff --git a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp 
b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
index e201fd2..b08be25 100644
--- a/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
+++ b/gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
@@ -47,9 +47,7 @@ dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/AN/*.c]] " -g
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " 
-g -O2 -ftree-vectorize -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " 
-g -O3 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " 
-O3 -ftree-vectorize -fcilkplus -g" " "
-dg-finish
 
-dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " 
-fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -O0 
-fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -O1 
-fcilkplus" " "
@@ -61,25 +59,17 @@ dg-runtest [lsort [glob -nocomplain 
$srcdir/g++.dg/cilk-plus/AN/*.cc]] " -g -O1
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -g 
-O2 -ftree-vectorize -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -g 
-O3 -fcilkplus" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/AN/*.cc]] " -O3 
-ftree-vectorize -fcilkplus -g" " "
-dg-finish
 
-dg-init
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O1 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O2 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -O3 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g 
-O2 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " -g 
-O3 -fcilkplus" " "
-dg-finish
+if { [check_libcilkrts_available] } {
+dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-O1 -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-O3 -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-g -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/g++.dg/cilk-plus/CK/*.cc]] " 
-g -O2 -fcilkplus" " "
 
-dg-init
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O1 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O2 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O3 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-g -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-g -O2 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-g -O3 -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O1 -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -g -O2 -fcilk

microMIPS jump instructions

2014-01-08 Thread Moore, Catherine

Hi Richard,

It looks like the microMIPS implementation is missing support for the JRC 
instruction and also misses an opportunity to generate JALS.
I've attached a patch, plus some new test cases to correct this.  Does this 
look okay to commit?  I'd like to get it in 4.9.

Thanks,
Catherine



jrc-jals.cl
Description: jrc-jals.cl


jrc-jals.patch
Description: jrc-jals.patch

Re: [PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types

2014-01-08 Thread Joseph S. Myers

On Wed, 8 Jan 2014, Richard Henderson wrote:

> diff --git a/libgcc/soft-fp/soft-fp.h b/libgcc/soft-fp/soft-fp.h
> index 696fc86..b54b1ed 100644
> --- a/libgcc/soft-fp/soft-fp.h
> +++ b/libgcc/soft-fp/soft-fp.h
> @@ -237,6 +237,11 @@ typedef int DItype __attribute__ ((mode (DI)));
>  typedef unsigned int UQItype __attribute__ ((mode (QI)));
>  typedef unsigned int USItype __attribute__ ((mode (SI)));
>  typedef unsigned int UDItype __attribute__ ((mode (DI)));
> +#if _FP_W_TYPE_SIZE == 64
> +typedef int TItype __attribute__ ((mode (TI)));
> +typedef unsigned int UTItype __attribute__ ((mode (TI)));
> +#endif

This isn't the right conditional.  _FP_W_TYPE_SIZE is ultimately an 
optimization choice and need not be related to whether any TImode 
functions are being defined using soft-fp, or whether TImode is supported 
at all.  I think the most you can do is have sfp-machine.h define a macro 
to say that TImode should be supported in soft-fp, rather than actually 
defining the types itself.

(If someone were to use soft-fp on hppa64, then they might well use 
_FP_W_TYPE_SIZE == 64, but hppa64 doesn't support TImode.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, AArch64 5/6] soft-fp: Define UDWtype for longlong.h

2014-01-08 Thread Joseph S. Myers

soft-fp patches should go first to glibc.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH,rs6000,committed] Remove duplicates from altivec_overloaded_builtins

2014-01-08 Thread Bill Schmidt

This patch removes a couple of redundant entries I noticed in
altivec_overloaded_builtins.  Identical entries occur nearby.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no
regressions, applied as obvious.

Thanks,
Bill


2014-01-08  Bill Schmidt  

* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
two duplicate entries.


Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 206375)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -608,10 +608,6 @@ const struct altivec_builtin_types altivec_overloa
 RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, ALTIVEC_BUILTIN_VUPKHSH,
 RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V8HI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKH, P8V_BUILTIN_VUPKHSW,
-RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKH, P8V_BUILTIN_VUPKHSW,
-RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, P8V_BUILTIN_VUPKHSW,
 RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, P8V_BUILTIN_VUPKHSW,

PING: PATCH: PRs bootstrap/59580/59583: Improve x86 --with-arch/--with-cpu= configure handling

2014-01-08 Thread H.J. Lu

On Mon, Dec 23, 2013 at 6:14 AM, H.J. Lu  wrote:
> On Sun, Dec 22, 2013 at 11:11:12PM +0100, Uros Bizjak wrote:
>
>> Please get someone to review config.gcc changes. They are OK as far as
>> x86 rename is concerned, but I can't review functional changes.
>
> Hi Paolo,
>
> Can you review this config.gcc change?
>
>>
>> > @@ -588,6 +588,22 @@ esac
>> >  # Common C libraries.
>> >  tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3"
>> >
>> > +# 32-bit x86 processors supported by --with-arch=.  Each processor
>> > +# MUST be separated by exactly one space.
>> > +x86_archs="athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
>> > +athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
>> > +i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
>> > +pentium4 pentium4m pentiumpro prescott"
>>
>> Missing "native".
>
> x86_archs contains 32-bit x86 processors.  "native" is allowed for
> 64-bit targets and is included in x86_64_archs.  64-bit processors
> can be used in --with-arch/--with-cpu= for 32-bit targets.
>
> Here is a patch to improve x86 x86 --with-arch/--with-cpu= configure
> handling.  This patch defines 3 variables:
>
> 1. x86_archs: It contains 32-bit x86 processors supported by
> --with-arch=, which aren't allowed for 64-bit targets.
> 2. x86_64_archs: It contains 64-bit x86 processors supported by
> --with-arch=, which are allowed for both 32-bit and 64-bit targets.
> 3. x86_cpus.  It contains x86 processors supported by --with-cpu=,
> which are allowed for both 32-bit and 64-bit targets.
>
> Each processor in those 3 variables are separated by exactly one space.
>
> Instead of checking if a value of --with-arch/--with-cpu= is valid in many
> difference places with
>
> case ${val} in
> valid pattern list)
>   OK
>   ;;
> *)
>   error
>   exit 1
>   ;;
> esac
>
> and updating all pattern lists when adding a new processor, this patch
> uses
>
> case " valid processor list separated by exactly one space " in
> *" ${val} "*)
>   OK
>   ;;
> *)
>   error
>   exit 1
>   ;;
> esac
>
> "valid processor list separated by exactly one space" is combination
> of 3 processor variables.  It only needs separate a check for empty
> value with
>
> if test x${val} != x; then
>   $val isn't empty
> else
>   $val is empty
> fi
>
> With this approach, we only need to add new 32-bit processors to x86_archs
> and new 64-bit processors to x86_64_archs.  They will be supported by
> --with-arch/--with-cpu= automatically.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2013-12-23   H.J. Lu  
>
> PR bootstrap/59580
> PR bootstrap/59583
> * config.gcc (x86_archs): New variable.
> (x86_64_archs): Likewise.
> (x86_cpus): Likewise.
> Use $x86_archs, $x86_64_archs and $x86_cpus to check valid
> --with-arch/--with-cpu= options.
> Support --with-arch=/--with-cpu={nehalem,westmere,
> sandybridge,ivybridge,haswell,broadwell,bonnell,silvermont}.
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 24dbaf9..51eb2b1 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -588,6 +588,22 @@ esac
>  # Common C libraries.
>  tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3"
>
> +# 32-bit x86 processors supported by --with-arch=.  Each processor
> +# MUST be separated by exactly one space.
> +x86_archs="athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
> +athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
> +i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
> +pentium4 pentium4m pentiumpro prescott"
> +# 64-bit x86 processors supported by --with-arch=.  Each processor
> +# MUST be separated by exactly one space.
> +x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
> +bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
> +core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
> +sandybridge ivybridge haswell broadwell bonnell silvermont x86-64 native"
> +# Additional x86 processors supported by --with-cpu=.  Each processor
> +# MUST be separated by exactly one space.
> +x86_cpus="generic intel"
> +
>  # Common parts for widely ported systems.
>  case ${target} in
>  *-*-darwin*)
> @@ -1392,20 +1408,21 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
> i[34567]86-*-knetbsd*-gnu | i
> done
> TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 
> 's/^,//'`
> need_64bit_isa=yes
> -   case X"${with_cpu}" in
> -   
> Xgeneric|Xintel|Xatom|Xslm|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver4|Xbdver3|Xbdver2|Xbdver1|Xbtver2|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3)
> -   ;;
> -   X)
> +   if test x$with_cpu = x; then
> if test x$with_cpu_64 = x; the

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 07:41:26PM +, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > 2014-01-08  Jakub Jelinek  
> >
> > * target-globals.c (save_target_globals): Allocate most of the
> > structs using GC in payload of target_globals struct instead
> > of allocating them on the heap.
> 
> Looks good to me FWIW.  I don't know either way about the one-big-blob thing.
> 
> Note that we'll still leak memory when deleting TARGET_OPTION_NODEs
> because target_ira_int and target_lra_int have pointers to heap-allocated
> storage.

Yeah, perhaps that is something to fix incrementally.

But, at least we will not leak ~ 0.5MB per (unique) target attribute
used on some unused function.

Jakub

Re: PR 59137: Incorrect liveness info during dbr_schedule

2014-01-08 Thread Steven Bosscher

On Wed, Jan 8, 2014 at 8:27 PM, Richard Sandiford wrote:
> gcc/
> PR rtl-optimization/59137
> * reorg.c (steal_delay_list_from_target): Call update_block for
> elided insns.
> (steal_delay_list_from_fallthrough, relax_delay_slots): Likewise.
>
> gcc/testsuite/
> PR rtl-optimization/59137
> * gcc.target/mips/pr59137.c: New test.

This is OK for trunk. For release branches I'll defer to the RMs.

Ciao!
Steven

Re: [Patch, bfin/c6x] Fix ICE for backends that rely on reorder_loops.

2014-01-08 Thread Teresa Johnson

On Tue, Jan 7, 2014 at 8:07 AM, Bernd Schmidt  wrote:
> On 01/05/2014 05:10 PM, Teresa Johnson wrote:
>>
>> On Sun, Jan 5, 2014 at 3:39 AM, Bernd Schmidt 
>> wrote:
>>>
>>> I have a different patch which I'll submit next week after some more
>>> testing. The assert in cfgrtl is unnecessarily broad and really only
>>> needs
>>> to trigger if -freorder-blocks-and-partition; there's nothing wrong with
>>> entering cfglayout after normal bb-reorder.
>>
>>
>> Currently -freorder-blocks-and-partition is the default for x86. I
>> assume that hw-doloop is not enabled for any i386 targets, which is
>> why we haven't seen this?
>
>
> Precisely.
>
>
>> And will this mean that -freorder-blocks-and-partition cannot be used
>> for the targets that use hw-doloop? If so, should
>> -freorder-blocks-and-partition be prevented with a warning for those
>> targets?
>
>
> If someone explicitly chooses that option we can turn off the reordering in
> hw-doloop. That should happen sufficiently rarely that it isn't a problem.
> That's what the patch below does - bootstraped on x86_64-linux, tested there
> and with bfin-elf. Ok?

Ok, looks good to me.

>
>
>>> I've also tested that Blackfin still benefits from the hw-doloop
>>> reordering
>>> code and generates more hardware loops if it's enabled. So we want to be
>>> able to run it at -O2.
>>
>>
>> I looked at hw-doloop briefly and since it seems to be doing some
>> manual bb reordering I guess it can't simply be moved before bbro. It
>> seems like a better long-term solution would be to make bbro
>> hw-doloop-aware as Felix suggested earlier.
>
>
> Maybe. It could be argued that the code in hw-doloop is relevant only for a
> small class of targets so it should only be enabled for them. In any case,
> that's not stage 3 material and two ports are broken...

Ok, that makes sense. Thanks, Teresa

>
>
> Bernd
>



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-08 Thread Richard Sandiford

Jakub Jelinek  writes:
> 2014-01-08  Jakub Jelinek  
>
>   * target-globals.c (save_target_globals): Allocate most of the
>   structs using GC in payload of target_globals struct instead
>   of allocating them on the heap.

Looks good to me FWIW.  I don't know either way about the one-big-blob thing.

Note that we'll still leak memory when deleting TARGET_OPTION_NODEs
because target_ira_int and target_lra_int have pointers to heap-allocated
storage.

Thanks,
Richard

Re: Drop -m32 from pr59099.c

2014-01-08 Thread Uros Bizjak

Hello!

>> gcc.target/i386/pr59099.c fails on x86_64-redhat-linux-gnu with
>> --disable-multilib because linking -m32 code is not supported.  The
>> test case passes in 64-bit mode as well.  The other -m32 tests do
>> not use dg-do run, so they do not exhibit this problem.
>>
>> Okay for trunk?
>
> No, this IMHO really should be:
> /* { dg-do run { target { ia32 && fpic } } } */
> /* { dg-options "-O2 -fPIC" } */
>
> All tests in gcc.target/i386 having -m32 (or -m64) in dg-options
> are buggy and should be fixed, either by adding { target ia32 }
> to their dg-do compile or whatever other dg-do they have, or
> adding
> /* { dg-require-effective-target ia32 } */
> line and dropping the -m32 from dg-options.

I have committed following testsuite patch that removes -m32 from
options. Also, the patch includes check for fpic effective target when
-fpic is used.

2014-01-08  Uros Bizjak  

* gcc.target/i386/asm-1.c: Remove dg-options.
* gcc.target/i386/incoming-5.c (dg-options): Remove -m32.
* gcc.target/i386/pr55433.c (dg-options): Ditto.
* gcc.target/i386/pr57848.c (dg-options): Ditto.
* gcc.target/i386/pr59099.c (dg-options): Ditto.
Require fpic effective target.
* gcc.target/i386/pr56246.c (dg-do): Compile for fpic target only.

Tested on x86_64-pc-linux-gnu {,-m32}, will be committed to mainline
in a moment.

Uros.
Index: gcc.target/i386/asm-1.c
===
--- gcc.target/i386/asm-1.c (revision 206436)
+++ gcc.target/i386/asm-1.c (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ia32 } */
-/* { dg-options "-m32" } */
 
 register unsigned int EAX asm ("r14"); /* { dg-error "register name" } */
 
Index: gcc.target/i386/incoming-5.c
===
--- gcc.target/i386/incoming-5.c(revision 206436)
+++ gcc.target/i386/incoming-5.c(working copy)
@@ -1,6 +1,6 @@
 /* PR middle-end/37009 */
 /* { dg-do compile { target { { ! *-*-darwin* } && ia32 } } } */
-/* { dg-options "-m32 -mincoming-stack-boundary=2 
-mpreferred-stack-boundary=2" } */
+/* { dg-options "-mincoming-stack-boundary=2 -mpreferred-stack-boundary=2" } */
 
 extern void bar (double *);
 
Index: gcc.target/i386/pr55433.c
===
--- gcc.target/i386/pr55433.c   (revision 206436)
+++ gcc.target/i386/pr55433.c   (working copy)
@@ -1,5 +1,5 @@
-/* { dg-do compile {target { *-*-darwin* } } } */
-/* { dg-options "-O1 -m32" } */
+/* { dg-do compile { target { *-*-darwin* } } } */
+/* { dg-options "-O1" } */
 
 typedef unsigned long long tick_t;
 extern int foo(void);
Index: gcc.target/i386/pr56246.c
===
--- gcc.target/i386/pr56246.c   (revision 206436)
+++ gcc.target/i386/pr56246.c   (working copy)
@@ -1,5 +1,5 @@
 /* PR target/56225 */
-/* { dg-do compile { target { ia32 } } } */
+/* { dg-do compile { target { ia32 && fpic } } } */
 /* { dg-options "-O2 -fno-omit-frame-pointer -march=i686 -fpic" } */
 
 void NoBarrier_AtomicExchange (long long *ptr) {
Index: gcc.target/i386/pr57848.c
===
--- gcc.target/i386/pr57848.c   (revision 206436)
+++ gcc.target/i386/pr57848.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -m32" } */
+/* { dg-options "-O1" } */
 
 extern unsigned int __builtin_ia32_crc32si (unsigned int, unsigned int);
 #pragma GCC target("sse4.2")
Index: gcc.target/i386/pr59099.c
===
--- gcc.target/i386/pr59099.c   (revision 206436)
+++ gcc.target/i386/pr59099.c   (working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -fPIC -m32" } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-O2 -fPIC" } */
 
 void (*pfn)(void);

[MIPS, committed] Revert some Octeon BADDU patches

2014-01-08 Thread Richard Sandiford

This patch just reverts some changes I'd made to the BADDU patterns
for the infamous (truncate:QI (plus:SI ...)) -> (plus:QI ...) simplification.
That simplification was limited to CISCy targets for PR 58295.

Tested on mips64-linux-gnu and applied.  It fixes the octeon-baddu-1.c
failures.

Thanks,
Richard


gcc/
Revert:
2012-10-07  Richard Sandiford  

* config/mips/mips.c (mips_truncated_op_cost): New function.
(mips_rtx_costs): Adjust test for BADDU.
* config/mips/mips.md (*baddu_di): Push truncates to operands.

2012-10-02  Richard Sandiford  

* config/mips/mips.md (*baddu_si_eb, *baddu_si_el): Merge into...
(*baddu_si): ...this new pattern.

Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c  2014-01-02 22:16:09.486330453 +
+++ gcc/config/mips/mips.c  2014-01-08 10:42:17.727013965 +
@@ -3634,17 +3634,6 @@ mips_set_reg_reg_cost (enum machine_mode
 }
 }
 
-/* Return the cost of an operand X that can be trucated for free.
-   SPEED says whether we're optimizing for size or speed.  */
-
-static int
-mips_truncated_op_cost (rtx x, bool speed)
-{
-  if (GET_CODE (x) == TRUNCATE)
-x = XEXP (x, 0);
-  return set_src_cost (x, speed);
-}
-
 /* Implement TARGET_RTX_COSTS.  */
 
 static bool
@@ -4037,13 +4026,12 @@ mips_rtx_costs (rtx x, int code, int out
 case ZERO_EXTEND:
   if (outer_code == SET
  && ISA_HAS_BADDU
+ && (GET_CODE (XEXP (x, 0)) == TRUNCATE
+ || GET_CODE (XEXP (x, 0)) == SUBREG)
  && GET_MODE (XEXP (x, 0)) == QImode
- && GET_CODE (XEXP (x, 0)) == PLUS)
+ && GET_CODE (XEXP (XEXP (x, 0), 0)) == PLUS)
{
- rtx plus = XEXP (x, 0);
- *total = (COSTS_N_INSNS (1)
-   + mips_truncated_op_cost (XEXP (plus, 0), speed)
-   + mips_truncated_op_cost (XEXP (plus, 1), speed));
+ *total = set_src_cost (XEXP (XEXP (x, 0), 0), speed);
  return true;
}
   *total = mips_zero_extend_cost (mode, XEXP (x, 0));
Index: gcc/config/mips/mips.md
===
--- gcc/config/mips/mips.md 2014-01-08 10:29:42.171963087 +
+++ gcc/config/mips/mips.md 2014-01-08 10:38:05.799078793 +
@@ -1312,20 +1312,32 @@ (define_insn_and_split "*addsi3_extended
 
 ;; Combiner patterns for unsigned byte-add.
 
-(define_insn "*baddu_si"
+(define_insn "*baddu_si_eb"
   [(set (match_operand:SI 0 "register_operand" "=d")
 (zero_extend:SI
-(plus:QI (match_operand:QI 1 "register_operand" "d")
- (match_operand:QI 2 "register_operand" "d"]
-  "ISA_HAS_BADDU"
+(subreg:QI
+ (plus:SI (match_operand:SI 1 "register_operand" "d")
+  (match_operand:SI 2 "register_operand" "d")) 3)))]
+  "ISA_HAS_BADDU && BYTES_BIG_ENDIAN"
+  "baddu\\t%0,%1,%2"
+  [(set_attr "alu_type" "add")])
+
+(define_insn "*baddu_si_el"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+(zero_extend:SI
+(subreg:QI
+ (plus:SI (match_operand:SI 1 "register_operand" "d")
+  (match_operand:SI 2 "register_operand" "d")) 0)))]
+  "ISA_HAS_BADDU && !BYTES_BIG_ENDIAN"
   "baddu\\t%0,%1,%2"
   [(set_attr "alu_type" "add")])
 
 (define_insn "*baddu_di"
   [(set (match_operand:GPR 0 "register_operand" "=d")
 (zero_extend:GPR
-(plus:QI (truncate:QI (match_operand:DI 1 "register_operand" "d"))
- (truncate:QI (match_operand:DI 2 "register_operand" "d")]
+(truncate:QI
+ (plus:DI (match_operand:DI 1 "register_operand" "d")
+  (match_operand:DI 2 "register_operand" "d")]
   "ISA_HAS_BADDU && TARGET_64BIT"
   "baddu\\t%0,%1,%2"
   [(set_attr "alu_type" "add")])

PR 59137: Incorrect liveness info during dbr_schedule

2014-01-08 Thread Richard Sandiford

PR 59137 is another case where dbr_schedule gets confused about liveness.
We start out with:

A: $2 = x
B: if $4 == $2 goto L1  [REG_DEAD: $2]
C: if $4 < 0 goto L2
   ...
L1:
D: $2 = y
E: goto L3
L2:
F: $2 = x
G: goto L3
   ...
L3:
   ...
   return $2

We fill G's delay slot in the obvious way:

L2:
G: goto L3
F:   $2 = x

Then we try to "steal" G's delay slot for C.  F is obviously redundant
with A in this context, so we drop it and end with a simple threaded
branch to L3:

A: $2 = x
B: if $4 == $2 goto L1  [REG_DEAD: $2]
C: if $4 < 0 goto L3

The problem is that the REG_DEAD note is no longer accurate, so when
we go on to fill B's delay slot we mistakenly think that we can use D:

A: $2 = x
B: if $4 == $2 goto L3
D:   $2 = y
C: if $4 < 0 goto L3

and so the return value for $4 < 0 changes from x to y.

reorg's mechanism for handling deleted redundant instructions seems
to be update_block, which adds a USE containing the redundant instruction
just before the place that it was supposed to occur.  The patch therefore
uses update_block in steal_delay_list_from_target.

I went through the other calls to redundant_insn and a few of them
also seem to be missing an update_block.  I don't have testcases
for these though, so it's going to be be a matter of opinion whether
adding them or leaving them out is the defensive thing to do.  I'm happy
either way.

(redundant_insn is pretty conservative, so the branch whose delay slot
we're trying to fill can never be the one that makes a delay slot redundant.
It must always be an instruction from before the branch.  So inserting the
(use ...) immediately before the branch should be correct.)

Tested on mips64-linux-gnu.  OK for trunk?  OK for 4.8?

Thanks,
Richard


gcc/
PR rtl-optimization/59137
* reorg.c (steal_delay_list_from_target): Call update_block for
elided insns.
(steal_delay_list_from_fallthrough, relax_delay_slots): Likewise.

gcc/testsuite/
PR rtl-optimization/59137
* gcc.target/mips/pr59137.c: New test.

Index: gcc/reorg.c
===
--- gcc/reorg.c 2014-01-08 18:04:23.420954812 +
+++ gcc/reorg.c 2014-01-08 19:17:12.005446964 +
@@ -1093,6 +1093,7 @@ steal_delay_list_from_target (rtx insn,
   int used_annul = 0;
   int i;
   struct resources cc_set;
+  bool *redundant;
 
   /* We can't do anything if there are more delay slots in SEQ than we
  can handle, or if we don't know that it will be a taken branch.
@@ -1133,6 +1134,7 @@ steal_delay_list_from_target (rtx insn,
 return delay_list;
 #endif
 
+  redundant = XALLOCAVEC (bool, XVECLEN (seq, 0));
   for (i = 1; i < XVECLEN (seq, 0); i++)
 {
   rtx trial = XVECEXP (seq, 0, i);
@@ -1154,7 +1156,8 @@ steal_delay_list_from_target (rtx insn,
 
   /* If this insn was already done (usually in a previous delay slot),
 pretend we put it in our delay slot.  */
-  if (redundant_insn (trial, insn, new_delay_list))
+  redundant[i] = redundant_insn (trial, insn, new_delay_list);
+  if (redundant[i])
continue;
 
   /* We will end up re-vectoring this branch, so compute flags
@@ -1187,6 +1190,12 @@ steal_delay_list_from_target (rtx insn,
return delay_list;
 }
 
+  /* Record the effect of the instructions that were redundant and which
+ we therefore decided not to copy.  */
+  for (i = 1; i < XVECLEN (seq, 0); i++)
+if (redundant[i])
+  update_block (XVECEXP (seq, 0, i), insn);
+
   /* Show the place to which we will be branching.  */
   *pnew_thread = first_active_target_insn (JUMP_LABEL (XVECEXP (seq, 0, 0)));
 
@@ -1250,6 +1259,7 @@ steal_delay_list_from_fallthrough (rtx i
   /* If this insn was already done, we don't need it.  */
   if (redundant_insn (trial, insn, delay_list))
{
+ update_block (trial, insn);
  delete_from_delay_slot (trial);
  continue;
}
@@ -3236,6 +3246,7 @@ relax_delay_slots (rtx first)
 to reprocess this insn.  */
   if (redundant_insn (XVECEXP (pat, 0, 1), delay_insn, 0))
{
+ update_block (XVECEXP (pat, 0, 1), insn);
  delete_from_delay_slot (XVECEXP (pat, 0, 1));
  next = prev_active_insn (next);
  continue;
@@ -3355,6 +3366,7 @@ relax_delay_slots (rtx first)
  && redirect_with_delay_slots_safe_p (delay_insn, target_label,
   insn))
{
+ update_block (XVECEXP (PATTERN (trial), 0, 1), insn);
  reorg_redirect_jump (delay_insn, target_label);
  next = insn;
  continue;
Index: gcc/testsuite/gcc.target/mips/pr59137.c
===
--- /dev/null   2013-12-26 20:29:50.272541227 +
+++ gcc/testsuite/gcc.target/mips/pr59137.c 2014-01-08 19:17:12.006448

Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2014-01-08 Thread H.J. Lu

On Wed, Dec 25, 2013 at 2:32 PM, Uros Bizjak  wrote:
> On Wed, Dec 25, 2013 at 10:31 PM, H.J. Lu  wrote:
>
>>>  cpu_names in i386.c is only used by ix86_function_specific_print 
>>>  which
>>>  accesses it with enum processor_type index. But cpu_names is 
>>>  defined as
>>>  array with enum target_cpu_default index.  This patch adds 
>>>  processor
>>>  names to processor_target_table and uses processor_target_table 
>>>  instead
>>>  of cpu_names.  It removes cpu_names and target_cpu_default.  
>>>  Tested on
>>>  Linux/x86-64.  OK to install?
>>> >>>
>>> >>> Wait a moment,
>>> >>>
>>> >>> it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
>>> >>> const processor_alias_table, so we are able to define various ISA
>>> >>> extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can 
>>> >>> then
>>> >>
>>> >> TARGET_CPU_DEFAULT sets the default -mtune=, not -march=.
>>> >>
>>> >>> be used to select extensions in the same way as PROCESSOR_* selects
>>> >>> tuning for certain processor.
>>> >>
>>> >> It has been like this for a long time.  For x86, TARGET_CPU_DEFAULT
>>> >> isn't defined no matter which configure options are used.  We can
>>> >> change config.gcc to set TARGET_CPU_DEFAULT to proper PROCESSOR_XXX 
>>> >> or
>>> >> set it to a string "xxx" for processor "xxx".
>>> >> But GCC driver always passes -march=/-mtune= to toplev.c so that
>>> >> TARGET_CPU_DEFAULT is normally used.
>>> 
>>>  I meant to say "TARGET_CPU_DEFAULT isn't normally used."
>>> 
>>> >
>>> > Let me rethink this a bit, please do not commit the patch.
>>> >
>>> >>>
>>> >>> TARGET_CPU_DEFAULT is left over for 32-bit target before --with-arch=
>>> >>> and --with-cpu= were added.  Today, -mtune=xxx -march=xxx are
>>> >>> always passed to cc1 by GCC driver.  If cc1 is run by hand and
>>> >>> -mtune=xxx -march=xxx aren't passed to cc1, we should do
>>> >>>
>>> >>> 1. For 64-bit, it should be the same as -mtune=generic -march=x86_64
>>> >>> are passed.
>>> >>> 2. For 32-bit, it should be the same as -mtune=cpu -march=cpu are
>>> >>> passed, where "cpu" is the target cpu used to configure GCC,
>>> >>> like i386 in i386-linux, i486 in i486-linux,  But there is no i786
>>> >>> cpu.  i786 is treated as i686.  If SUBTARGET32_DEFAULT_CPU
>>> >>> is defined, it should be the same -mtune=SUBTARGET32_DEFAULT_CPU
>>> >>> -march=SUBTARGET32_DEFAULT_CPU.
>>> >>>
>>> >>> Here is the patch to implement this.
>>> >>
>>> >> Let's do one step at a time. So, let's split the patch back to 
>>> >> target/59587 fix:
>
>> 2013-12-25   H.J. Lu  
>>
>> PR target/59587
>> * config/i386/i386.c (struct ptt): Add a field for processor
>> name.
>> (processor_target_table): Sync with processor_type.  Add
>> processor names.
>> (cpu_names): Removed.
>> (ix86_option_override_internal): Default x_ix86_tune_string
>> to processor_target_table[TARGET_CPU_DEFAULT].name.
>> (ix86_function_specific_print): Assert arch and tune <
>> PROCESSOR_max.  Use processor_target_table to print arch and
>> tune names.
>> * config/i386/i386.h (TARGET_CPU_DEFAULT): Default to
>> PROCESSOR_GENERIC.
>> (target_cpu_default): Removed.
>> (processor_type): Reordered.
>
> OK for mainline and for 4.8 after a few days in mainline.
>
> Thanks,
> Uros.

I am testing this patch.  I will check it into 4.8 branch after
finishing regression test.

Thanks.


-- 
H.J.
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6493bb2..f17bf56 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,24 @@
+2014-01-08   H.J. Lu  
+
+ Backport from mainline
+ 2013-12-25   H.J. Lu  
+
+ PR target/59587
+ * config/i386/i386.c (struct ptt): Add a field for processor
+ name.
+ (processor_target_table): Sync with processor_type.  Add
+ processor names.
+ (cpu_names): Removed.
+ (ix86_option_override_internal): Default x_ix86_tune_string
+ to processor_target_table[TARGET_CPU_DEFAULT].name.
+ (ix86_function_specific_print): Assert arch and tune <
+ PROCESSOR_max.  Use processor_target_table to print arch and
+ tune names.
+ * config/i386/i386.h (TARGET_CPU_DEFAULT): Default to
+ PROCESSOR_GENERIC32.
+ (target_cpu_default): Removed.
+ (processor_type): Reordered.
+
 2014-01-08  Uros Bizjak  

  Backport from mainline
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e03aa72..c06c220 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2409,6 +2409,7 @@ static tree ix86_veclibabi_acml (enum
built_in_function, tree, tree);
 /* Processor target table, indexed by processor number */
 struct ptt
 {
+  const char *const name; /* processor name  */
   const struct processor_costs *cost; /* Processor costs */
   const int align_loop; /* Default alignments.  */

[GOOGLE] Remove mod_id_to_name map

2014-01-08 Thread Dehao Chen

This patch removes mod_id_to_name map because the info is already
there in module_infos. And also, AutoFDO don't have access to update
this map because its a file-static structure.

Bootstrapped and passed regression test.

OK for google branch?

Thanks,
Dehao

Index: gcc/coverage.c
===
--- gcc/coverage.c (revision 206366)
+++ gcc/coverage.c (working copy)
@@ -615,37 +615,17 @@ reorder_module_groups (const char *imports_file, u
   module_name_tab.dispose ();
 }

-typedef struct {
-  unsigned int mod_id;
-  const char *mod_name;
-} mod_id_to_name_t;
-
-static vec *mod_names;
-
-static void
-record_module_name (unsigned int mod_id, const char *name)
-{
-  mod_id_to_name_t t;
-
-  t.mod_id = mod_id;
-  t.mod_name = xstrdup (name);
-  if (!mod_names)
-vec_alloc (mod_names, 10);
-  mod_names->safe_push (t);
-}
-
 /* Return the module name for module with MOD_ID.  */

 const char *
 get_module_name (unsigned int mod_id)
 {
   size_t i;
-  mod_id_to_name_t *elt;

-  for (i = 0; mod_names->iterate (i, &elt); i++)
+  for (i = 0; i < num_in_fnames; i++)
 {
-  if (elt->mod_id == mod_id)
-return elt->mod_name;
+  if (module_infos[i]->ident == mod_id)
+return lbasename (module_infos[i]->source_filename);
 }

   gcc_assert (0);
@@ -927,9 +907,6 @@ read_counts_file (const char *da_file_name, unsign
  }
 }

-  record_module_name (mod_info->ident,
-  lbasename (mod_info->source_filename));
-
   if (dump_enabled_p ())
 {
   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location,

C++ PATCH for c++/59614 (compile hog with lots of templates)

2014-01-08 Thread Jason Merrill

I was forgetting that recursing into template arguments would in turn 
recurse into their template arguments, leading to quadratic behavior. 
So, look at template arguments only once and add any inherited tags to 
the instantiated type.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit f97c952a82d54a4cf0fc4583560de78589fa5664
Author: Jason Merrill 
Date:   Tue Jan 7 17:19:20 2014 -0500

	PR c++/59614
	* class.c (abi_tag_data): Add tags field.
	(check_abi_tags): Initialize it.
	(find_abi_tags_r): Support collecting missing tags.
	(mark_type_abi_tags): Don't look at template args.
	(inherit_targ_abi_tags): New.
	(check_bases_and_members): Use it.
	* cp-tree.h (ABI_TAG_IMPLICIT): New.
	* mangle.c (write_abi_tags): Check it.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index c961b22..0c3ce47 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1340,14 +1340,20 @@ struct abi_tag_data
 {
   tree t;
   tree subob;
+  // error_mark_node to get diagnostics; otherwise collect missing tags here
+  tree tags;
 };
 
 static tree
-find_abi_tags_r (tree *tp, int */*walk_subtrees*/, void *data)
+find_abi_tags_r (tree *tp, int *walk_subtrees, void *data)
 {
   if (!OVERLOAD_TYPE_P (*tp))
 return NULL_TREE;
 
+  /* walk_tree shouldn't be walking into any subtrees of a RECORD_TYPE
+ anyway, but let's make sure of it.  */
+  *walk_subtrees = false;
+
   if (tree attributes = lookup_attribute ("abi_tag", TYPE_ATTRIBUTES (*tp)))
 {
   struct abi_tag_data *p = static_cast(data);
@@ -1358,7 +1364,20 @@ find_abi_tags_r (tree *tp, int */*walk_subtrees*/, void *data)
 	  tree id = get_identifier (TREE_STRING_POINTER (tag));
 	  if (!IDENTIFIER_MARKED (id))
 	{
-	  if (TYPE_P (p->subob))
+	  if (p->tags != error_mark_node)
+		{
+		  /* We're collecting tags from template arguments.  */
+		  tree str = build_string (IDENTIFIER_LENGTH (id),
+	   IDENTIFIER_POINTER (id));
+		  p->tags = tree_cons (NULL_TREE, str, p->tags);
+		  ABI_TAG_IMPLICIT (p->tags) = true;
+
+		  /* Don't inherit this tag multiple times.  */
+		  IDENTIFIER_MARKED (id) = true;
+		}
+
+	  /* Otherwise we're diagnosing missing tags.  */
+	  else if (TYPE_P (p->subob))
 		{
 		  warning (OPT_Wabi_tag, "%qT does not have the %E abi tag "
 			   "that base %qT has", p->t, tag, p->subob);
@@ -1397,22 +1416,6 @@ mark_type_abi_tags (tree t, bool val)
 	  IDENTIFIER_MARKED (id) = val;
 	}
 }
-
-  /* Also mark ABI tags from template arguments.  */
-  if (CLASSTYPE_TEMPLATE_INFO (t))
-{
-  tree args = CLASSTYPE_TI_ARGS (t);
-  for (int i = 0; i < TMPL_ARGS_DEPTH (args); ++i)
-	{
-	  tree level = TMPL_ARGS_LEVEL (args, i+1);
-	  for (int j = 0; j < TREE_VEC_LENGTH (level); ++j)
-	{
-	  tree arg = TREE_VEC_ELT (level, j);
-	  if (CLASS_TYPE_P (arg))
-		mark_type_abi_tags (arg, val);
-	}
-	}
-}
 }
 
 /* Check that class T has all the abi tags that subobject SUBOB has, or
@@ -1424,13 +1427,50 @@ check_abi_tags (tree t, tree subob)
   mark_type_abi_tags (t, true);
 
   tree subtype = TYPE_P (subob) ? subob : TREE_TYPE (subob);
-  struct abi_tag_data data = { t, subob };
+  struct abi_tag_data data = { t, subob, error_mark_node };
 
   cp_walk_tree_without_duplicates (&subtype, find_abi_tags_r, &data);
 
   mark_type_abi_tags (t, false);
 }
 
+void
+inherit_targ_abi_tags (tree t)
+{
+  if (CLASSTYPE_TEMPLATE_INFO (t) == NULL_TREE)
+return;
+
+  mark_type_abi_tags (t, true);
+
+  tree args = CLASSTYPE_TI_ARGS (t);
+  struct abi_tag_data data = { t, NULL_TREE, NULL_TREE };
+  for (int i = 0; i < TMPL_ARGS_DEPTH (args); ++i)
+{
+  tree level = TMPL_ARGS_LEVEL (args, i+1);
+  for (int j = 0; j < TREE_VEC_LENGTH (level); ++j)
+	{
+	  tree arg = TREE_VEC_ELT (level, j);
+	  data.subob = arg;
+	  cp_walk_tree_without_duplicates (&arg, find_abi_tags_r, &data);
+	}
+}
+
+  // If we found some tags on our template arguments, add them to our
+  // abi_tag attribute.
+  if (data.tags)
+{
+  tree attr = lookup_attribute ("abi_tag", TYPE_ATTRIBUTES (t));
+  if (attr)
+	TREE_VALUE (attr) = chainon (data.tags, TREE_VALUE (attr));
+  else
+	TYPE_ATTRIBUTES (t)
+	  = tree_cons (get_identifier ("abi_tag"), data.tags,
+		   TYPE_ATTRIBUTES (t));
+}
+
+  mark_type_abi_tags (t, false);
+}
+
 /* Run through the base classes of T, updating CANT_HAVE_CONST_CTOR_P,
and NO_CONST_ASN_REF_P.  Also set flag bits in T based on
properties of the bases.  */
@@ -5431,6 +5471,9 @@ check_bases_and_members (tree t)
   bool saved_nontrivial_dtor;
   tree fn;
 
+  /* Pick up any abi_tags from our template arguments before checking.  */
+  inherit_targ_abi_tags (t);
+
   /* By default, we use const reference arguments and generate default
  constructors.  */
   cant_have_const_ctor = 0;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index bdae500..96af562f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -65,6 +65,7 @@ c-common.h, not afte

[PATCH] Fix up ipa-prop caused -fcompare-debug failures (PR ipa/59722)

2014-01-08 Thread Jakub Jelinek

Hi!

The recent ipa_analyze_params_uses changes broke i686-linux bootstrap
with --enable-checking=release, the reduced testcase below shows it.
Obviously we need to ignore debug stmt uses during analysis.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious.

2014-01-08  Jakub Jelinek  

PR ipa/59722
* ipa-prop.c (ipa_analyze_params_uses): Ignore uses in debug stmts.

* gcc.dg/pr59722.c: New test.

--- gcc/ipa-prop.c.jj   2014-01-06 22:32:17.101586391 +0100
+++ gcc/ipa-prop.c  2014-01-08 16:07:29.203641224 +0100
@@ -2127,8 +2127,11 @@ ipa_analyze_params_uses (struct cgraph_n
  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, ddef)
if (!is_gimple_call (USE_STMT (use_p)))
  {
-   controlled_uses = IPA_UNDESCRIBED_USE;
-   break;
+   if (!is_gimple_debug (USE_STMT (use_p)))
+ {
+   controlled_uses = IPA_UNDESCRIBED_USE;
+   break;
+ }
  }
else
  controlled_uses++;
--- gcc/testsuite/gcc.dg/pr59722.c.jj   2014-01-08 16:06:34.325960016 +0100
+++ gcc/testsuite/gcc.dg/pr59722.c  2014-01-08 16:06:03.0 +0100
@@ -0,0 +1,36 @@
+/* PR ipa/59722 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcompare-debug" } */
+
+extern void abrt (const char *, int) __attribute__((noreturn));
+void baz (int *, int *);
+
+static inline int
+bar (void)
+{
+  return 1;
+}
+
+static inline void
+foo (int *x, int y (void))
+{
+  while (1)
+{
+  int a = 0;
+  if (*x)
+   {
+ baz (x, &a);
+ while (a && !y ())
+   ;
+ break;
+   }
+  abrt ("", 1);
+}
+}
+
+void
+test (int x)
+{
+  foo (&x, bar);
+  foo (&x, bar);
+}

Jakub

[PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 01:45:40PM +0100, Jakub Jelinek wrote:
> I'd like to get rid of all the XCNEW calls in target-globals.c as a
> follow-up.

Here it is.  The rationale is both to avoid many separate heap allocations
and if TARGET_OPTION_NODE is no longer needed (all FUNCTION_DECLs
referencing it are e.g. optimized away, say static unused functions)
to avoid leaking memory.

Bootstrapped/regtested on x86_64-linux and i686-linux (together
with the i386 SWITCHABLE_TARGET patch).

Though, looking at the sizes, i686-linux allocates 0x67928
bytes which I think with ggc-page.c we allocate 0.5MB for it (acceptable),
on x86_64-linux the allocation size is 0x83aa8 and thus only ~ 15KB over
to fit into 0.5MB, thus I think we allocate 1MB.
So, if we wanted to tune for x86_64, we could not allocate say
target_flag_state (size 0x5008) in the big chunk, but instead make
it GTY((atomic)) and allocate separately.

Or perhaps do that for other very large structs?  In any case, that doesn't
look like something that probably would need to be retuned for every
release.

The current sizes of the structs are:
struct target_globals   0x800x40
struct target_flag_state0x200x20
struct target_regs  0x5008  0x5008
struct target_hard_regs 0x35c8  0x33f8
struct target_reload0xef70  0xef70
struct target_expmed0x180b0 0xf4b0
struct target_optabs0x4f0   0x4b9
struct target_cfgloop   0x1c0x1c
struct target_ira   0x9628  0x9620
struct target_ira_int   0x3fca8 0x322e4
struct target_lra_int   0xa718  0x4e70
struct target_builtins  0x268   0x268
struct target_gcse  0x620x62
struct target_bb_reorder0x4 0x4
struct target_lower_subreg  0x24c   0x18c

Perhaps use cut-off of 4KB with current sizes, anything below that
would be allocated in the single block, anything above it separately.
So 7 structs allocated together, 7 separately.

2014-01-08  Jakub Jelinek  

* target-globals.c (save_target_globals): Allocate most of the
structs using GC in payload of target_globals struct instead
of allocating them on the heap.

--- gcc/target-globals.c.jj 2014-01-08 10:23:22.0 +0100
+++ gcc/target-globals.c2014-01-08 14:00:13.183231122 +0100
@@ -68,24 +68,43 @@ struct target_globals *
 save_target_globals (void)
 {
   struct target_globals *g;
-
-  g = ggc_alloc_target_globals ();
-  g->flag_state = XCNEW (struct target_flag_state);
-  g->regs = XCNEW (struct target_regs);
+  struct target_globals_extra {
+struct target_globals g;
+struct target_flag_state flag_state;
+struct target_regs regs;
+struct target_hard_regs hard_regs;
+struct target_reload reload;
+struct target_expmed expmed;
+struct target_optabs optabs;
+struct target_cfgloop cfgloop;
+struct target_ira ira;
+struct target_ira_int ira_int;
+struct target_lra_int lra_int;
+struct target_builtins builtins;
+struct target_gcse gcse;
+struct target_bb_reorder bb_reorder;
+struct target_lower_subreg lower_subreg;
+  } *p;
+  p = (struct target_globals_extra *)
+  ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
+  PASS_MEM_STAT);
+  g = (struct target_globals *) p;
+  g->flag_state = &p->flag_state;
+  g->regs = &p->regs;
   g->rtl = ggc_alloc_cleared_target_rtl ();
-  g->hard_regs = XCNEW (struct target_hard_regs);
-  g->reload = XCNEW (struct target_reload);
-  g->expmed = XCNEW (struct target_expmed);
-  g->optabs = XCNEW (struct target_optabs);
+  g->hard_regs = &p->hard_regs;
+  g->reload = &p->reload;
+  g->expmed = &p->expmed;
+  g->optabs = &p->optabs;
   g->libfuncs = ggc_alloc_cleared_target_libfuncs ();
-  g->cfgloop = XCNEW (struct target_cfgloop);
-  g->ira = XCNEW (struct target_ira);
-  g->ira_int = XCNEW (struct target_ira_int);
-  g->lra_int = XCNEW (struct target_lra_int);
-  g->builtins = XCNEW (struct target_builtins);
-  g->gcse = XCNEW (struct target_gcse);
-  g->bb_reorder = XCNEW (struct target_bb_reorder);
-  g->lower_subreg = XCNEW (struct target_lower_subreg);
+  g->cfgloop = &p->cfgloop;
+  g->ira = &p->ira;
+  g->ira_int = &p->ira_int;
+  g->lra_int = &p->lra_int;
+  g->builtins = &p->builtins;
+  g->gcse = &p->gcse;
+  g->bb_reorder = &p->bb_reorder;
+  g->lower_subreg = &p->lower_subreg;
   restore_target_globals (g);
   init_reg_sets ();
   target_reinit ();


Jakub

[PATCH, AArch64 6/6] aarch64: Define add_ssaaaa, sub_ddmmss, umul_ppmm

2014-01-08 Thread Richard Henderson

We have good support for TImode arithmetic, so no need to do anything
with inline assembly.

include/
* longlong.h [__aarch64__] (add_ss, sub_ddmmss, umul_ppmm): New.
[__aarch64__] (COUNT_LEADING_ZEROS_0): Define in terms of W_TYPE_SIZE.
---
 include/longlong.h | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/include/longlong.h b/include/longlong.h
index b4c1f400..1b11fc7 100644
--- a/include/longlong.h
+++ b/include/longlong.h
@@ -123,19 +123,35 @@ extern const UQItype __clz_tab[256] attribute_hidden;
 #endif /* __GNUC__ < 2 */
 
 #if defined (__aarch64__)
+#define add_ss(sh, sl, ah, al, bh, bl) \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(ah) << 64 | (UWtype)(al);  \
+__x += (UDWtype)(UWtype)(bh) << 64 | (UWtype)(bl); \
+(sh) = __x >> W_TYPE_SIZE; \
+(sl) = __x;
\
+  } while (0)
+#define sub_ddmmss(sh, sl, ah, al, bh, bl) \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(ah) << 64 | (UWtype)(al);  \
+__x -= (UDWtype)(UWtype)(bh) << 64 | (UWtype)(bl); \
+(sh) = __x >> W_TYPE_SIZE; \
+(sl) = __x;
\
+  } while (0)
+#define umul_ppmm(ph, pl, m0, m1)  \
+  do { \
+UDWtype __x = (UDWtype)(UWtype)(m0) * (UWtype)(m1);
\
+(ph) = __x >> W_TYPE_SIZE; \
+(pl) = __x;
\
+  } while (0)
 
+#define COUNT_LEADING_ZEROS_0   W_TYPE_SIZE
 #if W_TYPE_SIZE == 32
 #define count_leading_zeros(COUNT, X)  ((COUNT) = __builtin_clz (X))
 #define count_trailing_zeros(COUNT, X)   ((COUNT) = __builtin_ctz (X))
-#define COUNT_LEADING_ZEROS_0 32
-#endif /* W_TYPE_SIZE == 32 */
-
-#if W_TYPE_SIZE == 64
+#elif W_TYPE_SIZE == 64
 #define count_leading_zeros(COUNT, X)  ((COUNT) = __builtin_clzll (X))
 #define count_trailing_zeros(COUNT, X)   ((COUNT) = __builtin_ctzll (X))
-#define COUNT_LEADING_ZEROS_0 64
 #endif /* W_TYPE_SIZE == 64 */
-
 #endif /* __aarch64__ */
 
 #if defined (__alpha) && W_TYPE_SIZE == 64
-- 
1.8.4.2

[PATCH, AArch64 4/6] soft-fp: Commonize creation of TImode types

2014-01-08 Thread Richard Henderson

No need to do this over and over for different 64-bit hosts.

libgcc/
* config/soft-fp/soft-fp.h (TItype, UTItype, TI_BITS): New.
* config/aarch64/sfp-machine.h (TItype, UTItype, TI_BITS): Remove.
* config/i386/64/sfp-machine.h: Likewise.
* config/ia64/sfp-machine.h: Likewise.
* config/tilegx/sfp-machine32.h: Likewise.
* config/tilegx/sfp-machine64.h: Likewise.
---
 libgcc/config/aarch64/sfp-machine.h  | 4 
 libgcc/config/i386/64/sfp-machine.h  | 5 -
 libgcc/config/ia64/sfp-machine.h | 5 -
 libgcc/config/tilegx/sfp-machine32.h | 5 -
 libgcc/config/tilegx/sfp-machine64.h | 5 -
 libgcc/soft-fp/soft-fp.h | 8 
 6 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/libgcc/config/aarch64/sfp-machine.h 
b/libgcc/config/aarch64/sfp-machine.h
index 61b5f72..5e676be 100644
--- a/libgcc/config/aarch64/sfp-machine.h
+++ b/libgcc/config/aarch64/sfp-machine.h
@@ -28,10 +28,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define _FP_WS_TYPEsigned long long
 #define _FP_I_TYPE int
 
-typedef int TItype __attribute__ ((mode (TI)));
-typedef unsigned int UTItype __attribute__ ((mode (TI)));
-#define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype))
-
 /* The type of the result of a floating point comparison.  This must
match __libgcc_cmp_return__ in GCC for the target.  */
 typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
diff --git a/libgcc/config/i386/64/sfp-machine.h 
b/libgcc/config/i386/64/sfp-machine.h
index 1ff94c2..8197536 100644
--- a/libgcc/config/i386/64/sfp-machine.h
+++ b/libgcc/config/i386/64/sfp-machine.h
@@ -3,11 +3,6 @@
 #define _FP_WS_TYPEsigned long long
 #define _FP_I_TYPE long long
 
-typedef int TItype __attribute__ ((mode (TI)));
-typedef unsigned int UTItype __attribute__ ((mode (TI)));
-
-#define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype))
-
 #define _FP_MUL_MEAT_Q(R,X,Y)  \
   _FP_MUL_MEAT_2_wide(_FP_WFRACBITS_Q,R,X,Y,umul_ppmm)
 
diff --git a/libgcc/config/ia64/sfp-machine.h b/libgcc/config/ia64/sfp-machine.h
index e06bc9a..f7dd928 100644
--- a/libgcc/config/ia64/sfp-machine.h
+++ b/libgcc/config/ia64/sfp-machine.h
@@ -3,11 +3,6 @@
 #define _FP_WS_TYPEsigned long
 #define _FP_I_TYPE long
 
-typedef int TItype __attribute__ ((mode (TI)));
-typedef unsigned int UTItype __attribute__ ((mode (TI)));
-
-#define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype))
-
 /* The type of the result of a floating point comparison.  This must
match `__libgcc_cmp_return__' in GCC for the target.  */
 typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
diff --git a/libgcc/config/tilegx/sfp-machine32.h 
b/libgcc/config/tilegx/sfp-machine32.h
index 31a2032..a921533 100644
--- a/libgcc/config/tilegx/sfp-machine32.h
+++ b/libgcc/config/tilegx/sfp-machine32.h
@@ -3,11 +3,6 @@
 #define _FP_WS_TYPEsigned long
 #define _FP_I_TYPE long
 
-typedef int TItype __attribute__ ((mode (TI)));
-typedef unsigned int UTItype __attribute__ ((mode (TI)));
-
-#define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype))
-
 /* The type of the result of a floating point comparison.  This must
match `__libgcc_cmp_return__' in GCC for the target.  */
 typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
diff --git a/libgcc/config/tilegx/sfp-machine64.h 
b/libgcc/config/tilegx/sfp-machine64.h
index 7cf352e..2586dd5 100644
--- a/libgcc/config/tilegx/sfp-machine64.h
+++ b/libgcc/config/tilegx/sfp-machine64.h
@@ -3,11 +3,6 @@
 #define _FP_WS_TYPEsigned long
 #define _FP_I_TYPE long
 
-typedef int TItype __attribute__ ((mode (TI)));
-typedef unsigned int UTItype __attribute__ ((mode (TI)));
-
-#define TI_BITS (__CHAR_BIT__ * (int)sizeof(TItype))
-
 /* The type of the result of a floating point comparison.  This must
match `__libgcc_cmp_return__' in GCC for the target.  */
 typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
diff --git a/libgcc/soft-fp/soft-fp.h b/libgcc/soft-fp/soft-fp.h
index 696fc86..b54b1ed 100644
--- a/libgcc/soft-fp/soft-fp.h
+++ b/libgcc/soft-fp/soft-fp.h
@@ -237,6 +237,11 @@ typedef int DItype __attribute__ ((mode (DI)));
 typedef unsigned int UQItype __attribute__ ((mode (QI)));
 typedef unsigned int USItype __attribute__ ((mode (SI)));
 typedef unsigned int UDItype __attribute__ ((mode (DI)));
+#if _FP_W_TYPE_SIZE == 64
+typedef int TItype __attribute__ ((mode (TI)));
+typedef unsigned int UTItype __attribute__ ((mode (TI)));
+#endif
+
 #if _FP_W_TYPE_SIZE == 32
 typedef unsigned int UHWtype __attribute__ ((mode (HI)));
 #elif _FP_W_TYPE_SIZE == 64
@@ -249,6 +254,9 @@ typedef USItype UHWtype;
 
 #define SI_BITS(__CHAR_BIT__ * (int) sizeof (SItype))
 #define DI_BITS(__CHAR_BIT__ * (int) sizeof (DItype))
+#if _FP_W_TYPE_SIZE == 64
+# d

[PATCH, AArch64 5/6] soft-fp: Define UDWtype for longlong.h

2014-01-08 Thread Richard Henderson

The documentation for longlong.h says this type must be defined.
We've gotten away with this because so far longlong.h hasn't
actually used the type.

libgcc/
* soft-fp/soft-fp.h: (UDWtype): New define.
---
 libgcc/soft-fp/soft-fp.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libgcc/soft-fp/soft-fp.h b/libgcc/soft-fp/soft-fp.h
index b54b1ed..8f80ea6 100644
--- a/libgcc/soft-fp/soft-fp.h
+++ b/libgcc/soft-fp/soft-fp.h
@@ -248,6 +248,12 @@ typedef unsigned int UHWtype __attribute__ ((mode (HI)));
 typedef USItype UHWtype;
 #endif
 
+#if _FP_W_TYPE_SIZE == 32
+# define UDWtype   UDItype
+#elif _FP_W_TYPE_SIZE == 64
+# define UDWtype   UTItype
+#endif
+
 #ifndef CMPtype
 # define CMPtype   int
 #endif
-- 
1.8.4.2

[PATCH, AArch64 0/7] TImode and longlong.h improvements

2014-01-08 Thread Richard Henderson

The recent longlong.h patch 

  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00286.html

reminded me that the other common patterns really ought to be supported
somehow.  We had patterns defining ADDS, ADC, and UMULH, but we didn't
have the proper expanders in place to make use of them.

The final longlong.h patch has nothing that's really aarch64 specific,
but I chickened out in making the generic patterns use builtin double
word arithmetic.  Perhaps some define set in the cpu-specific portion
of the file ought to select this from the final common portion, but
that sort of thing begs the question of large-scale cleanup.


r~


Richard Henderson (6):
  aarch64: Add addti3 and subti3 patterns
  aarch64: Add mulditi3 and umulditi3 patterns
  aarch64: Add multi3 pattern
  soft-fp: Commonize creation of TImode types
  soft-fp: Define UDWtype for longlong.h
  aarch64: Define add_ss, sub_ddmmss, umul_ppmm

 gcc/config/aarch64/aarch64.md| 89 ++--
 include/longlong.h   | 28 +---
 libgcc/config/aarch64/sfp-machine.h  |  4 --
 libgcc/config/i386/64/sfp-machine.h  |  5 --
 libgcc/config/ia64/sfp-machine.h |  5 --
 libgcc/config/tilegx/sfp-machine32.h |  5 --
 libgcc/config/tilegx/sfp-machine64.h |  5 --
 libgcc/soft-fp/soft-fp.h | 14 ++
 8 files changed, 120 insertions(+), 35 deletions(-)

-- 
1.8.4.2

[PATCH, AArch64 3/6] aarch64: Add multi3 pattern

2014-01-08 Thread Richard Henderson

* config/aarch64/aarch64.md (multi3): New expander.
(madd): Remove leading * from name.
---
 gcc/config/aarch64/aarch64.md | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 0b3943d..0f76cd1 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1968,7 +1968,7 @@
   [(set_attr "type" "mul")]
 )
 
-(define_insn "*madd"
+(define_insn "madd"
   [(set (match_operand:GPI 0 "register_operand" "=r")
(plus:GPI (mult:GPI (match_operand:GPI 1 "register_operand" "r")
(match_operand:GPI 2 "register_operand" "r"))
@@ -2095,6 +2095,31 @@
   DONE;
 })
 
+;; The default expansion of multi3 using umuldi3_highpart will perform
+;; the additions in an order that fails to combine into two madd insns.
+(define_expand "multi3"
+  [(set (match_operand:TI 0 "register_operand")
+   (mult:TI (match_operand:TI 1 "register_operand")
+(match_operand:TI 2 "register_operand")))]
+  ""
+{
+  rtx l0 = gen_reg_rtx (DImode);
+  rtx l1 = gen_lowpart (DImode, operands[1]);
+  rtx l2 = gen_lowpart (DImode, operands[2]);
+  rtx h0 = gen_reg_rtx (DImode);
+  rtx h1 = gen_highpart (DImode, operands[1]);
+  rtx h2 = gen_highpart (DImode, operands[2]);
+
+  emit_insn (gen_muldi3 (l0, l1, l2));
+  emit_insn (gen_umuldi3_highpart (h0, l1, l2));
+  emit_insn (gen_madddi (h0, h1, l2, h0));
+  emit_insn (gen_madddi (h0, l1, h2, h0));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), l0);
+  emit_move_insn (gen_highpart (DImode, operands[0]), h0);
+  DONE;
+})
+
 (define_insn "muldi3_highpart"
   [(set (match_operand:DI 0 "register_operand" "=r")
(truncate:DI
-- 
1.8.4.2

[PATCH, AArch64 2/6] aarch64: Add mulditi3 and umulditi3 patterns

2014-01-08 Thread Richard Henderson

* config/aarch64/aarch64.md (mulditi3): New expander.
---
 gcc/config/aarch64/aarch64.md | 17 +
 1 file changed, 17 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c4acdfc..0b3943d 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2078,6 +2078,23 @@
   [(set_attr "type" "mull")]
 )
 
+(define_expand "mulditi3"
+  [(set (match_operand:TI 0 "register_operand")
+   (mult:TI (ANY_EXTEND:TI (match_operand:DI 1 "register_operand"))
+(ANY_EXTEND:TI (match_operand:DI 2 "register_operand"]
+  ""
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_muldi3 (low, operands[1], operands[2]));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_muldi3_highpart (high, operands[1], operands[2]));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})
+
 (define_insn "muldi3_highpart"
   [(set (match_operand:DI 0 "register_operand" "=r")
(truncate:DI
-- 
1.8.4.2

[PATCH, AArch64 1/6] aarch64: Add addti3 and subti3 patterns

2014-01-08 Thread Richard Henderson

* config/aarch64/aarch64 (addti3, subti3): New expanders.
(add3_compare0): Remove leading * from name.
(add3_carryin): Likewise.
(sub3_compare0): Likewise.
(sub3_carryin): Likewise.
---
 gcc/config/aarch64/aarch64.md | 45 +++
 1 file changed, 41 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 4e838ee..c4acdfc 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1102,7 +1102,26 @@
(set_attr "simd" "*,*,*,yes")]
 )
 
-(define_insn "*add3_compare0"
+(define_expand "addti3"
+  [(set (match_operand:TI 0 "register_operand" "")
+   (plus:TI (match_operand:TI 1 "register_operand" "")
+(match_operand:TI 2 "register_operand" "")))]
+  ""
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_adddi3_compare0 (low, gen_lowpart (DImode, operands[1]),
+ gen_lowpart (DImode, operands[2])));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_adddi3_carryin (high, gen_highpart (DImode, operands[1]),
+gen_highpart (DImode, operands[2])));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})
+
+(define_insn "add3_compare0"
   [(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ
 (plus:GPI (match_operand:GPI 1 "register_operand" "%r,r,r")
@@ -1386,7 +1405,7 @@
   [(set_attr "type" "alu_ext")]
 )
 
-(define_insn "*add3_carryin"
+(define_insn "add3_carryin"
   [(set
 (match_operand:GPI 0 "register_operand" "=r")
 (plus:GPI (geu:GPI (reg:CC CC_REGNUM) (const_int 0))
@@ -1554,8 +1573,26 @@
(set_attr "simd" "*,yes")]
 )
 
+(define_expand "subti3"
+  [(set (match_operand:TI 0 "register_operand" "")
+   (minus:TI (match_operand:TI 1 "register_operand" "")
+ (match_operand:TI 2 "register_operand" "")))]
+  ""
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_subdi3_compare0 (low, gen_lowpart (DImode, operands[1]),
+ gen_lowpart (DImode, operands[2])));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_subdi3_carryin (high, gen_highpart (DImode, operands[1]),
+gen_highpart (DImode, operands[2])));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})
 
-(define_insn "*sub3_compare0"
+(define_insn "sub3_compare0"
   [(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
  (match_operand:GPI 2 "register_operand" "r"))
@@ -1702,7 +1739,7 @@
   [(set_attr "type" "alu_ext")]
 )
 
-(define_insn "*sub3_carryin"
+(define_insn "sub3_carryin"
   [(set
 (match_operand:GPI 0 "register_operand" "=r")
 (minus:GPI (minus:GPI
-- 
1.8.4.2

Re: [PATCH][ARM]Use of vcvt for float to fixed point conversions.

2014-01-08 Thread Christophe Lyon

On 8 January 2014 18:15, Renlin Li  wrote:
> Hi Christophe,
>
> There is a minor issue about this test case. It requires the `float-abi` of
> your target to be either `softfp` or `hard` (to utilize the floating point
> hardware).
> Could you please check whether this solves the problem or not?
>
Indeed I had tried with 'hard' and it's OK. (That's why I said
arm-none-linux-gnueabi as opposed to arm-none-linux-gnueabihf, but I
wasn't clear enough).

Thanks for your upcoming patch :-)

Christophe.

Re: [PATCH] _Cilk_for for C and C++

2014-01-08 Thread Jakub Jelinek

On Tue, Jan 07, 2014 at 10:11:59PM +, Iyer, Balaji V wrote:
>   I used a similar existing one (safelen). Attached, please find 2
> fixed patches for C and C++ along with their changelogs.

But safelen is something completely different, while if I skim
the _Cilk_for docs, the grain is really a chunk size, where the runtime
library performs the scheduling of grain sized chunks, so using
OMP_CLAUSE_SCHEDULE clause with
OMP_CLAUSE_SCHEDULE_KIND (c) = OMP_CLAUSE_SCHEDULE_RUNTIME;
OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (c) = grain_expr;
sounds like what should be used.  OMP_CLAUSE_SAFELEN says what is the
minimal vectorization factor the compiler can assume is safe for
a simd loop.

Jakub

Re: [PATCH][ARM]Use of vcvt for float to fixed point conversions.

2014-01-08 Thread Renlin Li


Hi Christophe,

There is a minor issue about this test case. It requires the `float-abi` 
of your target to be either `softfp` or `hard` (to utilize the floating 
point hardware).

Could you please check whether this solves the problem or not?

I should add it to the `dg-options` section of the test case and a patch 
is on the way.


Thank you for your notification!

Kind regards,
Renlin Li


On 08/01/14 16:43, Christophe Lyon wrote:

Hi Renlin,

The new test you added introduces 2 new FAILs when the target is
arm-none-linux-gnueabi (as opposed to arm-none-linux-gnueabihf).

Christophe.


On 24 December 2013 15:46, Renlin Li  wrote:

Hi,

I just updated my patch according your suggestion.
Thank you for committing it for me!

All you guys have a nice Xmas break!

Kind regards,
Renlin Li


On 04/12/13 11:23, Ramana Radhakrishnan wrote:

Sorry about the slow response. Been on holiday.

On 20/11/13 16:27, Renlin Li wrote:

Hi all,

This patch will make the arm back-end use vcvt for float to fixed point
conversions when applicable.

Test on arm-none-linux-gnueabi has been done on the model.
Okay for trunk?

+ (define_insn "*combine_vcvtf2i"
+   [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (fix:SI (fix:SF (mult:SF (match_operand:SF 1 "s_register_operand"
"t")
+(match_operand 2
+"const_double_vcvt_power_of_two"
"Dp")]
+   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP3 &&
!flag_rounding_math"
+   "vcvt%?.s32.f32\\t%1, %1, %v2\;vmov%?\\t%0, %1"
+   [(set_attr "predicable" "yes")
+(set_attr "predicable_short_it" "no")
+(set_attr "ce_count" "2")
+(set_attr "type" "f_cvtf2i")]
+ )
+

You need to set length to 8.


--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fixed_float_conversion.c
@@ -0,0 +1,15 @@
+/* Check that vcvt is used for fixed and float data conversions.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -mfpu=vfp3" } */
+/* { dg-require-effective-target arm_vfp_ok } */
+float fixed_to_float(int i)
+{
+return ((float)i / (1 << 16));
+}
+
+int float_to_fixed(float f)
+{
+return ((int)(f*(1 << 16)));
+}
+/* { dg-final { scan-assembler "vcvt.f32.s32" } } */
+/* { dg-final { scan-assembler "vcvt.s32.f32" } } */


GNU coding style for functions.

Ok with those changes.




regards
Ramana



Kind regards,
Renlin Li


gcc/ChangeLog:

2013-11-20  Renlin Li  

* config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
* config/arm/constraints.md (Dp): Define new constraint.
* config/arm/predicates.md ( const_double_vcvt_power_of_two):
Define
new predicate.
* config/arm/arm.c (arm_print_operand): Add print for new
fucntion.
(vfp3_const_double_for_bits): New function.
* config/arm/vfp.md (combine_vcvtf2i): Define new instruction.

gcc/testsuite/ChangeLog:

2013-11-20  Renlin Li  

* gcc.target/arm/fixed_float_conversion.c: New test case.

FW: [PATCH] Fix PR 59631

2014-01-08 Thread Iyer, Balaji V

A small but major typo.

The second sentence should read "...usage of _Cilk_spawn [ and _Cilk_sync] 
*without* -fcilkplus..." instead of "...with -fcilkplus..."
 
I am sorry about this.

Sincerely,

Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Tuesday, January 7, 2014 10:15 AM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] Fix PR 59631
> 
> Hello Everyone,
>   The attached patch will fix the issue reported in PR 59631. The main
> issue was the usage of Cilk spawn [and _Cilk_sync] with -fcilkplus caused an
> ICE. This patch should fix that. The issue was only reported for C++ but the
> issue exists in C compiler also.  This patch fixes both C and C++. A test 
> case is
> also included.
> 
> Is this Ok for trunk?
> 
> Here are the ChangeLog entries:
> +++ gcc/c/ChangeLog
> +2014-01-07  Balaji V. Iyer  
> +
> +   PR c++/59631
> +   * c-parser.c (c_parser_postfix_expression): Replaced consecutive if
> +   statements with if-elseif statements.
> 
> +++ gcc/testsuite/ChangeLog
> +2014-01-07  Balaji V. Iyer  
> +
> +   PR c++/59631
> +   * gcc.dg/cilk-plus/cilk-plus.exp: Removed "-fcilkplus" from flags 
> list.
> +   * g++.dg/cilk-plus/cilk-plus.exp: Likewise.
> +   * c-c++-common/cilk-plus/CK/spawnee_inline.c: Replaced second dg-
> option
> +   with dg-additional-options.
> +   * c-c++-common/cilk-plus/CK/varargs_test.c: Likewise.
> +   * c-c++-common/cilk-plus/CK/steal_check.c: Likewise.
> +   * c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
> +   * c-c++-common/cilk-plus/CK/spawning_arg.c: Likewise.
> +   * c-c++-common/cilk-plus/CK/invalid_spawns.c: Added a dg-options
> tag.
> +   * c-c++-common/cilk-plus/CK/pr59631.c: New testcase.
> 
> +++ gcc/cp/ChangeLog
> +2014-01-07  Balaji V. Iyer  
> +
> +   PR c++/59631
> +   * parser.c (cp_parser_postfix_expression): Added a new if-statement
> +   and replaced an existing if-statement with else-if statement.
> +   Changed an existing error message wording to match the one from the
> C
> +   parser.
> 
> Thanks,
> 
> Balaji V. Iyer.
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c(revision 206392)
+++ gcc/c/c-parser.c(working copy)
@@ -7500,7 +7500,7 @@
  expr = c_parser_postfix_expression (parser);
  expr.value = error_mark_node;   
}
- if (c_parser_peek_token (parser)->keyword == RID_CILK_SPAWN)
+ else if (c_parser_peek_token (parser)->keyword == RID_CILK_SPAWN)
{
  error_at (loc, "consecutive %<_Cilk_spawn%> keywords "
"are not permitted");
Index: gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp
===
--- gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp(revision 206392)
+++ gcc/testsuite/gcc.dg/cilk-plus/cilk-plus.exp(working copy)
@@ -51,13 +51,13 @@
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " 
-fcilkplus -O3 -std=c99" " "
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " 
-fcilkplus -g -O0 -std=c99" " "
 
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-g -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O1 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O2 -std=c99 -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O2 -ftree-vectorize -fcilkplus" " "
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O3 -g -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-g " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O1 " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O2 -std=c99 " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O2 -ftree-vectorize " " "
+dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-O3 -g " " "
 if { [check_effective_target_lto] } {
-dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -flto -g -fcilkplus" " "
+dg-runtest [lsort [glob -nocomplain 
$srcdir/c-c++-common/cilk-plus/CK/*.c]] " -O3 -flto -g " " "
 }
 
 dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/SE/*.c]] " 
-g" " "
Index: gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp
===
--- gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(revision 206392)
+++ gcc/testsuite/g++.dg/cilk-plus/cilk-plus.exp(working copy)
@@ -74,12 +74,12 @@
 dg-finish
 
 dg-init
-dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/CK/*.c]] " 
-fcilkplus" " "
-dg-run

Re: [PATCH] Fix ifcvt (PR rtl-optimization/58668)

2014-01-08 Thread Uros Bizjak

Hello!

>> So like this instead?  Bootstrapped/regtested on x86_64-linux and
>> i686-linux.  For 4.8 I'd still prefer the earlier patch though.
>>
>> 2013-12-18  Jakub Jelinek  
>>
>> PR rtl-optimization/58668
>> * cfgcleanup.c (flow_find_cross_jump): Don't count
>> any jumps if dir_p is NULL.  Remove p1 variable, use active_insn_p
>> to determine what is counted.
>> (flow_find_head_matching_sequence): Use active_insn_p to determine
>> what is counted.
>> (try_head_merge_bb): Adjust for the flow_find_head_matching_sequence
>> counting change.
>> * ifcvt.c (count_bb_insns): Use active_insn_p && !JUMP_P to
>> determine what is counted.
>>
>> * gcc.dg/pr58668.c: New test.
>
> This is fine for the trunk. Release manager's call for what they'd prefer on 
> the 4.8 branch.

This caused PR59724 on alpha:

20021116-1.c: In function ‘foo’:
20021116-1.c:31:1: error: NOTE_INSN_BASIC_BLOCK is missing for block 9
 }
 ^
20021116-1.c:31:1: error: insn outside basic block
(jump_insn 94 52 93 9 (return) 20021116-1.c:31 -1
 (nil)
 -> return)

Uros.

Re: [PATCH] Add zero-overhead looping for xtensa backend

2014-01-08 Thread Sterling Augustine

On Wed, Jan 8, 2014 at 8:27 AM, Felix Yang  wrote:
> Hi Sterling,
>
>   This patch implements zero-overhead looping for xtensa backend using
> hw-doloop facility.
>   If OK for trunk, please apply it for me. Thanks.

Hi Felix,

I last worked on zero-overhead loops for Xtensa in the gcc 4.3
timeframe, but when I did, I ran into several problems related to
later optimizations rearranging the code which I didn't have time to
address.

I'm sure much of that experience is completely stale now, but I would
appreciate a detail of the testing you have done with this patch (in
particular, a description of the different xtensa configurations you
tested it against, especially the ones with and without loop
instructions) before I approve it. Please be sure the assembler can
relax the loops it generates as well. I don't see any particular
problem, but there are many, many gotchas when dealing with xtensa
loop instructions.

It also appears that Tensilica has stopped posting test results for
Xtensa, which makes it difficult to evaluate the quality of this
patch.

Thanks,

Sterling

Re: [PATCH][ARM]Use of vcvt for float to fixed point conversions.

2014-01-08 Thread Christophe Lyon

Hi Renlin,

The new test you added introduces 2 new FAILs when the target is
arm-none-linux-gnueabi (as opposed to arm-none-linux-gnueabihf).

Christophe.


On 24 December 2013 15:46, Renlin Li  wrote:
> Hi,
>
> I just updated my patch according your suggestion.
> Thank you for committing it for me!
>
> All you guys have a nice Xmas break!
>
> Kind regards,
> Renlin Li
>
>
> On 04/12/13 11:23, Ramana Radhakrishnan wrote:
>>
>> Sorry about the slow response. Been on holiday.
>>
>> On 20/11/13 16:27, Renlin Li wrote:
>>>
>>> Hi all,
>>>
>>> This patch will make the arm back-end use vcvt for float to fixed point
>>> conversions when applicable.
>>>
>>> Test on arm-none-linux-gnueabi has been done on the model.
>>> Okay for trunk?
>>>
>>> + (define_insn "*combine_vcvtf2i"
>>> +   [(set (match_operand:SI 0 "s_register_operand" "=r")
>>> +   (fix:SI (fix:SF (mult:SF (match_operand:SF 1 "s_register_operand"
>>> "t")
>>> +(match_operand 2
>>> +"const_double_vcvt_power_of_two"
>>> "Dp")]
>>> +   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP3 &&
>>> !flag_rounding_math"
>>> +   "vcvt%?.s32.f32\\t%1, %1, %v2\;vmov%?\\t%0, %1"
>>> +   [(set_attr "predicable" "yes")
>>> +(set_attr "predicable_short_it" "no")
>>> +(set_attr "ce_count" "2")
>>> +(set_attr "type" "f_cvtf2i")]
>>> + )
>>> +
>>
>> You need to set length to 8.
>>
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/arm/fixed_float_conversion.c
>>> @@ -0,0 +1,15 @@
>>> +/* Check that vcvt is used for fixed and float data conversions.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O1 -mfpu=vfp3" } */
>>> +/* { dg-require-effective-target arm_vfp_ok } */
>>> +float fixed_to_float(int i)
>>> +{
>>> +return ((float)i / (1 << 16));
>>> +}
>>> +
>>> +int float_to_fixed(float f)
>>> +{
>>> +return ((int)(f*(1 << 16)));
>>> +}
>>> +/* { dg-final { scan-assembler "vcvt.f32.s32" } } */
>>> +/* { dg-final { scan-assembler "vcvt.s32.f32" } } */
>>
>>
>> GNU coding style for functions.
>>
>> Ok with those changes.
>>
>>
>>
>>
>> regards
>> Ramana
>>
>>
>>> Kind regards,
>>> Renlin Li
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> 2013-11-20  Renlin Li  
>>>
>>>* config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
>>>* config/arm/constraints.md (Dp): Define new constraint.
>>>* config/arm/predicates.md ( const_double_vcvt_power_of_two):
>>> Define
>>>new predicate.
>>>* config/arm/arm.c (arm_print_operand): Add print for new
>>> fucntion.
>>>(vfp3_const_double_for_bits): New function.
>>>* config/arm/vfp.md (combine_vcvtf2i): Define new instruction.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2013-11-20  Renlin Li  
>>>
>>>* gcc.target/arm/fixed_float_conversion.c: New test case.
>>>
>

Re: [Patch] libgcov.c re-factoring

2014-01-08 Thread Teresa Johnson

On Wed, Jan 8, 2014 at 6:34 AM, Jan Hubicka  wrote:
>>
>> Actually, I tried changing these two, but gcc_checking_assert is
>> undefined in libgcov.a. Ok to commit without this change?
>
> OK.
> incrementally can you please define gcov_nonruntime_assert that will wind into
> gcc_assert for code within gcc/coverage tools and into nothing for libgcov
> runtime and we can change those offenders to that.

Ok, committed as r206435. Will send the assert patch in a follow-up
later this week.

Teresa

>
> Honza
>>
>> Teresa
>>
>> >
>> > Thanks,
>> > Teresa
>> >
>> >>
>> >> Thanks,
>> >> Honza
>> >
>> >
>> >
>> > --
>> > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[PATCH] Add zero-overhead looping for xtensa backend

2014-01-08 Thread Felix Yang

Hi Sterling,

  This patch implements zero-overhead looping for xtensa backend using
hw-doloop facility.
  If OK for trunk, please apply it for me. Thanks.


Index: gcc/ChangeLog
===
--- gcc/ChangeLog(revision 206431)
+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,18 @@
+2014-01-08  Felix Yang  
+
+* config/xtensa/xtensa.c (xtensa_reorg): New.
+(xtensa_reorg_loops): New.
+(xtensa_can_use_doloop_p): New.
+(xtensa_invalid_within_doloop): New.
+(hwloop_optimize): New.
+(hwloop_fail): New.
+(hwloop_pattern_reg): New.
+(xtensa_emit_loop_end): Modified to emit the zero-overhead loop end label.
+(xtensa_doloop_hooks): Define.
+* config/xtensa/xtensa.md (doloop_end): New.
+(zero_cost_loop_start): Rewritten.
+(zero_cost_loop_end): Rewritten.
+
 2014-01-08  Marek Polacek  

 PR middle-end/59669
Index: gcc/config/xtensa/xtensa.md
===
--- gcc/config/xtensa/xtensa.md(revision 206431)
+++ gcc/config/xtensa/xtensa.md(working copy)
@@ -35,6 +35,8 @@
   (UNSPEC_TLS_CALL9)
   (UNSPEC_TP10)
   (UNSPEC_MEMW11)
+  (UNSPEC_LSETUP_START  12)
+  (UNSPEC_LSETUP_END13)

   (UNSPECV_SET_FP1)
   (UNSPECV_ENTRY2)
@@ -1289,6 +1291,8 @@
(set_attr "length""3")])


+;; Hardware loop support.
+
 ;; Define the loop insns used by bct optimization to represent the
 ;; start and end of a zero-overhead loop (in loop.c).  This start
 ;; template generates the loop insn; the end template doesn't generate
@@ -1296,34 +1300,58 @@

 (define_insn "zero_cost_loop_start"
   [(set (pc)
-(if_then_else (eq (match_operand:SI 0 "register_operand" "a")
-  (const_int 0))
-  (label_ref (match_operand 1 "" ""))
-  (pc)))
-   (set (reg:SI 19)
-(plus:SI (match_dup 0) (const_int -1)))]
+(if_then_else (ne (match_operand:SI 2 "nonimmediate_operand" "0")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+   (set (match_operand:SI 0 "nonimmediate_operand" "=a")
+(plus (match_dup 2)
+  (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_LSETUP_START)]
   ""
-  "loopnez\t%0, %l1"
+  "loop\t%0, %l1_LEND"
   [(set_attr "type""jump")
(set_attr "mode""none")
(set_attr "length""3")])

 (define_insn "zero_cost_loop_end"
   [(set (pc)
-(if_then_else (ne (reg:SI 19) (const_int 0))
-  (label_ref (match_operand 0 "" ""))
-  (pc)))
-   (set (reg:SI 19)
-(plus:SI (reg:SI 19) (const_int -1)))]
+(if_then_else (ne (match_operand:SI 2 "nonimmediate_operand" "0")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+   (set (match_operand:SI 0 "nonimmediate_operand" "=a")
+(plus (match_dup 2)
+  (const_int -1)))
+   (unspec [(const_int 0)] UNSPEC_LSETUP_END)]
   ""
 {
-xtensa_emit_loop_end (insn, operands);
-return "";
+  xtensa_emit_loop_end (insn, operands);
+  return "";
 }
   [(set_attr "type""jump")
(set_attr "mode""none")
(set_attr "length""0")])

+; operand 0 is the loop count pseudo register
+; operand 1 is the label to jump to at the top of the loop
+(define_expand "doloop_end"
+  [(parallel [(set (pc) (if_then_else
+  (ne (match_operand:SI 0 "" "")
+  (const_int 1))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))
+  (set (match_dup 0)
+   (plus:SI (match_dup 0)
+(const_int -1)))
+  (unspec [(const_int 0)] UNSPEC_LSETUP_END)])]
+  ""
+{
+  /* The loop optimizer doesn't check the predicates... */
+  if (GET_MODE (operands[0]) != SImode)
+FAIL;
+})
+

 ;; Setting a register from a comparison.

Index: gcc/config/xtensa/xtensa.c
===
--- gcc/config/xtensa/xtensa.c(revision 206431)
+++ gcc/config/xtensa/xtensa.c(working copy)
@@ -1,6 +1,7 @@
 /* Subroutines for insn-output.c for Tensilica's Xtensa architecture.
Copyright (C) 2001-2014 Free Software Foundation, Inc.
Contributed by Bob Wilson (bwil...@tensilica.com) at Tensilica.
+   Zero-overhead looping support by Felix Yang (felix.yang0...@gmail.com).

 This file is part of GCC.

@@ -61,8 +62,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "gimplify.h"
 #include "df.h"
+#include "hw-doloop.h"
+#include "dumpfile.h"

-
 /* Enumeration for all of the relational tests, so that we can build
arrays indexed by the test type, and not worry about the order
of EQ, NE, etc.  */
@@ -186,6 +188,10 @@ static reg_class_t xtensa_secondary_reload (bool,

 static bool constantpool_address_p (con

[Patch, Fortran] PR 58182: [4.9 Regression] ICE with global binding name used as a FUNCTION

2014-01-08 Thread Janus Weil

Hi all,

I just committed an 'obvious' patch for a ICE-on-invalid regression on trunk:

http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=206429

Cheers,
Janus

Re: [PATCH] Don't segv in omp-low.c (PR middle-end/59669)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 04:25:47PM +0100, Marek Polacek wrote:
> Indeed it does.  So like this?
> 
> 2014-01-08  Marek Polacek  
> 
>   PR middle-end/59669
>   * omp-low.c (simd_clone_adjust): Don't crash if def is NULL.
> testsuite/
>   * gcc.dg/gomp/pr59669-1.c: New test.
>   * gcc.dg/gomp/pr59669-2.c: New test.

Yep, thanks.

Jakub

Re: [PATCH] Don't segv in omp-low.c (PR middle-end/59669)

2014-01-08 Thread Marek Polacek

On Wed, Jan 08, 2014 at 04:14:06PM +0100, Jakub Jelinek wrote:
> On Wed, Jan 08, 2014 at 04:09:08PM +0100, Marek Polacek wrote:
> > We can also get NULL for the default definition, so we need to handle that
> > before calling has_zero_uses on it.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> Looks ok, but there is similar code a few lines above, can you please fix it 
> up
> and add it to the testcase?
> 
> I'd think
> #pragma omp declare simd uniform(a) aligned(a:32)
> void
> bar (int *a)
> {
> }
> 
> could hit the other spot.

Indeed it does.  So like this?

2014-01-08  Marek Polacek  

PR middle-end/59669
* omp-low.c (simd_clone_adjust): Don't crash if def is NULL.
testsuite/
* gcc.dg/gomp/pr59669-1.c: New test.
* gcc.dg/gomp/pr59669-2.c: New test.

--- gcc/omp-low.c.mp2014-01-08 13:48:40.353624984 +0100
+++ gcc/omp-low.c   2014-01-08 16:21:06.247268557 +0100
@@ -11537,7 +11537,7 @@ simd_clone_adjust (struct cgraph_node *n
unsigned int alignment = node->simdclone->args[i].alignment;
tree orig_arg = node->simdclone->args[i].orig_arg;
tree def = ssa_default_def (cfun, orig_arg);
-   if (!has_zero_uses (def))
+   if (def && !has_zero_uses (def))
  {
tree fn = builtin_decl_explicit (BUILT_IN_ASSUME_ALIGNED);
gimple_seq seq = NULL;
@@ -11587,7 +11587,7 @@ simd_clone_adjust (struct cgraph_node *n
tree def = ssa_default_def (cfun, orig_arg);
gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (orig_arg))
|| POINTER_TYPE_P (TREE_TYPE (orig_arg)));
-   if (!has_zero_uses (def))
+   if (def && !has_zero_uses (def))
  {
iter1 = make_ssa_name (orig_arg, NULL);
iter2 = make_ssa_name (orig_arg, NULL);
--- gcc/testsuite/gcc.dg/gomp/pr59669-1.c.mp2014-01-08 13:50:23.710492087 
+0100
+++ gcc/testsuite/gcc.dg/gomp/pr59669-1.c   2014-01-08 13:50:54.339622411 
+0100
@@ -0,0 +1,9 @@
+/* PR middle-end/59669 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+#pragma omp declare simd linear(a)
+void
+foo (int a)
+{
+}
--- gcc/testsuite/gcc.dg/gomp/pr59669-2.c.mp2014-01-08 16:20:35.553121408 
+0100
+++ gcc/testsuite/gcc.dg/gomp/pr59669-2.c   2014-01-08 16:20:54.099210269 
+0100
@@ -0,0 +1,9 @@
+/* PR middle-end/59669 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+#pragma omp declare simd uniform(a) aligned(a:32)
+void
+bar (int *a)
+{
+}

Marek

Re: [Patch,ARM] crypto intrinsics in AArch32 testsuite fix

2014-01-08 Thread Kyrill Tkachov


On 08/01/14 15:00, Christophe Lyon wrote:

Hi,

Commit 206131 introduced check_effective_target_arm_crypto_ok in
lib/target-supports.exp, to check that the target supports
-mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp.

However, when GCC is configured for target arm-none-linux-gnueabihf, I
can see all the new tests fail:
sysroot-arm-none-linux-gnueabihf/usr/include/gnu/stubs.h:7:29: fatal
error: gnu/stubs-soft.h: No such file or directory

(stubs.h is included via arm_neon.h)

This is because check_effective_target_arm_crypto_ok sample test is
too simple. Making it include arm_neon.h does the trick (and makes the
tests UNSUPPORTED rather than FAIL).

OK?

Christophe.


Hi Christophe,

I believe the best solution here is to figure out the best mfloat-abi and mfpu 
options combiation like we do for the NEON options (look for example at 
check_effective_target_arm_neon_ok_nocache in target-supports.exp).


That way these tests will not add -mfloat-abi=softfp to an 
arm-none-linux-gnueabihf target (which is the root of the problem) and they will 
PASS instead of being just UNSUPPORTED.


I have a patch for that in testing.

Thanks,
Kyrill

Re: [PATCH] Don't segv in omp-low.c (PR middle-end/59669)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 04:09:08PM +0100, Marek Polacek wrote:
> We can also get NULL for the default definition, so we need to handle that
> before calling has_zero_uses on it.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Looks ok, but there is similar code a few lines above, can you please fix it up
and add it to the testcase?

I'd think
#pragma omp declare simd uniform(a) aligned(a:32)
void
bar (int *a)
{
}

could hit the other spot.

Jakub

[PATCH] Don't segv in omp-low.c (PR middle-end/59669)

2014-01-08 Thread Marek Polacek

We can also get NULL for the default definition, so we need to handle that
before calling has_zero_uses on it.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-01-08  Marek Polacek  

PR middle-end/59669
* omp-low.c (simd_clone_adjust): Don't crash if def is NULL.
testsuite/
* gcc.dg/gomp/pr59669.c: New test.

--- gcc/omp-low.c.mp2014-01-08 13:48:40.353624984 +0100
+++ gcc/omp-low.c   2014-01-08 13:48:47.780656551 +0100
@@ -11587,7 +11587,7 @@ simd_clone_adjust (struct cgraph_node *n
tree def = ssa_default_def (cfun, orig_arg);
gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (orig_arg))
|| POINTER_TYPE_P (TREE_TYPE (orig_arg)));
-   if (!has_zero_uses (def))
+   if (def && !has_zero_uses (def))
  {
iter1 = make_ssa_name (orig_arg, NULL);
iter2 = make_ssa_name (orig_arg, NULL);
--- gcc/testsuite/gcc.dg/gomp/pr59669.c.mp  2014-01-08 13:50:23.710492087 
+0100
+++ gcc/testsuite/gcc.dg/gomp/pr59669.c 2014-01-08 13:50:54.339622411 +0100
@@ -0,0 +1,9 @@
+/* PR middle-end/59669 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+#pragma omp declare simd linear(a)
+void
+foo (int a)
+{
+}

Marek

[Patch,ARM] crypto intrinsics in AArch32 testsuite fix

2014-01-08 Thread Christophe Lyon

Hi,

Commit 206131 introduced check_effective_target_arm_crypto_ok in
lib/target-supports.exp, to check that the target supports
-mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp.

However, when GCC is configured for target arm-none-linux-gnueabihf, I
can see all the new tests fail:
sysroot-arm-none-linux-gnueabihf/usr/include/gnu/stubs.h:7:29: fatal
error: gnu/stubs-soft.h: No such file or directory

(stubs.h is included via arm_neon.h)

This is because check_effective_target_arm_crypto_ok sample test is
too simple. Making it include arm_neon.h does the trick (and makes the
tests UNSUPPORTED rather than FAIL).

OK?

Christophe.

2014-01-08  Christophe Lyon  

* lib/target-supports.exp (check_effective_target_arm_crypto_ok):
Include arm_neon.h in sample test.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a8910bb..cc10936 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-01-08  Christophe Lyon  
+
+   * lib/target-supports.exp (check_effective_target_arm_crypto_ok):
+   Include arm_neon.h in sample test.
+
 2014-01-07  Paolo Carlini  
 
* g++.dg/ext/is_base_of_incomplete-2.C: New.
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 5166679..7b40ccd 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2305,6 +2305,7 @@ proc check_effective_target_arm_unaligned { } {
 proc check_effective_target_arm_crypto_ok {} {
 if { [check_effective_target_arm32] } {
return [check_no_compiler_messages arm_crypto_ok object {
+ #include 
  int foo (void)
  {
 __asm__ volatile ("aese.8 q0, q0");

Re: Extend -fstack-protector-strong to cover calls with return slot

2014-01-08 Thread Florian Weimer


On 01/07/2014 02:37 PM, Jakub Jelinek wrote:

On Tue, Jan 07, 2014 at 02:27:04PM +0100, Florian Weimer wrote:

gimplify_modify_expr_rhs, in the CALL_EXPR case:

  if (use_target)
{
  CALL_EXPR_RETURN_SLOT_OPT (*from_p) = 1;
  mark_addressable (*to_p);
}


Yeah, that sets it in some cases too, not in other testcases.

Just look at how the flag is used when actually expanding it:

 if (target && MEM_P (target) && CALL_EXPR_RETURN_SLOT_OPT (exp))
   structure_value_addr = XEXP (target, 0);
 else
   {
 /* For variable-sized objects, we must be called with a target
specified.  If we were to allocate space on the stack here,
we would have no way of knowing when to free it.  */
 rtx d = assign_temp (rettype, 1, 1);
 structure_value_addr = XEXP (d, 0);
 target = 0;
   }


Okay, I'm beginning to understand.  I tried to actually reach the second 
branch, and ended up with PR59711. :)


foo12 in the new C testcase covers it in part without a variable-sized 
object.



so, if it is set, the address of the var on the LHS is passed to the
function as hidden argument, if it is not set, we pass address of
a stack temporary instead.  Both the automatic var and the stack temporary
can overflow, if the callee does something wrong.


What about the attached version?  It still does not exactly match your 
original suggestion because gimple_call_lhs (stmt) can be NULL_TREE if 
the result is ignored and this case needs instrumentation, as you 
explained, so I use the function return type in the aggregate_value_p check.


Testing is still under way, but looks good so far.  I'm bootstrapping 
with BOOT_CFLAGS="-O2 -g -fstack-protector-strong" with Ada enabled, for 
additional coverage.


--
Florian Weimer / Red Hat Product Security Team
gcc/

2014-01-08  Florian Weimer  

	* cfgexpand.c (stack_protect_decl_p): New function, extracted from
	expand_used_vars.
	(stack_protect_return_slot_p): New function.
	(expand_used_vars): Call stack_protect_decl_p and
	stack_protect_return_slot_p for -fstack-protector-strong.

gcc/testsuite/

2014-01-08  Florian Weimer  

	* gcc.dg/fstack-protector-strong.c: Add coverage for return slots.
	* g++.dg/fstack-protector-strong.C: Likewise.
	* gcc.target/i386/ssp-strong-reg.c: New file.

Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c	(revision 206311)
+++ gcc/cfgexpand.c	(working copy)
@@ -1599,6 +1599,52 @@
   return 0;
 }
 
+/* Check if the current function has local referenced variables that
+   have their addresses taken, contain an array, or are arrays.  */
+
+static bool
+stack_protect_decl_p ()
+{
+  unsigned i;
+  tree var;
+
+  FOR_EACH_LOCAL_DECL (cfun, i, var)
+if (!is_global_var (var))
+  {
+	tree var_type = TREE_TYPE (var);
+	if (TREE_CODE (var) == VAR_DECL
+	&& (TREE_CODE (var_type) == ARRAY_TYPE
+		|| TREE_ADDRESSABLE (var)
+		|| (RECORD_OR_UNION_TYPE_P (var_type)
+		&& record_or_union_type_has_array_p (var_type
+	  return true;
+  }
+  return false;
+}
+
+/* Check if the current function has calls that use a return slot.  */
+
+static bool
+stack_protect_return_slot_p ()
+{
+  basic_block bb;
+  
+  FOR_ALL_BB_FN (bb, cfun)
+for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+  {
+	gimple stmt = gsi_stmt (gsi);
+	/* This assumes that calls to internal-only functions never
+	   use a return slot.  */
+	if (is_gimple_call (stmt)
+	&& !gimple_call_internal_p (stmt)
+	&& aggregate_value_p (TREE_TYPE (gimple_call_fntype (stmt)),
+  gimple_call_fndecl (stmt)))
+	  return true;
+  }
+  return false;
+}
+
 /* Expand all variables used in the function.  */
 
 static rtx
@@ -1669,22 +1715,8 @@
   pointer_map_destroy (ssa_name_decls);
 
   if (flag_stack_protect == SPCT_FLAG_STRONG)
-FOR_EACH_LOCAL_DECL (cfun, i, var)
-  if (!is_global_var (var))
-	{
-	  tree var_type = TREE_TYPE (var);
-	  /* Examine local referenced variables that have their addresses taken,
-	 contain an array, or are arrays.  */
-	  if (TREE_CODE (var) == VAR_DECL
-	  && (TREE_CODE (var_type) == ARRAY_TYPE
-		  || TREE_ADDRESSABLE (var)
-		  || (RECORD_OR_UNION_TYPE_P (var_type)
-		  && record_or_union_type_has_array_p (var_type
-	{
-	  gen_stack_protect_signal = true;
-	  break;
-	}
-	}
+  gen_stack_protect_signal
+	= stack_protect_decl_p () || stack_protect_return_slot_p ();
 
   /* At this point all variables on the local_decls with TREE_USED
  set are not associated with any block scope.  Lay them out.  */
Index: gcc/testsuite/g++.dg/fstack-protector-strong.C
===
--- gcc/testsuite/g++.dg/fstack-protector-strong.C	(revision 206311)
+++ gcc/testsuite/g++.dg/fstack-protector-

Re: [Patch] libgcov.c re-factoring

2014-01-08 Thread Jan Hubicka

> 
> Actually, I tried changing these two, but gcc_checking_assert is
> undefined in libgcov.a. Ok to commit without this change?

OK.
incrementally can you please define gcov_nonruntime_assert that will wind into
gcc_assert for code within gcc/coverage tools and into nothing for libgcov
runtime and we can change those offenders to that.

Honza
> 
> Teresa
> 
> >
> > Thanks,
> > Teresa
> >
> >>
> >> Thanks,
> >> Honza
> >
> >
> >
> > --
> > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
> 
> 
> 
> -- 
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)

2014-01-08 Thread Richard Biener

On Wed, Jan 8, 2014 at 1:45 PM, Jakub Jelinek  wrote:
> On Wed, Jan 08, 2014 at 12:32:59PM +0100, Richard Biener wrote:
>> > Either before writing PCH c-common.c could call some tree.c routine that
>> > would traverse the cl_option_hash_table hash table and for every
>> > TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
>> > Or perhaps some gengtype extension to run some routine before PCH saving
>> > on the tree_target_option structs and clear the globals field in there.
>> > Or use GTY((user)) on tree_target_option, but then dunno how we'd handle 
>> > the
>> > marking of the embedded opts field (and common).
>> > Any ideas?
>>
>> Well, a GTY((skip_pch)) would probably work.  Or move the thing
>> out-of GC land (thus make cl_option_hash_table persistant) and
>> simply GTY((skip)) the pointer completely.  Not sure if we ever
>> collect from it.
>
> Even if the pointer was out of GCC land and GTY((skip)), we'd need to clear
> it somewhere during PCH saving, as the containing structure is GC allocated.
>
> I've already implemented in the mean time the variant with the
> htab_traverse, all still reachable TARGET_OPTION_NODE trees should be in that
> hash table.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (in both cases with
> --enable-checking=yes,rtl and --enable-checking=release, for the
> i686-linux/release checking I had to fix an unrelated compare debug issue
> I'll post when I manage to reduce testcase).
>
> I'd like to get rid of all the XCNEW calls in target-globals.c as a
> follow-up.
>
> As for performance, for --enable-checking=release from very rough check
> on make -j48 bootstrap and make -j48 check times the patch is compile time
> neutral, on e.g. declare-simd-1.C testcase g++ is twice as fast with the
> patch though (~ 0.8 sec without the patch, ~ 0.3 sec with the patch, both
> for x86_64 and i686).
>
> Ok for trunk?

Works for me.  Wait a bit for others to comment though.

Thanks,
Richard.

> 2014-01-07  Jakub Jelinek  
>
> PR target/58115
> * tree-core.h (struct target_globals): New forward declaration.
> (struct tree_target_option): Add globals field.
> * tree.h (TREE_TARGET_GLOBALS): Define.
> (prepare_target_option_nodes_for_pch): New prototype.
> * target-globals.h (struct target_globals): Define even if
> !SWITCHABLE_TARGET.
> * tree.c (prepare_target_option_node_for_pch,
> prepare_target_option_nodes_for_pch): New functions.
> * config/i386/i386.h (SWITCHABLE_TARGET): Define.
> * config/i386/i386.c: Include target-globals.h.
> (ix86_set_current_function): Instead of doing target_reinit
> unconditionally, use save_target_globals_default_opts and
> restore_target_globals.
> c-family/
> * c-pch.c (c_common_write_pch): Call
> prepare_target_option_nodes_for_pch.
>
> --- gcc/tree-core.h.jj  2014-01-07 08:47:24.0 +0100
> +++ gcc/tree-core.h 2014-01-07 16:44:35.591358235 +0100
> @@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
>struct target_optabs *GTY ((skip)) base_optabs;
>  };
>
> +/* Forward declaration, defined in target-globals.h.  */
> +
> +struct GTY(()) target_globals;
> +
>  /* Target options used by a function.  */
>
>  struct GTY(()) tree_target_option {
>struct tree_common common;
>
> +  /* Target globals for the corresponding target option.  */
> +  struct target_globals *globals;
> +
>/* The optimization options used by the user.  */
>struct cl_target_option opts;
>  };
> --- gcc/tree.h.jj   2014-01-03 11:40:33.0 +0100
> +++ gcc/tree.h  2014-01-07 21:28:15.038061120 +0100
> @@ -2695,9 +2695,14 @@ extern tree build_optimization_node (str
>  #define TREE_TARGET_OPTION(NODE) \
>(&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
>
> +#define TREE_TARGET_GLOBALS(NODE) \
> +  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
> +
>  /* Return a tree node that encapsulates the target options in OPTS.  */
>  extern tree build_target_option_node (struct gcc_options *opts);
>
> +extern void prepare_target_option_nodes_for_pch (void);
> +
>  #if defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 2007)
>
>  inline tree
> --- gcc/target-globals.h.jj 2014-01-03 11:40:46.0 +0100
> +++ gcc/target-globals.h2014-01-07 17:08:51.113880947 +0100
> @@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
>  extern struct target_gcse *this_target_gcse;
>  extern struct target_bb_reorder *this_target_bb_reorder;
>  extern struct target_lower_subreg *this_target_lower_subreg;
> +#endif
>
>  struct GTY(()) target_globals {
>struct target_flag_state *GTY((skip)) flag_state;
> @@ -57,6 +58,7 @@ struct GTY(()) target_globals {
>struct target_lower_subreg *GTY((skip)) lower_subreg;
>  };
>
> +#if SWITCHABLE_TARGET
>  extern struct target_globals default_target_globals;
>
>  extern struct target_globals *save_target_globals (void);
> --- gcc/tree.c.jj

Workaround PR59584 on 4.8 "Fix use of stack-pointer-register as a temporary for CRIS"

2014-01-08 Thread Hans-Peter Nilsson

> From: Hans-Peter Nilsson 
> Date: Mon, 23 Dec 2013 23:34:02 +0100

Just as previously done on trunk, I'm going to cover up PR59584
(which was fixed and then exposed on the 4.8 branch) by applying
commit r206187 from trunk below.  Again, the PR bug is an ICE
caused by the combination of expr.c:find_args_size_adjust and
expr.c:fixup_args_size_notes not able to handle a define_split
matching for the stack-adjustment assignment instruction emitted
by __builtin_stack_restore (the insn that gets the REG_ARGS_SIZE
note).

*This* bug is slightly different but the fix happens to cover up
that bug by not matching the splitter for the stack-pointer; the
destination is used as a temporary, so sp is set to something
unusable as a stack-pointer, ungood.

Tested cris-elf, makes gcc.dg/pr50251.c pass again, will commit
to the 4.8 branch.

>   PR middle-end/59584
>   * config/cris/predicates.md (cris_nonsp_register_operand):
>   New define_predicate.
>   * config/cris/cris.md: Replace register_operand with
>   cris_nonsp_register_operand for destinations in all
>   define_splits where a register is set more than once.
> 
> Index: gcc/config/cris/cris.md
> ===
> --- gcc/config/cris/cris.md   (revision 206176)
> +++ gcc/config/cris/cris.md   (working copy)
> @@ -758,7 +758,7 @@ (define_split
> (match_operand:SI 1 "const_int_operand" ""))
>(match_operand:SI 2 "register_operand" ""))])
> (match_operand 3 "register_operand" ""))
> - (set (match_operand:SI 4 "register_operand" "")
> + (set (match_operand:SI 4 "cris_nonsp_register_operand" "")
> (plus:SI (mult:SI (match_dup 0)
>   (match_dup 1))
>  (match_dup 2)))])]
> @@ -859,7 +859,7 @@ (define_split
>(match_operand:SI 0 "cris_bdap_operand" "")
>(match_operand:SI 1 "cris_bdap_operand" ""))])
> (match_operand 2 "register_operand" ""))
> - (set (match_operand:SI 3 "register_operand" "")
> + (set (match_operand:SI 3 "cris_nonsp_register_operand" "")
> (plus:SI (match_dup 0) (match_dup 1)))])]
>"reload_completed && reg_overlap_mentioned_p (operands[3], operands[2])"
>[(set (match_dup 4) (match_dup 2))
> @@ -3960,7 +3960,7 @@ (define_expand "casesi"
>  ;; up.
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>4 "cris_operand_extend_operator"
>[(match_operand 1 "register_operand" "")
> @@ -3990,7 +3990,7 @@ (define_split
>  ;; Call this op-extend-split-rx=rz
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>4 "cris_plus_or_bound_operator"
>[(match_operand 1 "register_operand" "")
> @@ -4018,7 +4018,7 @@ (define_split
>  ;; Call this op-extend-split-swapped
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>4 "cris_plus_or_bound_operator"
>[(match_operator
> @@ -4044,7 +4044,7 @@ (define_split
>  ;; bound.  Call this op-extend-split-swapped-rx=rz.
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>4 "cris_plus_or_bound_operator"
>[(match_operator
> @@ -4075,7 +4075,7 @@ (define_split
>  ;; Call this op-extend.
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>3 "cris_orthogonal_operator"
>[(match_operand 1 "register_operand" "")
> @@ -4099,7 +4099,7 @@ (define_split
>  ;; Call this op-split-rx=rz
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>3 "cris_commutative_orth_op"
>[(match_operand 2 "memory_operand" "")
> @@ -4123,7 +4123,7 @@ (define_split
>  ;; Call this op-split-swapped.
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>3 "cris_commutative_orth_op"
>[(match_operand 1 "register_operand" "")
> @@ -4146,7 +4146,7 @@ (define_split
>  ;; Call this op-split-swapped-rx=rz.
>  
>  (define_split
> -  [(set (match_operand 0 "register_operand" "")
> +  [(set (match_operand 0 "cris_nonsp_register_operand" "")
>   (match_operator
>3 "cris_orthogonal_operator"
>[(match_operand 2 "memory_operand" "")
> @@ -4555,10 +4555,11 @@ (define_split
>  ;; We're not allowed to generate copies of registers with different mode
>  ;; until after reload; copying pseudos upsets reload.  CVS

Re: [Patch] libgcov.c re-factoring

2014-01-08 Thread Teresa Johnson

On Mon, Jan 6, 2014 at 9:49 AM, Teresa Johnson  wrote:
> On Sun, Jan 5, 2014 at 12:08 PM, Jan Hubicka  wrote:
>>> 2014-01-03  Rong Xu  
>>>
>>> * gcc/gcov-io.c (gcov_var): Move from gcov-io.h.
>>> (gcov_position): Ditto.
>>> (gcov_is_error): Ditto.
>>> (gcov_rewrite): Ditto.
>>> * gcc/gcov-io.h: Refactor. Move gcov_var to gcov-io.h, and libgcov
>>> only part to libgcc/libgcov.h.
>>> * libgcc/libgcov-driver.c: Use libgcov.h.
>>> (buffer_fn_data): Use xmalloc instead of malloc.
>>> (gcov_exit_merge_gcda): Ditto.
>>> * libgcc/libgcov-driver-system.c (allocate_filename_struct): Ditto.
>>> * libgcc/libgcov.h: New common header files for libgcov-*.h.
>>> * libgcc/libgcov-interface.c: Use libgcov.h
>>> * libgcc/libgcov-merge.c: Ditto.
>>> * libgcc/libgcov-profiler.c: Ditto.
>>> * libgcc/Makefile.in: Add dependence to libgcov.h
>>
>> OK, with the licence changes and...
>>>
>>> Index: gcc/gcov-io.c
>>> ===
>>> --- gcc/gcov-io.c   (revision 206100)
>>> +++ gcc/gcov-io.c   (working copy)
>>> @@ -36,6 +36,61 @@ static const gcov_unsigned_t *gcov_read_words (uns
>>>  static void gcov_allocate (unsigned);
>>>  #endif
>>>
>>> +/* Optimum number of gcov_unsigned_t's read from or written to disk.  */
>>> +#define GCOV_BLOCK_SIZE (1 << 10)
>>> +
>>> +GCOV_LINKAGE struct gcov_var
>>> +{
>>> +  FILE *file;
>>> +  gcov_position_t start;   /* Position of first byte of block */
>>> +  unsigned offset; /* Read/write position within the block.  */
>>> +  unsigned length; /* Read limit in the block.  */
>>> +  unsigned overread;   /* Number of words overread.  */
>>> +  int error;   /* < 0 overflow, > 0 disk error.  */
>>> +  int mode;/* < 0 writing, > 0 reading */
>>> +#if IN_LIBGCOV
>>> +  /* Holds one block plus 4 bytes, thus all coverage reads & writes
>>> + fit within this buffer and we always can transfer GCOV_BLOCK_SIZE
>>> + to and from the disk. libgcov never backtracks and only writes 4
>>> + or 8 byte objects.  */
>>> +  gcov_unsigned_t buffer[GCOV_BLOCK_SIZE + 1];
>>> +#else
>>> +  int endian;  /* Swap endianness.  */
>>> +  /* Holds a variable length block, as the compiler can write
>>> + strings and needs to backtrack.  */
>>> +  size_t alloc;
>>> +  gcov_unsigned_t *buffer;
>>> +#endif
>>> +} gcov_var;
>>> +
>>> +/* Save the current position in the gcov file.  */
>>> +static inline gcov_position_t
>>> +gcov_position (void)
>>> +{
>>> +  gcc_assert (gcov_var.mode > 0);
>>> +  return gcov_var.start + gcov_var.offset;
>>> +}
>>> +
>>> +/* Return nonzero if the error flag is set.  */
>>> +static inline int
>>> +gcov_is_error (void)
>>> +{
>>> +  return gcov_var.file ? gcov_var.error : 1;
>>> +}
>>> +
>>> +#if IN_LIBGCOV
>>> +/* Move to beginning of file and initialize for writing.  */
>>> +GCOV_LINKAGE inline void
>>> +gcov_rewrite (void)
>>> +{
>>> +  gcc_assert (gcov_var.mode > 0);
>>
>> I would turn those two asserts into checking asserts so they do not
>> bloat the runtime lib.
>
> Ok, but note that there are a number of other gcc_assert already in
> gcov-io.c (these were the only 2 in gcov-io.h, now moved here). Should
> I go ahead and change all of them in gcov-io.c?

Actually, I tried changing these two, but gcc_checking_assert is
undefined in libgcov.a. Ok to commit without this change?

Teresa

>
> Thanks,
> Teresa
>
>>
>> Thanks,
>> Honza
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [RFA][PATCH][middle-end/53623] Improve extension elimination

2014-01-08 Thread Jeff Law


On 01/08/14 01:14, Eric Botcazou wrote:

Committed after private email approval from Jakub.  I made one
additional trivial change (missing whitespace in a comment).


This breaks bootstrap with RTL checking enabled:

[ ... ]
Thanks.  I'm on it.
jeff

Re: Rb tree node recycling patch

2014-01-08 Thread Paolo Carlini


On 01/08/2014 02:34 PM, Paolo Carlini wrote:

Hi,

On 12/27/2013 07:30 PM, François Dumont wrote:
Note that this patch contains also a cleanup of a useless template 
parameter _Is_pod_comparator on _Rb_tree_impl.
The useless parameter is a remnant of an attempt at exploiting the EBO 
for _Rb_tree_impl. At some point Benjamin got a patch from a 
contributor but then had to quickly revert it just in time for the ABI 
freeze because it didn't work. Evrything is recorded in the mailing 
list. Anyway, whatever we do now (more exactly, post 4.9) let's make 
sure we don't break the ABI inadvertently, or, if we actually decide 
do that, we should reconsider the EBO.


This ChangeLog entry:

2004-03-25  Dhruv Matani  

   * include/bits/stl_tree.h: Introduced a new class _Rb_tree_impl, ...

has the original EBO idea, which in fact we didn't deliver.

Paolo.

Re: Rb tree node recycling patch

2014-01-08 Thread Paolo Carlini


Hi,

On 12/27/2013 07:30 PM, François Dumont wrote:
Note that this patch contains also a cleanup of a useless template 
parameter _Is_pod_comparator on _Rb_tree_impl.
The useless parameter is a remnant of an attempt at exploiting the EBO 
for _Rb_tree_impl. At some point Benjamin got a patch from a contributor 
but then had to quickly revert it just in time for the ABI freeze 
because it didn't work. Evrything is recorded in the mailing list. 
Anyway, whatever we do now (more exactly, post 4.9) let's make sure we 
don't break the ABI inadvertently, or, if we actually decide do that, we 
should reconsider the EBO.


About the node recycling idea itself, we got a closely related Bugzilla. 
Is it *exactly* the same issue, or not? Please double check.


Paolo.

[PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 12:32:59PM +0100, Richard Biener wrote:
> > Either before writing PCH c-common.c could call some tree.c routine that
> > would traverse the cl_option_hash_table hash table and for every
> > TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
> > Or perhaps some gengtype extension to run some routine before PCH saving
> > on the tree_target_option structs and clear the globals field in there.
> > Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
> > marking of the embedded opts field (and common).
> > Any ideas?
> 
> Well, a GTY((skip_pch)) would probably work.  Or move the thing
> out-of GC land (thus make cl_option_hash_table persistant) and
> simply GTY((skip)) the pointer completely.  Not sure if we ever
> collect from it.

Even if the pointer was out of GCC land and GTY((skip)), we'd need to clear
it somewhere during PCH saving, as the containing structure is GC allocated.

I've already implemented in the mean time the variant with the
htab_traverse, all still reachable TARGET_OPTION_NODE trees should be in that
hash table.

Bootstrapped/regtested on x86_64-linux and i686-linux (in both cases with
--enable-checking=yes,rtl and --enable-checking=release, for the
i686-linux/release checking I had to fix an unrelated compare debug issue
I'll post when I manage to reduce testcase).

I'd like to get rid of all the XCNEW calls in target-globals.c as a
follow-up.

As for performance, for --enable-checking=release from very rough check
on make -j48 bootstrap and make -j48 check times the patch is compile time
neutral, on e.g. declare-simd-1.C testcase g++ is twice as fast with the
patch though (~ 0.8 sec without the patch, ~ 0.3 sec with the patch, both
for x86_64 and i686).

Ok for trunk?

2014-01-07  Jakub Jelinek  

PR target/58115
* tree-core.h (struct target_globals): New forward declaration.
(struct tree_target_option): Add globals field.
* tree.h (TREE_TARGET_GLOBALS): Define.
(prepare_target_option_nodes_for_pch): New prototype.
* target-globals.h (struct target_globals): Define even if
!SWITCHABLE_TARGET.
* tree.c (prepare_target_option_node_for_pch,
prepare_target_option_nodes_for_pch): New functions.
* config/i386/i386.h (SWITCHABLE_TARGET): Define.
* config/i386/i386.c: Include target-globals.h.
(ix86_set_current_function): Instead of doing target_reinit
unconditionally, use save_target_globals_default_opts and
restore_target_globals.
c-family/
* c-pch.c (c_common_write_pch): Call
prepare_target_option_nodes_for_pch.

--- gcc/tree-core.h.jj  2014-01-07 08:47:24.0 +0100
+++ gcc/tree-core.h 2014-01-07 16:44:35.591358235 +0100
@@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
   struct target_optabs *GTY ((skip)) base_optabs;
 };
 
+/* Forward declaration, defined in target-globals.h.  */
+
+struct GTY(()) target_globals;
+
 /* Target options used by a function.  */
 
 struct GTY(()) tree_target_option {
   struct tree_common common;
 
+  /* Target globals for the corresponding target option.  */
+  struct target_globals *globals;
+
   /* The optimization options used by the user.  */
   struct cl_target_option opts;
 };
--- gcc/tree.h.jj   2014-01-03 11:40:33.0 +0100
+++ gcc/tree.h  2014-01-07 21:28:15.038061120 +0100
@@ -2695,9 +2695,14 @@ extern tree build_optimization_node (str
 #define TREE_TARGET_OPTION(NODE) \
   (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
 
+#define TREE_TARGET_GLOBALS(NODE) \
+  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
+
 /* Return a tree node that encapsulates the target options in OPTS.  */
 extern tree build_target_option_node (struct gcc_options *opts);
 
+extern void prepare_target_option_nodes_for_pch (void);
+
 #if defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 2007)
 
 inline tree
--- gcc/target-globals.h.jj 2014-01-03 11:40:46.0 +0100
+++ gcc/target-globals.h2014-01-07 17:08:51.113880947 +0100
@@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
 extern struct target_gcse *this_target_gcse;
 extern struct target_bb_reorder *this_target_bb_reorder;
 extern struct target_lower_subreg *this_target_lower_subreg;
+#endif
 
 struct GTY(()) target_globals {
   struct target_flag_state *GTY((skip)) flag_state;
@@ -57,6 +58,7 @@ struct GTY(()) target_globals {
   struct target_lower_subreg *GTY((skip)) lower_subreg;
 };
 
+#if SWITCHABLE_TARGET
 extern struct target_globals default_target_globals;
 
 extern struct target_globals *save_target_globals (void);
--- gcc/tree.c.jj   2014-01-03 11:40:33.0 +0100
+++ gcc/tree.c  2014-01-07 21:27:35.590268195 +0100
@@ -11527,6 +11527,28 @@ build_target_option_node (struct gcc_opt
   return t;
 }
 
+/* Reset TREE_TARGET_GLOBALS cache for TARGET_OPTION_NODE.
+   Called through htab_traverse.  */
+
+static int
+prepare_target_option_node_for_pch (vo

Re: [Patch,testsuite] Fix testcases that use bind_pic_locally

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 11:49:08AM +, Vidya Praveen wrote:
> On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote:
> > On Dec 17, 2013, at 6:06 AM, Vidya Praveen  wrote:
> > > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by
> > > default [1][2].
> > 
> > Let's give Jakub 2 days to weigh in?  If no objections, Ok, though, do see 
> > about adding documentation for it.  
> 
> Sure. I didn't respin the patch with documentation since I wanted to know
> if the solution is acceptable. If this patch is OK, I'll respin with the
> documentation for bind_pic_locally_ok. 
> 
> > I kinda would like a simpler interface for these two, but?  that can be 
> > follow on work, if someone has a bright idea and some time to implement it.
> > 
> 
> Could you explain what do you mean by simpler interface here? 

The simpler interface, as I said earlier, would be just to make sure
/* { dg-add-options bind_pic_locally } */
does the right thing, I really don't believe you've tried hard enough.

It is true dejagnu's default_target_compile has:
if {[board_info $dest exists multilib_flags]} {
append add_flags " [board_info $dest multilib_flags]"
}
last (before just adding -o $destfile; is multilib_flags where the
-fpic/-fPIC comes in, right?), but if say dg-add-options bind_pic_locally
adds the necessary options not to dg-extra-tools-flags, but to some
other variable and say gcc_target_compile (and g++_target_compile)
around the [target_compile ...] invocation e.g. temporarily append
that other variable (if not empty) to board_info's multilib_flags
and afterwards remove it, I don't see why it wouldn't work.
Tcl is quite flexible in this.

Jakub

Re: Rb tree node recycling patch

2014-01-08 Thread Jonathan Wakely

On 27 December 2013 18:30, François Dumont wrote:
> Hi
>
> Here is a patch to add recycling of Rb tree nodes when possible.

The change looks good, but it is not a bug fix, so I don't think it's
suitable for Stage 3.  Please re-submit this after 4.9 is released
when we are in Stage 1 again, thanks.

Re: [Patch,testsuite] Fix testcases that use bind_pic_locally

2014-01-08 Thread Vidya Praveen

On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote:
> On Dec 17, 2013, at 6:06 AM, Vidya Praveen  wrote:
> > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by
> > default [1][2].
> 
> Let's give Jakub 2 days to weigh in?  If no objections, Ok, though, do see 
> about adding documentation for it.  

Sure. I didn't respin the patch with documentation since I wanted to know
if the solution is acceptable. If this patch is OK, I'll respin with the
documentation for bind_pic_locally_ok. 

> I kinda would like a simpler interface for these two, but?  that can be 
> follow on work, if someone has a bright idea and some time to implement it.
> 

Could you explain what do you mean by simpler interface here? 

Cheers
VP.

Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115)

2014-01-08 Thread Richard Biener

On Tue, Jan 7, 2014 at 8:39 PM, Jakub Jelinek  wrote:
> On Mon, Jan 06, 2014 at 10:27:06AM +, Richard Sandiford wrote:
>> Of course, IMO, the cleanest fix would be to use switchable targets
>> for i386...
>
> The following patch does that, bootstrapped/regtested on x86_64-linux and
> i686-linux.  The only problem with the patch is PCH,
> +FAIL: 17_intro/headers/c++200x/stdc++.cc (test for excess errors)
> +FAIL: 17_intro/headers/c++200x/stdc++_multiple_inclusion.cc (test for excess 
> errors)
> (both 32-bit and 64-bit regtests), where it ICEs.  I guess the problem is
> that the target globals are allocated partly in GC, partly in heap and
> even if they were allocated completely in GC and GTY(()) marked fully all
> the individual pointed structures, we IMNSHO still don't want it to be
> saved during PCH and restored later, what we have is basically just a cache
> of the target globals.
>
> Dunno what is the best way to handle that though.
> Either before writing PCH c-common.c could call some tree.c routine that
> would traverse the cl_option_hash_table hash table and for every
> TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
> Or perhaps some gengtype extension to run some routine before PCH saving
> on the tree_target_option structs and clear the globals field in there.
> Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
> marking of the embedded opts field (and common).
> Any ideas?

Well, a GTY((skip_pch)) would probably work.  Or move the thing
out-of GC land (thus make cl_option_hash_table persistant) and
simply GTY((skip)) the pointer completely.  Not sure if we ever
collect from it.

Richard.

> 2014-01-07  Jakub Jelinek  
>
> PR target/58115
> * tree-core.h (struct target_globals): New forward declaration.
> (struct tree_target_option): Add globals field.
> * tree.h (TREE_TARGET_GLOBALS): Define.
> * target-globals.h (struct target_globals): Define even if
> !SWITCHABLE_TARGET.
> * config/i386/i386.h (SWITCHABLE_TARGET): Define.
> * config/i386/i386.c: Include target-globals.h.
> (ix86_set_current_function): Instead of doing target_reinit
> unconditionally, use save_target_globals_default_opts and
> restore_target_globals.
>
> --- gcc/tree-core.h.jj  2014-01-07 08:47:24.0 +0100
> +++ gcc/tree-core.h 2014-01-07 16:44:35.591358235 +0100
> @@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
>struct target_optabs *GTY ((skip)) base_optabs;
>  };
>
> +/* Forward declaration, defined in target-globals.h.  */
> +
> +struct GTY(()) target_globals;
> +
>  /* Target options used by a function.  */
>
>  struct GTY(()) tree_target_option {
>struct tree_common common;
>
> +  /* Target globals for the corresponding target option.  */
> +  struct target_globals *globals;
> +
>/* The optimization options used by the user.  */
>struct cl_target_option opts;
>  };
> --- gcc/tree.h.jj   2014-01-03 11:40:33.0 +0100
> +++ gcc/tree.h  2014-01-07 12:55:39.137295100 +0100
> @@ -2695,6 +2695,9 @@ extern tree build_optimization_node (str
>  #define TREE_TARGET_OPTION(NODE) \
>(&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
>
> +#define TREE_TARGET_GLOBALS(NODE) \
> +  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
> +
>  /* Return a tree node that encapsulates the target options in OPTS.  */
>  extern tree build_target_option_node (struct gcc_options *opts);
>
> --- gcc/target-globals.h.jj 2014-01-03 11:40:46.0 +0100
> +++ gcc/target-globals.h2014-01-07 17:08:51.113880947 +0100
> @@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
>  extern struct target_gcse *this_target_gcse;
>  extern struct target_bb_reorder *this_target_bb_reorder;
>  extern struct target_lower_subreg *this_target_lower_subreg;
> +#endif
>
>  struct GTY(()) target_globals {
>struct target_flag_state *GTY((skip)) flag_state;
> @@ -57,6 +58,7 @@ struct GTY(()) target_globals {
>struct target_lower_subreg *GTY((skip)) lower_subreg;
>  };
>
> +#if SWITCHABLE_TARGET
>  extern struct target_globals default_target_globals;
>
>  extern struct target_globals *save_target_globals (void);
> --- gcc/config/i386/i386.h.jj   2014-01-06 22:37:19.0 +0100
> +++ gcc/config/i386/i386.h  2014-01-07 12:13:06.480486755 +0100
> @@ -2510,6 +2510,9 @@ extern void debug_dispatch_window (int);
>  #define IX86_HLE_ACQUIRE (1 << 16)
>  #define IX86_HLE_RELEASE (1 << 17)
>
> +/* For switching between functions with different target attributes.  */
> +#define SWITCHABLE_TARGET 1
> +
>  /*
>  Local variables:
>  version-control: t
> --- gcc/config/i386/i386.c.jj   2014-01-06 22:37:19.0 +0100
> +++ gcc/config/i386/i386.c  2014-01-07 16:52:32.597904760 +0100
> @@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-pass.h"
>  #include "context.h"
>  #include "pass_manager.h"
> +#include "target-globals.

Re: [PATCH, 4.8, PR 59610] More optimize guards in ipa-prop.c

2014-01-08 Thread Richard Biener

On Tue, Jan 7, 2014 at 7:27 PM, Martin Jambor  wrote:
> Hi,
>
> I forgot to put the optimize test to the ipa_compute_jump_functions
> when fixing PR 57358 which is where it is most necessary.  This patch
> adds it there and to parm_preserved_before_stmt_p which is also
> reachable through ipa_load_from_parm_agg_1 that is also called from
> outside of jump function computations.
>
> I'm currently bootstrapping and testing the following on x86_64-linux.
> OK for the branch if it passes?  And the testcase for trunk?

Ok.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> 2014-01-07  Martin Jambor  
>
> PR ipa/59610
> * ipa-prop.c (ipa_compute_jump_functions): Bail out if not optimizing.
> (parm_preserved_before_stmt_p): Assume modification present when not
> optimizing.
>
> testsuite/
> * gcc.dg/ipa/pr59610.c: New test.
>
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 47d487d..3788a11 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -623,16 +623,22 @@ parm_preserved_before_stmt_p (struct 
> param_analysis_info *parm_ainfo,
>if (parm_ainfo && parm_ainfo->parm_modified)
>  return false;
>
> -  gcc_checking_assert (gimple_vuse (stmt) != NULL_TREE);
> -  ao_ref_init (&refd, parm_load);
> -  /* We can cache visited statements only when parm_ainfo is available and 
> when
> - we are looking at a naked load of the whole parameter.  */
> -  if (!parm_ainfo || TREE_CODE (parm_load) != PARM_DECL)
> -visited_stmts = NULL;
> +  if (optimize)
> +{
> +  gcc_checking_assert (gimple_vuse (stmt) != NULL_TREE);
> +  ao_ref_init (&refd, parm_load);
> +  /* We can cache visited statements only when parm_ainfo is available 
> and
> + when we are looking at a naked load of the whole parameter.  */
> +  if (!parm_ainfo || TREE_CODE (parm_load) != PARM_DECL)
> +   visited_stmts = NULL;
> +  else
> +   visited_stmts = &parm_ainfo->parm_visited_statements;
> +  walk_aliased_vdefs (&refd, gimple_vuse (stmt), mark_modified, 
> &modified,
> + visited_stmts);
> +}
>else
> -visited_stmts = &parm_ainfo->parm_visited_statements;
> -  walk_aliased_vdefs (&refd, gimple_vuse (stmt), mark_modified, &modified,
> - visited_stmts);
> +modified = true;
> +
>if (parm_ainfo && modified)
>  parm_ainfo->parm_modified = true;
>return !modified;
> @@ -1466,6 +1472,9 @@ ipa_compute_jump_functions (struct cgraph_node *node,
>  {
>struct cgraph_edge *cs;
>
> +  if (!optimize)
> +return;
> +
>for (cs = node->callees; cs; cs = cs->next_callee)
>  {
>struct cgraph_node *callee = cgraph_function_or_thunk_node (cs->callee,
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr59610.c 
> b/gcc/testsuite/gcc.dg/ipa/pr59610.c
> new file mode 100644
> index 000..fc09334
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/pr59610.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +struct A { int a; };
> +extern void *y;
> +
> +__attribute__((optimize (0))) void
> +foo (void *p, struct A x)
> +{
> +  foo (y, x);
> +}

Re: [PING] [REPOST] Invalid Code when reading from unaligned zero-sized array

2014-01-08 Thread Richard Biener

On Tue, Jan 7, 2014 at 5:31 PM, Bernd Edlinger
 wrote:
> Hello,
>
> Ping...
>
> We still need a decision how to fix this.
>
> There are two alternative patches:
>
> 1. My latest proposal: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01675.html
>
> 2. Eric's latest proposal: 
> http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01667.html

Let's go with 1., your patch adjusting how we recurse in expand.  That
seems safer to eventually backport.

For 4.10 we should re-visit this and fix all the backends for those ABI
issues with modes ...

Richard.

>
> Thanks
> Bernd.

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Richard Biener

On Wed, 8 Jan 2014, Jakub Jelinek wrote:

> On Wed, Jan 08, 2014 at 12:15:40PM +0100, Richard Biener wrote:
> > > I start to think this is a too complex transform for stmt folding ...
> > 
> > Alternatively do update_call_from_tree (gsi, get_or_create_ssa_default_def 
> > (cfun, create_tmp_var (TREE_TYPE (lhs.
> 
> The lhs might not be is_gimple_reg_type though.  What to do in that case?

In that case you can remove the stmt.

Richard.

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 12:15:40PM +0100, Richard Biener wrote:
> > I start to think this is a too complex transform for stmt folding ...
> 
> Alternatively do update_call_from_tree (gsi, get_or_create_ssa_default_def 
> (cfun, create_tmp_var (TREE_TYPE (lhs.

The lhs might not be is_gimple_reg_type though.  What to do in that case?

Jakub

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Richard Biener

On Wed, 8 Jan 2014, Richard Biener wrote:

> On Wed, 8 Jan 2014, Jakub Jelinek wrote:
> 
> > On Wed, Jan 08, 2014 at 11:45:28AM +0100, Richard Biener wrote:
> > > I prefer to always do this, not do the fancy insertion-before.  That
> > > would do repeated folding for
> > > 
> > >fold_stmt (gsi);
> > >fold_stmt (gsi);
> > >fold_stmt (gsi);
> > > 
> > > where the last two should be a no-op.
> > 
> > I don't see how is that possible, at least for the __builtin_unreachable
> > case, because by just setting the fndecl to __builtin_unreachable and
> > keeping the incompatible fntype and bogus arguments for it all the
> > predicates whether it is a valid/suitable builtin call will fail and we
> > don't have a __builtin_unreachable function you could call.
> 
> Well, that just means we need two sets of predicates to check for
> a builtin call.  The __builtin_unreachable code wants to know what
> the callee is, not if that's a "valid" call to it.  But yeah - this
> starts to get confusing :/
> 
> > So at least for builtin we want to make sure it has the right parameters.
> > If the lhs is something we can just initialize to zero, we can replace the
> > call with zeroing the lhs, but that is no the case always.
> 
> I start to think this is a too complex transform for stmt folding ...

Alternatively do update_call_from_tree (gsi, get_or_create_ssa_default_def 
(cfun, create_tmp_var (TREE_TYPE (lhs.

> > For __cxa_pure_virtual we could just keep the code as is (just with the
> > !inplace addition and spelling fix?), but would need to fix up whatever ICEs
> > during checking on it to honor fntype rather than decl's type.
> 
> Yes.
> 
> So a patch just keeping the targets.length () == 1 case in folding
> with just replacing the fndecl of the call is ok.
> 
> Thanks,
> Richard.
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Richard Biener

On Wed, 8 Jan 2014, Jakub Jelinek wrote:

> On Wed, Jan 08, 2014 at 11:45:28AM +0100, Richard Biener wrote:
> > I prefer to always do this, not do the fancy insertion-before.  That
> > would do repeated folding for
> > 
> >fold_stmt (gsi);
> >fold_stmt (gsi);
> >fold_stmt (gsi);
> > 
> > where the last two should be a no-op.
> 
> I don't see how is that possible, at least for the __builtin_unreachable
> case, because by just setting the fndecl to __builtin_unreachable and
> keeping the incompatible fntype and bogus arguments for it all the
> predicates whether it is a valid/suitable builtin call will fail and we
> don't have a __builtin_unreachable function you could call.

Well, that just means we need two sets of predicates to check for
a builtin call.  The __builtin_unreachable code wants to know what
the callee is, not if that's a "valid" call to it.  But yeah - this
starts to get confusing :/

> So at least for builtin we want to make sure it has the right parameters.
> If the lhs is something we can just initialize to zero, we can replace the
> call with zeroing the lhs, but that is no the case always.

I start to think this is a too complex transform for stmt folding ...

> For __cxa_pure_virtual we could just keep the code as is (just with the
> !inplace addition and spelling fix?), but would need to fix up whatever ICEs
> during checking on it to honor fntype rather than decl's type.

Yes.

So a patch just keeping the targets.length () == 1 case in folding
with just replacing the fndecl of the call is ok.

Thanks,
Richard.

Re: [Patch AArch64] Implement Vector Permute Support

2014-01-08 Thread James Greenhalgh

On Wed, Jan 08, 2014 at 12:10:13AM +, Andrew Pinski wrote:
> On Tue, Jan 7, 2014 at 4:05 PM, Marcus Shawcroft
>  wrote:
> >
> > Andrew, We know that there are numerous issues with aarch64 BE advsimd 
> > support in GCC.  The aarch64_be support is very much a work in progress.  
> > Tejas sorted out a number of fundamentals with a series of patches in 
> > November, notably in PCS conformance.  There is more to come.  However, 
> > aarch64_be-* support in gcc 4.9 is not going to match the level of quality 
> > for the aarch64-* port.
> 
> 
> Yes but should not introduce an ICE while GCC is in stage3.  This was
> working before due not having a vec_perm before.  I am going to
> request this to be reverted soon if it is not fixed (the GCC rules are
> clear here).

Hi Andrew,

I am confused, are you also proposing to revert this patch on 4.8
branch? The code has been sitting with that assert in place on trunk
for well over a year (note that December 2012 was during 4.8's
stage 3, not 4.9) there is no regression here.

But, that doesn't absolve me of the fact that this is broken in
a stupid way for big-endian AArch64.

The band-aid, which I can prepare, would be to turn off
vec_perm for BYTES_BIG_ENDIAN targets on the 4.9 and
4.8 branches. This is the most sensible thing to do in the short
term. Naturally, you will lose vectorization of permute operations,
but at least you won't get the ICE or wrong code generation. This
is what the ARM back-end (from which I ported the vec_perm code)
does.

In the longer term you would want to audit the lane-numbering
discrepancies between GCC and our architectural lane-numbers.
We are some way towards that after Tejas' PCS conformance fix,
but as Marcus has said, there is more to come. I should imagine
that in this case you will need to provide a run-time transformation
between the permute mask and an appropriate mask for tbl.

To reiterate, this does not need reverted, we'll get a fix out
disabling vec_perm for BYTES_BIG_ENDIAN on 4.8 branch and 4.9.

Thanks,
James

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Jakub Jelinek

On Wed, Jan 08, 2014 at 11:45:28AM +0100, Richard Biener wrote:
> I prefer to always do this, not do the fancy insertion-before.  That
> would do repeated folding for
> 
>fold_stmt (gsi);
>fold_stmt (gsi);
>fold_stmt (gsi);
> 
> where the last two should be a no-op.

I don't see how is that possible, at least for the __builtin_unreachable
case, because by just setting the fndecl to __builtin_unreachable and
keeping the incompatible fntype and bogus arguments for it all the
predicates whether it is a valid/suitable builtin call will fail and we
don't have a __builtin_unreachable function you could call.
So at least for builtin we want to make sure it has the right parameters.
If the lhs is something we can just initialize to zero, we can replace the
call with zeroing the lhs, but that is no the case always.

For __cxa_pure_virtual we could just keep the code as is (just with the
!inplace addition and spelling fix?), but would need to fix up whatever ICEs
during checking on it to honor fntype rather than decl's type.

Jakub

Re: [PATCH] Fix devirtualization ICE (PR tree-optimization/59622, take 3)

2014-01-08 Thread Richard Biener

On Tue, 7 Jan 2014, Jakub Jelinek wrote:

> Hi!
> 
> On Fri, Jan 03, 2014 at 11:33:50AM +0100, Jakub Jelinek wrote:
> > On Fri, Jan 03, 2014 at 11:24:53AM +0100, Richard Biener wrote:
> 
> Anyway, back to the original patch, so do you prefer something like
> this instead?  I.e. handle only __builtin_unreachable and
> __cxa_pure_virtual specially, and not devirt for fold_stmt_inplace?
> 
> 2014-01-07  Jakub Jelinek  
> 
>   PR tree-optimization/59622
>   * gimple-fold.c (gimple_fold_call): Fix a typo in message.  Handle
>   __cxa_pure_virtual similarly to __builtin_unreachable.  Don't
>   devirtualize for inplace at all.
> 
>   * g++.dg/opt/pr59622-2.C: New test.
> 
> --- gcc/gimple-fold.c.jj  2014-01-03 11:40:57.247320424 +0100
> +++ gcc/gimple-fold.c 2014-01-07 18:15:00.352601812 +0100
> @@ -1167,7 +1167,7 @@ gimple_fold_call (gimple_stmt_iterator *
>   (OBJ_TYPE_REF_EXPR 
> (callee)
>   {
> fprintf (dump_file,
> -"Type inheritnace inconsistent devirtualization of ");
> +"Type inheritance inconsistent devirtualization of ");
> print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> fprintf (dump_file, " to ");
> print_generic_expr (dump_file, callee, TDF_SLIM);
> @@ -1177,26 +1177,35 @@ gimple_fold_call (gimple_stmt_iterator *
> gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee));
> changed = true;
>   }
> -  else if (flag_devirtualize && virtual_method_call_p (callee))
> +  else if (flag_devirtualize && !inplace && virtual_method_call_p 
> (callee))
>   {
> bool final;
> vec targets
>   = possible_polymorphic_call_targets (callee, &final);
> if (final && targets.length () <= 1)
>   {
> +   tree fndecl;
> if (targets.length () == 1)
> + fndecl = targets[0]->decl;
> +   else
> + fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
> +
> +   /* If fndecl (like __builtin_unreachable or
> +  __cxa_pure_virtual) takes no arguments, doesn't have
> +  return value and is noreturn, just add the call before
> +  stmt and DCE will do it's job later on.  */
> +   if (TREE_THIS_VOLATILE (fndecl)
> +   && VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fndecl)))
> +   && TYPE_ARG_TYPES (TREE_TYPE (fndecl)) == void_list_node)
>   {
> -   gimple_call_set_fndecl (stmt, targets[0]->decl);
> -   changed = true;
> - }
> -   else if (!inplace)
> - {
> -   tree fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE);
> gimple new_stmt = gimple_build_call (fndecl, 0);
> gimple_set_location (new_stmt, gimple_location (stmt));
> gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> return true;
>   }
> +
> +   gimple_call_set_fndecl (stmt, fndecl);

I prefer to always do this, not do the fancy insertion-before.  That
would do repeated folding for

   fold_stmt (gsi);
   fold_stmt (gsi);
   fold_stmt (gsi);

where the last two should be a no-op.

Richard.


> +   changed = true;
>   }
>   }
>  }
> --- gcc/testsuite/g++.dg/opt/pr59622-2.C.jj   2014-01-07 18:10:45.435904909 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr59622-2.C  2014-01-07 18:10:45.435904909 
> +0100
> @@ -0,0 +1,21 @@
> +// PR tree-optimization/59622
> +// { dg-do compile }
> +// { dg-options "-O2" }
> +
> +namespace
> +{
> +  struct A
> +  {
> +A () {}
> +virtual A *bar (int) = 0;
> +A *baz (int x) { return bar (x); }
> +  };
> +}
> +
> +A *a;
> +
> +void
> +foo ()
> +{
> +  a->baz (0);
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: [PING^2][PATCH][2 of 2] RTL expansion for zero sign extension elimination with VRP

2014-01-08 Thread Richard Biener

On Wed, 8 Jan 2014, Kugan wrote:

> 
> On 07/01/14 23:23, Richard Biener wrote:
> > On Tue, 7 Jan 2014, Kugan wrote:
> 
> [snip]
> 
> 
> > Note that VIEW_CONVERT_EXPR is wrong here.  I think you are
> > handling this wrong still.  From a quick look you want to avoid
> > the actual promotion for
> > 
> >   reg_1 = 
> > 
> > when reg_1 is promoted and thus the target is (subreg:XX N).
> > The RHS has been expanded in XXmode.  Dependent on the value-range
> > of reg_1 you want to set N to a paradoxical subreg of the expanded
> > result.  You can always do that if the reg is zero-extended
> > and else if the MSB is not set for any of the values of reg_1.
> 
> Thanks Richard for the explanation. I just want to double confirm I
> understand you correctly before I attempt to fix it. So let me try this
> for the following example,
> 
> for a gimple stmt of the following from:
> unsigned short _5;
> short int _6;
> _6 = (short int)_5;
> 
> ;; _6 = (short int) _5;
> target = (subreg/s/u:HI (reg:SI 110 [ D.4144 ]) 0)
> temp = (subreg:HI (reg:SI 118) 0)
> 
> So, I must generate the following if it satisfies the other conditions.
> (set (reg:SI 110 [ D.4144 ]) (subreg:SI temp ))
> 
> Is my understanding correct?

I'm no RTL expert in this particular area but yes, I think so.  Not
sure what paradoxical subregs are valid, so somebody else should
comment here.  You could even generate

  (set (reg:SI 110) (reg:SI 118))

iff temp is a SUBREG of a promoted var, as you require that for the
destination as well.

> 
> > I don't see how is_assigned_exp_fit_type reflects this in any way.
> >
> 
> 
> What I tried doing with the patch is:
> 
> (insn 13 12 0 (set (reg:SI 110 [ D.4144 ])
> (zero_extend:SI (subreg:HI (reg:SI 118) 0))) c5.c:8 -1
>  (nil))
> 
> If the values in register (reg:SI 118) fits HI mode (without
> overflowing), I assume that it is not necessary to just drop the higher
> bits and zero_extend as done above and generate the following instead.
> 
> (insn 13 12 0 (set (reg:SI 110 [ D.4144 ])
> (((reg:SI 118) 0))) c5.c:8 -1
>  (nil))
> 
> is_assigned_exp_fit_type just checks if the range fits (in the above
> case, the value in eg:SI 118 fits HI mode) and the checks before
> emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); checks the
> modes match.
> 
> Is this wrong  or am I missing the whole point?

is_assigned_exp_fit_type is weird - it looks at the value-range of _5,
but as you want to elide the extension from _6 to SImode you want
to look at the value-range from _5.  So, breaking it down and applying
the promotion to GIMPLE it would look like

   unsigned short _5;
   short int _6;
   _6 = (short int)_5;
   _6_7 = (int) _6;

where you want to remove the last line representing the
assignment to (subreg:HI (reg:SI 110)).  Whether you can
do that depends on the value-range of _6, not on the
value-range of _5.  It's also completely independent
on the operation performed on the RHS.

Well.  As far as I understand at least.

Richard.

Re: [Patch] Regex bracket matcher cache optimization

2014-01-08 Thread Paolo Carlini


On 01/07/2014 08:36 PM, Tim Shen wrote:

On Tue, Jan 7, 2014 at 4:02 AM, Paolo Carlini  wrote:

Ideally, I would suggest committing first the improvements in your previous
patch (by the way, thanks for the numbers!) + the pure bug fixes and
separate the further performance improvements which have compile-time
performance implications (how big?), see if, eg, Jon has something to
recommend. Can we do that?

First patch committed. I later found that the second patch "b.diff" is
based on the committed version (the attach, which fixed the "&&"
problem);
Not sure I'm following all the past and present tenses ;) but in my old 
message I proposed to commit now the *correctness* fixes too, which, I 
suppose, are fixes which don't have compile-time performance implications.


Paolo.

Re: reload autoinc fix

2014-01-08 Thread Richard Earnshaw

On 07/01/14 21:06, Andrew Pinski wrote:
> On Tue, Jan 7, 2014 at 12:55 PM, Jeff Law  wrote:
>> On 01/07/14 09:16, Bernd Schmidt wrote:
>>>
>>> This is PR56791. The address inside of an autoinc is reloaded, and the
>>> autoinc is reloaded, but the reload insns are emitted in the wrong order.
>>>
>>> As far as I can tell, this is because find_reloads_address_1 has two
>>> methods of pushing a reload for an autoinc, one of them using the
>>> previously identified type, and the other (better one) using
>>> RELOAD_OTHER. If we previously reloaded an inner part of the address,
>>> the use of RELOAD_OTHER is mismatched and leads to the wrong order of
>>> insns.
>>>
>>> This patch just remembers if we've pushed a reload, and forces the
>>> optimization to be skipped in that case. Bootstrapped and tested on
>>> x86_64-linux (with lra_p disabled but still somewhat pointlessly); John
>>> Anglin said in the PR that it tests ok on PA. Will commit in a few days
>>> if no objections.
>>
>> No objections to the substance of the patch, though I think the comment
>> could be clearer.
> 
> Though my question is what target does this matter since ARM has moved
> away from reload and other targets should do the same?
> 

There's still the chance we will have to move back for this release when
building Thumb1.  Only if we can iron out enough of the bugs/size
regressions will we stick with LRA for that permutation.

R.

[patch] [plugin] Fix PR 59335 plugin build

2014-01-08 Thread Joey Ye

Fix trunk plugin build by adding missing headers and remove headers no
longer exist.

Test passed:
- arm-none-eabi build --enable-plugins
- build test plugin 
- x86_64 bootstrap --enable-plugins

OK to trunk?

ChangeLog.gcc
2013-11-19  Joey Ye  

PR plugin/59335
* Makefile.in (tree-cfg.h, tree-into-ssa.h, fold-const.h,
gimple-ssa.h,
gimple-iterator.h, varasm.h, context.h): Add missing headers for
plugin.
(tree-flow.h, tree-flow-inline.h): Remove as they no longer exist.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 459b1ba..55f1ace 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -882,13 +882,14 @@ TREE_CORE_H = tree-core.h coretypes.h all-tree.def
tree.def \
$(VEC_H) treestruct.def $(HASHTAB_H) \
double-int.h alias.h $(SYMTAB_H) $(FLAGS_H) \
$(REAL_H) $(FIXED_VALUE_H)
-TREE_H = tree.h $(TREE_CORE_H)  tree-check.h
+TREE_H = tree.h $(TREE_CORE_H)  tree-check.h tree-cfg.h tree-into-ssa.h
 REGSET_H = regset.h $(BITMAP_H) hard-reg-set.h
 BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) \
cfg-flags.def cfghooks.h
 GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \
$(GGC_H) $(BASIC_BLOCK_H) $(TREE_H) tree-ssa-operands.h \
-   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h
+   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h \
+   gimple-ssa.h gimple-iterator.h
 GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h
 RECOG_H = recog.h
 EMIT_RTL_H = emit-rtl.h
@@ -929,7 +930,7 @@ CPP_ID_DATA_H = $(CPPLIB_H)
$(srcdir)/../libcpp/include/cpp-id-data.h
 CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H)
 TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H)
 TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H)
-TREE_FLOW_H = tree-flow.h tree-flow-inline.h tree-ssa-operands.h \
+TREE_FLOW_H = tree-ssa-operands.h \
$(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \
$(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
tree-ssa-alias.h
@@ -3119,7 +3120,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H)
coretypes.h $(TM_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h
\
   $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
-  version.h stringpool.h
+  version.h stringpool.h varasm.h fold-const.h $(CONTEXT_H)
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefilediff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 459b1ba..55f1ace 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -882,13 +882,14 @@ TREE_CORE_H = tree-core.h coretypes.h all-tree.def 
tree.def \
$(VEC_H) treestruct.def $(HASHTAB_H) \
double-int.h alias.h $(SYMTAB_H) $(FLAGS_H) \
$(REAL_H) $(FIXED_VALUE_H)
-TREE_H = tree.h $(TREE_CORE_H)  tree-check.h
+TREE_H = tree.h $(TREE_CORE_H)  tree-check.h tree-cfg.h tree-into-ssa.h
 REGSET_H = regset.h $(BITMAP_H) hard-reg-set.h
 BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) \
cfg-flags.def cfghooks.h
 GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \
$(GGC_H) $(BASIC_BLOCK_H) $(TREE_H) tree-ssa-operands.h \
-   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h
+   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h \
+   gimple-ssa.h gimple-iterator.h
 GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h
 RECOG_H = recog.h
 EMIT_RTL_H = emit-rtl.h
@@ -929,7 +930,7 @@ CPP_ID_DATA_H = $(CPPLIB_H) 
$(srcdir)/../libcpp/include/cpp-id-data.h
 CPP_INTERNAL_H = $(srcdir)/../libcpp/internal.h $(CPP_ID_DATA_H)
 TREE_DUMP_H = tree-dump.h $(SPLAY_TREE_H) $(DUMPFILE_H)
 TREE_PASS_H = tree-pass.h $(TIMEVAR_H) $(DUMPFILE_H)
-TREE_FLOW_H = tree-flow.h tree-flow-inline.h tree-ssa-operands.h \
+TREE_FLOW_H = tree-ssa-operands.h \
$(BITMAP_H) sbitmap.h $(BASIC_BLOCK_H) $(GIMPLE_H) \
$(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
tree-ssa-alias.h
@@ -3119,7 +3120,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
   $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
-  version.h stringpool.h
+  version.h stringpool.h varasm.h fold-const.h $(CONTEXT_H)
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile

1 2 >

1 - 100 of 111 matches

Mail list logo