Re: [PATCH] Keep REG_INC note in subreg2 pass

2013-10-30 Thread Zhenqiang Chen
On 31 October 2013 00:08, Jeff Law  wrote:
> On 10/30/13 00:09, Zhenqiang Chen wrote:
>>
>> On 30 October 2013 02:47, Jeff Law  wrote:
>>>
>>> On 10/24/13 02:20, Zhenqiang Chen wrote:


 Hi,

 REG_INC note is lost in subreg2 pass when resolve_simple_move, which
 might lead to wrong dependence for ira. e.g. In function
 validate_equiv_mem of ira.c, it checks REG_INC note:

for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
   if ((REG_NOTE_KIND (note) == REG_INC
|| REG_NOTE_KIND (note) == REG_DEAD)
   && REG_P (XEXP (note, 0))
   && reg_overlap_mentioned_p (XEXP (note, 0), memref))
 return 0;

 Without REG_INC note, validate_equiv_mem will return a wrong result.

 Referhttps://bugs.launchpad.net/gcc-linaro/+bug/1243022  for more

 detail about a real case in kernel.

 Bootstrap and no make check regression on X86-64 and ARM.

 Is it OK for trunk and 4.8?

 Thanks!
 -Zhenqiang

 ChangeLog:
 2013-10-24  Zhenqiang Chen

   * lower-subreg.c (resolve_simple_move): Copy REG_INC note.

 testsuite/ChangeLog:
 2013-10-24  Zhenqiang Chen

   * gcc.target/arm/lp1243022.c: New test.
>>>
>>>
>>> This clearly handles adding a note when the destination is a MEM with a
>>> side
>>> effect.  What about cases where the side effect is associated with a load
>>> from memory rather than a store to memory?
>>
>>
>> Yes. We should handle load from memory.
>>


 lp1243022.patch


 diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
 index 57b4b3c..e710fa5 100644
 --- a/gcc/lower-subreg.c
 +++ b/gcc/lower-subreg.c
 @@ -1056,6 +1056,22 @@ resolve_simple_move (rtx set, rtx insn)
  mdest = simplify_gen_subreg (orig_mode, dest, GET_MODE (dest),
 0);
  minsn = emit_move_insn (real_dest, mdest);

 +#ifdef AUTO_INC_DEC
 +  /* Copy the REG_INC notes.  */
 +  if (MEM_P (real_dest) && !(resolve_reg_p (real_dest)
 +|| resolve_subreg_p (real_dest)))
 +   {
 + rtx note = find_reg_note (insn, REG_INC, NULL_RTX);
 + if (note)
 +   {
 + if (!REG_NOTES (minsn))
 +   REG_NOTES (minsn) = note;
 + else
 +   add_reg_note (minsn, REG_INC, note);
 +   }
 +   }
 +#endif
>>>
>>>
>>> If MINSN does not have any notes, then this results in MINSN and INSN
>>> sharing the note.  Note carefully that notes are chained (see
>>> implementation
>>> of add_reg_note).  Thus the sharing would result in MINSN and INSN
>>> actually
>>> sharing a chain of notes.  I'm pretty sure that's not what you intended.
>>> I
>>> think you need to always use add_reg_note.
>>
>>
>> Yes. I should use add_reg_note.
>>
>> Here is the updated patch:
>>
>> diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
>> index ebf364f..16dfa62 100644
>> --- a/gcc/lower-subreg.c
>> +++ b/gcc/lower-subreg.c
>> @@ -967,7 +967,20 @@ resolve_simple_move (rtx set, rtx insn)
>> rtx reg;
>>
>> reg = gen_reg_rtx (orig_mode);
>> +
>> +#ifdef AUTO_INC_DEC
>> +  {
>> +   rtx move = emit_move_insn (reg, src);
>> +   if (MEM_P (src))
>> + {
>> +   rtx note = find_reg_note (insn, REG_INC, NULL_RTX);
>> +   if (note)
>> + add_reg_note (move, REG_INC, XEXP (note, 0));
>> + }
>> +  }
>> +#else
>> emit_move_insn (reg, src);
>> +#endif
>> src = reg;
>>   }
>>
>> @@ -1057,6 +1070,16 @@ resolve_simple_move (rtx set, rtx insn)
>>  mdest = simplify_gen_subreg (orig_mode, dest, GET_MODE (dest),
>> 0);
>> minsn = emit_move_insn (real_dest, mdest);
>>
>> +#ifdef AUTO_INC_DEC
>> +  if (MEM_P (real_dest) && !(resolve_reg_p (real_dest)
>> +|| resolve_subreg_p (real_dest)))
>
> Formatting nit.   This should be formatted as
>
> if (MEM_P (real_dest)
> && !(resolve_reg_p (real_dest) || resolve_subreg_p (real_dest)))
>
> If that results in too long of a line, then it should wrap like this:
>
>
> if (MEM_P (real_dest)
> && !(resolve_reg_p (real_dest)
>  || resolve_subreg_p (real_dest)))
>
> OK with that change.  Please install on the trunk.  The 4.8 maintainers have
> the final call for the 4.8 release branch.

Thanks. Patch is committed to trunk@r204247 with the change.

-Zhenqiang


Re: Pre-Patch RFC: proposed changes to option-lookup

2013-10-30 Thread Jeff Law

On 10/30/13 14:39, David Malcolm wrote:

[Sending this to gcc-patches to double-check that the idea is sound
before continuing to work on this large patch. [1] ]

I want to eliminate hidden use of the preprocessor in our code, in favor
of using block caps to signal to people reading the code that macro
magic is happening.

As a specific example, consider this supposedly-simple code:

   static bool
   gate_vrp (void)
   {
 return flag_tree_vrp != 0;
   }

where "flag_tree_vrp" is actually an autogenerated macro to
"global_options.x_flag_tree_vrp"

This is deeply confusing to a newbie - and indeed still to me after two
years of working with GCC's internals, for example, when stepping
through code and trying to query values in gdb.

My idea is to introduce a GCC_OPTION macro, and replace the above with:

   static bool
   gate_vrp (void)
   {
 return GCC_OPTION (flag_tree_vrp) != 0;
   }

thus signaling to humans that macros are present.

Is such a patch likely to be accepted?   Should I try to break the
options up into logical groups e.g. with separate macros for warnings vs
optimizations, or some other scheme?
So what's the advantage of GCC_OPTION over just explicitly referencing 
it via global_options?  (I would agree that GCC_OPTION or an explicit 
reference are better than the magic that happens behind our back with 
the flags right now)


I'm definitely in favor of removing hidden macro magic, so I think we 
want to go forward with something here.  I just want to know why 
GCC_OPTION over the fully explicit version.



Jeff


[RFA][PATCH] Isolate erroneous paths optimization

2013-10-30 Thread Jeff Law


I've incorporated the various suggestions from Marc and Richi, except 
for Richi's to integrate this into jump threading.


I've also made the following changes since the last version:

  1. Added more testcases.

  2. Use infer_nonnull_range, moving it from tree-vrp.c
  into gimple.c.  Minor improvements to infer_nonnull_range
  to make it handle more cases we care about and avoid using
  unnecessary routines from tree-ssa.c (which can now be removed)

  3. Multiple undefined statements in a block are handled in the
  logical way.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for 
the trunk?


Thanks,
Jeff
* Makefile.in (OBJS): Add gimple-ssa-isolate-paths.o
* common.opt (-fisolate-erroneous-paths): Add option and
documentation.
* gimple-ssa-isolate-paths.c: New file.
* gimple.c (check_loadstore): New function.
(infer_nonnull_range): Moved into gimple.c from tree-vrp.c
Verify OP is in the argument list and the argument corresponding
to OP is a pointer type.  Use operand_equal_p rather than
pointer equality when testing if OP is on the nonnull list.
Use check_loadstore rather than count_ptr_derefs.  Handle
GIMPLE_RETURN statements.
* tree-vrp.c (infer_nonnull_range): Remove.
* gimple.h (infer_nonnull_range): Declare.
(gsi_start_nondebug_after_labels): New function.
* opts.c (default_options_table): Add OPT_fisolate_erroneous_paths.
* passes.def: Add pass_isolate_erroneous_paths.
* timevar.def (TV_ISOLATE_ERRONEOUS_PATHS): New timevar.
* tree-pass.h (make_pass_isolate_erroneous_paths): Declare.
* tree-ssa.c (struct count_ptr_d): Remove.
(count_ptr_derefs, count_uses_and_derefs): Remove.
* tree-ssa.h (count_uses_and_derefs): Remove.



* gcc.dg/pr38984.c: Add -fno-isolate-erroneous-paths.
* gcc.dg/tree-ssa/20030711-3.c: Update expected output.
* gcc.dg/tree-ssa/isolate-1.c: New test.
* gcc.dg/tree-ssa/isolate-2.c: New test.
* gcc.dg/tree-ssa/isolate-3.c: New test.
* gcc.dg/tree-ssa/isolate-4.c: New test.



diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 29609fd..7e9a702 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1233,6 +1233,7 @@ OBJS = \
gimple-fold.o \
gimple-low.o \
gimple-pretty-print.o \
+   gimple-ssa-isolate-paths.o \
gimple-ssa-strength-reduction.o \
gimple-streamer-in.o \
gimple-streamer-out.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index deeb3f2..6db9f56 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2104,6 +2104,12 @@ foptimize-strlen
 Common Report Var(flag_optimize_strlen) Optimization
 Enable string length optimizations on trees
 
+fisolate-erroneous-paths
+Common Report Var(flag_isolate_erroneous_paths) Init(1) Optimization
+Detect paths which trigger erroneous or undefined behaviour.  Isolate those
+paths from the main control flow and turn the statement with erroneous or
+undefined behaviour into a trap.
+
 ftree-loop-distribution
 Common Report Var(flag_tree_loop_distribution) Optimization
 Enable loop distribution on trees
diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
new file mode 100644
index 000..aa526cc
--- /dev/null
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -0,0 +1,332 @@
+/* Detect paths through the CFG which can never be executed in a conforming
+   program and isolate them.
+
+   Copyright (C) 2013
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "flags.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "tree-ssa.h"
+#include "gimple-ssa.h"
+#include "tree-ssa-operands.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "cfgloop.h"
+#include "tree-pass.h"
+
+
+static bool cfg_altered;
+
+/* BB when reached via incoming edge E will exhibit undefined behaviour
+   at STMT.  Isolate and optimize the path which exhibits undefined
+   behaviour.
+
+   Isolation is simple.  Duplicate BB and redirect E to BB'.
+
+   Optimization is simple as well.  Replace STMT in BB' with an
+   unconditional trap and remove all outgoing edges from BB'.
+
+   DUPLICATE is a pre-existing duplicate

ICE with "[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping"

2013-10-30 Thread Hans-Peter Nilsson
> From: Jakub Jelinek 
> Date: Thu, 31 Oct 2013 00:16:41 +0100

> On Fri, Oct 25, 2013 at 05:19:06PM +0200, Martin Jambor wrote:
> > 2013-10-23  Martin Jambor  
> > 
> > PR rtl-optimization/10474
> > * ira.c (find_moveable_pseudos): Do not calculate dominance info
> > nor df analysis.
> > (interesting_dest_for_shprep): New function.
> > (split_live_ranges_for_shrink_wrap): Likewise.
> > (ira): Calculate dominance info and df analysis. Call
> > split_live_ranges_for_shrink_wrap.
> > 
> > testsuite/
> > * gcc.dg/pr10474.c: New testcase.
> > * gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> > * gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
> 
> Unfortunately this patch breaks i686-linux bootstrap,

It (revision r204205) "also" causes
.

(If I had seen your message Jakub, it might have saved me the
reghunt and a PR.  On the other hand, debugging an ICE is far
more comfortable than a miscompare.)

brgds, H-P


Re: [PATCH][RFA] Improvements to infer_nonnull_range

2013-10-30 Thread Jeff Law

On 10/28/13 23:02, Jeff Law wrote:


Based on a suggestion from Marc, I want to use infer_nonnull_range in
the erroneous path isolation optimization.

In the process of doing that, I found  a few deficiencies in
infer_nonnull_range that need to be addressed.


First, infer_nonnull_range_p doesn't infer ranges from a GIMPLE_RETURN
if the current function is marked as returns_nonnull.  That's fixed in
the relatively obvious way.

Second, infer_nonnull_range_p, when presented with an arglist where the
non-null attribute applies to all pointer arguments, it won't bother to
determine if OP is actually one of the arguments :(  It just assumes
that OP is in the argument list.  Opps.

Third, I want to be able to call infer_nonnull_range with OP being 0B.
That lets me use infer_nonnull_range to look for explicit null pointer
dereferences, explicit uses of null in a return statement in functions
that can't return non-null and explicit uses of null arguments when
those arguments can't be null.   Sadly, to make that work we need to use
operand_equal_p rather than simple pointer comparisons to see if OP
shows up in STMT.

Finally, for detecting explicit null pointers, infer_nonnull_range calls
count_ptr_derefs.  count_ptr_derefs counts things in two ways.  One with
a FOR_EACH_SSA_TREE_OPERAND, then again with simple walks of the tree
structures.   Not surprisingly if we're looking for an explicit 0B, then
the loop over the operands finds nothing, but walking the tree
structures does. And the checking assert triggers.  This change removes
the assert and instead sets *num_uses_p to a sane value.

I don't have testcases for this stuff that are independent of the
erroneous path isolation optimization.  However, each is triggered by
tests I'll include in the the erroneous path isolation patch.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for
the trunk?
I'm withdrawing this patch.  While everything works exactly how I 
wanted, I think it's best to rework things a bit and totally eliminate 
count_ptr_derefs entirely.


That change makes it significantly easier to move infer_nonnull_range 
out of tree-vrp.c and into gimple.c (or wherever Andrew wants it) 
without a significant modularity break.


Jeff



Re: RFC: gimple.[ch] break apart

2013-10-30 Thread Jeff Law

On 10/30/13 21:18, Andrew MacLeod wrote:

Hopefully the other attempts to send this aren't queued up...  in any
case, maybe I can't even attach a .dot file?... So no attachments this
time...

instead, the diagram is here:
http://gcc.gnu.org/wiki/rearch?action=AttachFile&do=view&target=gimple.png

One was queued and went through with attachments :-)
jeff



Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread Jeff Law

On 10/30/13 16:50, Steven Bosscher wrote:

On Tue, Oct 29, 2013 at 4:12 PM, Vladimir Makarov wrote:

   Tomorrow I'd like commit the following patch.

   The patch removes regmove pass.


I can barely hold my tears... of joy :-)

Attached patch cleans up some left-overs. Nothing to test, really, as
it's just comments and NOPs. OK for trunk?

Of course.

jeff



RFC: gimple.[ch] break apart

2013-10-30 Thread Andrew MacLeod
Hopefully the other attempts to send this aren't queued up...  in any 
case, maybe I can't even attach a .dot file?... So no attachments this 
time...


instead, the diagram is here:
http://gcc.gnu.org/wiki/rearch?action=AttachFile&do=view&target=gimple.png

-

I've made 4 attempts now to split gimple.[ch] into reasonable component
parts, and I've finally found something that I can make work and fits my
plans.

I've attached a diagram to (hopefully :-) clarify things.

The original purpose of gimple.[ch] was to provide gimple statements.
This replaces the tcc_statement tree kind during the gimplification
process. No other tree kinds have been converted to gimple
structs/classes.  That what the next stage of my project will do.

As a result, any gimple queries regarding types, decls, or expressions
are actually tree queries. They are sprinkled throughout gimple.[ch] and
gimplify.[ch], not to mention tree.[ch] as well as other parts of the
compiler where they happened to be needed.  This has caused various
ordering issues among the inline functions when I tried to split out the
stmt, iterator, and gimplification bits from gimple.[ch].  Not to
mention a lack of an obvious home for some of these functions.

I'd like to move these as I encounter them into a new file,
gimple-decl.[ch].  When I'm working on the other gimple classes, this
will be further split into gimple-decl, gimple-type and gimple-expr as
appropriate but it seems reasonable to create just the one file now to
clump them since there is no other formal organization. So any function
which is actually querying/setting/building a decl, type, or expression
for gimple would go here.

I also want to split out the structure and accessing bits of the gimple
statement structure into gimple-stmt.[ch].  This would be just the
struct decls as well as all the accessing/setting/building functions...

The gimple_stmt_iterators (gsi) themselves also break out into their own
file quite naturally.

I find that gimple_seq does not seem to be a very clearly defined
thing.  Although a gimple_seq is just a typedef for a gimple stmt (I
thought it use to be a container?), it provides some additional
statement queuing functionality.   Ie, you don't have to worry about
next and prev pointers in the stmt's you build, you simply queue them up
and attach them where you want.  In that sense, its a kind of overlay on
top of a gimple-stmt as it provides additional functionality.
gimple_seq also utilizes gimple_smt_iterators under the covers. They do
not expose the iterators to a function using a gimple_seq but they do
need that knowledge to mange the lists under the covers (thus a dashed
line in the diagram).

Its unclear to me whether gimple_seq's are intended to have a future, or
whether their functionality should be rolled right into statements
themselves.  I believe it may be possible to include gimple-iterator.h
in gimple-stmt.c to provide the implementation without affecting the
inheritance layout, although I haven't actually tried that path.

Or we can treat them as a different layer with their own gimple-seq.[ch]
files.

Or we could combine gimple_seq and gsi routines in the same file, but
that would have the downside of exposing the gsi routines to the
gimplifier, which should have no need of.   Either of the latter 2
options seem reasonable to me for now.

The remnants of gimple.[ch] would contain various general helper
routines (walkers, etc), much like tree-ssa.[ch] does for ssa.

And finally gimplify.[ch] would contain all the stuff required for the
front ends to generate gimple code.  This is actually a front end
interface.  At the moment it isn't obvious since all the current gimple
code also uses trees and calls the gimplifier frequently. As I push
gimple types and decls into the back end and remove trees, the backend
should simply generate gimple directly.  Gimplify should slowly become
usable only via tree based front ends...  (Thus the dotted line from BE
to gimplify.. it should be removed eventually)

Which means that all the front end files should be including *only*
gimplify.h, and getting everything they need from there. Currently a
number of them include gimple.h which should not be required.

How reasonable or unreasonable does this sound? :-)  I've been tearing
the file apart in different ways and orders, and this seems to be the
most workable solution I have come up with that doesn't involve hacks.

Andrew




Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread Vladimir Makarov

On 10/30/2013, 7:40 PM, David Edelsohn wrote:

Where was this patch bootstrapped? This appears to have broken
bootstrap on PowerPC (Linux and AIX)

/nasfarm/edelsohn/src/src/libgcc/libgcov.c: In function 'gcov_exit':
/nasfarm/edelsohn/src/src/libgcc/libgcov.c:827:1: internal compiler error: in up
date_costs_from_allocno, at ira-color.c:1334



Sorry for inconvenience, David.

I've tested it thoroughly on x86/x86-64 and ppc a few days ago but it 
looks like I used LRA.  PPC was not so interested for me as this patch 
affects mostly two-op insn architectures.  The problem is specific only 
for reload and occurs when we reassign pseudo from the reload pass.


I've just committed the patch fixing this.

Sorry again.



patch to fix PR58933 (ppc bootstrap)

2013-10-30 Thread Vladimir Makarov

The following patch fixes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58933.

The problem occurs for reload based targets.

Successfully bootstrapped on ppc64.

Committed as rev. 204245.

2013-10-30  Vladimir Makarov  

PR bootstrap/58933
* ira-color.c (update_costs_from_copies): Add new parameter.  Use
it for calling update_costs_from_allocno.
(assign_hard_reg): Call restore_costs_from_copies only for
!retry_p.  Pass new argument to update_costs_from_copies.
(color_pass): Pass new argument to update_costs_from_copies.
(ira_mark_allocation_change): Ditto.


Index: ira-color.c
===
--- ira-color.c (revision 204244)
+++ ira-color.c (working copy)
@@ -1217,7 +1217,7 @@ static struct update_cost_queue_elem *up
 static int update_cost_check;
 
 /* Allocate and initialize data necessary for function
-   update_costs_from_copiess.  */
+   update_costs_from_copies.  */
 static void
 initiate_cost_update (void)
 {
@@ -1399,16 +1399,16 @@ update_costs_from_prefs (ira_allocno_t a
 /* Update (decrease if DECR_P) the cost of allocnos connected to
ALLOCNO through copies to increase chances to remove some copies as
the result of subsequent assignment.  ALLOCNO was just assigned to
-   a hard register.  */
+   a hard register.  Record cost updates if RECORD_P is true.  */
 static void
-update_costs_from_copies (ira_allocno_t allocno, bool decr_p)
+update_costs_from_copies (ira_allocno_t allocno, bool decr_p, bool record_p)
 {
   int hard_regno;
 
   hard_regno = ALLOCNO_HARD_REGNO (allocno);
   ira_assert (hard_regno >= 0 && ALLOCNO_CLASS (allocno) != NO_REGS);
   start_update_cost ();
-  update_costs_from_allocno (allocno, hard_regno, 1, decr_p, true);
+  update_costs_from_allocno (allocno, hard_regno, 1, decr_p, record_p);
 }
 
 /* Restore costs of allocnos connected to ALLOCNO by copies as it was
@@ -1849,11 +1849,12 @@ assign_hard_reg (ira_allocno_t a, bool r
   for (i = hard_regno_nregs[best_hard_regno][mode] - 1; i >= 0; i--)
allocated_hardreg_p[best_hard_regno + i] = true;
 }
-  restore_costs_from_copies (a);
+  if (! retry_p)
+restore_costs_from_copies (a);
   ALLOCNO_HARD_REGNO (a) = best_hard_regno;
   ALLOCNO_ASSIGNED_P (a) = true;
   if (best_hard_regno >= 0)
-update_costs_from_copies (a, true);
+update_costs_from_copies (a, true, ! retry_p);
   ira_assert (ALLOCNO_CLASS (a) == aclass);
   /* We don't need updated costs anymore: */
   ira_free_allocno_updated_costs (a);
@@ -2942,7 +2943,7 @@ color_pass (ira_loop_tree_node_t loop_tr
ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno;
ALLOCNO_ASSIGNED_P (subloop_allocno) = true;
if (hard_regno >= 0)
- update_costs_from_copies (subloop_allocno, true);
+ update_costs_from_copies (subloop_allocno, true, true);
/* We don't need updated costs anymore: */
ira_free_allocno_updated_costs (subloop_allocno);
  }
@@ -2986,7 +2987,7 @@ color_pass (ira_loop_tree_node_t loop_tr
  ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno;
  ALLOCNO_ASSIGNED_P (subloop_allocno) = true;
  if (hard_regno >= 0)
-   update_costs_from_copies (subloop_allocno, true);
+   update_costs_from_copies (subloop_allocno, true, true);
  /* We don't need updated costs anymore: */
  ira_free_allocno_updated_costs (subloop_allocno);
}
@@ -3002,7 +3003,7 @@ color_pass (ira_loop_tree_node_t loop_tr
  ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno;
  ALLOCNO_ASSIGNED_P (subloop_allocno) = true;
  if (hard_regno >= 0)
-   update_costs_from_copies (subloop_allocno, true);
+   update_costs_from_copies (subloop_allocno, true, true);
  /* We don't need updated costs anymore: */
  ira_free_allocno_updated_costs (subloop_allocno);
}
@@ -3983,7 +3984,7 @@ ira_mark_allocation_change (int regno)
   ? ALLOCNO_CLASS_COST (a)
   : ALLOCNO_HARD_REG_COSTS (a)
 [ira_class_hard_reg_index[aclass][old_hard_regno]]);
-  update_costs_from_copies (a, false);
+  update_costs_from_copies (a, false, false);
 }
   ira_overall_cost -= cost;
   ALLOCNO_HARD_REGNO (a) = hard_regno;
@@ -3998,7 +3999,7 @@ ira_mark_allocation_change (int regno)
   ? ALLOCNO_CLASS_COST (a)
   : ALLOCNO_HARD_REG_COSTS (a)
 [ira_class_hard_reg_index[aclass][hard_regno]]);
-  update_costs_from_copies (a, true);
+  update_costs_from_copies (a, true, false);
 }
   else
 /* Reload changed class of the allocno.  */


Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread Vladimir Makarov

On 10/30/2013, 6:50 PM, Steven Bosscher wrote:

On Tue, Oct 29, 2013 at 4:12 PM, Vladimir Makarov wrote:

   Tomorrow I'd like commit the following patch.

   The patch removes regmove pass.

I can barely hold my tears... of joy :-)

Attached patch cleans up some left-overs. Nothing to test, really, as
it's just comments and NOPs. OK for trunk?


Yes, sure.  Sorry, I missed this.  Am not a target maintainer but the 
changes are pretty obvious.  Thanks, Steven.




Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Jason Merrill

On 10/30/2013 12:15 PM, Marek Polacek wrote:

On Wed, Oct 30, 2013 at 11:56:25AM -0400, Jason Merrill wrote:

Saving 'size' here doesn't help since it's already been used above.
Could you use itype instead of size here?


I already experimented with that and I think I can't, since we call
the finish_expr_stmt too soon, which results in:

 int x = 1;
 int a[0:(sizetype) SAVE_EXPR ];

 <>;
 < <= 0)
   {
 __builtin___ubsan_handle_vla_bound_not_positive (&*.Lubsan_data0, (unsigned 
long) SAVE_EXPR );
   }
 else
   {
 0
   }, (void) SAVE_EXPR ; >;
   ssizetype D.2143;
 <;


Ah, looks like you're getting an unfortunate interaction with 
stabilize_vla_size, which is replacing the contents of the SAVE_EXPR 
with a reference to a variable that isn't initialized yet.  Perhaps we 
should move the stabilize_vla_size call into compute_array_index_type, too.



and that ICEs in gimplify_var_or_parm_decl, presumably because the
if (SAVE_EXPR  <= 0) { ... } should be emitted *after* that
cleanup_point.  When we generated the C++1y check in cp_finish_decl,
we emitted the check after the cleanup_point, and everything was OK.
I admit I don't understand the cleanup_points very much and I don't
know exactly where they are coming from, because normally I don't see
them coming out of C FE. :)


You can ignore the cleanup_points; they just wrap every full-expression.

Jason



Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.

2013-10-30 Thread Ramana Radhakrishnan
On Thu, Oct 31, 2013 at 12:29 AM, Cong Hou  wrote:
> On Tue, Oct 29, 2013 at 4:49 PM, Ramana Radhakrishnan
>  wrote:
>> Cong,
>>
>> Please don't do the following.
>>
>>>+++ b/gcc/testsuite/gcc.dg/vect/
>> vect-reduc-sad.c
>> @@ -0,0 +1,54 @@
>> +/* { dg-require-effective-target sse2 { target { i?86-*-* x86_64-*-* } } } 
>> */
>>
>> you are adding a test to gcc.dg/vect - It's a common directory
>> containing tests that need to run on multiple architectures and such
>> tests should be keyed by the feature they enable which can be turned
>> on for ports that have such an instruction.
>>
>> The correct way of doing this is to key this on the feature something
>> like dg-require-effective-target vect_sad_char . And define the
>> equivalent routine in testsuite/lib/target-supports.exp and enable it
>> for sse2 for the x86 port. If in doubt look at
>> check_effective_target_vect_int and a whole family of such functions
>> in testsuite/lib/target-supports.exp
>>
>> This makes life easy for other port maintainers who want to turn on
>> this support. And for bonus points please update the testcase writing
>> wiki page with this information if it isn't already there.
>>
>
> OK, I will likely move the test case to gcc.target/i386 as currently
> only SSE2 provides SAD instruction. But your suggestion also helps!

Sorry, no - I really don't like that approach, if the test remains in
the common directory keyed off as I suggested, it makes life easier
when turning this on in other ports as adding this pattern in the port
would take this test from being UNSUPPORTED->XPASS and keeps
gcc.dg/vect reasonably up to date with respect to testing the features
of the vectorizer and in touch with the way in which the tests in
gcc.dg/vect have been written till date.

I think Neon has an equivalent instruction called vaba but I will have
to check in the morning when I get back to my machine.


regards
Ramana


>
>6  abs_diff = ABS_EXPR ;
>>>  [S7  abs_diff = (TYPE2) abs_diff;  #optional]
>>>  S8  sum_1 = abs_diff + sum_0;
>>>
>>>where 'TYPE1' is at least double the size of type 'type', and 'TYPE2' is 
>>> the
>>>same size of 'TYPE1' or bigger. This is a special case of a reduction
>>>computation.
>>>
>>> For SSE2, type is char, and TYPE1 and TYPE2 are int.
>>>
>>>
>>> In order to express this new operation, a new expression SAD_EXPR is
>>> introduced in tree.def, and the corresponding entry in optabs is
>>> added. The patch also added the "define_expand" for SSE2 and AVX2
>>> platforms for i386.
>>>
>>> The patch is pasted below and also attached as a text file (in which
>>> you can see tabs). Bootstrap and make check got passed on x86. Please
>>> give me your comments.
>>>
>>>
>>>
>>> thanks,
>>> Cong
>>>
>>>
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 8a38316..d528307 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,23 @@
>>> +2013-10-29  Cong Hou  
>>> +
>>> + * tree-vect-patterns.c (vect_recog_sad_pattern): New function for SAD
>>> + pattern recognition.
>>> + (type_conversion_p): PROMOTION is true if it's a type promotion
>>> + conversion, and false otherwise.  Return true if the given expression
>>> + is a type conversion one.
>>> + * tree-vectorizer.h: Adjust the number of patterns.
>>> + * tree.def: Add SAD_EXPR.
>>> + * optabs.def: Add sad_optab.
>>> + * cfgexpand.c (expand_debug_expr): Add SAD_EXPR case.
>>> + * expr.c (expand_expr_real_2): Likewise.
>>> + * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>>> + * gimple.c (get_gimple_rhs_num_ops): Likewise.
>>> + * optabs.c (optab_for_tree_code): Likewise.
>>> + * tree-cfg.c (estimate_operator_cost): Likewise.
>>> + * tree-ssa-operands.c (get_expr_operands): Likewise.
>>> + * tree-vect-loop.c (get_initial_def_for_reduction): Likewise.
>>> + * config/i386/sse.md: Add SSE2 and AVX2 expand for SAD.
>>> +
>>>  2013-10-14  David Malcolm  
>>>
>>>   * dumpfile.h (gcc::dump_manager): New class, to hold state
>>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>>> index 7ed29f5..9ec761a 100644
>>> --- a/gcc/cfgexpand.c
>>> +++ b/gcc/cfgexpand.c
>>> @@ -2730,6 +2730,7 @@ expand_debug_expr (tree exp)
>>>   {
>>>   case COND_EXPR:
>>>   case DOT_PROD_EXPR:
>>> + case SAD_EXPR:
>>>   case WIDEN_MULT_PLUS_EXPR:
>>>   case WIDEN_MULT_MINUS_EXPR:
>>>   case FMA_EXPR:
>>> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
>>> index c3f6c94..ca1ab70 100644
>>> --- a/gcc/config/i386/sse.md
>>> +++ b/gcc/config/i386/sse.md
>>> @@ -6052,6 +6052,40 @@
>>>DONE;
>>>  })
>>>
>>> +(define_expand "sadv16qi"
>>> +  [(match_operand:V4SI 0 "register_operand")
>>> +   (match_operand:V16QI 1 "register_operand")
>>> +   (match_operand:V16QI 2 "register_operand")
>>> +   (match_operand:V4SI 3 "register_operand")]
>>> +  "TARGET_SSE2"
>>> +{
>>> +  rtx t1 = gen_reg_rtx (V2DImode);
>>> +  rtx t2 = gen_reg_rtx (V4SImode);
>>> +  emit_insn (gen_sse2_psadbw (t1, operands[1], operands[2]));
>>> +  convert_move (t2, t1, 0);
>>> +  emit_ins

Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-10-30 Thread Sandra Loosemore

On 10/29/2013 02:51 AM, Bernd Edlinger wrote:


On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote:

On 10/28/2013 03:20 AM, Bernd Edlinger wrote:

I have attached an update to your patch, that should
a) fix the recursion problem.
b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model.


Here's a new version of the update patch.


Alternatively, if strict_volatile_bitfield_p returns false but
flag_strict_volatile_bitfields> 0, then always force to word_mode and
change the -fstrict-volatile-bitfields documentation to indicate that's
the fallback if the insertion/extraction cannot be done in the declared
mode, rather than claiming that it tries to do the same thing as if
-fstrict-volatile-bitfields were not enabled at all.


I decided that this approach was more expedient, after all.

I've tested this patch (in conjunction with my already-approved but 
not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, 
and mips-linux gnu.  I also backported the entire series to GCC 4.8 and 
tested there on arm-none-eabi and x86_64-linux-gnu.  OK to apply?


-Sandra
2013-10-30  Bernd Edlinger 
	Sandra Loosemore  

	PR middle-end/23623
	PR middle-end/48784
	PR middle-end/56341
	PR middle-end/56997

	gcc/
	* expmed.c (strict_volatile_bitfield_p): Add bitregion_start
	and bitregion_end parameters.  Test for compliance with C++
	memory model.
	(store_bit_field): Adjust call to strict_volatile_bitfield_p.
	Add fallback logic for cases where -fstrict-volatile-bitfields
	is supposed to apply, but cannot.
	(extract_bit-field): Likewise.
	* doc/invoke.texi (Code Gen Options): Better document fallback for
	-fstrict-volatile-bitfields.

	gcc/testsuite/
	* gcc.dg/pr23623.c: Update to test interaction with C++
	memory model.
diff -u gcc/doc/invoke.texi gcc/doc/invoke.texi
--- gcc/doc/invoke.texi	(working copy)
+++ gcc/doc/invoke.texi	(working copy)
@@ -21659,7 +21659,8 @@
 In some cases, such as when the @code{packed} attribute is applied to a 
 structure field, it may not be possible to access the field with a single
 read or write that is correctly aligned for the target machine.  In this
-case GCC falls back to generating multiple accesses rather than code that 
+case GCC falls back to generating either multiple accesses
+or an access in a larger mode, rather than code that 
 will fault or truncate the result at run time.
 
 The default value of this option is determined by the application binary
diff -u gcc/testsuite/gcc.dg/pr23623.c gcc/testsuite/gcc.dg/pr23623.c
--- gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
@@ -8,16 +8,19 @@
 extern struct
 {
   unsigned int b : 1;
+  unsigned int : 31;
 } bf1;
 
 extern volatile struct
 {
   unsigned int b : 1;
+  unsigned int : 31;
 } bf2;
 
 extern struct
 {
   volatile unsigned int b : 1;
+  volatile unsigned int : 31;
 } bf3;
 
 void writeb(void)
diff -u gcc/expmed.c gcc/expmed.c
--- gcc/expmed.c	(working copy)
+++ gcc/expmed.c	(working copy)
@@ -416,12 +416,17 @@
 }
 
 /* Return true if -fstrict-volatile-bitfields applies an access of OP0
-   containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE.  */
+   containing BITSIZE bits starting at BITNUM, with field mode FIELDMODE.
+   Return false if the access would touch memory outside the range
+   BITREGION_START to BITREGION_END for conformance to the C++ memory
+   model.  */
 
 static bool
 strict_volatile_bitfield_p (rtx op0, unsigned HOST_WIDE_INT bitsize,
 			unsigned HOST_WIDE_INT bitnum,
-			enum machine_mode fieldmode)
+			enum machine_mode fieldmode,
+			unsigned HOST_WIDE_INT bitregion_start,
+			unsigned HOST_WIDE_INT bitregion_end)
 {
   unsigned HOST_WIDE_INT modesize = GET_MODE_BITSIZE (fieldmode);
 
@@ -448,6 +453,12 @@
 	  && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
 return false;
 
+  /* Check for cases where the C++ memory model applies.  */
+  if (bitregion_end != 0
+  && (bitnum - bitnum % modesize < bitregion_start
+	  || bitnum - bitnum % modesize + modesize > bitregion_end))
+return false;
+
   return true;
 }
 
@@ -904,7 +915,8 @@
 		 rtx value)
 {
   /* Handle -fstrict-volatile-bitfields in the cases where it applies.  */
-  if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode))
+  if (strict_volatile_bitfield_p (str_rtx, bitsize, bitnum, fieldmode,
+  bitregion_start, bitregion_end))
 {
 
   /* Storing any naturally aligned field can be done with a simple
@@ -923,6 +935,14 @@
 	store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value);
   return;
 }
+  else if (MEM_P (str_rtx)
+	   && MEM_VOLATILE_P (str_rtx)
+	   && flag_strict_volatile_bitfields > 0)
+/* This is a case where -fstrict-volatile-bitfields doesn't apply
+   because we can't do a single access in the declared mode of the field.
+   Since the incoming STR_RTX has already been adjusted to that mode,
+   fall back to word mode for sub

Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.

2013-10-30 Thread Cong Hou
On Tue, Oct 29, 2013 at 4:49 PM, Ramana Radhakrishnan
 wrote:
> Cong,
>
> Please don't do the following.
>
>>+++ b/gcc/testsuite/gcc.dg/vect/
> vect-reduc-sad.c
> @@ -0,0 +1,54 @@
> +/* { dg-require-effective-target sse2 { target { i?86-*-* x86_64-*-* } } } */
>
> you are adding a test to gcc.dg/vect - It's a common directory
> containing tests that need to run on multiple architectures and such
> tests should be keyed by the feature they enable which can be turned
> on for ports that have such an instruction.
>
> The correct way of doing this is to key this on the feature something
> like dg-require-effective-target vect_sad_char . And define the
> equivalent routine in testsuite/lib/target-supports.exp and enable it
> for sse2 for the x86 port. If in doubt look at
> check_effective_target_vect_int and a whole family of such functions
> in testsuite/lib/target-supports.exp
>
> This makes life easy for other port maintainers who want to turn on
> this support. And for bonus points please update the testcase writing
> wiki page with this information if it isn't already there.
>

OK, I will likely move the test case to gcc.target/i386 as currently
only SSE2 provides SAD instruction. But your suggestion also helps!


> You are also missing documentation updates for SAD_EXPR, md.texi for
> the new standard pattern name. Shouldn't it be called sad4
> really ?
>


I will add the documentation for the new operation SAD_EXPR.

I use sad by just following udot_prod as those two
operations are quite similar:

 OPTAB_D (udot_prod_optab, "udot_prod$I$a")


thanks,
Cong


>
> regards
> Ramana
>
>
>
>
>
> On Tue, Oct 29, 2013 at 10:23 PM, Cong Hou  wrote:
>> Hi
>>
>> SAD (Sum of Absolute Differences) is a common and important algorithm
>> in image processing and other areas. SSE2 even introduced a new
>> instruction PSADBW for it. A SAD loop can be greatly accelerated by
>> this instruction after being vectorized. This patch introduced a new
>> operation SAD_EXPR and a SAD pattern recognizer in vectorizer.
>>
>> The pattern of SAD is shown below:
>>
>>  unsigned type x_t, y_t;
>>  signed TYPE1 diff, abs_diff;
>>  TYPE2 sum = init;
>>loop:
>>  sum_0 = phi 
>>  S1  x_t = ...
>>  S2  y_t = ...
>>  S3  x_T = (TYPE1) x_t;
>>  S4  y_T = (TYPE1) y_t;
>>  S5  diff = x_T - y_T;
>>  S6  abs_diff = ABS_EXPR ;
>>  [S7  abs_diff = (TYPE2) abs_diff;  #optional]
>>  S8  sum_1 = abs_diff + sum_0;
>>
>>where 'TYPE1' is at least double the size of type 'type', and 'TYPE2' is 
>> the
>>same size of 'TYPE1' or bigger. This is a special case of a reduction
>>computation.
>>
>> For SSE2, type is char, and TYPE1 and TYPE2 are int.
>>
>>
>> In order to express this new operation, a new expression SAD_EXPR is
>> introduced in tree.def, and the corresponding entry in optabs is
>> added. The patch also added the "define_expand" for SSE2 and AVX2
>> platforms for i386.
>>
>> The patch is pasted below and also attached as a text file (in which
>> you can see tabs). Bootstrap and make check got passed on x86. Please
>> give me your comments.
>>
>>
>>
>> thanks,
>> Cong
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 8a38316..d528307 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,23 @@
>> +2013-10-29  Cong Hou  
>> +
>> + * tree-vect-patterns.c (vect_recog_sad_pattern): New function for SAD
>> + pattern recognition.
>> + (type_conversion_p): PROMOTION is true if it's a type promotion
>> + conversion, and false otherwise.  Return true if the given expression
>> + is a type conversion one.
>> + * tree-vectorizer.h: Adjust the number of patterns.
>> + * tree.def: Add SAD_EXPR.
>> + * optabs.def: Add sad_optab.
>> + * cfgexpand.c (expand_debug_expr): Add SAD_EXPR case.
>> + * expr.c (expand_expr_real_2): Likewise.
>> + * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>> + * gimple.c (get_gimple_rhs_num_ops): Likewise.
>> + * optabs.c (optab_for_tree_code): Likewise.
>> + * tree-cfg.c (estimate_operator_cost): Likewise.
>> + * tree-ssa-operands.c (get_expr_operands): Likewise.
>> + * tree-vect-loop.c (get_initial_def_for_reduction): Likewise.
>> + * config/i386/sse.md: Add SSE2 and AVX2 expand for SAD.
>> +
>>  2013-10-14  David Malcolm  
>>
>>   * dumpfile.h (gcc::dump_manager): New class, to hold state
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index 7ed29f5..9ec761a 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -2730,6 +2730,7 @@ expand_debug_expr (tree exp)
>>   {
>>   case COND_EXPR:
>>   case DOT_PROD_EXPR:
>> + case SAD_EXPR:
>>   case WIDEN_MULT_PLUS_EXPR:
>>   case WIDEN_MULT_MINUS_EXPR:
>>   case FMA_EXPR:
>> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
>> index c3f6c94..ca1ab70 100644
>> --- a/gcc/config/i386/sse.md
>> +++ b/gcc/config/i386/sse.md
>> @@ -6052,6 +6052,40 @@
>>DONE;
>>  })
>>
>> +(define_expand "sadv16qi"
>> +  [(match_operand:V4SI 0 "register_operand")
>> +   (match_operand:V16QI

[rx] fix no-argument builtins

2013-10-30 Thread DJ Delorie

Apparently, there's a subtle difference between "function that takes
no argument" and "function that takes void" :-P

Committed.

* config/rx/rx.c (ADD_RX_BUILTIN0): New macro, used for builtins
that take no arguments.

Index: config/rx/rx.c
===
--- config/rx/rx.c  (revision 204234)
+++ config/rx/rx.c  (working copy)
@@ -2270,12 +2270,20 @@ enum rx_builtin
 
 static GTY(()) tree rx_builtins[(int) RX_BUILTIN_max];
 
 static void
 rx_init_builtins (void)
 {
+#define ADD_RX_BUILTIN0(UC_NAME, LC_NAME, RET_TYPE)\
+   rx_builtins[RX_BUILTIN_##UC_NAME] = \
+   add_builtin_function ("__builtin_rx_" LC_NAME,  \
+   build_function_type_list (RET_TYPE##_type_node, \
+ NULL_TREE),   \
+   RX_BUILTIN_##UC_NAME,   \
+   BUILT_IN_MD, NULL, NULL_TREE)
+
 #define ADD_RX_BUILTIN1(UC_NAME, LC_NAME, RET_TYPE, ARG_TYPE)  \
rx_builtins[RX_BUILTIN_##UC_NAME] = \
add_builtin_function ("__builtin_rx_" LC_NAME,  \
build_function_type_list (RET_TYPE##_type_node, \
  ARG_TYPE##_type_node, \
  NULL_TREE),   \
@@ -2300,32 +2308,32 @@ rx_init_builtins (void)
  ARG_TYPE2##_type_node,\
  ARG_TYPE3##_type_node,\
  NULL_TREE),   \
RX_BUILTIN_##UC_NAME,   \
BUILT_IN_MD, NULL, NULL_TREE)
 
-  ADD_RX_BUILTIN1 (BRK, "brk", void,  void);
+  ADD_RX_BUILTIN0 (BRK, "brk", void);
   ADD_RX_BUILTIN1 (CLRPSW,  "clrpsw",  void,  integer);
   ADD_RX_BUILTIN1 (SETPSW,  "setpsw",  void,  integer);
   ADD_RX_BUILTIN1 (INT, "int", void,  integer);
   ADD_RX_BUILTIN2 (MACHI,   "machi",   void,  intSI, intSI);
   ADD_RX_BUILTIN2 (MACLO,   "maclo",   void,  intSI, intSI);
   ADD_RX_BUILTIN2 (MULHI,   "mulhi",   void,  intSI, intSI);
   ADD_RX_BUILTIN2 (MULLO,   "mullo",   void,  intSI, intSI);
-  ADD_RX_BUILTIN1 (MVFACHI, "mvfachi", intSI, void);
-  ADD_RX_BUILTIN1 (MVFACMI, "mvfacmi", intSI, void);
+  ADD_RX_BUILTIN0 (MVFACHI, "mvfachi", intSI);
+  ADD_RX_BUILTIN0 (MVFACMI, "mvfacmi", intSI);
   ADD_RX_BUILTIN1 (MVTACHI, "mvtachi", void,  intSI);
   ADD_RX_BUILTIN1 (MVTACLO, "mvtaclo", void,  intSI);
-  ADD_RX_BUILTIN1 (RMPA,"rmpa",void,  void);
+  ADD_RX_BUILTIN0 (RMPA,"rmpa",void);
   ADD_RX_BUILTIN1 (MVFC,"mvfc",intSI, integer);
   ADD_RX_BUILTIN2 (MVTC,"mvtc",void,  integer, integer);
   ADD_RX_BUILTIN1 (MVTIPL,  "mvtipl",  void,  integer);
   ADD_RX_BUILTIN1 (RACW,"racw",void,  integer);
   ADD_RX_BUILTIN1 (ROUND,   "round",   intSI, float);
   ADD_RX_BUILTIN1 (REVW,"revw",intSI, intSI);
-  ADD_RX_BUILTIN1 (WAIT,"wait",void,  void);
+  ADD_RX_BUILTIN0 (WAIT,"wait",void);
 }
 
 /* Return the RX builtin for CODE.  */
 
 static tree
 rx_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)


Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.

2013-10-30 Thread Cong Hou
On Wed, Oct 30, 2013 at 4:27 AM, Richard Biener  wrote:
> On Tue, 29 Oct 2013, Cong Hou wrote:
>
>> Hi
>>
>> SAD (Sum of Absolute Differences) is a common and important algorithm
>> in image processing and other areas. SSE2 even introduced a new
>> instruction PSADBW for it. A SAD loop can be greatly accelerated by
>> this instruction after being vectorized. This patch introduced a new
>> operation SAD_EXPR and a SAD pattern recognizer in vectorizer.
>>
>> The pattern of SAD is shown below:
>>
>>  unsigned type x_t, y_t;
>>  signed TYPE1 diff, abs_diff;
>>  TYPE2 sum = init;
>>loop:
>>  sum_0 = phi 
>>  S1  x_t = ...
>>  S2  y_t = ...
>>  S3  x_T = (TYPE1) x_t;
>>  S4  y_T = (TYPE1) y_t;
>>  S5  diff = x_T - y_T;
>>  S6  abs_diff = ABS_EXPR ;
>>  [S7  abs_diff = (TYPE2) abs_diff;  #optional]
>>  S8  sum_1 = abs_diff + sum_0;
>>
>>where 'TYPE1' is at least double the size of type 'type', and 'TYPE2' is 
>> the
>>same size of 'TYPE1' or bigger. This is a special case of a reduction
>>computation.
>>
>> For SSE2, type is char, and TYPE1 and TYPE2 are int.
>>
>>
>> In order to express this new operation, a new expression SAD_EXPR is
>> introduced in tree.def, and the corresponding entry in optabs is
>> added. The patch also added the "define_expand" for SSE2 and AVX2
>> platforms for i386.
>>
>> The patch is pasted below and also attached as a text file (in which
>> you can see tabs). Bootstrap and make check got passed on x86. Please
>> give me your comments.
>
> Apart from the testcase comment made earlier
>
> +++ b/gcc/tree-cfg.c
> @@ -3797,6 +3797,7 @@ verify_gimple_assign_ternary (gimple stmt)
>return false;
>
>  case DOT_PROD_EXPR:
> +case SAD_EXPR:
>  case REALIGN_LOAD_EXPR:
>/* FIXME.  */
>return false;
>
> please add proper verification of the operand types.

OK.

>
> +/* Widening sad (sum of absolute differences).
> +   The first two arguments are of type t1 which should be unsigned
> integer.
> +   The third argument and the result are of type t2, such that t2 is at
> least
> +   twice the size of t1. SAD_EXPR(arg1,arg2,arg3) is equivalent to:
> +   tmp1 = WIDEN_MINUS_EXPR (arg1, arg2);
> +   tmp2 = ABS_EXPR (tmp1);
> +   arg3 = PLUS_EXPR (tmp2, arg3);   */
> +DEFTREECODE (SAD_EXPR, "sad_expr", tcc_expression, 3)
>
> WIDEN_MINUS_EXPR doesn't exist so you have to explain on its
> operation (it returns a signed wide difference?).  Why should
> the first two arguments be unsigned?  I cannot see a good reason
> to require that (other than that maybe the x86 target only has
> support for widened unsigned difference?).  So if you want to
> make that restriction maybe change the name to SADU_EXPR
> (sum of absolute differences of unsigned)?
>
> I suppose you tried introducing WIDEN_MINUS_EXPR instead and
> letting combine do it's work, avoiding the very special optab?

I may use the wrong representation here. I think the behavior of
"WIDEN_MINUS_EXPR" in SAD is different from the general one. SAD
usually works on unsigned integers (see
http://en.wikipedia.org/wiki/Sum_of_absolute_differences), and before
getting the difference between two unsigned integers, they are
promoted to bigger signed integers. And the result of (int)(char)(1) -
(int)(char)(-1) is different from (int)(unsigned char)(1) -
(int)(unsigned char)(-1). So we cannot implement SAD using
WIDEN_MINUS_EXPR.

Also, the SSE2 instruction PSADBW also requires the operands to be
unsigned 8-bit integers.

I will remove the improper description as you pointed out.



thanks,
Cong


>
> Thanks,
> Richard.
>
>>
>>
>> thanks,
>> Cong
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 8a38316..d528307 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,23 @@
>> +2013-10-29  Cong Hou  
>> +
>> + * tree-vect-patterns.c (vect_recog_sad_pattern): New function for SAD
>> + pattern recognition.
>> + (type_conversion_p): PROMOTION is true if it's a type promotion
>> + conversion, and false otherwise.  Return true if the given expression
>> + is a type conversion one.
>> + * tree-vectorizer.h: Adjust the number of patterns.
>> + * tree.def: Add SAD_EXPR.
>> + * optabs.def: Add sad_optab.
>> + * cfgexpand.c (expand_debug_expr): Add SAD_EXPR case.
>> + * expr.c (expand_expr_real_2): Likewise.
>> + * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>> + * gimple.c (get_gimple_rhs_num_ops): Likewise.
>> + * optabs.c (optab_for_tree_code): Likewise.
>> + * tree-cfg.c (estimate_operator_cost): Likewise.
>> + * tree-ssa-operands.c (get_expr_operands): Likewise.
>> + * tree-vect-loop.c (get_initial_def_for_reduction): Likewise.
>> + * config/i386/sse.md: Add SSE2 and AVX2 expand for SAD.
>> +
>>  2013-10-14  David Malcolm  
>>
>>   * dumpfile.h (gcc::dump_manager): New class, to hold state
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index 7ed29f5..9ec761a 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>

[patch] make the libstdc++ pretty printers compatible with both Python 2 and Python3

2013-10-30 Thread Matthias Klose
Starting with gdb 7.6, gdb can be linked with both Python 2.x and Python 3.x.
Therefore the pretty printers should be compatible with both Python versions.

This patch should be backported to 4.7 and 4.8 as well.

Ok for the trunk?

  Matthias


* python/libstdcxx/v6/printers.py: Make pretty printers compatible with
Both python 2.x and 3.x.

Index: python/libstdcxx/v6/printers.py
===
--- python/libstdcxx/v6/printers.py (revision 204231)
+++ python/libstdcxx/v6/printers.py (working copy)
@@ -15,8 +15,14 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see .
 
+# Keep this module compatible with both Python 2 and Python 3
+
 import gdb
-import itertools
+import sys
+if sys.version_info[0] < 3:
+import itertools
+map = itertools.imap
+zip = itertools.izip
 import re
 
 # Try to use the new-style pretty-printing if available.
@@ -51,7 +57,7 @@
 # anything fancier here.
 field = typ.fields()[0]
 if not field.is_base_class:
-raise ValueError, "Cannot find type %s::%s" % (str(orig), name)
+raise ValueError("Cannot find type %s::%s" % (str(orig), name))
 typ = field.type
 
 class SharedPointerPrinter:
@@ -97,7 +103,7 @@
 def __iter__(self):
 return self
 
-def next(self):
+def __next__(self):
 if self.base == self.head:
 raise StopIteration
 elt = self.base.cast(self.nodetype).dereference()
@@ -105,6 +111,7 @@
 count = self.count
 self.count = self.count + 1
 return ('[%d]' % count, elt['_M_data'])
+next = __next__
 
 def __init__(self, typename, val):
 self.typename = typename
@@ -144,7 +151,7 @@
 def __iter__(self):
 return self
 
-def next(self):
+def __next__(self):
 if self.base == 0:
 raise StopIteration
 elt = self.base.cast(self.nodetype).dereference()
@@ -152,6 +159,7 @@
 count = self.count
 self.count = self.count + 1
 return ('[%d]' % count, elt['_M_data'])
+next = __next__
 
 def __init__(self, typename, val):
 self.val = val
@@ -198,7 +206,7 @@
 def __iter__(self):
 return self
 
-def next(self):
+def __next__(self):
 count = self.count
 self.count = self.count + 1
 if self.bitvec:
@@ -220,6 +228,7 @@
 elt = self.item.dereference()
 self.item = self.item + 1
 return ('[%d]' % count, elt)
+next = __next__
 
 def __init__(self, typename, val):
 self.typename = typename
@@ -276,20 +285,20 @@
 # Set the actual head to the first pair.
 self.head  = self.head.cast (nodes[0].type)
 elif len (nodes) != 0:
-raise ValueError, "Top of tuple tree does not consist of a 
single node."
+raise ValueError("Top of tuple tree does not consist of a 
single node.")
 self.count = 0
 
 def __iter__ (self):
 return self
 
-def next (self):
+def __next__ (self):
 nodes = self.head.type.fields ()
 # Check for further recursions in the inheritance tree.
 if len (nodes) == 0:
 raise StopIteration
 # Check that this iteration has an expected structure.
 if len (nodes) != 2:
-raise ValueError, "Cannot parse more than 2 nodes in a tuple 
tree."
+raise ValueError("Cannot parse more than 2 nodes in a tuple 
tree.")
 
 # - Left node is the next recursion parent.
 # - Right node is the actual class contained in the tuple.
@@ -309,6 +318,7 @@
 return ('[%d]' % self.count, impl)
 else:
 return ('[%d]' % self.count, impl['_M_head_impl'])
+next = __next__
 
 def __init__ (self, typename, val):
 self.typename = typename
@@ -353,7 +363,7 @@
 def __len__(self):
 return int (self.size)
 
-def next(self):
+def __next__(self):
 if self.count == self.size:
 raise StopIteration
 result = self.node
@@ -374,6 +384,7 @@
 node = parent
 self.node = node
 return result
+next = __next__
 
 # This is a pretty printer for std::_Rb_tree_iterator (which is
 # std::map::iterator), and has nothing to do with the RbtreeIterator
@@ -414,9 +425,9 @@
 def __iter__(self):
 return self
 
-def next(self):
+def __next__(self):
 if self.count % 2 == 0:
-n = self.rbiter.next()
+n = next(self.rbiter)
 n = n.cast(self.type).dereference

Go testsuite patch committed: Remove empty directory

2013-10-30 Thread Ian Lance Taylor
I removed the empty directory
gcc/testsuite/go.test/test/fixedbugs/bug479.dir.

Ian


Re: [PATCH, rs6000] Correct handling of multiply high-part for little endian

2013-10-30 Thread David Edelsohn
On Wed, Oct 30, 2013 at 6:55 PM, Bill Schmidt
 wrote:
> Hi,
>
> When working around the peculiar little-endian semantics of the vperm
> instruction, our usual fix is to complement the permute control vector
> and swap the order of the two vector input operands, so that we get a
> double-wide vector in the proper order.  We don't want to swap the
> operands when we are expanding a mult_highpart operation, however, as
> the two input operands are not to be interpreted as a double-wide
> vector.  Instead they represent odd and even elements, and swapping the
> operands gets the odd and even elements reversed in the final result.
>
> The permute for this case is generated by target-neutral code in
> optabs.c: expand_mult_highpart ().  We obviously can't change that code
> directly.  However, we can redirect the logic from the "case 2" method
> to target-specific code by implementing expansions for the
> umul3_highpart and smul3_highpart operations.  I've done
> this, with the expansions acting exactly as expand_mult_highpart does
> today, with the exception that it swaps the input operands to the call
> to expand_vec_perm when we are generating little-endian code.  We will
> later swap them back to their original position in the code in rs6000.c:
> altivec_expand_vec_perm_const_le ().
>
> The change has no intended effect when generating big-endian code.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
> regressions.  This fixes the gcc.dg/vect/pr51581-4.c test failure for
> little endian.  Ok for trunk?
>
> Thanks,
> Bill
>
>
> 2013-10-30  Bill Schmidt  
>
> * config/rs6000/rs6000-protos.h (altivec_expand_mul_highpart): New
> prototype.
> * config/rs6000/rs6000.c (altivec_expand_mul_highpart): New.
> * config/rs6000/altivec.md (umul3_highpart): New.
> (smul_3_highpart): New.

I really do not like duplicating this code.  I think that you need to
explore with the community the possibility of including a hook in the
general code to handle the strangeness of PPC LE vector semantics.

This is asking for problems if the generic code is updated / modified / fixed.

- David


Re: [RFC/CFT] auto-wipe dump files [was: Re: [committed] Fix up bb-slp-31.c testcase]

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 2:41 AM, Bernhard Reutner-Fischer  
wrote:
>> I've noticed that this testcase doesn't clean up after itself.

> This was nagging me last weekend.. ;)
> What about automating this?

So, the idea sounds very nice.

One thing that I worry about is the testing speed hit for people (test cases) 
that don't need cleanups.  I don't know the speed hit of the code, so, don't 
know how necessary it is to try and go faster.

I was thinking the presence of a scan-tree-dump, would set a bit that said, do 
a scan-tree-dump style cleanup.

The common code then does, if cleanups needed, do cleanups

The idea, most test cases don't do this, and don't need the big cleanup routine 
to fire.  A scan-tree-dump would setup a cleanup tree dumps flags, and then in 
the big cleanup routine, you have:

do cleanups()
{
if (need tree cleanups) do tree cleanups();
if (need rtl cleanups) do rtl cleanups();
}

this way, we avoid randomly doing cleanups for things we don't need them for, 
and avoid even asking if we need any cleanups, as we can have a global flag 
that says if we need any extra, special cleanups.

So, all that would be bad to do, if the speed hit is small…  Can you collect 
with/without numbers and post them?  If you can, include user, sys and elapsed. 
 You can run a subset of one testsuite, say, dg.exp, as representative.

Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread David Edelsohn
Where was this patch bootstrapped? This appears to have broken
bootstrap on PowerPC (Linux and AIX)

/nasfarm/edelsohn/src/src/libgcc/libgcov.c: In function 'gcov_exit':
/nasfarm/edelsohn/src/src/libgcc/libgcov.c:827:1: internal compiler error: in up
date_costs_from_allocno, at ira-color.c:1334

- David


Re: [PATCH] arm: emit neon alignment hints for 32/16-bit loads/stores

2013-10-30 Thread Ramana Radhakrishnan
Mans,

Can you please follow the guidelines as in
http://gcc.gnu.org/contribute.html ? Notably what's missing in your
submission here is

1. A changelog entry - well I'll create one for you . (see below)
2. A note on how this was tested and what impact this has on any
testcase that you have.
3. A covering note describing the change.

This is a small enough patch that we should be able to take this under
the 10 line rule, however if you intend to contribute to GCC regularly
we need to check that your contributions are covered by a copyright
assignment with the FSF. If you aren't covered by this please start
this process soon.

regards
Ramana


  Mans Rullgard  

PR target/58847
* config/arm/arm.c (arm_print_operand): Handle alignment for 2
and 4 byte sizes.


On Wed, Oct 30, 2013 at 8:31 PM, Mans Rullgard  wrote:
> ---
>  gcc/config/arm/arm.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 8c9897e..8183a8e 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -21247,6 +21247,10 @@ arm_print_operand (FILE *stream, rtx x, int code)
>   align_bits = 128;
> else if (memsize >= 8 && (align % 8) == 0)
>   align_bits = 64;
> +   else if (memsize == 4 && (align % 4) == 0)
> + align_bits = 32;
> +   else if (memsize == 2 && (align % 2) == 0)
> + align_bits = 16;
> else
>   align_bits = 0;
>
> --
> 1.8.4
>


Re: [PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Jakub Jelinek
On Wed, Oct 30, 2013 at 04:08:19PM -0700, Mike Stump wrote:
> On Oct 30, 2013, at 3:49 PM, "Iyer, Balaji V"  wrote:
> > Yes, the library is compiled by GCC, but it is also used by LLVM and ICC. 
> > So, we would like to keep the same code base for all.
> 
> Ah, dual use…  that would explain it.  Certainly in that case the config hair 
> is desirable, I mean, serves a purpose.  :-)  Thanks.

Though, it should better go into some header instead of being duplicated
into all places that need alloca.

Jakub


Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping

2013-10-30 Thread Jakub Jelinek
On Fri, Oct 25, 2013 at 05:19:06PM +0200, Martin Jambor wrote:
> 2013-10-23  Martin Jambor  
> 
>   PR rtl-optimization/10474
>   * ira.c (find_moveable_pseudos): Do not calculate dominance info
>   nor df analysis.
>   (interesting_dest_for_shprep): New function.
>   (split_live_ranges_for_shrink_wrap): Likewise.
>   (ira): Calculate dominance info and df analysis. Call
>   split_live_ranges_for_shrink_wrap.
> 
> testsuite/
>   * gcc.dg/pr10474.c: New testcase.
>   * gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
>   * gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

Unfortunately this patch breaks i686-linux bootstrap,
in r204204 compare passes, while in r204205 I'm getting .bad_compare
gcc/fortran/module.o differs
   
gcc/ipa.o differs   
   
gcc/go/gogo.o differs   
   
gcc/go/statements.o differs 
   
Most likely combine.o is miscompiled, but haven't verified that yet.

The way I'm configuring this on x86_64-linux is:
mkdir ~/hbin
cat > ~/hbin/as <<\EOF2
#!/bin/sh
exec /usr/bin/as --32 "$@"
EOF2
cat > ~/hbin/g++ <<\EOF2
#!/bin/sh
exec /usr/bin/g++ -m32 "$@"
EOF2
cat > ~/hbin/gcc <<\EOF2
#!/bin/sh
exec /usr/bin/gcc -m32 "$@"
EOF2
cat > ~/hbin/ld <<\EOF2
#!/bin/sh
case "$*" in
  --version) cat <<\EOF
GNU ld version 2.20.52.0.1-10.fc17 20100131
Copyright 2012 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later
version.
This program has absolutely no warranty.
EOF
  exit 0;;
esac
exec /usr/bin/ld -m elf_i386 -L /usr/lib/ "$@"
EOF2
chmod 755 ~/hbin/*
PATH=~/hbin:$PATH i386 ../configure --enable-languages=all,obj-c++,lto,go 
--enable-checking=yes,rtl
PATH=~/hbin:$PATH i386 make -j48

Jakub


Re: [PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 3:49 PM, "Iyer, Balaji V"  wrote:
> Yes, the library is compiled by GCC, but it is also used by LLVM and ICC. So, 
> we would like to keep the same code base for all.

Ah, dual use…  that would explain it.  Certainly in that case the config hair 
is desirable, I mean, serves a purpose.  :-)  Thanks.

Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread Marc Glisse

On Wed, 30 Oct 2013, Steven Bosscher wrote:


Attached patch cleans up some left-overs. Nothing to test, really, as
it's just comments and NOPs. OK for trunk?



-would be better to run a full CSE/propogation pass on it through, or
-re-run regmove, but that has not yet been attempted.
+would be better to run a full CSE/propogation pass on it through, bu
+but that has not yet been attempted.


bu?
through -> though?

--
Marc Glisse


Re: Pre-Patch RFC: proposed changes to option-lookup

2013-10-30 Thread Joseph S. Myers
On Wed, 30 Oct 2013, David Malcolm wrote:

> My idea is to introduce a GCC_OPTION macro, and replace the above with:
> 
>   static bool
>   gate_vrp (void)
>   {
> return GCC_OPTION (flag_tree_vrp) != 0;
>   }

That's only slightly shorter than the full expansion using global_options; 
I'd prefer using the full expansion.

(Of course the ideal is to use explicit options pointers instead of 
global_options.  For example, if such a pointer were associated with the 
current function, it might make function-specific options handling a bit 
less fragile.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH, rs6000] Correct handling of multiply high-part for little endian

2013-10-30 Thread Bill Schmidt
Hi,

When working around the peculiar little-endian semantics of the vperm
instruction, our usual fix is to complement the permute control vector
and swap the order of the two vector input operands, so that we get a
double-wide vector in the proper order.  We don't want to swap the
operands when we are expanding a mult_highpart operation, however, as
the two input operands are not to be interpreted as a double-wide
vector.  Instead they represent odd and even elements, and swapping the
operands gets the odd and even elements reversed in the final result.

The permute for this case is generated by target-neutral code in
optabs.c: expand_mult_highpart ().  We obviously can't change that code
directly.  However, we can redirect the logic from the "case 2" method
to target-specific code by implementing expansions for the
umul3_highpart and smul3_highpart operations.  I've done
this, with the expansions acting exactly as expand_mult_highpart does
today, with the exception that it swaps the input operands to the call
to expand_vec_perm when we are generating little-endian code.  We will
later swap them back to their original position in the code in rs6000.c:
altivec_expand_vec_perm_const_le ().

The change has no intended effect when generating big-endian code.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
regressions.  This fixes the gcc.dg/vect/pr51581-4.c test failure for
little endian.  Ok for trunk?

Thanks,
Bill


2013-10-30  Bill Schmidt  

* config/rs6000/rs6000-protos.h (altivec_expand_mul_highpart): New
prototype.
* config/rs6000/rs6000.c (altivec_expand_mul_highpart): New.
* config/rs6000/altivec.md (umul3_highpart): New.
(smul_3_highpart): New.


Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 204192)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -58,6 +58,7 @@ extern void rs6000_expand_vector_extract (rtx, rtx
 extern bool altivec_expand_vec_perm_const (rtx op[4]);
 extern void altivec_expand_vec_perm_le (rtx op[4]);
 extern bool rs6000_expand_vec_perm_const (rtx op[4]);
+extern bool altivec_expand_mul_highpart (rtx op[3], bool);
 extern void rs6000_expand_extract_even (rtx, rtx, rtx);
 extern void rs6000_expand_interleave (rtx, rtx, rtx, bool);
 extern void build_mask64_2_operands (rtx, rtx *);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 204192)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -29249,6 +29249,58 @@ rs6000_do_expand_vec_perm (rtx target, rtx op0, rt
 emit_move_insn (target, x);
 }
 
+/* Expand an Altivec multiply high-part.  The logic matches the 
+   general logic in optabs.c:expand_mult_highpart, but swaps the
+   inputs for little endian.  Note that we will swap them again
+   during the permute; this is the one case where we don't want
+   the operands swapped, as if we do we get the even and odd values
+   reversed.  */
+
+bool
+altivec_expand_mul_highpart (rtx operands[3], bool uns_p)
+{
+  struct expand_operand eops[3];
+  rtx m1, m2, perm, tmp;
+  int i;
+
+  optab tab1 = uns_p ? vec_widen_umult_even_optab : vec_widen_smult_even_optab;
+  optab tab2 = uns_p ? vec_widen_umult_odd_optab : vec_widen_umult_odd_optab;
+  enum machine_mode mode = GET_MODE (operands[0]);
+  enum insn_code icode = optab_handler (tab1, mode);
+  int nunits = GET_MODE_NUNITS (mode);
+  enum machine_mode wmode = insn_data[icode].operand[0].mode;
+  rtvec v = rtvec_alloc (nunits);
+
+  create_output_operand (&eops[0], gen_reg_rtx (wmode), wmode);
+  create_input_operand (&eops[1], operands[1], mode);
+  create_input_operand (&eops[2], operands[2], mode);
+  expand_insn (icode, 3, eops);
+  m1 = gen_lowpart (mode, eops[0].value);
+
+  create_output_operand (&eops[0], gen_reg_rtx (wmode), wmode);
+  create_input_operand (&eops[1], operands[1], mode);
+  create_input_operand (&eops[2], operands[2], mode);
+  expand_insn (optab_handler (tab2, mode), 3, eops);
+  m2 = gen_lowpart (mode, eops[0].value);
+
+  for (i = 0; i < nunits; ++i)
+RTVEC_ELT (v, i) = GEN_INT (!BYTES_BIG_ENDIAN + (i & ~1)
+   + ((i & 1) ? nunits : 0));
+
+  perm = gen_rtx_CONST_VECTOR (mode, v);
+
+  if (!BYTES_BIG_ENDIAN) {
+tmp = m1;
+m1 = m2;
+m2 = tmp;
+  }
+
+  perm = expand_vec_perm (mode, m1, m2, perm, NULL_RTX);
+  emit_move_insn (operands[0], perm);
+
+  return true;
+}
+
 /* Expand an extract even operation.  */
 
 void
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 204192)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -1401,6 +1401,30 @@
 FAIL;
 })
 
+(define_expand "umul3_highpart"
+  [(match_operand:VIshort 0 "register_operand" "")
+   (match_operand:VIshort 1 "register_o

Re: C++14 digit separators..

2013-10-30 Thread Ed Smith-Rowland

On 10/28/2013 09:44 AM, Jason Merrill wrote:

On 10/28/2013 09:10 AM, Joseph S. Myers wrote:

On Sun, 27 Oct 2013, Ed Smith-Rowland wrote:


Here is an implementation for C++14 digit separators (single quote).


I tend to think that such features should come with a test that the
feature is not enabled for language / standard versions for which it
shouldn't be.  That is, something like

#define m(x) 0
int i = m(1'2)+(3'4);


Good idea.  Other than that, the patch looks good to me.

Jason



OK,

I had to add some more checks in libcpp.
After reading the standard a few times i figured out the magic words are 
"digit separators" - must have digit on both sides.

I also put in the suggested check above.

bootstrapped and fully tested on x86_64-linux.

Still OK?

Ed

Index: libcpp/include/cpplib.h
===
--- libcpp/include/cpplib.h (revision 204189)
+++ libcpp/include/cpplib.h (working copy)
@@ -437,6 +437,9 @@
   /* Nonzero for C++ 2014 Standard binary constants.  */
   unsigned char binary_constants;
 
+  /* Nonzero for C++ 2014 Standard digit separators.  */
+  unsigned char digit_separators;
+
   /* Holds the name of the target (execution) character set.  */
   const char *narrow_charset;
 
Index: libcpp/internal.h
===
--- libcpp/internal.h   (revision 204189)
+++ libcpp/internal.h   (working copy)
@@ -59,6 +59,8 @@
 || (((prevc) == 'p' || (prevc) == 'P') \
 && CPP_OPTION (pfile, extended_numbers
 
+#define DIGIT_SEP(c) ((c) == '\'' && CPP_OPTION (pfile, digit_separators))
+
 #define CPP_OPTION(PFILE, OPTION) ((PFILE)->opts.OPTION)
 #define CPP_BUFFER(PFILE) ((PFILE)->buffer)
 #define CPP_BUF_COLUMN(BUF, CUR) ((CUR) - (BUF)->line_base)
Index: libcpp/expr.c
===
--- libcpp/expr.c   (revision 204189)
+++ libcpp/expr.c   (working copy)
@@ -394,6 +394,7 @@
   unsigned int max_digit, result, radix;
   enum {NOT_FLOAT = 0, AFTER_POINT, AFTER_EXPON} float_flag;
   bool seen_digit;
+  bool seen_digit_sep;
 
   if (ud_suffix)
 *ud_suffix = NULL;
@@ -408,6 +409,7 @@
   max_digit = 0;
   radix = 10;
   seen_digit = false;
+  seen_digit_sep = false;
 
   /* First, interpret the radix.  */
   if (*str == '0')
@@ -416,16 +418,27 @@
   str++;
 
   /* Require at least one hex digit to classify it as hex.  */
-  if ((*str == 'x' || *str == 'X')
- && (str[1] == '.' || ISXDIGIT (str[1])))
+  if (*str == 'x' || *str == 'X')
{
- radix = 16;
- str++;
+ if (str[1] == '.' || ISXDIGIT (str[1]))
+   {
+ radix = 16;
+ str++;
+   }
+ else if (DIGIT_SEP (str[1]))
+   SYNTAX_ERROR_AT (virtual_location,
+"digit separator after base indicator");
}
-  else if ((*str == 'b' || *str == 'B') && (str[1] == '0' || str[1] == 
'1'))
+  else if (*str == 'b' || *str == 'B')
{
- radix = 2;
- str++;
+ if (str[1] == '0' || str[1] == '1')
+   {
+ radix = 2;
+ str++;
+   }
+ else if (DIGIT_SEP (str[1]))
+   SYNTAX_ERROR_AT (virtual_location,
+"digit separator after base indicator");
}
 }
 
@@ -436,13 +449,24 @@
 
   if (ISDIGIT (c) || (ISXDIGIT (c) && radix == 16))
{
+ seen_digit_sep = false;
  seen_digit = true;
  c = hex_value (c);
  if (c > max_digit)
max_digit = c;
}
+  else if (DIGIT_SEP (c))
+   {
+ if (seen_digit_sep)
+   SYNTAX_ERROR_AT (virtual_location, "adjacent digit separators");
+ seen_digit_sep = true;
+   }
   else if (c == '.')
{
+ if (seen_digit_sep || DIGIT_SEP (*str))
+   SYNTAX_ERROR_AT (virtual_location,
+"digit separator adjacent to decimal point");
+ seen_digit_sep = false;
  if (float_flag == NOT_FLOAT)
float_flag = AFTER_POINT;
  else
@@ -452,6 +476,9 @@
   else if ((radix <= 10 && (c == 'e' || c == 'E'))
   || (radix == 16 && (c == 'p' || c == 'P')))
{
+ if (seen_digit_sep || DIGIT_SEP (*str))
+   SYNTAX_ERROR_AT (virtual_location,
+"digit separator adjacent to exponent");
  float_flag = AFTER_EXPON;
  break;
}
@@ -463,6 +490,10 @@
}
 }
 
+  if (seen_digit_sep && float_flag != AFTER_EXPON)
+SYNTAX_ERROR_AT (virtual_location,
+"digit separator outside digit sequence");
+
   /* The suffix may be for decimal fixed-point constants without exponent.  */
   if (radix != 16 && float_flag == NOT_FLOAT)
 {
@@ -520,16 +551,28 @@
 
  /* Exponent is decimal, even if string is a hex float.  */
  if (!ISDIGI

Re: patch to improve register preferencing in IRA and to *remove regmove* pass

2013-10-30 Thread Steven Bosscher
On Tue, Oct 29, 2013 at 4:12 PM, Vladimir Makarov wrote:
>   Tomorrow I'd like commit the following patch.
>
>   The patch removes regmove pass.

I can barely hold my tears... of joy :-)

Attached patch cleans up some left-overs. Nothing to test, really, as
it's just comments and NOPs. OK for trunk?

Ciao!
Steven
* gcse.c (pre_delete): Remove references to regmove from comments.
* recog.c: (validate_replace_rtx_1): Likewise.
* config/rl78/rl78.c: Likewise.
* config/v850/v850.h: Likewise, and remove unused ENABLE_REGMOVE_PASS.
* common/config/m32r/m32r-common.c: Don't manipulate OPT_fregmove.
* common/config/mmix/mmix-common.c: Likewise.

Index: gcse.c
===
--- gcse.c  (revision 204231)
+++ gcse.c  (working copy)
@@ -2535,7 +2535,7 @@ gcse_emit_move_after (rtx dest, rtx src, rtx insn)
 /* Delete redundant computations.
Deletion is done by changing the insn to copy the `reaching_reg' of
the expression into the result of the SET.  It is left to later passes
-   (cprop, cse2, flow, combine, regmove) to propagate the copy or eliminate it.
+   to propagate the copy or eliminate it.
 
Return nonzero if a change is made.  */
 
Index: recog.c
===
--- recog.c (revision 204231)
+++ recog.c (working copy)
@@ -726,7 +726,7 @@ validate_replace_rtx_1 (rtx *loc, rtx from, rtx to
   /* Call ourself recursively to perform the replacements.
  We must not replace inside already replaced expression, otherwise we
  get infinite recursion for replacements like (reg X)->(subreg (reg X))
- done by regmove, so we must special case shared ASM_OPERANDS.  */
+ so we must special case shared ASM_OPERANDS.  */
 
   if (GET_CODE (x) == PARALLEL)
 {
@@ -762,6 +762,7 @@ validate_replace_rtx_1 (rtx *loc, rtx from, rtx to
   if (num_changes == prev_changes)
 return;
 
+  /* ??? The regmove is no more, so is this aberration still necessary?  */
   /* Allow substituted expression to have different mode.  This is used by
  regmove to change mode of pseudo register.  */
   if (fmt[0] == 'e' && GET_MODE (XEXP (x, 0)) != VOIDmode)
Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 204231)
+++ config/rl78/rl78.c  (working copy)
@@ -1894,8 +1894,8 @@ post-reload optimizers could operate on the real r
 tried that there were some issues building the target libraries.
 
 During devirtualization, a simple register move optimizer is run.  It
-would be better to run a full CSE/propogation pass on it through, or
-re-run regmove, but that has not yet been attempted.
+would be better to run a full CSE/propogation pass on it through, bu
+but that has not yet been attempted.
 
  */
 #define DEBUG_ALLOC 0
Index: config/v850/v850.h
===
--- config/v850/v850.h  (revision 204231)
+++ config/v850/v850.h  (working copy)
@@ -954,10 +954,6 @@ extern tree GHS_current_section_names [(int) COUNT
 
 #define FILE_ASM_OP "\t.file\n"
 
-/* Enable the register move pass to improve code.  */
-#define ENABLE_REGMOVE_PASS
-
-
 /* Implement ZDA, TDA, and SDA */
 
 #define EP_REGNUM 30   /* ep register number */
Index: common/config/m32r/m32r-common.c
===
--- common/config/m32r/m32r-common.c(revision 204231)
+++ common/config/m32r/m32r-common.c(working copy)
@@ -29,7 +29,6 @@
 static const struct default_options m32r_option_optimization_table[] =
   {
 { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
-{ OPT_LEVELS_1_PLUS, OPT_fregmove, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 
Index: common/config/mmix/mmix-common.c
===
--- common/config/mmix/mmix-common.c(revision 204231)
+++ common/config/mmix/mmix-common.c(working copy)
@@ -28,7 +28,6 @@ along with GCC; see the file COPYING3.  If not see
 
 static const struct default_options mmix_option_optimization_table[] =
   {
-{ OPT_LEVELS_1_PLUS, OPT_fregmove, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };


RE: [PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Iyer, Balaji V


> -Original Message-
> From: Mike Stump [mailto:mikest...@comcast.net]
> Sent: Wednesday, October 30, 2013 6:48 PM
> To: Iyer, Balaji V
> Cc: 'ger...@pfeifer.com'; Jeff Law; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, committed] libcilkrts - Add check for availability of 
> alloca.h
> (Bug Bootstrap/58918)
> 
> On Oct 30, 2013, at 3:40 PM, "Iyer, Balaji V"  wrote:
> > This attached patch will check for usage of alloca.h before using it. 
> > The
> change is entirely in libcilkrts and I am committing this change as it is 
> pretty
> obvious.
> 
> Uh, no?  Usually runtimes are compiled by the built compiler?!  This built
> compiler _is_ by definition, gcc.  So, one can do:
> 
> #define alloca __builtin_alloca
> 
> safely and be done with it.  So, net result, don't need any config hair.  Is 
> the
> library not compiled by gcc?

Yes, the library is compiled by GCC, but it is also used by LLVM and ICC. So, 
we would like to keep the same code base for all.

Thanks,

Balaji V. Iyer.


Re: [PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 3:40 PM, "Iyer, Balaji V"  wrote:
>   This attached patch will check for usage of alloca.h before using it. 
> The change is entirely in libcilkrts and I am committing this change as it is 
> pretty obvious. 

Uh, no?  Usually runtimes are compiled by the built compiler?!  This built 
compiler _is_ by definition, gcc.  So, one can do:

#define alloca __builtin_alloca

safely and be done with it.  So, net result, don't need any config hair.  Is 
the library not compiled by gcc?

RE: [PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Iyer, Balaji V
Err... I meant to say "...*availability* of alloca.h before using it..."

> -Original Message-
> From: Iyer, Balaji V
> Sent: Wednesday, October 30, 2013 6:41 PM
> To: 'ger...@pfeifer.com'
> Cc: Jeff Law; 'gcc-patches@gcc.gnu.org'
> Subject: [PATCH, committed] libcilkrts - Add check for availability of 
> alloca.h
> (Bug Bootstrap/58918)
> 
> Hello Everyone,
>   This attached patch will check for usage of alloca.h before using it. 
> The
> change is entirely in libcilkrts and I am committing this change as it is 
> pretty
> obvious (the change was lifted from the autconf manual example).
> 
> Thanks,
> 
> Balaji V. Iyer.


Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 3:15 PM, Marek Polacek  wrote:
> I had a quick look at the CLEANUP_STMT and cp-tree.def says
> "A CLEANUP_STMT marks the point at which a declaration is fully
> constructed.", while doc says
> "Used to represent an action that should take place upon exit from the
> enclosing scope.  Typically, these actions are calls to destructors for
> local objects."  Huh?  So, how come it e.g. initializes variables, and on
> the other hand it should run dtors?  I'm baffled (but it's too late for me
> to think clearly ;)).

The dtors only run, after the ctors run.  We mark where the ctors finish spot, 
as the _start_ of the region for which we have to clean up.  Really, the 
cleanup has nothing to do with ctors.  You can have dtors, without any ctors, 
or ctors, without any dtors.

{
  decl d;
  s;
}

transforms into:

<-  start of lifetime of the storage for d
ctor(d)
<-  start of lifetime of the fully constructed object d
s;
<-  end of lifetime of fully constructed object d
dtor(d)
<-  end of the storage of d

CLEANUP_STMT documents when the region protected by the cleanup starts.  One 
want to describe that region is, the end of the ctors, if any, else after the 
storage is allocated.  In the above, that is the second < spot.

Now, in the trees, the above is decl d; ctors; CLEANUP_STMT (s, dtors, d).

s is the region for which the cleanups are active for.  dtors is the cleanup to 
perform on transfer out of that region, and d is the decl related to the 
actions in dtors.

[PATCH, committed] libcilkrts - Add check for availability of alloca.h (Bug Bootstrap/58918)

2013-10-30 Thread Iyer, Balaji V
Hello Everyone, 
This attached patch will check for usage of alloca.h before using it. 
The change is entirely in libcilkrts and I am committing this change as it is 
pretty obvious (the change was lifted from the autconf manual example). 

Thanks,

Balaji V. Iyer.
Index: libcilkrts/configure
===
--- libcilkrts/configure(revision 204231)
+++ libcilkrts/configure(working copy)
@@ -171,6 +171,7 @@
   as_lineno_2=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" 
as_lineno_2a=\$LINENO
   eval 'test \"x\$as_lineno_1'\$as_run'\" != \"x\$as_lineno_2'\$as_run'\" &&
   test \"x\`expr \$as_lineno_1'\$as_run' + 1\`\" = 
\"x\$as_lineno_2'\$as_run'\"' || exit 1
+test \$(( 1 + 1 )) = 2 || exit 1
 
   test -n \"\${ZSH_VERSION+set}\${BASH_VERSION+set}\" || (
 
ECHO='\\'
@@ -178,8 +179,7 @@
 ECHO=\$ECHO\$ECHO\$ECHO\$ECHO\$ECHO\$ECHO
 PATH=/empty FPATH=/empty; export PATH FPATH
 test \"X\`printf %s \$ECHO\`\" = \"X\$ECHO\" \\
-  || test \"X\`print -r -- \$ECHO\`\" = \"X\$ECHO\" ) || exit 1
-test \$(( 1 + 1 )) = 2 || exit 1"
+  || test \"X\`print -r -- \$ECHO\`\" = \"X\$ECHO\" ) || exit 1"
   if (eval "$as_required") 2>/dev/null; then :
   as_have_required=yes
 else
@@ -607,7 +607,6 @@
 toolexeclibdir
 toolexecdir
 CXXCPP
-CPP
 OTOOL64
 OTOOL
 LIPO
@@ -622,8 +621,6 @@
 DUMPBIN
 LD
 FGREP
-EGREP
-GREP
 SED
 LIBTOOL
 MAC_LINKER_SCRIPT_FALSE
@@ -631,6 +628,10 @@
 LINUX_LINKER_SCRIPT_FALSE
 LINUX_LINKER_SCRIPT_TRUE
 config_dir
+EGREP
+GREP
+CPP
+ALLOCA
 multi_basedir
 am__fastdepCC_FALSE
 am__fastdepCC_TRUE
@@ -1614,37 +1615,6 @@
 
 } # ac_fn_c_try_link
 
-# ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES
-# ---
-# Tests whether HEADER exists and can be compiled using the include files in
-# INCLUDES, setting the cache variable VAR accordingly.
-ac_fn_c_check_header_compile ()
-{
-  as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
-  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5
-$as_echo_n "checking for $2... " >&6; }
-if { as_var=$3; eval "test \"\${$as_var+set}\" = set"; }; then :
-  $as_echo_n "(cached) " >&6
-else
-  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-$4
-#include <$2>
-_ACEOF
-if ac_fn_c_try_compile "$LINENO"; then :
-  eval "$3=yes"
-else
-  eval "$3=no"
-fi
-rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
-fi
-eval ac_res=\$$3
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
-$as_echo "$ac_res" >&6; }
-  eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset 
as_lineno;}
-
-} # ac_fn_c_check_header_compile
-
 # ac_fn_c_try_cpp LINENO
 # --
 # Try to preprocess conftest.$ac_ext, and return whether this succeeded.
@@ -1682,48 +1652,6 @@
 
 } # ac_fn_c_try_cpp
 
-# ac_fn_c_try_run LINENO
-# --
-# Try to link conftest.$ac_ext, and return whether this succeeded. Assumes
-# that executables *can* be run.
-ac_fn_c_try_run ()
-{
-  as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
-  if { { ac_try="$ac_link"
-case "(($ac_try" in
-  *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;;
-  *) ac_try_echo=$ac_try;;
-esac
-eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\""
-$as_echo "$ac_try_echo"; } >&5
-  (eval "$ac_link") 2>&5
-  ac_status=$?
-  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
-  test $ac_status = 0; } && { ac_try='./conftest$ac_exeext'
-  { { case "(($ac_try" in
-  *\"* | *\`* | *\\*) ac_try_echo=\$ac_try;;
-  *) ac_try_echo=$ac_try;;
-esac
-eval ac_try_echo="\"\$as_me:${as_lineno-$LINENO}: $ac_try_echo\""
-$as_echo "$ac_try_echo"; } >&5
-  (eval "$ac_try") 2>&5
-  ac_status=$?
-  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
-  test $ac_status = 0; }; }; then :
-  ac_retval=0
-else
-  $as_echo "$as_me: program exited with status $ac_status" >&5
-   $as_echo "$as_me: failed program was:" >&5
-sed 's/^/| /' conftest.$ac_ext >&5
-
-   ac_retval=$ac_status
-fi
-  rm -rf conftest.dSYM conftest_ipa8_conftest.oo
-  eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset 
as_lineno;}
-  return $ac_retval
-
-} # ac_fn_c_try_run
-
 # ac_fn_c_check_func LINENO FUNC VAR
 # --
 # Tests whether FUNC exists, setting the cache variable VAR accordingly
@@ -1791,6 +1719,79 @@
 
 } # ac_fn_c_check_func
 
+# ac_fn_c_try_run LINENO
+# --
+# Try to link conftest.$ac_ext, and return whether this succeeded. Assumes
+# that executables *can* be run.
+ac_fn_c_try_run ()
+{
+  as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_line

Re: [PATCH C++/testsuite] Remove pchtest check objects and compile with current tool

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 3:14 PM, Bernhard Reutner-Fischer  
wrote:
> On 30 October 2013 22:47, Mike Stump  wrote:
>> 
>> Was there a significant purpose for the added C++ comment?  If not, can you 
>> remove that?  If so, can you explain?
> 
> grep -A9 "CONTENTS is" gcc/testsuite/lib/target-supports.exp
> # Assume by default that CONTENTS is C code.
> # Otherwise, code should contain:
> # "// C++" for c++,
> # "! Fortran" for Fortran code,
> # "/* ObjC", for ObjC
> # "// ObjC++" for ObjC++
> # and "// Go" for Go
> # If the tool is ObjC/ObjC++ then we overide the extension to .m/.mm to
> # allow for ObjC/ObjC++ specific flags.
> proc check_compile {basename type contents args} {

Ah, but this is why I asked for a significant purpose?  The language of the 
file selects the options (flags) allowed.  The language is set in your code.  I 
think it was part of trying different ways to fix it, but, it turned out to be 
neither necessary or sufficient in the end.

Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Marek Polacek
Thanks Mike.

I had a quick look at the CLEANUP_STMT and cp-tree.def says
"A CLEANUP_STMT marks the point at which a declaration is fully
constructed.", while doc says
"Used to represent an action that should take place upon exit from the
enclosing scope.  Typically, these actions are calls to destructors for
local objects."  Huh?  So, how come it e.g. initializes variables, and on
the other hand it should run dtors?  I'm baffled (but it's too late for me
to think clearly ;)).

Marek


Re: [PATCH C++/testsuite] Remove pchtest check objects and compile with current tool

2013-10-30 Thread Bernhard Reutner-Fischer
On 30 October 2013 22:47, Mike Stump  wrote:
> On Oct 30, 2013, at 2:56 AM, Bernhard Reutner-Fischer  
> wrote:
>> - set result [check_compile pchtest object "int i;" "-x c-header"]
>> + set result [check_compile pchtest object "$chk_type" "$chk_lang"]
>
> the patch uses chk_type, but, I can't find where it is being set?

hmz yea, that should read $chk_content
>
> Was there a significant purpose for the added C++ comment?  If not, can you 
> remove that?  If so, can you explain?

grep -A9 "CONTENTS is" gcc/testsuite/lib/target-supports.exp
# Assume by default that CONTENTS is C code.
# Otherwise, code should contain:
# "// C++" for c++,
# "! Fortran" for Fortran code,
# "/* ObjC", for ObjC
# "// ObjC++" for ObjC++
# and "// Go" for Go
# If the tool is ObjC/ObjC++ then we overide the extension to .m/.mm to
# allow for ObjC/ObjC++ specific flags.
proc check_compile {basename type contents args} {
>
> Last question I have is the remove-build-file primitive.  I'm wondering on a 
> canadian cross, are the files left over on the build machine, the host 
> machine or both the build machine and the host machine?

I don't really remember, i didn't run canadian cross tests on remote
boxes since ages, TBH.

> I see people use remote_file build delete …, file_on_host delete and 
> remove-build-file.  Some folks even use the plain file delete.  I'd hate to 
> guess which one you need, it hurts my brain.  I think remove-build-file is 
> safe; just don't know if it is best.

remove-build-file certainly wipes it from everywhere so seems the safe bet.
But yes, for this specific pchtest.o's one could refine the delete to
the appropriate build or host. I would think that using plain delete
is wrong everywhere though.

> Anyone else want to weigh in?


[v3 patch] enable commented out tests

2013-10-30 Thread Jonathan Wakely
GCC allows arithmetic on void pointers so std::atomic does too,
but the VERIFY checks in this test were commented out, probably
because they failed due to using sizeof(void*) when they should have
used sizeof(void), which is 1.

2013-10-30  Jonathan Wakely  

* testsuite/29_atomics/atomic/operators/pointer_partial_void.cc: Fix
and enable VERIFY tests.

Tested x86_64-linux, committed to trunk.
commit 0224e88cccd5a4d4bf1f972a292bfc9e58d57382
Author: Jonathan Wakely 
Date:   Wed Oct 30 17:57:50 2013 +

* testsuite/29_atomics/atomic/operators/pointer_partial_void.cc: Fix
and enable VERIFY tests.

diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
index fa936a1..3a4377f 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
@@ -42,28 +42,28 @@ int main(void)
   a++;
   void* vp3(a);
   dist = reinterpret_cast(vp2) - reinterpret_cast(vp3);
-  // VERIFY ( std::abs(dist) == sizeof(void*));
+  VERIFY ( std::abs(dist) == 1 );
 
   // operator--
   void* vp4(a);
   a--;
   void* vp5(a);
   dist = reinterpret_cast(vp4) - reinterpret_cast(vp5);
-  // VERIFY ( std::abs(dist) == sizeof(void*));
+  VERIFY ( std::abs(dist) == 1 );
 
   // operator+=
   void* vp6(a);
   a+=n;
   void* vp7(a);
   dist = reinterpret_cast(vp6) - reinterpret_cast(vp7);
-  // VERIFY ( std::abs(dist) == sizeof(void*) * n);
+  VERIFY ( std::abs(dist) == n );
 
   // operator-=
   void* vp8(a);
   a-=n;
   void* vp9(a);
   dist = reinterpret_cast(vp8) - reinterpret_cast(vp9);
-  //VERIFY ( std::abs(dist) == sizeof(void*) * n);
+  VERIFY ( std::abs(dist) == n );
 
   return 0;
 }


Re: Pre-Patch RFC: proposed changes to option-lookup

2013-10-30 Thread Steven Bosscher
On Wed, Oct 30, 2013 at 9:39 PM, David Malcolm wrote:

> I want to eliminate hidden use of the preprocessor in our code, in favor
> of using block caps to signal to people reading the code that macro
> magic is happening.

Good idea. In the past this kind of change would be sort-of
controversial (large number of changes for small gain) but if the
whole source code base is being overhauled anyway, it's a good time to
get rid of this black magic at the same time...


> My idea is to introduce a GCC_OPTION macro, and replace the above with:
>
>   static bool
>   gate_vrp (void)
>   {
> return GCC_OPTION (flag_tree_vrp) != 0;
>   }
>
> thus signaling to humans that macros are present.
>
> Is such a patch likely to be accepted?   Should I try to break the
> options up into logical groups e.g. with separate macros for warnings vs
> optimizations, or some other scheme?

Why not just expose the different "classes" of options in some struct or object?

struct {
  unsigned tree_vrp : 1;
  ...
} opt_flags;

struct {
  ...
} warn_flags;

static bool gate_vrp (void) { return opt_flags.tree_vrp != 0; }

This would also make streaming the flags settings easier for LTO.

Ciao!
Steven


Re: [PATCH C++/testsuite] Remove pchtest check objects and compile with current tool

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 2:56 AM, Bernhard Reutner-Fischer  
wrote:
> - set result [check_compile pchtest object "int i;" "-x c-header"]
> + set result [check_compile pchtest object "$chk_type" "$chk_lang"]

the patch uses chk_type, but, I can't find where it is being set?

Was there a significant purpose for the added C++ comment?  If not, can you 
remove that?  If so, can you explain?

Last question I have is the remove-build-file primitive.  I'm wondering on a 
canadian cross, are the files left over on the build machine, the host machine 
or both the build machine and the host machine?
I see people use remote_file build delete …, file_on_host delete and 
remove-build-file.  Some folks even use the plain file delete.  I'd hate to 
guess which one you need, it hurts my brain.  I think remove-build-file is 
safe; just don't know if it is best.  Anyone else want to weigh in?

Pre-Patch RFC: proposed changes to option-lookup

2013-10-30 Thread David Malcolm
[Sending this to gcc-patches to double-check that the idea is sound
before continuing to work on this large patch. [1] ]

I want to eliminate hidden use of the preprocessor in our code, in favor
of using block caps to signal to people reading the code that macro
magic is happening.

As a specific example, consider this supposedly-simple code:

  static bool
  gate_vrp (void)
  {
return flag_tree_vrp != 0;
  }

where "flag_tree_vrp" is actually an autogenerated macro to
"global_options.x_flag_tree_vrp"

This is deeply confusing to a newbie - and indeed still to me after two
years of working with GCC's internals, for example, when stepping
through code and trying to query values in gdb.

My idea is to introduce a GCC_OPTION macro, and replace the above with:

  static bool
  gate_vrp (void)
  {
return GCC_OPTION (flag_tree_vrp) != 0;
  }

thus signaling to humans that macros are present.

Is such a patch likely to be accepted?   Should I try to break the
options up into logical groups e.g. with separate macros for warnings vs
optimizations, or some other scheme?

Thanks
Dave
[1] fwiw, not-yet-working version of script to create patch can be seen
at:
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/refactor_options.py
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor_options.py




Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Joseph S. Myers
On Wed, 30 Oct 2013, Tobias Burnus wrote:

> On the other hand, one could use the existence of libcilkrts* as detected by
> the patch to decide whether to link or not: If the library is there, one can
> link – if not found, it is unlikely to work (unless it is, e.g. found in
> /usr/lib).

The way to detect it is to try to link with it, rather than just looking 
for it in the build tree.  If you just look for it in the build tree and 
assume it's missing if not found there, you break installed testing.  See 
Andrew's recent patch fixing this for asan/ubsan testing (or what I did 
for tests linking with libatomic on the C11-atomic branch).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFA [reload]: Fix PR other/58545

2013-10-30 Thread Jeff Law

On 10/03/13 14:23, Joern Rennecke wrote:


2013-10-02  Joern Rennecke

PR other/58545
* reload1.c (update_eliminables_and_spill): New function, broken
out of reload.
(reload): Use it.  Check for frame size change after frame
size alignment, and call update_eliminables_and_spill first
if continue-ing.

With a testcase included, this is OK.

I still think we can get extra alignments, but the impact of them is 
reduced since we can use those slots for other things.


jeff


[PATCH] arm: emit neon alignment hints for 32/16-bit loads/stores

2013-10-30 Thread Mans Rullgard
---
 gcc/config/arm/arm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8c9897e..8183a8e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21247,6 +21247,10 @@ arm_print_operand (FILE *stream, rtx x, int code)
  align_bits = 128;
else if (memsize >= 8 && (align % 8) == 0)
  align_bits = 64;
+   else if (memsize == 4 && (align % 4) == 0)
+ align_bits = 32;
+   else if (memsize == 2 && (align % 2) == 0)
+ align_bits = 16;
else
  align_bits = 0;
 
-- 
1.8.4



Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Mike Stump
On Oct 30, 2013, at 9:15 AM, Marek Polacek  wrote:
> I admit I don't understand the cleanup_points very much and I don't
> know exactly where they are coming from

So, here is the mental model…  and how it is related to the standard.  C++ 
mandates that destructors for objects and temporary objects run no sooner than 
a certain place, and no later than another place.  In the implementation, we 
choose a single point to run them, and use a cleanup point as the embodiment of 
when destructors run.  For example:

cleanup (a + cleanup (b - c))

means generate this:

a
b
c
-
dtors for things related to b-c
+
dtors for things related to a+ (b-c)

that's it.  Pretty simple.  Now, cute little details, once you get past the 
simplicity, would be things like, if you run the cleanups for b-c, at the first 
dtor line above, do you also run those same things at the lower point?  That 
answer is no, they only run once.  If one takes an exception out of that 
region, does the cleanup action run?  That answer is yes.  Lots of other 
possible questions like this, all with fairly simple, easy to understand 
answers.  Just ask.

Now, some advanced topics…  So, one thing you discover, if you _add_ a cleanup 
point into an expression, it will run those actions sooner that they would have 
run, if you had not.  One cannot meet the requirements of the language standard 
and just arbitrarily add cleanup points.  However, constructs beyond the 
language standard, say ({ s1; s2; s3; }) + b;, one discovers that the 
implementation is free to decide if there is a cleanup point for ({ }) or not.  
The language standard places no requirements on such code, and this is why we 
can decide.

decl cleanups are strongly related to these sorts of cleanups, but lie just 
outside (enclosing).  I'll note their existence for completeness.  See 
CLEANUP_STMT for these.

Re: [PATCH] decide edge's hotness when there is profile info

2013-10-30 Thread Teresa Johnson
On Fri, Oct 18, 2013 at 2:17 PM, Jan Hubicka  wrote:
>> Here is the patch updated to use the new parameter from r203830.
>> Passed bootstrap and regression tests.
>>
>> 2013-10-18  Jan Hubicka  
>> Teresa Johnson  
>>
>> * predict.c (handle_missing_profiles): New function.
>> (counts_to_freqs): Don't overwrite estimated frequencies
>> when function has no profile counts.
>> * predict.h (handle_missing_profiles): Declare.
>> * tree-profile.c (tree_profiling): Invoke handle_missing_profiles.
>>
>> Index: predict.c
>> ===
>> --- predict.c   (revision 203830)
>> +++ predict.c   (working copy)
>> @@ -2758,6 +2758,40 @@ estimate_loops (void)
>>BITMAP_FREE (tovisit);
>>  }
>>
>
> You need block comment. It should explain the problem of COMDATs and how they 
> are handled.
> It is not an obvious thing.

Done.

>
>> +void
>> +handle_missing_profiles (void)
>> +{
>> +  struct cgraph_node *node;
>> +  int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION);
> extra line
>> +  /* See if 0 count function has non-0 count callers.  In this case we
>> + lost some profile.  Drop its function profile to PROFILE_GUESSED.  */
>> +  FOR_EACH_DEFINED_FUNCTION (node)
>> +{
>> +  struct cgraph_edge *e;
>> +  gcov_type call_count = 0;
>> +  struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
> Extra line
>> +  if (node->count)
>> +continue;
>> +  for (e = node->callers; e; e = e->next_caller)
>> +call_count += e->count;
> What about checking if the sum is way off even for non-0 counts.
> I.e. for case where function was inlined to some calls but not to others?  In
> that case we may want to scale up the counts (with some resonable care for
> capping)

In this patch I am not changing any counts, so I am leaving this one
for follow-on work (even for the routines missing counts completely
like these, we don't apply any counts, just mark them as guessed. I
have a follow-on patch to send once this goes in that does apply
counts to these 0-count routines only when we decide to inline as we
discussed).

>
> Also I think the code really should propagate - i.e. have simple worklist and
> look for functions that are COMDAT and are called by other COMDAT functions
> where we decided to drop the profile.  Having non-trivial chains of comdats is
> common thing.

Done.

>
> What about outputting user visible warning/error on the incosnsistency when
> no COMDAT function is involved same way as we do for BB profile?

Done. This one caught the fact that we have this situation for "extern
template" functions as well when I tested on cpu2006. I added in a
check to ignore those when issuing the warning (they are not emitted
thus don't get any profile counts).

Updated patch below.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Also tested on
profiledbootstrap and profile-use build of SPEC cpu2006. Ok for trunk?

Thanks,
Teresa

2013-10-30  Teresa Johnson  

* predict.c (drop_profile): New function.
(handle_missing_profiles): Ditto.
(counts_to_freqs): Don't overwrite estimated frequencies
when function has no profile counts.
* predict.h (handle_missing_profiles): Declare.
* tree-profile.c (tree_profiling): Invoke handle_missing_profiles.

Index: predict.c
===
--- predict.c   (revision 204213)
+++ predict.c   (working copy)
@@ -2765,6 +2765,107 @@ estimate_loops (void)
   BITMAP_FREE (tovisit);
 }

+/* Drop the profile for NODE to guessed, and update its frequency based on
+   whether it is expected to be HOT.  */
+
+static void
+drop_profile (struct cgraph_node *node, bool hot)
+{
+  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
+
+  if (dump_file)
+fprintf (dump_file,
+ "Dropping 0 profile for %s/%i. %s based on calls.\n",
+ cgraph_node_name (node), node->order,
+ hot ? "Function is hot" : "Function is normal");
+  /* We only expect to miss profiles for functions that are reached
+ via non-zero call edges in cases where the function may have
+ been linked from another module or library (COMDATs and extern
+ templates). See the comments below for handle_missing_profiles.  */
+  if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl))
+warning (0,
+ "Missing counts for called function %s/%i",
+ cgraph_node_name (node), node->order);
+
+  profile_status_for_function (fn)
+  = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
+  node->frequency
+  = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
+}
+
+/* In the case of COMDAT routines, multiple object files will contain the same
+   function and the linker will select one for the binary. In that case
+   all the other copies from the profile instrument binary will be missing
+   profi

Re: [PATCH 1/n] Add conditional compare support

2013-10-30 Thread Richard Henderson
> +/* RCODE0, RCODE1 and a valid return value should be enum rtx_code.
> +   TCODE should be enum tree_code.
> +   Check whether two compares are a valid combination in the target to 
> generate
> +   a conditional compare.  If valid, return the new compare after 
> combination.
> +   */
> +DEFHOOK
> +(legitimize_cmp_combination,
> + "This target hook returns the dominance compare if the two compares are\n\
> +a valid combination.  This target hook is required only when the target\n\
> +supports conditional compare, such as ARM.",
> + int, (int rcode0, int rcode1, int tcode),
> + default_legitimize_cmp_combination)
> +
> +/* RCODE0, RCODE1 and a valid return value should be enum rtx_code.
> +   TCODE should be enum tree_code.
> +   Check whether two compares are a valid combination in the target to 
> generate
> +   a conditional compare.  If valid, return the new compare after 
> combination.
> +   The difference from legitimize_cmp_combination is that its first compare 
> is
> +   the result of a previous conditional compare, which leads to more 
> constrain
> +   on it, since no way to swap the two compares.  */
> +DEFHOOK
> +(legitimize_ccmp_combination,
> + "This target hook returns the dominance compare if the two compares are\n\
> +a valid combination.  This target hook is required only when the target\n\
> +supports conditional compare, such as ARM.",
> + int, (int rcode0, int rcode1, int tcode),
> + default_legitimize_ccmp_combination)
> +

Why do these hooks still exist?  They should be redundant with ...

> +/* CMP0 and CMP1 are two compares.  USED_AS_CC_P indicates whether the target
> +   is used as CC or not.  TCODE should be enum tree_code.
> +   The hook will return a condtional compare RTX if all checkes are OK.  */
> +DEFHOOK
> +(gen_ccmp_with_cmp_cmp,
> + "This target hook returns a condtional compare RTX if the two compares 
> are\n\
> +a valid combination.  This target hook is required only when the target\n\
> +supports conditional compare, such as ARM.",
> + rtx, (gimple cmp0, gimple cmp1, int tcode, bool used_as_cc_p),
> + default_gen_ccmp_with_cmp_cmp)
> +
> +/* CC is the result of a previous conditional compare.  CMP1 is a compare.
> +   USED_AS_CC_P indicates whether the target is used as CC or not.
> +   TCODE should be enum tree_code.
> +   The hook will return a condtional compare rtx if all checkes are OK.  */
> +DEFHOOK
> +(gen_ccmp_with_ccmp_cmp,
> + "This target hook returns a condtional compare RTX if the CC and CMP1 are\n\
> +a valid combination.  This target hook is required only when the target\n\
> +supports conditional compare, such as ARM.",
> + rtx, (rtx cc, gimple cmp1, int tcode, bool used_as_cc_p),
> + default_gen_ccmp_with_ccmp_cmp)
> +

... these.

Why in the world are you passing down gimple to the backends?  The
expand_operands done therein should have been done in expr.c.

The hooks are still not what I suggested, particularly gen_ccmp_with_cmp_cmp is
what I was trying to avoid, passing down two initial compares like that.

To be 100% explicit this time, I think the hooks should be

DEFHOOK
(gen_ccmp_first,
 "This function emits a comparison insn for the first of a sequence of\n\
conditional comparisions.  It returns a comparison expression appropriate\n\
for passing to @code{gen_ccmp_next} or to @code{cbranch_optab}.",
 rtx, (int code, rtx op0, rtx op1),
 NULL)

DEFHOOK
(gen_ccmp_next,
 "This function emits a conditional comparison within a sequence of\n\
conditional comparisons.  The @code{prev} expression is the result of a\n\
prior call to @code{gen_ccmp_first} or @code{gen_ccmp_next}.  It may return\n\
@code{NULL} if the combination of @code{prev} and this comparison is\n\
not supported, otherwise the result must be appropriate for passing to
@code{gen_ccmp_next} or @code{cbranch_optab}.",
 rtx, (rtx prev, int code, rtx op0, rtx op1),
 NULL)

All of your existing tests for HAVE_ccmp should be replaced with

  if (targetm.gen_ccmp_first == NULL)
return; /* No ccmp supported. */
  gcc_checking_assert(targetm.gen_ccmp_next != NULL);


r~


Re: RFA [reload]: Fix PR other/58545

2013-10-30 Thread Jeff Law

On 10/11/13 08:40, Joern Rennecke wrote:

On 11 October 2013 04:53, Jeff Law  wrote:
  > With your change it seems to me that we do a single round of spilling &

caller-save setup, then align the stack, then restart.  The net result being
we align the stack a lot more often.


Yes, but AFAICT, that should not result in more space being used,
because assign_stack_local uses ASLK_RECORD_PAD, so
assign_stack_local_1 will record any space
added for alignment purposes with add_frame_space, so it can be used
for a subsequent spill of suitable size.
You're right -- I wasn't aware of Bernd's work to reuse alignment 
paddings for real objects.  It was added a few months after I'd been 
poking at problems in that loop of reload.  Let me look at this stuff again.


jeff



Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Cong Hou
I have run check_GNU_style.sh on my patch.

The patch is submitted. Thank you for your comments and help on this patch!



thanks,
Cong


On Wed, Oct 30, 2013 at 11:13 AM, Uros Bizjak  wrote:
> On Wed, Oct 30, 2013 at 7:01 PM, Cong Hou  wrote:
>
 I found my problem: I put DONE outside of if not inside. You are
 right. I have updated my patch.
>>>
>>> OK, great that we put things in order ;)
>>>
>>> Does this patch need some extra middle-end functionality? I was not
>>> able to vectorize char and short part of your patch.
>>
>>
>> In the original patch, I converted abs() on short and char values to
>> their own types by removing type casts. That is, originally char_val1
>> = abs(char_val2) will be converted to char_val1 = (char) abs((int)
>> char_val2) in the frontend, and I would like to convert it back to
>> char_val1 = abs(char_val2). But after several discussions, it seems
>> this conversion has some problems such as overflow converns, and I
>> thereby removed that part.
>>
>> Now you should still be able to vectorize abs(char) and abs(short) but
>> with packing and unpacking. Later I will consider to write pattern
>> recognizer for abs(char) and abs(short) and then the expand on
>> abs(char)/abs(short) in this patch will be used during vectorization.
>
> OK, this seems reasonable. We already have "unused" SSSE3 8/16 bit abs
> pattern, so I think we can commit SSE2 expanders, even if they will be
> unused for now. The proposed recognizer will benefit SSE2 as well as
> existing SSSE3 patterns.
>
>>> Regarding the testcase - please put it to gcc.target/i386/ directory.
>>> There is nothing generic in the test, as confirmed by target-dependent
>>> scan test. You will find plenty of examples in the mentioned
>>> directory. I'd suggest to split the testcase in three files, and to
>>> simplify it to something like the testcase with global variables I
>>> used earlier.
>>
>>
>> I have done it. The test case is split into three for s8/s16/s32 in
>> gcc.target/i386.
>
> OK.
>
> The patch is OK for mainline, but please check formatting and
> whitespace before the patch is committed.
>
> Thanks,
> Uros.


Re: PATCH to use -Wno-format during stage1

2013-10-30 Thread Ian Lance Taylor
On Wed, Oct 30, 2013 at 8:47 AM, Jason Merrill  wrote:
> I find -Wformat warnings about unknown % codes when building with an older
> GCC to be mildly annoying noise; this patch avoids them by passing
> -Wno-format during stage 1.
>
> Tested x86_64-pc-linux-gnu.  Is this OK for trunk?  Do you have another
> theory of how this should work?

Thank you!

Ian


Two tiny C++ cleanup PATCHes

2013-10-30 Thread Jason Merrill

1) We want to skip anonymous structs when pushing member cleanups, too.
2) We shouldn't create a static variable for a compound literal in e.g. 
sizeof.


I don't think either of these will actually affect compiler output, 
they're just cleanups I noticed while working on other things.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit beac93e0a45d47f6a896d9f3f27eed224ad51ee9
Author: Jason Merrill 
Date:   Tue Oct 22 16:32:07 2013 -0400

	* init.c (push_base_cleanups): Check ANON_AGGR_TYPE_P.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 78ea986..82b3cae 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -4130,7 +4130,7 @@ push_base_cleanups (void)
 	  || TREE_CODE (member) != FIELD_DECL
 	  || DECL_ARTIFICIAL (member))
 	continue;
-  if (ANON_UNION_TYPE_P (this_type))
+  if (ANON_AGGR_TYPE_P (this_type))
 	continue;
   if (type_build_dtor_call (this_type))
 	{
commit d643f3c28260e0295e6ac3dd0b4d80e26b00ea3f
Author: Jason Merrill 
Date:   Tue Oct 22 16:33:38 2013 -0400

	* semantics.c (finish_compound_literal): Don't create a static variable
	inside cp_unevaluated_operand.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 052746c..a54123e 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2501,6 +2501,7 @@ finish_compound_literal (tree type, tree compound_literal,
   if ((!at_function_scope_p () || CP_TYPE_CONST_P (type))
   && TREE_CODE (type) == ARRAY_TYPE
   && !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)
+  && !cp_unevaluated_operand
   && initializer_constant_valid_p (compound_literal, type))
 {
   tree decl = create_temporary_var (type);


[v3 patch] Extend smart ptr assertions to reject void*

2013-10-30 Thread Jonathan Wakely
Because of the GNU extension that allows sizeof(void) we fail to
reject ill-formed programs. This patch fixes that.

2013-10-30  Jonathan Wakely  

* include/bits/shared_ptr (__shared_ptr): Assert non-void pointer.
* include/bits/shared_ptr (default_delete): Likewise.
* include/backward/auto_ptr.h (__shared_ptr(auto_ptr&&)): Likewise.
* testsuite/20_util/shared_ptr/cons/58839.cc: Do not use
default_delete.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: New.
* testsuite/20_util/default_delete/void_neg.cc: New.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line numbers.
* testsuite/20_util/unique_ptr/assign/48635_neg.cc: Likewise.

Tested x86_64-linux, committed to trunk.
commit 5b0ffcc57b7126016eda484e56eda3eba0a0ec90
Author: Jonathan Wakely 
Date:   Wed Oct 30 17:50:39 2013 +

* include/bits/shared_ptr (__shared_ptr): Assert non-void pointer.
* include/bits/shared_ptr (default_delete): Likewise.
* testsuite/20_util/shared_ptr/cons/58839.cc: Do not use
default_delete.
* testsuite/20_util/shared_ptr/cons/void_neg.cc: New.
* testsuite/20_util/default_delete/void_neg.cc: New.
* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line numbers.
* testsuite/20_util/unique_ptr/assign/48635_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 91b6367..cf90d7a 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -775,6 +775,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : _M_ptr(__p), _M_refcount(__p)
{
  __glibcxx_function_requires(_ConvertibleConcept<_Tp1*, _Tp*>)
+ static_assert( !is_void<_Tp>::value, "incomplete type" );
  static_assert( sizeof(_Tp1) > 0, "incomplete type" );
  __enable_shared_from_this_helper(_M_refcount, __p, __p);
}
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index c6c9a5a..bfe40ec 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -69,6 +69,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   operator()(_Tp* __ptr) const
   {
+   static_assert(!is_void<_Tp>::value,
+ "can't delete pointer to incomplete type");
static_assert(sizeof(_Tp)>0,
  "can't delete pointer to incomplete type");
delete __ptr;
diff --git a/libstdc++-v3/testsuite/20_util/default_delete/void_neg.cc 
b/libstdc++-v3/testsuite/20_util/default_delete/void_neg.cc
new file mode 100644
index 000..79786cb
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/default_delete/void_neg.cc
@@ -0,0 +1,30 @@
+// { dg-options "-std=gnu++11" }
+// { dg-do compile }
+
+// Copyright (C) 2013 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// 20.8.1.1 Default deleters [util.ptr.dltr]
+
+#include 
+
+void test01()
+{
+  std::default_delete d;
+  d(nullptr);   // { dg-error "here" }
+  // { dg-error "incomplete" "" { target *-*-* } 72 }
+}
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
index fd2a677..db3fcac 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
@@ -32,7 +32,7 @@ void test01()
 {
   X* px = 0;
   std::shared_ptr p1(px);   // { dg-error "here" }
-  // { dg-error "incomplete" "" { target *-*-* } 778 }
+  // { dg-error "incomplete" "" { target *-*-* } 779 }
 
   std::shared_ptr p9(ap());  // { dg-error "here" }
   // { dg-error "incomplete" "" { target *-*-* } 307 }
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/58839.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/58839.cc
index 6ad2564..f78a07f 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/58839.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/58839.cc
@@ -22,8 +22,12 @@
 
 // libstdc++/58839
 
+struct D {
+  void operator()(void*) const noexcept { }
+};
+
 void test01()
 {
-  std::unique_ptr y;
+  std::unique_ptr y;
   std::shared_ptr x = std::move(y);
 }
diff --git a/libstdc++-v3/t

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Jeff Law

On 10/30/13 12:12, Ilya Enkovich wrote:


GIMPLE layout depending on flag_check_pointer_bounds sounds like
a recipie for desaster if you consider TUs compiled with and TUs
compiled without and LTO.  Or if you consider using optimized
attribute with that flag.

Sorry, I don't follow.  Can you elaborate please.


I suppose the possile problem here is when we run LTO compiler
without -fcheck-pointer-bounds and give instrumented code as input.
gimple_call_nobnd_arg would work wrong for instrumented code.
Actually there are other places in subsequent patches wich assume
that flag_check_pointer_bounds is 1 if we have instrumented code.
OK, I can see how that would be problematical.  I'm not entirely sure 
how you're going to avoid that problem with the argument passing scheme 
you've built.


At the least,  I think an error message would be appropriate if you 
encounter instrumented code and -fcheck-pointer-bounds isn't on.



Jeff



Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Jeff Law

On 10/30/13 04:34, Tobias Burnus wrote:

Without that patch, which I have copied from asan-dg.exp, I get tons of
failures because "ld" cannot find libcilkrts.

OK for committal?

Tobias

cilk.diff


2013-10-30  Tobias Burnus

* gcc.dg/cilk-plus/cilk-plus.exp: Add the libcilkrts library
path to the compile flags.
Yea, seems good to me.  g++.dg/cilk-plus/cilk-plus.exp may have the same 
problem (haven't looked).


jeff



Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Jeff Law

On 10/30/13 13:27, Iyer, Balaji V wrote:

* * *

Actually, I was wondering whether -fcilkplus should always automatically link
libcilkrts - akin to -fopenmp which links libgomp. Currently, one has to 
specify it
manually.*


Yes, I would like that to happen. Do you have any pointers about how to do that?

LINK_SPEC and its friends.

jeff



Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Tobias Burnus

Iyer, Balaji V wrote:

* * *

Actually, I was wondering whether -fcilkplus should always automatically link
libcilkrts - akin to -fopenmp which links libgomp. Currently, one has to 
specify it
manually.*

Yes, I would like that to happen. Do you have any pointers about how to do that?


Well, if you need some special libraries like librt, see:
libgomp/libgomp.spec.in
libgomp/configure.ac (search for "link_gomp").

Otherwise, simply have a look at gcc/gcc.c – search for "fopenmp" (which 
adds -pthread).


Tobias


Re: [C++ Patch] PR 58581

2013-10-30 Thread Jason Merrill

OK.

Jason


RE: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Iyer, Balaji V
> * * *
> 
> Actually, I was wondering whether -fcilkplus should always automatically link
> libcilkrts - akin to -fopenmp which links libgomp. Currently, one has to 
> specify it
> manually.*

Yes, I would like that to happen. Do you have any pointers about how to do that?


Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Tobias Burnus

Iyer, Balaji V wrote:
What I ideally wanted to do with my testsuite files was that I want 
all the Cilk keywords test to compile no matter what the architecture 
is, but it should only run in certain architectures where the runtime 
is enabled (this is known statically and thus the testsuite doesn't 
have to do anything to figure it out.). Can someone please tell me how 
do I do this?



Which is a bit orthogonal to my patch, which helps that the tests pass 
on systems which are supported. – Thus, let's start with the question 
what do you think of that patch?


On the other hand, one could use the existence of libcilkrts* as 
detected by the patch to decide whether to link or not: If the library 
is there, one can link – if not found, it is unlikely to work (unless it 
is, e.g. found in /usr/lib).


* * *

Regarding compile vs. run: I think one possibility would be to have no 
"dg-do" in the files and simply change the default to compile or run, 
depending whether the architecture is supported. On the other hand, that 
can be confusing as an explicit "dg-do run" will break it on some systems.


* * *

Actually, I was wondering whether -fcilkplus should always automatically 
link libcilkrts – akin to -fopenmp which links libgomp. Currently, one 
has to specify it manually.*


Or are the features which do not need libcilkplus common enough that one 
doesn't always want to link it?


[For OpenMP, GCC will have -fopenmp-simd [patch posted, 1], which 
doesn't link libgomp. I could imagine that one would like to have Cilk 
Plus' "#pragma simd" and the array syntax without enabling threads 
(cilk_sync, cilk_spawn, cilk_for, reducers – and thus libcilkrts linkage).]



Tobias

* libgomp is handled via spec files – including one which adds "librt" 
when libgomp is linked.


[1] -fopenmp-simd: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02275.html


Re: [PATCH][RFC] fix reload causing ICE in subreg_get_info on m68k (PR58369)

2013-10-30 Thread Jeff Law

On 10/18/13 14:17, Mikael Pettersson wrote:

  > Thanks Mikael.  My only concern is the lack of adjustment when the value
  > found was already a SUBREG.
  >
  > ie, let's assume rld[r].in_reg was something like
  > (subreg:XF (reg:DF) 0)
  >
  > and our target is (reg:DF)
  >
  > In this case it seems to me we still want to compute the subreg offset,
  > right?
  >
  > jeff

Thanks Jeff.  Sorry about the delay, but it took me a while to work
out the details of the various cases (I was afraid of having to do
something like simplify_subreg, but the cases in reload are simpler),
and then to re-test on my various targets.
No worries.  This stuff is complex and taking the time to thoroughly 
analyze the cases and test is warranted and greatly appreciated.







Let the reloadee be rld[r].in_reg, outermode be its mode, and innermode
be the mode of the hard reg in last_reg.

The trivial cases are when the reloadee is a plain REG or a SUBREG of a
hard reg.  For these, reload virtually forms a normal lowpart subreg of
last_reg, and subreg_lowpart_offset (outermode, innermode) computes the
endian-correct offset for subreg_regno_offset.  This is exactly what my
previous patch did.

Right.




In remaining cases the reloadee is a SUBREG of a pseudo.  Let outer_offset
be its BYTE, and middlemode be the mode of its REG.

Another simple case is when the reloadee is paradoxical.  Then outer_offset
is zero (by convention), and reload should just form a normal lowpart
subreg as in the trivial cases.  Even though the reloadee is paradoxical,
this subreg will fit thanks to the mode size check on lines 6546-6547.

Agreed.





If the reloadee is a normal lowpart SUBREG, then again reload should just
form a normal lowpart subreg as in the trivial cases.  (But see below.)

The tricky case is when the reloadee is a normal but not lowpart SUBREG.
We get the correct offset for reload's virtual subreg by adding outer_offset
to the lowpart offset of middlemode and innermode.  This works for both
big-endian and little-endian.  It also handles normal lowpart SUBREGs,
so we don't need to check for lowpart vs non-lowpart normal SUBREGs.

Sounds right.




Tested with trunk and 4.8 on {m68k,sparc64,powerpc64,x86_64}-linux-gnu
and armv5tel-linux-gnueabi.  No regressions.

Ok for trunk?

gcc/

2013-10-18  Mikael Pettersson  

PR rtl-optimization/58369
* reload1.c (compute_reload_subreg_offset): Define.
(choose_reload_regs): Use it to pass endian-correct
offset to subreg_regno_offset.
I fixed a few trivial whitespace issues and added a testcase.  Installed 
onto the trunk on your behalf.


Thanks for your patience,

Jeff

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ef40a43..e3d2abd 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-10-18  Mikael Pettersson  
+
+   PR rtl-optimization/58369
+   * reload1.c (compute_reload_subreg_offset): New function.
+   (choose_reload_regs): Use it to pass endian-correct
+   offset to subreg_regno_offset.
+
 2013-10-30  Tobias Burnus  
 
PR other/33426
diff --git a/gcc/reload1.c b/gcc/reload1.c
index d56c554..b62b047 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -6371,6 +6371,37 @@ replaced_subreg (rtx x)
 }
 #endif
 
+/* Compute the offset to pass to subreg_regno_offset, for a pseudo of
+   mode OUTERMODE that is available in a hard reg of mode INNERMODE.
+   SUBREG is non-NULL if the pseudo is a subreg whose reg is a pseudo,
+   otherwise it is NULL.  */
+
+static int
+compute_reload_subreg_offset (enum machine_mode outermode,
+ rtx subreg,
+ enum machine_mode innermode)
+{
+  int outer_offset;
+  enum machine_mode middlemode;
+
+  if (!subreg)
+return subreg_lowpart_offset (outermode, innermode);
+
+  outer_offset = SUBREG_BYTE (subreg);
+  middlemode = GET_MODE (SUBREG_REG (subreg));
+
+  /* If SUBREG is paradoxical then return the normal lowpart offset
+ for OUTERMODE and INNERMODE.  Our caller has already checked
+ that OUTERMODE fits in INNERMODE.  */
+  if (outer_offset == 0
+  && GET_MODE_SIZE (outermode) > GET_MODE_SIZE (middlemode))
+return subreg_lowpart_offset (outermode, innermode);
+
+  /* SUBREG is normal, but may not be lowpart; return OUTER_OFFSET
+ plus the normal lowpart offset for MIDDLEMODE and INNERMODE.  */
+  return outer_offset + subreg_lowpart_offset (middlemode, innermode);
+}
+
 /* Assign hard reg targets for the pseudo-registers we must reload
into hard regs for this insn.
Also output the instructions to copy them in and out of the hard regs.
@@ -6508,6 +6539,7 @@ choose_reload_regs (struct insn_chain *chain)
  int byte = 0;
  int regno = -1;
  enum machine_mode mode = VOIDmode;
+ rtx subreg = NULL_RTX;
 
  if (rld[r].in == 0)
;
@@ -6528,7 +6560,10 @@ choose_reload_regs (struct insn_chain *chain)
  if (regno < FIRST

[C++ Patch] PR 58581

2013-10-30 Thread Paolo Carlini

Hi,

to resolve this simple ICE we only have to check the return value of 
mark_used, like we do in a number of other places. Tested x86_64-linux.


Thanks!
Paolo.

/
/cp
2013-10-30  Paolo Carlini  

PR c++/58581
* call.c (build_over_call): Check return value of mark_used.

/testsuite
2013-10-30  Paolo Carlini  

PR c++/58581
* g++.dg/cpp0x/deleted1.C: New.
Index: cp/call.c
===
--- cp/call.c   (revision 204219)
+++ cp/call.c   (working copy)
@@ -7112,8 +7112,9 @@ build_over_call (struct z_candidate *cand, int fla
mark_versions_used (fn);
 }
 
-  if (!already_used)
-mark_used (fn);
+  if (!already_used
+  && !mark_used (fn))
+return error_mark_node;
 
   if (DECL_VINDEX (fn) && (flags & LOOKUP_NONVIRTUAL) == 0
   /* Don't mess with virtual lookup in fold_non_dependent_expr; virtual
Index: testsuite/g++.dg/cpp0x/deleted1.C
===
--- testsuite/g++.dg/cpp0x/deleted1.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/deleted1.C   (working copy)
@@ -0,0 +1,6 @@
+// PR c++/58581
+// { dg-do compile { target c++11 } }
+
+template int foo(T) noexcept(T()) = delete;
+
+int i = foo(0);   // { dg-error "deleted" }


Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich
2013/10/30 Jeff Law :
> On 10/30/13 04:34, Ilya Enkovich wrote:
>>
>> On 30 Oct 10:26, Richard Biener wrote:
>>>
>>>
>>> Ick - you enlarge all return statements?  But you don't set the
>>> actual value? So why allocate it with 2 ops in the first place??
>>
>>
>> When return does not return bounds it has operand with zero value
>> similar to case when it does not return value. What is the difference
>> then?
>
> In general, when someone proposes a change in the size of tree, rtl or
> gimple nodes, it's a "yellow flag" that something may need further
> investigation.
>
> In this specific instance, I could trivially predict how that additional
> field would be used and a GIMPLE_RETURN isn't terribly important from a size
> standpoint, so I didn't call it out.
>
>
>
>> Returns instrumentation. We add new operand to return statement to
>> hold returned bounds and instrumentation pass is responsible to fill
>> this operand with correct bounds
>
> Exactly what I expected.
>
>
>>
>> Unfortunately patch has been already installed.  Should we uninstall
>> it?  If not, then here is patch for documentation.
>
> I think we're OK for now.  If Richi wants it out, he'll say so explicitly.
>
>
>
>>
>> Thanks, Ilya --
>>
>> gcc/
>>
>> 2013-10-30  Ilya Enkovich  
>>
>> * doc/gimple.texi (gimple_call_num_nobnd_args): New.
>> (gimple_call_nobnd_arg): New. (gimple_return_retbnd): New.
>> (gimple_return_set_retbnd): New. (gimple_call_get_nobnd_arg_index):
>> New.
>
> Can you also fixup the GIMPLE_RETURN documentation in gimple.texi.  It needs
> a minor update after these changes.

I could not find anything but accessors for GIMPLE_RETURN in
gimple.texi. And new accessors are in my doc patch already.

Ilya
>
> jeff
>


Re: C++ PATCH to deal with trivial but non-callable [cd]tors

2013-10-30 Thread Eric Botcazou
> > +/* Return whether DECL, a method of a C++ TYPE, is trivial, that is to
> > say +   doesn't do anything for the objects of TYPE.  */
> > +
> > +static bool
> > +is_trivial_method (const_tree decl, const_tree type)
> > +{
> > +  if (cpp_check (decl, IS_CONSTRUCTOR) && !TYPE_NEEDS_CONSTRUCTING
> > (type)) +return true;
> 
> This will tell you whether decl is a constructor for a type with some
> non-trivial constructor, but not whether decl itself is non-trivial.

I think that's good enough though, in practice we only need to eliminate 
constructors/destructors for POD types.  As soon as there is one non-trivial 
method, the game is essentially over.

> I think a good way to check for any non-trivial methods would be to
> check trivial_type_p in the front end and then see if there are any
> !DECL_ARTIFICIAL decls in TYPE_METHODS.

That sounds interesting indeed, thanks for the tip.  I was initially reluctant 
to call into the front-end because of side-effects, but the various predicates 
in tree.c seem fine in this respect.

-- 
Eric Botcazou


Re: [wide-int] Update main comment

2013-10-30 Thread Richard Sandiford
Kenneth Zadeck  writes:
> On 10/30/2013 07:01 AM, Richard Sandiford wrote:
>> Kenneth Zadeck  writes:
>>> On 10/29/2013 06:37 PM, Richard Sandiford wrote:
 This patch tries to update the main wide_int comment to reflect the current
 implementation.

 - bitsizetype is TImode on x86_64 and others, so I don't think it's
 necessarily true that all offset_ints are signed.  (widest_int are
 though.)
>>> i am wondering if this is too conservative an interpretation.I
>>> believe that they are ti mode because that is the next thing after di
>>> mode and so they wanted to accommodate the 3 extra bits. Certainly there
>>> is no x86 that is able to address more than 64 bits.
>> Right, but my point is that it's a different case from widest_int.
>> It'd be just as valid to do bitsizetype arithmetic using wide_int
>> rather than offset_int, and those wide_ints would have precision 128,
>> just like the offset_ints.  And I wouldn't really say that those wide_ints
>> were fundamentally signed in any way.  Although the tree layer might "know"
>> that X upper bits of the bitsizetype are always signs, the tree-wide_int
>> interface treats them in the same way as any other 128-bit type.
>>
>> Maybe I'm just being pedantic, but I think offset_int would only be like
>> widest_int if bitsizetype had precision 67 or whatever.  Then we could
>> say that both offset_int and widest_int must be wider than any inputs,
>> meaning that there's at least one leading sign bit.
> this was of course what mike and i wanted, but we could not really 
> figure out how to pull it off.
> in particular, we could not find any existing reliable marker in the 
> targets to say what the width of the widest pointer on any 
> implementation.   We actually used the number 68 rather than 67 because 
> we assumed 64 for the widest pointer on any existing platform, 3 bits 
> for the bits and 1 bit for the sign.

Ah yeah, 68 would be better for signed types.

Is the patch OK while we still have 128-bit bitsizetypes though?
I agree the current comment would be right if we ever did switch
to sub-128 bitsizes.

Thanks,
Richard


Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Uros Bizjak
On Wed, Oct 30, 2013 at 7:01 PM, Cong Hou  wrote:

>>> I found my problem: I put DONE outside of if not inside. You are
>>> right. I have updated my patch.
>>
>> OK, great that we put things in order ;)
>>
>> Does this patch need some extra middle-end functionality? I was not
>> able to vectorize char and short part of your patch.
>
>
> In the original patch, I converted abs() on short and char values to
> their own types by removing type casts. That is, originally char_val1
> = abs(char_val2) will be converted to char_val1 = (char) abs((int)
> char_val2) in the frontend, and I would like to convert it back to
> char_val1 = abs(char_val2). But after several discussions, it seems
> this conversion has some problems such as overflow converns, and I
> thereby removed that part.
>
> Now you should still be able to vectorize abs(char) and abs(short) but
> with packing and unpacking. Later I will consider to write pattern
> recognizer for abs(char) and abs(short) and then the expand on
> abs(char)/abs(short) in this patch will be used during vectorization.

OK, this seems reasonable. We already have "unused" SSSE3 8/16 bit abs
pattern, so I think we can commit SSE2 expanders, even if they will be
unused for now. The proposed recognizer will benefit SSE2 as well as
existing SSSE3 patterns.

>> Regarding the testcase - please put it to gcc.target/i386/ directory.
>> There is nothing generic in the test, as confirmed by target-dependent
>> scan test. You will find plenty of examples in the mentioned
>> directory. I'd suggest to split the testcase in three files, and to
>> simplify it to something like the testcase with global variables I
>> used earlier.
>
>
> I have done it. The test case is split into three for s8/s16/s32 in
> gcc.target/i386.

OK.

The patch is OK for mainline, but please check formatting and
whitespace before the patch is committed.

Thanks,
Uros.


Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich
On 30 Oct 11:40, Jeff Law wrote:
> On 10/30/13 04:48, Richard Biener wrote:
> >foo (int * p, unsigned int size)
> >{
> >__bound_tmp.0;
> >   long unsigned int D.2239;
> >   long unsigned int _2;
> >   sizetype _6;
> >   int * _7;
> >
> >   :
> >   __bound_tmp.0_4 = __builtin_ia32_arg_bnd (p_3(D));
> >
> >   :
> >   _2 = (long unsigned int) size_1(D);
> >   __builtin_ia32_bndcl (__bound_tmp.0_4, p_3(D));
> >   _6 = _2 + 18446744073709551615;
> >   _7 = p_3(D) + _6;
> >   __builtin_ia32_bndcu (__bound_tmp.0_4, _7);
> >   access_and_store (p_3(D), __bound_tmp.0_4, size_1(D));
> >
> >so it seems there is now a mismatch between DECL_ARGUMENTS
> >and the GIMPLE call stmt arguments.  How (if) did you amend
> >the GIMPLE stmt verifier for this?
> Effectively the bounds are passed "on the side".
> 
> >
> >How does regular code deal with this which may expect matching
> >to DECL_ARGUMENTS?  In fact interleaving the additional
> >arguments sounds very error-prone for existing code - I'd have
> >appended all bound args at the end.  Also you unconditionally
> >claim all pointer arguments have a bound - that looks like bad
> >design as well.  Why didn't you add a flag to the relevant
> >PARM_DECL (and then, what do you do for indirect calls?).
> You can't actually interleave them -- that results in MPX and normal
> code not being able to interact.   Passing the bound at the end
> doesn't really work either -- varargs and the desire to pass some of
> the bounds around in bound registers.
> 
> 
> >
> >/* Return the number of arguments used by call statement GS
> >ignoring bound ones.  */
> >
> >static inline unsigned
> >gimple_call_num_nobnd_args (const_gimple gs)
> >{
> >   unsigned num_args = gimple_call_num_args (gs);
> >   unsigned res = num_args;
> >   for (unsigned n = 0; n < num_args; n++)
> > if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
> >   res--;
> >   return res;
> >}
> >
> >the choice means that gimple_call_num_nobnd_args is not O(1).
> Yes, but I don't see that's terribly problematical.
> 
> 
> >
> >/* Return INDEX's call argument ignoring bound ones.  */
> >static inline tree
> >gimple_call_nobnd_arg (const_gimple gs, unsigned index)
> >{
> >   /* No bound args may exist if pointers checker is off.  */
> >   if (!flag_check_pointer_bounds)
> > return gimple_call_arg (gs, index);
> >   return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
> >}
> >
> >GIMPLE layout depending on flag_check_pointer_bounds sounds
> >like a recipie for desaster if you consider TUs compiled with and
> >TUs compiled without and LTO.  Or if you consider using
> >optimized attribute with that flag.
> Sorry, I don't follow.  Can you elaborate please.

I suppose the possile problem here is when we run LTO compiler without 
-fcheck-pointer-bounds and give instrumented code as input. 
gimple_call_nobnd_arg would work wrong for instrumented code. Actually there 
are other places in subsequent patches wich assume that 
flag_check_pointer_bounds is 1 if we have instrumented code. 

Ilya

> 
> >I hope the reviewers that approved the patch will work with you to
> >address the above issues.  I can't be everywhere.
> Obviously I will.
> 
> jeff
> 


Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Cong Hou
Also, as the current expand for abs() on 8/16bit integer is not used
at all, should I comment them temporarily now? Later I can uncomment
them once I finished the pattern recognizer.



thanks,
Cong


On Wed, Oct 30, 2013 at 10:22 AM, Uros Bizjak  wrote:
> On Wed, Oct 30, 2013 at 6:01 PM, Cong Hou  wrote:
>> I found my problem: I put DONE outside of if not inside. You are
>> right. I have updated my patch.
>
> OK, great that we put things in order ;)
>
> Does this patch need some extra middle-end functionality? I was not
> able to vectorize char and short part of your patch.
>
> Regarding the testcase - please put it to gcc.target/i386/ directory.
> There is nothing generic in the test, as confirmed by target-dependent
> scan test. You will find plenty of examples in the mentioned
> directory. I'd suggest to split the testcase in three files, and to
> simplify it to something like the testcase with global variables I
> used earlier.
>
> Modulo testcase, the patch is OK otherwise, but middle-end parts
> should be committed first.
>
> Thanks,
> Uros.


Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Cong Hou
On Wed, Oct 30, 2013 at 10:22 AM, Uros Bizjak  wrote:
> On Wed, Oct 30, 2013 at 6:01 PM, Cong Hou  wrote:
>> I found my problem: I put DONE outside of if not inside. You are
>> right. I have updated my patch.
>
> OK, great that we put things in order ;)
>
> Does this patch need some extra middle-end functionality? I was not
> able to vectorize char and short part of your patch.


In the original patch, I converted abs() on short and char values to
their own types by removing type casts. That is, originally char_val1
= abs(char_val2) will be converted to char_val1 = (char) abs((int)
char_val2) in the frontend, and I would like to convert it back to
char_val1 = abs(char_val2). But after several discussions, it seems
this conversion has some problems such as overflow converns, and I
thereby removed that part.

Now you should still be able to vectorize abs(char) and abs(short) but
with packing and unpacking. Later I will consider to write pattern
recognizer for abs(char) and abs(short) and then the expand on
abs(char)/abs(short) in this patch will be used during vectorization.


>
> Regarding the testcase - please put it to gcc.target/i386/ directory.
> There is nothing generic in the test, as confirmed by target-dependent
> scan test. You will find plenty of examples in the mentioned
> directory. I'd suggest to split the testcase in three files, and to
> simplify it to something like the testcase with global variables I
> used earlier.


I have done it. The test case is split into three for s8/s16/s32 in
gcc.target/i386.


Thank you!

Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..84c7ab5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-10-22  Cong Hou  
+
+ PR target/58762
+ * config/i386/i386-protos.h (ix86_expand_sse2_abs): New function.
+ * config/i386/i386.c (ix86_expand_sse2_abs): New function.
+ * config/i386/sse.md: Add SSE2 support to abs (8/16/32-bit-int).
+
 2013-10-14  David Malcolm  

  * dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3ab2f3a..ca31224 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
rtx, rtx, bool, bool);
 extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
 extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
 extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
+extern void ix86_expand_sse2_abs (rtx, rtx);

 /* In i386-c.c  */
 extern void ix86_target_macros (void);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 02cbbbd..71905fc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
gen_rtx_MULT (mode, op1, op2));
 }

+void
+ix86_expand_sse2_abs (rtx op0, rtx op1)
+{
+  enum machine_mode mode = GET_MODE (op0);
+  rtx tmp0, tmp1;
+
+  switch (mode)
+{
+  /* For 32-bit signed integer X, the best way to calculate the absolute
+ value of X is (((signed) X >> (W-1)) ^ X) - ((signed) X >> (W-1)).  */
+  case V4SImode:
+ tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
+GEN_INT (GET_MODE_BITSIZE
+ (GET_MODE_INNER (mode)) - 1),
+NULL, 0, OPTAB_DIRECT);
+ if (tmp0)
+  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
+  NULL, 0, OPTAB_DIRECT);
+ if (tmp0 && tmp1)
+  expand_simple_binop (mode, MINUS, tmp1, tmp0,
+   op0, 0, OPTAB_DIRECT);
+ break;
+
+  /* For 16-bit signed integer X, the best way to calculate the absolute
+ value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
+  case V8HImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  /* For 8-bit signed integer X, the best way to calculate the absolute
+ value of X is min ((unsigned char) X, (unsigned char) (-X)),
+ as SSE2 provides the PMINUB insn.  */
+  case V16QImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  default:
+ break;
+}
+}
+
 /* Expand an insert into a vector register through pinsr insn.
Return true if successful.  */

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c3f6c94..46e1df4 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8721,7 +8721,7 @@
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
(set_attr "mode" "DI")])

-(define_insn "abs2"
+(define_insn "*abs2"
   [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand" "=v")
  (abs:VI124_AVX2_48_AVX512F
   (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand" "vm")))]
@@ -8733,6 +8733,19 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "")])

+(define_expand "abs2"
+  [(set (match_operand:VI124_AVX2_48_A

Re: RFA: patch to fix PR58785 (an ARM LRA crash)

2013-10-30 Thread Richard Earnshaw (home)
On 30 Oct 2013, at 08:16, "Vladimir Makarov"  wrote:

> The following patch fixes:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58785
> 
> LRA chooses constraint 'm' for const_int operand.  It means that the 
> const_int should be placed in memory but it does not happen as preferred 
> reload class hook returns LO_REGS for class NO_REGS which is result of 
> LRA choosing 'm'.  I don't know why reload pass needs such value but it 
> should be return NO_REGS IMHO as it results in much less reload insns.
> 
> Is this patch ok to commit to the trunk?
> 
> 2013-10-30  Vladimir Makarov  
> 
> PR target/58785
> * config/arm/arm.c (arm_preferred_reload_class): Don't return
> LO_REGS for NO_REGS for LRA.
> 
> 2013-10-30  Vladimir Makarov  
> 
> PR target/58785
> * gcc.target/arm/pr58785.c: New.
> 
> 

We've been suspicious of this hunk of code for a while now.  One reading of the 
manual suggests that p_r_c can only return a subset of rclass, not a different 
class.  On that basis, lo_regs as a result should only be returned when rclass 
is general_regs, even for traditional reload.

R.

Re: [RFC PATCH] For TARGET_AVX use *mov_internal for misaligned loads

2013-10-30 Thread Uros Bizjak
On Wed, Oct 30, 2013 at 6:42 PM, Uros Bizjak  wrote:
> On Wed, Oct 30, 2013 at 12:11 PM, Jakub Jelinek  wrote:
>
>>> > Yesterday I've noticed that for AVX which allows unaligned operands in
>>> > AVX arithmetics instructions we still don't combine unaligned loads with 
>>> > the
>>> > AVX arithmetics instructions.  So say for -O2 -mavx -ftree-vectorize
>>>
>>> This is actually PR 47754 that fell below radar for some reason...
>>
>> Apparently yes.
>>
>>> > the patch attempts to avoid gen_lowpart on the non-MEM lhs of the 
>>> > unaligned
>>> > loads, which usually means combine will fail, by doing the load into a
>>> > temporary pseudo in that case and then doing a pseudo to pseudo move with
>>> > gen_lowpart on the rhs (which will be merged soon after into following
>>> > instructions).
>>>
>>> Is this similar to PR44141? There were similar problems with V4SFmode
>>> subregs, so combine was not able to merge load to the arithemtic insn.
>>
>> From the work on the vectorization last year I remember many cases where
>> subregs (even equal size) on the LHS of instructions prevented combiner or
>> other RTL optimizations from doing it's job.  I believe I've changed some
>> easy places that did that completely unnecessarily, but certainly have not
>> went through all the code to look for other places where this is done.
>>
>> Perhaps let's hack up a checking pass that will after expansion walk the
>> whole IL and complain about same sized subregs on the LHS of insns, then do 
>> make
>> check with it for a couple of ISAs (-msse2,-msse4,-mavx,-mavx2 e.g.?
>>
>>> > I'll bootstrap/regtest this on x86_64-linux and i686-linux, unfortunately 
>>> > my
>>> > bootstrap/regtest server isn't AVX capable.
>>>
>>> I can bootstrap the patch later today on IvyBridge with
>>> --with-arch=core-avx-i --with-cpu=core-avx-i --with-fpmath=avx.
>>
>> That would be greatly appreciated, thanks.
>
> The bootstrap and regression test was OK for x86_64-linux-gnu {,-m32}.
>
> The failures in the attached report are either pre-existing or benign
> due to core-avx-i default to AVX.

I was referring to the *runtime* failures here.

> Please also mention PR44141 in the ChangeLog entry.

Ops, this should be PR47754.

Thanks,
Uros.


Re: [RFC PATCH] For TARGET_AVX use *mov_internal for misaligned loads

2013-10-30 Thread Uros Bizjak
On Wed, Oct 30, 2013 at 12:11 PM, Jakub Jelinek  wrote:

>> > Yesterday I've noticed that for AVX which allows unaligned operands in
>> > AVX arithmetics instructions we still don't combine unaligned loads with 
>> > the
>> > AVX arithmetics instructions.  So say for -O2 -mavx -ftree-vectorize
>>
>> This is actually PR 47754 that fell below radar for some reason...
>
> Apparently yes.
>
>> > the patch attempts to avoid gen_lowpart on the non-MEM lhs of the unaligned
>> > loads, which usually means combine will fail, by doing the load into a
>> > temporary pseudo in that case and then doing a pseudo to pseudo move with
>> > gen_lowpart on the rhs (which will be merged soon after into following
>> > instructions).
>>
>> Is this similar to PR44141? There were similar problems with V4SFmode
>> subregs, so combine was not able to merge load to the arithemtic insn.
>
> From the work on the vectorization last year I remember many cases where
> subregs (even equal size) on the LHS of instructions prevented combiner or
> other RTL optimizations from doing it's job.  I believe I've changed some
> easy places that did that completely unnecessarily, but certainly have not
> went through all the code to look for other places where this is done.
>
> Perhaps let's hack up a checking pass that will after expansion walk the
> whole IL and complain about same sized subregs on the LHS of insns, then do 
> make
> check with it for a couple of ISAs (-msse2,-msse4,-mavx,-mavx2 e.g.?
>
>> > I'll bootstrap/regtest this on x86_64-linux and i686-linux, unfortunately 
>> > my
>> > bootstrap/regtest server isn't AVX capable.
>>
>> I can bootstrap the patch later today on IvyBridge with
>> --with-arch=core-avx-i --with-cpu=core-avx-i --with-fpmath=avx.
>
> That would be greatly appreciated, thanks.

The bootstrap and regression test was OK for x86_64-linux-gnu {,-m32}.

The failures in the attached report are either pre-existing or benign
due to core-avx-i default to AVX.

Please also mention PR44141 in the ChangeLog entry.

Uros.


mail-report.log.gz
Description: GNU Zip compressed data


Re: [RFC PATCH] For TARGET_AVX use *mov_internal for misaligned loads

2013-10-30 Thread Jakub Jelinek
On Wed, Oct 30, 2013 at 09:17:04AM -0700, Richard Henderson wrote:
> On 10/30/2013 02:47 AM, Jakub Jelinek wrote:
> > 2013-10-30  Jakub Jelinek  
> > 
> > * config/i386/i386.c (ix86_avx256_split_vector_move_misalign): If
> > op1 is misaligned_operand, just use *mov_internal insn
> > rather than UNSPEC_LOADU load.
> > (ix86_expand_vector_move_misalign): Likewise (for TARGET_AVX only).
> > Avoid gen_lowpart on op0 if it isn't MEM.
> 
> Ok.

Testing revealed some testsuite failures, due to either trying to match
insn names in -dp dump or counting specific FMA insns, where with the
patch there are changes like:
-   vmovupd 0(%r13,%rax), %ymm0
-   vfmadd231pd %ymm1, %ymm2, %ymm0
+   vmovapd %ymm2, %ymm0
+   vfmadd213pd 0(%r13,%rax), %ymm1, %ymm0

So, here is updated patch with those testsuite changes and added PR line
to ChangeLog.  I'll wait for Uros' testresults.

2013-10-30  Jakub Jelinek  

PR target/47754
* config/i386/i386.c (ix86_avx256_split_vector_move_misalign): If
op1 is misaligned_operand, just use *mov_internal insn
rather than UNSPEC_LOADU load.
(ix86_expand_vector_move_misalign): Likewise (for TARGET_AVX only).
Avoid gen_lowpart on op0 if it isn't MEM.

* gcc.target/i386/avx256-unaligned-load-1.c: Adjust scan-assembler
and scan-assembler-not regexps.
* gcc.target/i386/avx256-unaligned-load-2.c: Likewise.
* gcc.target/i386/avx256-unaligned-load-3.c: Likewise.
* gcc.target/i386/avx256-unaligned-load-4.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Expect vf{,n}m{add,sub}213*p*
instead of vf{,n}m{add,sub}231*p*.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_double_1.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.

--- gcc/config/i386/i386.c.jj   2013-10-30 08:15:38.0 +0100
+++ gcc/config/i386/i386.c  2013-10-30 10:20:22.684708729 +0100
@@ -16560,6 +16560,12 @@ ix86_avx256_split_vector_move_misalign (
  r = gen_rtx_VEC_CONCAT (GET_MODE (op0), r, m);
  emit_move_insn (op0, r);
}
+  /* Normal *mov_internal pattern will handle
+unaligned loads just fine if misaligned_operand
+is true, and without the UNSPEC it can be combined
+with arithmetic instructions.  */
+  else if (misaligned_operand (op1, GET_MODE (op1)))
+   emit_insn (gen_rtx_SET (VOIDmode, op0, op1));
   else
emit_insn (load_unaligned (op0, op1));
 }
@@ -16634,7 +16640,7 @@ ix86_avx256_split_vector_move_misalign (
 void
 ix86_expand_vector_move_misalign (enum machine_mode mode, rtx operands[])
 {
-  rtx op0, op1, m;
+  rtx op0, op1, orig_op0 = NULL_RTX, m;
   rtx (*load_unaligned) (rtx, rtx);
   rtx (*store_unaligned) (rtx, rtx);
 
@@ -16647,7 +16653,16 @@ ix86_expand_vector_move_misalign (enum m
{
case MODE_VECTOR_INT:
case MODE_INT:
- op0 = gen_lowpart (V16SImode, op0);
+ if (GET_MODE (op0) != V16SImode)
+   {
+ if (!MEM_P (op0))
+   {
+ orig_op0 = op0;
+ op0 = gen_reg_rtx (V16SImode);
+   }
+ else
+   op0 = gen_lowpart (V16SImode, op0);
+   }
  op1 = gen_lowpart (V16SImode, op1);
  /* FALLTHRU */
 
@@ -16676,6 +16691,8 @@ ix86_expand_vector_move_misalign (enum m
emit_insn (store_unaligned (op0, op1));
  else
gcc_unreachable ();
+ if (orig_op0)
+   emit_move_insn (orig_op0, gen_lowpart (GET_MODE (orig_op0), op0));
  break;
 
default:
@@ -16692,12 +16709,23 @@ ix86_expand_vector_move_misalign (enum m
{
case MODE_VECTOR_INT:
case MODE_INT:
- op0 = gen_lowpart (V32QImode, op0);
+ if (GET_MODE (op0) != V32QImode)
+   {
+ if (!MEM_P (op0))
+   {
+ orig_op0 = op0;
+ op0 = gen_reg_rtx (V32QImode);
+   }
+ else
+   op0 = gen_lowpart (V32QImode, op0);
+   }
  op1 = gen_lowpart (V32QImode, op1);
  /* FALLTHRU */
 
case MODE_VECTOR_FLOAT:
  ix86_avx256_split_vector_move_misalign (op0, op1);
+ if (orig_op0)
+   emit_move_insn (orig_op0, gen_lowpart (GET_MODE (orig_op0), op0));
  break;
 
default:
@@ -16709,15 +16737,30 @@ ix86_expand_vector_move_misalign (enum m
 
   if (MEM_P (op1))
 {
+  /* Normal *mov_internal pattern will handle
+unaligned loads just fine if misaligned_operand
+is true, and without the UNSPEC it can be combined
+with arithmetic instructions.  */
+  if (TARGET_AVX
+ && (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+ || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
+ && misaligned_operand (op1, GET_MODE (op1)))
+   emit_insn (gen_rtx_SET (VOIDmode, op0

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Jeff Law

On 10/30/13 04:48, Richard Biener wrote:

foo (int * p, unsigned int size)
{
__bound_tmp.0;
   long unsigned int D.2239;
   long unsigned int _2;
   sizetype _6;
   int * _7;

   :
   __bound_tmp.0_4 = __builtin_ia32_arg_bnd (p_3(D));

   :
   _2 = (long unsigned int) size_1(D);
   __builtin_ia32_bndcl (__bound_tmp.0_4, p_3(D));
   _6 = _2 + 18446744073709551615;
   _7 = p_3(D) + _6;
   __builtin_ia32_bndcu (__bound_tmp.0_4, _7);
   access_and_store (p_3(D), __bound_tmp.0_4, size_1(D));

so it seems there is now a mismatch between DECL_ARGUMENTS
and the GIMPLE call stmt arguments.  How (if) did you amend
the GIMPLE stmt verifier for this?

Effectively the bounds are passed "on the side".



How does regular code deal with this which may expect matching
to DECL_ARGUMENTS?  In fact interleaving the additional
arguments sounds very error-prone for existing code - I'd have
appended all bound args at the end.  Also you unconditionally
claim all pointer arguments have a bound - that looks like bad
design as well.  Why didn't you add a flag to the relevant
PARM_DECL (and then, what do you do for indirect calls?).
You can't actually interleave them -- that results in MPX and normal 
code not being able to interact.   Passing the bound at the end doesn't 
really work either -- varargs and the desire to pass some of the bounds 
around in bound registers.





/* Return the number of arguments used by call statement GS
ignoring bound ones.  */

static inline unsigned
gimple_call_num_nobnd_args (const_gimple gs)
{
   unsigned num_args = gimple_call_num_args (gs);
   unsigned res = num_args;
   for (unsigned n = 0; n < num_args; n++)
 if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
   res--;
   return res;
}

the choice means that gimple_call_num_nobnd_args is not O(1).

Yes, but I don't see that's terribly problematical.




/* Return INDEX's call argument ignoring bound ones.  */
static inline tree
gimple_call_nobnd_arg (const_gimple gs, unsigned index)
{
   /* No bound args may exist if pointers checker is off.  */
   if (!flag_check_pointer_bounds)
 return gimple_call_arg (gs, index);
   return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
}

GIMPLE layout depending on flag_check_pointer_bounds sounds
like a recipie for desaster if you consider TUs compiled with and
TUs compiled without and LTO.  Or if you consider using
optimized attribute with that flag.

Sorry, I don't follow.  Can you elaborate please.


I hope the reviewers that approved the patch will work with you to
address the above issues.  I can't be everywhere.

Obviously I will.

jeff



Re: [Patch, C, C++] Accept GCC ivdep for 'do' and 'while', and for C++11's range-based loops

2013-10-30 Thread Jason Merrill

OK.

Jason


Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Jeff Law

On 10/30/13 11:16, Iyer, Balaji V wrote:




-Original Message-
From: Joseph Myers [mailto:jos...@codesourcery.com]
Sent: Wednesday, October 30, 2013 1:15 PM
To: Jeff Law
Cc: Iyer, Balaji V; Tobias Burnus; gcc patches
Subject: Re: Testsuite / Cilk Plus: Include library path in compile flags in
gcc.dg/cilk-plus/cilk-plus.exp

On Wed, 30 Oct 2013, Jeff Law wrote:


/* { dg-do compile } */
/* { dg-do run { target i?86-*-* x86-64-*-* } } */


But with an effective-target keyword cilkplusrts or similar, rather than
hardcoding the same list of targets in lots of places, please.



Wow, I didn't know you could do that. Do you have an example where it is done 
that I can model after?

Look at
testsuite/lib/target-support.exp for check_*.  Once you see one or two, 
you can search in testsuite/gcc.dg for examples.


jeff


Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Uros Bizjak
On Wed, Oct 30, 2013 at 6:01 PM, Cong Hou  wrote:
> I found my problem: I put DONE outside of if not inside. You are
> right. I have updated my patch.

OK, great that we put things in order ;)

Does this patch need some extra middle-end functionality? I was not
able to vectorize char and short part of your patch.

Regarding the testcase - please put it to gcc.target/i386/ directory.
There is nothing generic in the test, as confirmed by target-dependent
scan test. You will find plenty of examples in the mentioned
directory. I'd suggest to split the testcase in three files, and to
simplify it to something like the testcase with global variables I
used earlier.

Modulo testcase, the patch is OK otherwise, but middle-end parts
should be committed first.

Thanks,
Uros.


C++ PATCH to C++1y VLA of 0 length

2013-10-30 Thread Jason Merrill
At the Chicago meeting the EWG agreed that we don't need to throw on 
0-length VLAs.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1e328cbd26bfb641db8e218e4a4c32fc1a9a8d9d
Author: Jason Merrill 
Date:   Fri Oct 25 06:15:01 2013 -0400

	* decl.c (cp_finish_decl): Never throw for VLA bound == 0.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1e92f2a..476d559 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6404,11 +6404,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
   /* If the VLA bound is larger than half the address space, or less
 	 than zero, throw std::bad_array_length.  */
   tree max = convert (ssizetype, TYPE_MAX_VALUE (TYPE_DOMAIN (type)));
-  /* C++1y says we should throw for length <= 0, but we have
-	 historically supported zero-length arrays.  Let's treat that as an
-	 extension to be disabled by -std=c++NN.  */
-  int lower = flag_iso ? 0 : -1;
-  tree comp = build2 (LT_EXPR, boolean_type_node, max, ssize_int (lower));
+  tree comp = build2 (LT_EXPR, boolean_type_node, max, ssize_int (-1));
   comp = build3 (COND_EXPR, void_type_node, comp,
 		 throw_bad_array_length (), void_zero_node);
   finish_expr_stmt (comp);


Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Jeff Law

On 10/30/13 11:12, Iyer, Balaji V wrote:

-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Wednesday, October 30, 2013 1:08 PM
To: Iyer, Balaji V; Tobias Burnus; gcc patches
Subject: Re: Testsuite / Cilk Plus: Include library path in compile flags in
gcc.dg/cilk-plus/cilk-plus.exp

On 10/30/13 09:09, Iyer, Balaji V wrote:

Hello Everyone, What I ideally wanted to do with my testsuite files
was that I want all the Cilk keywords test to compile no matter what
the architecture is, but it should only run in certain architectures
where the runtime is enabled (this is known statically and thus the
testsuite doesn't have to do anything to figure it out.). Can someone
please tell me how do I do this?


I can't recall a similar situation off the top of my head.  Are you using the dg
framework?  Can you have multiple dg-do directives?

ie,
/* { dg-do compile } */
/* { dg-do run { target i?86-*-* x86-64-*-* } } */



Yes, I am using the dg-framework.

I tried that couple months back, and from what it looked like, it was replacing 
the compile command with the run command, and then on non-x86 architectures, it 
was just ignoring the file...

But, will try it again to see if it works.
Hmm, if that's the case, you may need distinct tests.  Judicious use of 
#include files would avoid unnecessary duplication.


jeff


[Patch] Fix canadian cross build on systems with no fenv.h

2013-10-30 Thread Steve Ellcey

I ran into a build problem while doing a canadian cross build of GCC.
I was building on linux to create a Windows (mingw) GCC that generates
code for mips-mti-elf.

The mips-mti-elf target uses newlib for its system headers and libraries
and the headers do not include a fenv.h header file.  However, when doing
a canadian cross build libstdc++ is built with the C++ compiler that runs
on linux (the build system), not the C++ that was just built for Windows
(the host system).  The problem is that the libstdc++ configure script
(in GLIBCXX_CHECK_C99_TR1) is checking for the fenv.h using C++ (not C)
and the C++ compiler running on linux does have an fenv.h header because
the latest libstdc++ builds always create this header for C++ regardless
of whether there is one on the system or not.  This results in the newly
created libstdc++ library having a fenv.h header that tries to include the
system fenv.h header file which does not exist and the build fails.

My fix for this is to explicitly check for fenv.h and complex.h in the
libstdc++ configure.ac script before calling GLIBCXX_CHECK_C99_TR1.
This makes the configure script check for these headers using the C
compiler instead of the C++ compiler and when GLIBCXX_CHECK_C99_TR1
is run it uses that information (saved in a autoconf variable) to 
correctly ascertain that fenv.h does not exist.

I could put these checks in GLIBCXX_CHECK_C99_TR1 if that were considered
preferable, it would just have to be done before we call 'AC_LANG_CPLUSPLUS'.

Tested with both my canadian cross build and a standard cross build
targetting mips-mti-elf.

OK for checkin?

Steve Ellcey
sell...@mips.com


2013-10-30  Steve Ellcey  

* configure.ac: Add header checks for fenv.h and complex.h.
* configure: Regenerate.


diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index dd13b01..22fc840 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -195,6 +195,12 @@ GLIBCXX_CHECK_S_ISREG_OR_S_IFREG
 AC_CHECK_HEADERS(sys/uio.h)
 GLIBCXX_CHECK_WRITEV
 
+# Check for fenv.h and complex.h before GLIBCXX_CHECK_C99_TR1
+# so that the check is done with the C compiler (not C++).
+# Checking with C++ can break a canadian cross build if either
+# file does not exist in C but does in C++.
+AC_CHECK_HEADERS(fenv.h complex.h)
+
 # For C99 support to TR1.
 GLIBCXX_CHECK_C99_TR1
 



RE: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Iyer, Balaji V


> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Wednesday, October 30, 2013 1:15 PM
> To: Jeff Law
> Cc: Iyer, Balaji V; Tobias Burnus; gcc patches
> Subject: Re: Testsuite / Cilk Plus: Include library path in compile flags in
> gcc.dg/cilk-plus/cilk-plus.exp
> 
> On Wed, 30 Oct 2013, Jeff Law wrote:
> 
> > /* { dg-do compile } */
> > /* { dg-do run { target i?86-*-* x86-64-*-* } } */
> 
> But with an effective-target keyword cilkplusrts or similar, rather than
> hardcoding the same list of targets in lots of places, please.
> 

Wow, I didn't know you could do that. Do you have an example where it is done 
that I can model after?

> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Joseph S. Myers
On Wed, 30 Oct 2013, Jeff Law wrote:

> /* { dg-do compile } */
> /* { dg-do run { target i?86-*-* x86-64-*-* } } */

But with an effective-target keyword cilkplusrts or similar, rather than 
hardcoding the same list of targets in lots of places, please.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Iyer, Balaji V
> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, October 30, 2013 1:08 PM
> To: Iyer, Balaji V; Tobias Burnus; gcc patches
> Subject: Re: Testsuite / Cilk Plus: Include library path in compile flags in
> gcc.dg/cilk-plus/cilk-plus.exp
> 
> On 10/30/13 09:09, Iyer, Balaji V wrote:
> > Hello Everyone, What I ideally wanted to do with my testsuite files
> > was that I want all the Cilk keywords test to compile no matter what
> > the architecture is, but it should only run in certain architectures
> > where the runtime is enabled (this is known statically and thus the
> > testsuite doesn't have to do anything to figure it out.). Can someone
> > please tell me how do I do this?
> 
> I can't recall a similar situation off the top of my head.  Are you using the 
> dg
> framework?  Can you have multiple dg-do directives?
> 
> ie,
> /* { dg-do compile } */
> /* { dg-do run { target i?86-*-* x86-64-*-* } } */
> 

Yes, I am using the dg-framework.

I tried that couple months back, and from what it looked like, it was replacing 
the compile command with the run command, and then on non-x86 architectures, it 
was just ignoring the file...

But, will try it again to see if it works.

> ?
> 
> That'd be my best guess.
> 
> jeff



Re: Testsuite / Cilk Plus: Include library path in compile flags in gcc.dg/cilk-plus/cilk-plus.exp

2013-10-30 Thread Jeff Law

On 10/30/13 09:09, Iyer, Balaji V wrote:

Hello Everyone, What I ideally wanted to do with my testsuite files
was that I want all the Cilk keywords test to compile no matter what
the architecture is, but it should only run in certain architectures
where the runtime is enabled (this is known statically and thus the
testsuite doesn't have to do anything to figure it out.). Can someone
please tell me how do I do this?


I can't recall a similar situation off the top of my head.  Are you 
using the dg framework?  Can you have multiple dg-do directives?


ie,
/* { dg-do compile } */
/* { dg-do run { target i?86-*-* x86-64-*-* } } */

?

That'd be my best guess.

jeff



Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Cong Hou
Forget to attach the patch file.



thanks,
Cong


On Wed, Oct 30, 2013 at 10:01 AM, Cong Hou  wrote:
> I found my problem: I put DONE outside of if not inside. You are
> right. I have updated my patch.
>
> I appreciate your comment and test on it!
>
>
> thanks,
> Cong
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 8a38316..84c7ab5 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2013-10-22  Cong Hou  
> +
> + PR target/58762
> + * config/i386/i386-protos.h (ix86_expand_sse2_abs): New function.
> + * config/i386/i386.c (ix86_expand_sse2_abs): New function.
> + * config/i386/sse.md: Add SSE2 support to abs (8/16/32-bit-int).
> +
>  2013-10-14  David Malcolm  
>
>   * dumpfile.h (gcc::dump_manager): New class, to hold state
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 3ab2f3a..ca31224 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
> rtx, rtx, bool, bool);
>  extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
>  extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
>  extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
> +extern void ix86_expand_sse2_abs (rtx, rtx);
>
>  /* In i386-c.c  */
>  extern void ix86_target_macros (void);
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 02cbbbd..71905fc 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
> gen_rtx_MULT (mode, op1, op2));
>  }
>
> +void
> +ix86_expand_sse2_abs (rtx op0, rtx op1)
> +{
> +  enum machine_mode mode = GET_MODE (op0);
> +  rtx tmp0, tmp1;
> +
> +  switch (mode)
> +{
> +  /* For 32-bit signed integer X, the best way to calculate the absolute
> + value of X is (((signed) X >> (W-1)) ^ X) - ((signed) X >> (W-1)).  */
> +  case V4SImode:
> + tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
> +GEN_INT (GET_MODE_BITSIZE
> + (GET_MODE_INNER (mode)) - 1),
> +NULL, 0, OPTAB_DIRECT);
> + if (tmp0)
> +  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
> +  NULL, 0, OPTAB_DIRECT);
> + if (tmp0 && tmp1)
> +  expand_simple_binop (mode, MINUS, tmp1, tmp0,
> +   op0, 0, OPTAB_DIRECT);
> + break;
> +
> +  /* For 16-bit signed integer X, the best way to calculate the absolute
> + value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
> +  case V8HImode:
> + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
> + if (tmp0)
> +  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
> +   OPTAB_DIRECT);
> + break;
> +
> +  /* For 8-bit signed integer X, the best way to calculate the absolute
> + value of X is min ((unsigned char) X, (unsigned char) (-X)),
> + as SSE2 provides the PMINUB insn.  */
> +  case V16QImode:
> + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
> + if (tmp0)
> +  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
> +   OPTAB_DIRECT);
> + break;
> +
> +  default:
> + break;
> +}
> +}
> +
>  /* Expand an insert into a vector register through pinsr insn.
> Return true if successful.  */
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index c3f6c94..46e1df4 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -8721,7 +8721,7 @@
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p 
> (insn)"))
> (set_attr "mode" "DI")])
>
> -(define_insn "abs2"
> +(define_insn "*abs2"
>[(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand" "=v")
>   (abs:VI124_AVX2_48_AVX512F
>(match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand" "vm")))]
> @@ -8733,6 +8733,19 @@
> (set_attr "prefix" "maybe_vex")
> (set_attr "mode" "")])
>
> +(define_expand "abs2"
> +  [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand")
> + (abs:VI124_AVX2_48_AVX512F
> +  (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand")))]
> +  "TARGET_SSE2"
> +{
> +  if (!TARGET_SSSE3)
> +{
> +  ix86_expand_sse2_abs (operands[0], operands[1]);
> +  DONE;
> +}
> +})
> +
>  (define_insn "abs2"
>[(set (match_operand:MMXMODEI 0 "register_operand" "=y")
>   (abs:MMXMODEI
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 075d071..cf5b942 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2013-10-22  Cong Hou  
> +
> + PR target/58762
> + * gcc.dg/vect/pr58762.c: New test.
> +
>  2013-10-14  Tobias Burnus  
>
>   PR fortran/58658
> diff --git a/gcc/testsuite/gcc.dg/vect/pr58762.c
> b/gcc/testsuite/gcc.dg/vect/pr58762.c
> new file mode 100644
> index 000..6468d0a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr58762.c
> @@ -0,0 +1,28 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +
> +void test1 (char* a, char

Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-30 Thread Cong Hou
I found my problem: I put DONE outside of if not inside. You are
right. I have updated my patch.

I appreciate your comment and test on it!


thanks,
Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..84c7ab5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-10-22  Cong Hou  
+
+ PR target/58762
+ * config/i386/i386-protos.h (ix86_expand_sse2_abs): New function.
+ * config/i386/i386.c (ix86_expand_sse2_abs): New function.
+ * config/i386/sse.md: Add SSE2 support to abs (8/16/32-bit-int).
+
 2013-10-14  David Malcolm  

  * dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3ab2f3a..ca31224 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
rtx, rtx, bool, bool);
 extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
 extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
 extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
+extern void ix86_expand_sse2_abs (rtx, rtx);

 /* In i386-c.c  */
 extern void ix86_target_macros (void);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 02cbbbd..71905fc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
gen_rtx_MULT (mode, op1, op2));
 }

+void
+ix86_expand_sse2_abs (rtx op0, rtx op1)
+{
+  enum machine_mode mode = GET_MODE (op0);
+  rtx tmp0, tmp1;
+
+  switch (mode)
+{
+  /* For 32-bit signed integer X, the best way to calculate the absolute
+ value of X is (((signed) X >> (W-1)) ^ X) - ((signed) X >> (W-1)).  */
+  case V4SImode:
+ tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
+GEN_INT (GET_MODE_BITSIZE
+ (GET_MODE_INNER (mode)) - 1),
+NULL, 0, OPTAB_DIRECT);
+ if (tmp0)
+  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
+  NULL, 0, OPTAB_DIRECT);
+ if (tmp0 && tmp1)
+  expand_simple_binop (mode, MINUS, tmp1, tmp0,
+   op0, 0, OPTAB_DIRECT);
+ break;
+
+  /* For 16-bit signed integer X, the best way to calculate the absolute
+ value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
+  case V8HImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  /* For 8-bit signed integer X, the best way to calculate the absolute
+ value of X is min ((unsigned char) X, (unsigned char) (-X)),
+ as SSE2 provides the PMINUB insn.  */
+  case V16QImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  default:
+ break;
+}
+}
+
 /* Expand an insert into a vector register through pinsr insn.
Return true if successful.  */

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c3f6c94..46e1df4 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8721,7 +8721,7 @@
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
(set_attr "mode" "DI")])

-(define_insn "abs2"
+(define_insn "*abs2"
   [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand" "=v")
  (abs:VI124_AVX2_48_AVX512F
   (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand" "vm")))]
@@ -8733,6 +8733,19 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "")])

+(define_expand "abs2"
+  [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand")
+ (abs:VI124_AVX2_48_AVX512F
+  (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand")))]
+  "TARGET_SSE2"
+{
+  if (!TARGET_SSSE3)
+{
+  ix86_expand_sse2_abs (operands[0], operands[1]);
+  DONE;
+}
+})
+
 (define_insn "abs2"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
  (abs:MMXMODEI
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075d071..cf5b942 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2013-10-22  Cong Hou  
+
+ PR target/58762
+ * gcc.dg/vect/pr58762.c: New test.
+
 2013-10-14  Tobias Burnus  

  PR fortran/58658
diff --git a/gcc/testsuite/gcc.dg/vect/pr58762.c
b/gcc/testsuite/gcc.dg/vect/pr58762.c
new file mode 100644
index 000..6468d0a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr58762.c
@@ -0,0 +1,28 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+void test1 (char* a, char* b)
+{
+  int i;
+  for (i = 0; i < 1; ++i)
+a[i] = abs (b[i]);
+}
+
+void test2 (short* a, short* b)
+{
+  int i;
+  for (i = 0; i < 1; ++i)
+a[i] = abs (b[i]);
+}
+
+void test3 (int* a, int* b)
+{
+  int i;
+  for (i = 0; i < 1; ++i)
+a[i] = abs (b[i]);
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect"
+   { target i?86-*-* x86_64-*-* } } } */
+/

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Jeff Law

On 10/30/13 04:34, Ilya Enkovich wrote:

On 30 Oct 10:26, Richard Biener wrote:


Ick - you enlarge all return statements?  But you don't set the
actual value? So why allocate it with 2 ops in the first place??


When return does not return bounds it has operand with zero value
similar to case when it does not return value. What is the difference
then?
In general, when someone proposes a change in the size of tree, rtl or 
gimple nodes, it's a "yellow flag" that something may need further 
investigation.


In this specific instance, I could trivially predict how that additional 
field would be used and a GIMPLE_RETURN isn't terribly important from a 
size standpoint, so I didn't call it out.




Returns instrumentation. We add new operand to return statement to
hold returned bounds and instrumentation pass is responsible to fill
this operand with correct bounds

Exactly what I expected.



Unfortunately patch has been already installed.  Should we uninstall
it?  If not, then here is patch for documentation.

I think we're OK for now.  If Richi wants it out, he'll say so explicitly.




Thanks, Ilya --

gcc/

2013-10-30  Ilya Enkovich  

* doc/gimple.texi (gimple_call_num_nobnd_args): New.
(gimple_call_nobnd_arg): New. (gimple_return_retbnd): New.
(gimple_return_set_retbnd): New. (gimple_call_get_nobnd_arg_index):
New.
Can you also fixup the GIMPLE_RETURN documentation in gimple.texi.  It 
needs a minor update after these changes.


jeff



Re: C++ PATCH to deal with trivial but non-callable [cd]tors

2013-10-30 Thread Jason Merrill

On 10/30/2013 06:14 AM, Eric Botcazou wrote:

+/* Return whether DECL, a method of a C++ TYPE, is trivial, that is to say
+   doesn't do anything for the objects of TYPE.  */
+
+static bool
+is_trivial_method (const_tree decl, const_tree type)
+{
+  if (cpp_check (decl, IS_CONSTRUCTOR) && !TYPE_NEEDS_CONSTRUCTING (type))
+return true;


This will tell you whether decl is a constructor for a type with some 
non-trivial constructor, but not whether decl itself is non-trivial.


I think a good way to check for any non-trivial methods would be to 
check trivial_type_p in the front end and then see if there are any 
!DECL_ARTIFICIAL decls in TYPE_METHODS.


Jason



Re: [RFC PATCH] For TARGET_AVX use *mov_internal for misaligned loads

2013-10-30 Thread Richard Henderson
On 10/30/2013 02:47 AM, Jakub Jelinek wrote:
> 2013-10-30  Jakub Jelinek  
> 
>   * config/i386/i386.c (ix86_avx256_split_vector_move_misalign): If
>   op1 is misaligned_operand, just use *mov_internal insn
>   rather than UNSPEC_LOADU load.
>   (ix86_expand_vector_move_misalign): Likewise (for TARGET_AVX only).
>   Avoid gen_lowpart on op0 if it isn't MEM.

Ok.


r~


Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-10-30 Thread Marc Glisse

On Wed, 30 Oct 2013, Richard Biener wrote:


Btw, get_addr_base_and_unit_offset may also return an offsetted
MEM_REF (from &MEM [p_3, 17] for example).  As we are interested in
pointers this could be handled by not requiring a memory reference
but extracting the base address and offset, covering more cases.


I tried the attached patch, and it almost worked, except for one fortran 
testcase (widechar_intrinsics_10.f90):


! { dg-do run }
! { dg-options "-fbackslash" }

  implicit none
  character(kind=1,len=3) :: s1(3)

  s1 = [ "abc", "def", "ghi" ]

  if (any (cshift (s1, -1) /= [ s1(3), s1(1:2) ])) call abort

end


we end up with a double array_ref, one of variable index, and 
get_ref_base_and_extent signals that by returning as size the size of the 
whole declaration (the double-array). However, 
ao_ref_init_from_ptr_and_size happily ignores the last two parameters of 
get_ref_base_and_extent and continues as if nothing was wrong (I don't 
know how to easily check that something went wrong either). Whether we 
think this is a good/bad patch for memcpy, we have a latent bug that the 
recent patches are making more visible.


--
Marc GlisseIndex: testsuite/gcc.dg/tree-ssa/alias-26.c
===
--- testsuite/gcc.dg/tree-ssa/alias-26.c(revision 0)
+++ testsuite/gcc.dg/tree-ssa/alias-26.c(working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+
+void f (long *p) {
+  *p = 42;
+  p[4] = 42;
+  __builtin_memset (p, 0, 100);
+}
+
+/* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/gcc.dg/tree-ssa/alias-26.c
___
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: tree-ssa-alias.c
===
--- tree-ssa-alias.c(revision 204199)
+++ tree-ssa-alias.c(working copy)
@@ -1982,23 +1982,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
   return stmt_may_clobber_ref_p_1 (stmt, &r);
 }
 
 /* If STMT kills the memory reference REF return true, otherwise
return false.  */
 
 static bool
 stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
 {
   /* For a must-alias check we need to be able to constrain
- the access properly.  */
-  ao_ref_base (ref);
-  if (ref->max_size == -1)
+ the access properly.
+ FIXME: except for BUILTIN_FREE.  */
+  if (!ao_ref_base (ref)
+  || ref->max_size == -1)
 return false;
 
   if (gimple_has_lhs (stmt)
   && TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
   /* The assignment is not necessarily carried out if it can throw
 and we can catch it in the current function where we could inspect
 the previous value.
 ???  We only need to care about the RHS throwing.  For aggregate
 assignments or similar calls and non-call exceptions the LHS
 might throw as well.  */
@@ -2071,37 +2072,47 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref
  case BUILT_IN_MEMPCPY:
  case BUILT_IN_MEMMOVE:
  case BUILT_IN_MEMSET:
  case BUILT_IN_MEMCPY_CHK:
  case BUILT_IN_MEMPCPY_CHK:
  case BUILT_IN_MEMMOVE_CHK:
  case BUILT_IN_MEMSET_CHK:
{
  tree dest = gimple_call_arg (stmt, 0);
  tree len = gimple_call_arg (stmt, 2);
- tree base = NULL_TREE;
- HOST_WIDE_INT offset = 0;
+ tree rbase = ref->base;
+ HOST_WIDE_INT roffset = ref->offset;
  if (!host_integerp (len, 0))
return false;
- if (TREE_CODE (dest) == ADDR_EXPR)
-   base = get_addr_base_and_unit_offset (TREE_OPERAND (dest, 0),
- &offset);
- else if (TREE_CODE (dest) == SSA_NAME)
-   base = dest;
- if (base
- && base == ao_ref_base (ref))
+ ao_ref dref;
+ ao_ref_init_from_ptr_and_size (&dref, dest, len);
+ tree base = ao_ref_base (&dref);
+ HOST_WIDE_INT offset = dref.offset;
+ if (!base)
+   return false;
+ if (TREE_CODE (base) == MEM_REF)
+   {
+ if (TREE_CODE (rbase) != MEM_REF)
+   return false;
+ // Compare pointers.
+ offset += BITS_PER_UNIT
+   * TREE_INT_CST_LOW (TREE_OPERAND (base, 1));
+ roffset += BITS_PER_UNIT
+* TREE_INT_CST_LOW (TREE_OPERAND (rbase, 1));
+ base = TREE_OPERAND (base, 0);
+ rbase = TREE_OPERAND (rbase, 0);
+   }
+ if (base == rbase)
   

Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Marek Polacek
On Wed, Oct 30, 2013 at 11:56:25AM -0400, Jason Merrill wrote:
> On 10/30/2013 10:52 AM, Marek Polacek wrote:
> >+ if ((flag_sanitize & SANITIZE_VLA)
> >+ && !processing_template_decl
> 
> You don't need to check processing_template_decl; the template case
> was already handled above.

Right, removed.
 
> >+ tree x = cp_save_expr (size);
> >+ x = build2 (COMPOUND_EXPR, TREE_TYPE (x),
> >+ ubsan_instrument_vla (input_location, x), x);
> >+ finish_expr_stmt (x);
> 
> Saving 'size' here doesn't help since it's already been used above.
> Could you use itype instead of size here?

I already experimented with that and I think I can't, since we call
the finish_expr_stmt too soon, which results in:

int x = 1;
int a[0:(sizetype) SAVE_EXPR ];
  
<>;
< <= 0)
  {   
__builtin___ubsan_handle_vla_bound_not_positive (&*.Lubsan_data0, 
(unsigned long) SAVE_EXPR );
  }   
else
  {   
0   
  }, (void) SAVE_EXPR ; >;
  ssizetype D.2143;
<;

and that ICEs in gimplify_var_or_parm_decl, presumably because the
if (SAVE_EXPR  <= 0) { ... } should be emitted *after* that
cleanup_point.  When we generated the C++1y check in cp_finish_decl,
we emitted the check after the cleanup_point, and everything was OK.
I admit I don't understand the cleanup_points very much and I don't
know exactly where they are coming from, because normally I don't see
them coming out of C FE. :)  Thanks.

Marek


Re: PATCH to use -Wno-format during stage1

2013-10-30 Thread Paolo Bonzini
Il 30/10/2013 16:47, Jason Merrill ha scritto:
> I find -Wformat warnings about unknown % codes when building with an
> older GCC to be mildly annoying noise; this patch avoids them by passing
> -Wno-format during stage 1.
> 
> Tested x86_64-pc-linux-gnu.  Is this OK for trunk?

Ok.



Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Jeff Law

On 10/30/13 03:26, Richard Biener wrote:

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 3ddceb9..20f6010 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -174,7 +174,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned 
subcode,
  gimple
  gimple_build_return (tree retval)
  {
-  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1);
+  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2);


Ick - you enlarge all return statements?  But you don't set the actual value?
So why allocate it with 2 ops in the first place??

[Seems I completely missed that MPX changes "gimple" and the design
document that was posted somewhere??]
I'd assumed that'd be in a follow-up patch and probably would be bounds 
on the return value or NULL if there were no bounds.


I'm terribly concerned about the enlarging of the return statements.  On 
the other hand, if something enlarged a GIMPLE_ASSIGN, then I would have 
called it out for further discussion.





Bah.

Where is the update to gimple.texi and tree.texi?

Good catch.  Ilya, please send a patch to update the docs.





Re: [PATCH] Keep REG_INC note in subreg2 pass

2013-10-30 Thread Jeff Law

On 10/30/13 00:09, Zhenqiang Chen wrote:

On 30 October 2013 02:47, Jeff Law  wrote:

On 10/24/13 02:20, Zhenqiang Chen wrote:


Hi,

REG_INC note is lost in subreg2 pass when resolve_simple_move, which
might lead to wrong dependence for ira. e.g. In function
validate_equiv_mem of ira.c, it checks REG_INC note:

   for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
  if ((REG_NOTE_KIND (note) == REG_INC
   || REG_NOTE_KIND (note) == REG_DEAD)
  && REG_P (XEXP (note, 0))
  && reg_overlap_mentioned_p (XEXP (note, 0), memref))
return 0;

Without REG_INC note, validate_equiv_mem will return a wrong result.

Referhttps://bugs.launchpad.net/gcc-linaro/+bug/1243022  for more

detail about a real case in kernel.

Bootstrap and no make check regression on X86-64 and ARM.

Is it OK for trunk and 4.8?

Thanks!
-Zhenqiang

ChangeLog:
2013-10-24  Zhenqiang Chen

  * lower-subreg.c (resolve_simple_move): Copy REG_INC note.

testsuite/ChangeLog:
2013-10-24  Zhenqiang Chen

  * gcc.target/arm/lp1243022.c: New test.


This clearly handles adding a note when the destination is a MEM with a side
effect.  What about cases where the side effect is associated with a load
from memory rather than a store to memory?


Yes. We should handle load from memory.




lp1243022.patch


diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 57b4b3c..e710fa5 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -1056,6 +1056,22 @@ resolve_simple_move (rtx set, rtx insn)
 mdest = simplify_gen_subreg (orig_mode, dest, GET_MODE (dest), 0);
 minsn = emit_move_insn (real_dest, mdest);

+#ifdef AUTO_INC_DEC
+  /* Copy the REG_INC notes.  */
+  if (MEM_P (real_dest) && !(resolve_reg_p (real_dest)
+|| resolve_subreg_p (real_dest)))
+   {
+ rtx note = find_reg_note (insn, REG_INC, NULL_RTX);
+ if (note)
+   {
+ if (!REG_NOTES (minsn))
+   REG_NOTES (minsn) = note;
+ else
+   add_reg_note (minsn, REG_INC, note);
+   }
+   }
+#endif


If MINSN does not have any notes, then this results in MINSN and INSN
sharing the note.  Note carefully that notes are chained (see implementation
of add_reg_note).  Thus the sharing would result in MINSN and INSN actually
sharing a chain of notes.  I'm pretty sure that's not what you intended.  I
think you need to always use add_reg_note.


Yes. I should use add_reg_note.

Here is the updated patch:

diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index ebf364f..16dfa62 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -967,7 +967,20 @@ resolve_simple_move (rtx set, rtx insn)
rtx reg;

reg = gen_reg_rtx (orig_mode);
+
+#ifdef AUTO_INC_DEC
+  {
+   rtx move = emit_move_insn (reg, src);
+   if (MEM_P (src))
+ {
+   rtx note = find_reg_note (insn, REG_INC, NULL_RTX);
+   if (note)
+ add_reg_note (move, REG_INC, XEXP (note, 0));
+ }
+  }
+#else
emit_move_insn (reg, src);
+#endif
src = reg;
  }

@@ -1057,6 +1070,16 @@ resolve_simple_move (rtx set, rtx insn)
 mdest = simplify_gen_subreg (orig_mode, dest, GET_MODE (dest), 0);
minsn = emit_move_insn (real_dest, mdest);

+#ifdef AUTO_INC_DEC
+  if (MEM_P (real_dest) && !(resolve_reg_p (real_dest)
+|| resolve_subreg_p (real_dest)))

Formatting nit.   This should be formatted as

if (MEM_P (real_dest)
&& !(resolve_reg_p (real_dest) || resolve_subreg_p (real_dest)))

If that results in too long of a line, then it should wrap like this:

if (MEM_P (real_dest)
&& !(resolve_reg_p (real_dest)
 || resolve_subreg_p (real_dest)))

OK with that change.  Please install on the trunk.  The 4.8 maintainers 
have the final call for the 4.8 release branch.


Thanks,
Jeff



Re: [PATCH v2 3/6] Split symtab_node declarations onto multiple lines

2013-10-30 Thread David Malcolm
On Tue, 2013-09-10 at 15:36 +0200, Jan Hubicka wrote:
> > Amongst other things, the rename_symtab.py script converts
> > "symtab_node" to "symtab_node *".
> > 
> > This will lead to broken code on declarations that declare
> > more than one variable (only the first would get a "*"), so split
> > up such declarations.
> > 
> > gcc/
> > * cgraphunit.c (analyze_functions): Split symtab_node
> > declarations onto multiple lines to make things easier
> > for rename_symtab.py.
> > 
> > * symtab.c (symtab_dissolve_same_comdat_group_list): Likewise.
> > (symtab_semantically_equivalent_p): Likewise.
> > 
> > gcc/lto
> > * lto-symtab.c (lto_symtab_merge_decls_2): Split symtab_node
> > declarations onto multiple lines to make things easier for
> > rename_symtab.py.
> > (lto_symtab_merge_decls_1): Likewise.
> > (lto_symtab_merge_symbols_1): Likewise.
> 
> OK
Thanks; committed to trunk as r204216.




Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-10-30 Thread Jeff Law

On 10/30/13 03:34, Richard Biener wrote:


 * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle case where
 ao_ref_base returns a MEM_REF.

 * gcc.dg/tree-ssa/alias-26.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
b/gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
new file mode 100644
index 000..b5625b8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-26.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+
+void f (long *p) {
+  *p = 42;
+  p[4] = 42;
+  __builtin_memset (p, 0, 100);
+}
+
+/* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 4db83bd..5120e72 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -2079,6 +2079,7 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
   tree dest = gimple_call_arg (stmt, 0);
   tree len = gimple_call_arg (stmt, 2);
   tree base = NULL_TREE;
+ tree ref_base;
   HOST_WIDE_INT offset = 0;
   if (!host_integerp (len, 0))
 return false;
@@ -2087,8 +2088,11 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
   &offset);
   else if (TREE_CODE (dest) == SSA_NAME)
 base = dest;
+ ref_base = ao_ref_base (ref);
   if (base
- && base == ao_ref_base (ref))
+ && ((TREE_CODE (ref_base) == MEM_REF
+  && base == TREE_OPERAND (ref_base, 0))


That's not sufficient - ref_base may have an offset, so for correctness
you have to check that integer_zerop (TREE_OPERAND (ref_base, 0)).
But this now looks convoluted and somewhat backward, and still
does not catch all cases (including the def-stmt lookup recently
added to ao_ref_from_ptr_and_size).
So how do you want to proceed?  I'm not really up for burning through 
this code right now and trying to sort out how it ought to work.


Perhaps checkin the test (xfailed) and wait for someone with the 
interest and time to push this through to completion?


jeff



Re: [PATCH][ubsan] Add VLA bound instrumentation

2013-10-30 Thread Jason Merrill

On 10/30/2013 10:52 AM, Marek Polacek wrote:

+ if ((flag_sanitize & SANITIZE_VLA)
+ && !processing_template_decl


You don't need to check processing_template_decl; the template case was 
already handled above.



+ tree x = cp_save_expr (size);
+ x = build2 (COMPOUND_EXPR, TREE_TYPE (x),
+ ubsan_instrument_vla (input_location, x), x);
+ finish_expr_stmt (x);


Saving 'size' here doesn't help since it's already been used above. 
Could you use itype instead of size here?


Jason



PATCH to use -Wno-format during stage1

2013-10-30 Thread Jason Merrill
I find -Wformat warnings about unknown % codes when building with an 
older GCC to be mildly annoying noise; this patch avoids them by passing 
-Wno-format during stage 1.


Tested x86_64-pc-linux-gnu.  Is this OK for trunk?  Do you have another 
theory of how this should work?


commit c40b06619fc9ef74e4d4d8b299a6c77c6fb63df5
Author: Jason Merrill 
Date:   Mon Oct 28 16:45:05 2013 -0400

/
	* Makefile.tpl (STAGE1_CONFIGURE_FLAGS): Pass
	--disable-build-format-warnings.
gcc/
	* configure.ac (loose_warn): Add -Wno-format if
	--disable-build-format-warnings.

diff --git a/Makefile.in b/Makefile.in
index 572b3d0..e0ba784 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -498,8 +498,10 @@ STAGE1_LANGUAGES = @stage1_languages@
 #   the last argument when conflicting --enable arguments are passed.
 # * Likewise, we force-disable coverage flags, since the installed
 #   compiler probably has never heard of them.
+# * We also disable -Wformat, since older GCCs don't understand newer %s.
 STAGE1_CONFIGURE_FLAGS = --disable-intermodule $(STAGE1_CHECKING) \
-	  --disable-coverage --enable-languages="$(STAGE1_LANGUAGES)"
+	  --disable-coverage --enable-languages="$(STAGE1_LANGUAGES)" \
+	  --disable-build-format-warnings
 
 STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-generate
 STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
diff --git a/Makefile.tpl b/Makefile.tpl
index 3e187e1..65d070b 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -451,8 +451,10 @@ STAGE1_LANGUAGES = @stage1_languages@
 #   the last argument when conflicting --enable arguments are passed.
 # * Likewise, we force-disable coverage flags, since the installed
 #   compiler probably has never heard of them.
+# * We also disable -Wformat, since older GCCs don't understand newer %s.
 STAGE1_CONFIGURE_FLAGS = --disable-intermodule $(STAGE1_CHECKING) \
-	  --disable-coverage --enable-languages="$(STAGE1_LANGUAGES)"
+	  --disable-coverage --enable-languages="$(STAGE1_LANGUAGES)" \
+	  --disable-build-format-warnings
 
 STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-generate
 STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
diff --git a/gcc/configure b/gcc/configure
index 1e7bcb6..ea91906 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -875,6 +875,7 @@ with_demangler_in_ld
 with_gnu_as
 with_as
 enable_largefile
+enable_build_format_warnings
 enable_werror_always
 enable_checking
 enable_coverage
@@ -1569,6 +1570,8 @@ Optional Features:
   for creating source tarballs for users without
   texinfo bison or flex
   --disable-largefile omit support for large files
+  --disable-build-format-warnings
+  don't use -Wformat while building GCC
   --enable-werror-always  enable -Werror despite compiler version
   --enable-checking[=LIST]
   enable expensive run-time checks. With LIST, enable
@@ -6270,9 +6273,22 @@ fi
 # * C++11 narrowing conversions in { }
 # So, we only use -pedantic if we can disable those warnings.
 
+# In stage 1, disable -Wformat warnings from old GCCs about new % codes
+# Check whether --enable-build-format-warnings was given.
+if test "${enable_build_format_warnings+set}" = set; then :
+  enableval=$enable_build_format_warnings;
+else
+  enable_build_format_warnings=yes
+fi
+
+if test $enable_build_format_warnings = no; then :
+  wf_opt=-Wno-format
+else
+  wf_opt=
+fi
 loose_warn=
 save_CFLAGS="$CFLAGS"
-for real_option in -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual; do
+for real_option in -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual $wf_opt; do
   # Do the check with the no- prefix removed since gcc silently
   # accepts any -Wno-* option on purpose
   case $real_option in
@@ -17897,7 +17913,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17900 "configure"
+#line 17916 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18003,7 +18019,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18006 "configure"
+#line 18022 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 5e686db..3d3b26b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -326,8 +326,14 @@ GCC_STDINT_TYPES
 # * C++11 narrowing conversions in { }
 # So, we only use -pedantic if we can disable those warnings.
 
+# In stage 1, disable -Wformat warnings from old GCCs about new % codes
+AC_ARG_ENABLE(build-format-warnings,
+  AS_HELP_STRING([--disable-build-format-warnings],[don't use -Wformat while building GCC]),
+  [],[enable_build_format_warnings=yes])
+AS_IF([test $enable_build_format_warnings = no],
+  [wf_opt=-Wno-format],[wf_opt=])
 ACX_PROG_CC_WARNING_OPTS(
-	m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual])), [loose_warn])
+	m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual 

Re: [PATCH] Do not append " *INTERNAL* " to the decl name

2013-10-30 Thread Jason Merrill

On 10/29/2013 01:37 PM, Dehao Chen wrote:

If we're actually emitting the name now, we need to give it a name different
from the complete constructor.  I suppose it makes sense to go with C4/D4 as
in the decloning patch,


Shall we do it in a separate patch? And I suppose binutils also need
to be updated for C4/D4?


In the same patch, please.  And yes, the demangler will need to be updated.

Jason




  1   2   >