Re: [TREE-SSA-CCP] Issue warning when folding condition

2016-08-18 Thread Kugan Vivekanandarajah
On 19 August 2016 at 12:09, Kugan Vivekanandarajah
 wrote:
> The testcase pr33738.C for warning fails with early-vrp patch. The
> reason is, with early-vrp ccp2 is folding the comparison that used to
> be folded in simplify_stmt_for_jump_threading. Since early-vrp does
> not perform jump-threading is not optimized there.
>
> Attached patch adds this warning to tree-ssa-ccp.c. We might also run
> into some other similar issues in the future.

Sorry, I attached the wrong patch (with typo). Here is the correct one.

Thanks,
Kugan

>
> Bootstrapped and regression tested on x86_64-linux-gnu with no new 
> regressions.
>
> Is this OK for trunk?
>
> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2016-08-18  Kugan Vivekanandarajah  
>
> * tree-ssa-ccp.c (ccp_fold_stmt): If the comparison is being folded
> and the operand on the LHS is being compared against a constant
> value that is outside of type limit, issue warning.
From 942f0e331f7a92ac46cee8793aac07af74e541e6 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Thu, 18 Aug 2016 20:29:38 +1000
Subject: [PATCH 7/7] Add warn_type_limits to ccp

---
 gcc/tree-ssa-ccp.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..fce76ce 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,6 +142,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
+#include "diagnostic-core.h"
+#include "intl.h"
 
 
 /* Possible lattice values.  */
@@ -2147,7 +2149,11 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 case GIMPLE_COND:
   {
 	gcond *cond_stmt = as_a  (stmt);
+	tree lhs = gimple_cond_lhs (stmt);
+	tree rhs = gimple_cond_rhs (stmt);
 	ccp_prop_value_t val;
+	wide_int min, max, rhs_val;
+	bool warn_limit = false;
 	/* Statement evaluation will handle type mismatches in constants
 	   more gracefully than the final propagation.  This allows us to
 	   fold more conditionals here.  */
@@ -2165,6 +2171,41 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 	fprintf (dump_file, "\n");
 	  }
 
+	/* If the comparison is being folded and the operand on the LHS
+	   is being compared against a constant value that is outside of
+	   the natural range of LHSs type, then the predicate will
+	   always fold regardless of the value of LHS.  If -Wtype-limits
+	   was specified, emit a warning.  */
+	if (warn_type_limits
+	&& INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	&& (rhs = get_constant_value (rhs))
+	&& (TREE_CODE (rhs) == INTEGER_CST))
+	  {
+	rhs_val = rhs;
+	min = TYPE_MIN_VALUE (TREE_TYPE (lhs));
+	max = TYPE_MAX_VALUE (TREE_TYPE (lhs));
+	warn_limit = true;
+	  }
+
+	if (warn_limit
+	&& (wi::cmp (rhs_val, min, TYPE_SIGN (TREE_TYPE (lhs))) == -1
+		|| wi::cmp (max, rhs_val, TYPE_SIGN (TREE_TYPE (lhs))) == -1))
+	  {
+	location_t location;
+
+	if (!gimple_has_location (stmt))
+	  location = input_location;
+	else
+	  location = gimple_location (stmt);
+
+	warning_at (location, OPT_Wtype_limits,
+			integer_zerop (val.value)
+			? G_("comparison always false "
+			 "due to limited range of data type")
+			: G_("comparison always true "
+			 "due to limited range of data type"));
+	  }
+
 	if (integer_zerop (val.value))
 	  gimple_cond_make_false (cond_stmt);
 	else
-- 
2.7.4



[TREE-SSA-CCP] Issue warning when folding condition

2016-08-18 Thread Kugan Vivekanandarajah
The testcase pr33738.C for warning fails with early-vrp patch. The
reason is, with early-vrp ccp2 is folding the comparison that used to
be folded in simplify_stmt_for_jump_threading. Since early-vrp does
not perform jump-threading is not optimized there.

Attached patch adds this warning to tree-ssa-ccp.c. We might also run
into some other similar issues in the future.

Bootstrapped and regression tested on x86_64-linux-gnu with no new regressions.

Is this OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2016-08-18  Kugan Vivekanandarajah  

* tree-ssa-ccp.c (ccp_fold_stmt): If the comparison is being folded
and the operand on the LHS is being compared against a constant
value that is outside of type limit, issue warning.
From 62152b5499233c2136dc388fd0322bab999f0cb4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Thu, 18 Aug 2016 20:29:38 +1000
Subject: [PATCH 7/7] Add warn_type_limits to ccp

---
 gcc/tree-ssa-ccp.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..3c32045 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,6 +142,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
+#include "diagnostic-core.h"
+#include "intl.h"
 
 
 /* Possible lattice values.  */
@@ -2147,7 +2149,11 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 case GIMPLE_COND:
   {
 	gcond *cond_stmt = as_a  (stmt);
+	tree lhs = gimple_cond_lhs (stmt);
+	tree rhs = gimple_cond_rhs (stmt);
 	ccp_prop_value_t val;
+	wide_int min, max, rhs_val;
+	bool warn_limit = false;
 	/* Statement evaluation will handle type mismatches in constants
 	   more gracefully than the final propagation.  This allows us to
 	   fold more conditionals here.  */
@@ -2165,6 +2171,41 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi)
 	fprintf (dump_file, "\n");
 	  }
 
+	/* If the comparison is being folded and the operand on the LHS
+	   is being compared against a constant value that is outside of
+	   the natural range of LHSs type, then the predicate will
+	   always fold regardless of the value of LHS.  If -Wtype-limits
+	   was specified, emit a warning.  */
+	if (warn_type_limits
+	&& INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+	&& (rhs = get_constant_value (rhs))
+	&& (TREE-code (rhs) == INTEGER_CST))
+	  {
+	rhs_val = rhs;
+	min = TYPE_MIN_VALUE (TREE_TYPE (lhs));
+	max = TYPE_MAX_VALUE (TREE_TYPE (lhs));
+	warn_limit = true;
+	  }
+
+	if (warn_limit
+	&& (wi::cmp (rhs_val, min, TYPE_SIGN (TREE_TYPE (lhs))) == -1
+		|| wi::cmp (max, rhs_val, TYPE_SIGN (TREE_TYPE (lhs))) == -1))
+	  {
+	location_t location;
+
+	if (!gimple_has_location (stmt))
+	  location = input_location;
+	else
+	  location = gimple_location (stmt);
+
+	warning_at (location, OPT_Wtype_limits,
+			integer_zerop (val.value)
+			? G_("comparison always false "
+			 "due to limited range of data type")
+			: G_("comparison always true "
+			 "due to limited range of data type"));
+	  }
+
 	if (integer_zerop (val.value))
 	  gimple_cond_make_false (cond_stmt);
 	else
-- 
2.7.4



Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-18 Thread Patrick Palka
On Thu, 18 Aug 2016, Richard Biener wrote:

> On August 18, 2016 8:25:18 PM GMT+02:00, Patrick Palka  
> wrote:
> >In comment #5 Yuri reports that r235653 introduces a runtime failure
> >for
> >176.gcc which I guess is caused by the combining step in
> >simplify_control_stmt_condition_1() not behaving properly on operands
> >of
> >type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
> >VECTOR_TYPEs because the logic should be generic enough to support them
> >as well.  But it was confirmed that restricting the combining step to
> >operands of scalar type fixes the runtime failure so here is a patch
> >that does this.  Does this look OK to commit after bootstrap +
> >regtesting on x86_64-pc-linux-gnu?
> 
> Hum, I'd rather understand what is going wrong.  Can you at least isolate a 
> testcase?
> 
> Richard.

I don't have access to the SPEC benchmarks unfortunately.  Maybe Yuri
can isolate a test case?

But I think I found a theoretical bug which may or may not coincide with
the bug that Yuri is observing.  The part of the combining step that may
provide wrong results for VECTOR_TYPEs is the one that simplifies the
conditional (A & B) != 0 to true when given that A != 0 and B != 0 and
given that their TYPE_PRECISION is 1.

The TYPE_PRECISION test was intended to succeed only on scalars, but
IIUC it accidentally succeeds on one-dimensional vectors too.  So we may
be wrongly simplifying X & Y != <0> to true given that e.g.  X == <8>
and Y == <2>.  So this simplification should probably be restricted to
integral types like so:

diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 170e456..b8c8b70 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -648,14 +648,17 @@ simplify_control_stmt_condition_1 (edge e,
  if (res1 != NULL_TREE && res2 != NULL_TREE)
{
  if (rhs_code == BIT_AND_EXPR
+ && INTEGRAL_TYPE_P (TREE_TYPE (op0))
  && TYPE_PRECISION (TREE_TYPE (op0)) == 1
  && integer_nonzerop (res1)
  && integer_nonzerop (res2))
-- 
2.9.3.650.g20ba99f

Hope this makes sense.

> 
> >gcc/ChangeLog:
> >
> > PR tree-optimization/71077
> > * tree-ssa-threadedge.c (simplify_control_stmt_condition_1):
> > Perform the combining step only if the operands have an integral
> > or a pointer type.
> >---
> > gcc/tree-ssa-threadedge.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> >diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> >index 170e456..a97c00c 100644
> >--- a/gcc/tree-ssa-threadedge.c
> >+++ b/gcc/tree-ssa-threadedge.c
> >@@ -577,6 +577,9 @@ simplify_control_stmt_condition_1 (edge e,
> >   if (handle_dominating_asserts
> >   && (cond_code == EQ_EXPR || cond_code == NE_EXPR)
> >   && TREE_CODE (op0) == SSA_NAME
> >+  /* ??? Vector types are mishandled here.  */
> >+  && (INTEGRAL_TYPE_P (TREE_TYPE (op0))
> >+  || POINTER_TYPE_P (TREE_TYPE (op0)))
> >   && integer_zerop (op1))
> > {
> >   gimple *def_stmt = SSA_NAME_DEF_STMT (op0);
> 
> 
> 


Re: [PATCH] aarch64: Add split-stack initial support

2016-08-18 Thread Adhemerval Zanella


On 08/08/2016 07:58, Jiong Wang wrote:
> 
> Adhemerval Zanella writes:
> 
>>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>>> index e56398a..2cf239f 100644
>>> --- a/gcc/config/aarch64/aarch64.c
>>> +++ b/gcc/config/aarch64/aarch64.c
>>> @@ -3227,6 +3227,34 @@ aarch64_expand_prologue (void)
>>> +  emit_move_insn (x11, GEN_INT (hard_fp_offset));
>>> +  emit_insn (gen_add3_insn (x10, x29, x11));
>>> +  jump = gen_rtx_IF_THEN_ELSE (VOIDmode,
>>> +  gen_rtx_GEU (VOIDmode, cc_reg,
>>> +   const0_rtx),
>>> +  gen_rtx_LABEL_REF (VOIDmode, not_more),
>>> +  pc_rtx);
>>> +  jump = emit_jump_insn (gen_rtx_SET (pc_rtx, jump));
>>> +  JUMP_LABEL (jump) = not_more;
>>> +  LABEL_NUSES (not_more) += 1;
>>> +  emit_move_insn (x10, x28);
>>> +  emit_label (not_more);
>>> +}
>>>  }
> 
> This part needs rebase, there are major changes in AArch64 prologue code
> recently.
> 

Right, I see that 'hard_fp_offset' is not defined locally anymore.

>>>  
>>>  /* Return TRUE if we can use a simple_return insn.
>>> @@ -3303,6 +3331,7 @@ aarch64_expand_epilogue (bool for_sibcall)
>>>offset = offset - fp_offset;
>>>  }
>>>  
>>> +
> 
> Unncessary new line.
>

Ack.

 
>>> +
>>> +  /* Load __private_ss from TCB.  */
>>> +  ssvalue = gen_rtx_REG (Pmode, R9_REGNUM);
>>> +  emit_insn (gen_aarch64_load_tp_hard (ssvalue));
>>> +  mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, ssvalue, psso));
>>> +  emit_move_insn (ssvalue, mem);
>>> +
>>> +  temp = gen_rtx_REG (Pmode, R10_REGNUM);
>>> +
>>> +  /* Always emit two insns to calculate the requested stack, so the linker
>>> + can edit them when adjusting size for calling non-split-stack code.  
>>> */
>>> +  ninsn = aarch64_internal_mov_immediate (temp, GEN_INT (-frame_size), 
>>> true,
>>> + Pmode);
>>> +  gcc_assert (ninsn == 1 || ninsn == 2);
>>> +  if (ninsn == 1)
>>> +emit_insn (gen_nop ());
> 
> there will be trouble to linker if the following add is scheduled before
> the nop?

I theory yes, although I haven't see gcc splitting it.  Which would be the
correct way to tie the nop generation to be emitted after the mov immediate?
 
>>> +
>>> +   # Set up for a call to the target function.
>>> +   #ldpx29, x30, [x28, STACKFRAME_BASE]
>>> +   ldr x30, [x28, STACKFRAME_BASE + 8]
>>> +   ldp x0, x1, [x28, STACKFRAME_BASE + 16]
>>> +   ldp x2, x3, [x28, STACKFRAME_BASE + 32]
>>> +   ldp x4, x5, [x28, STACKFRAME_BASE + 48]
>>> +   ldp x6, x7, [x28, STACKFRAME_BASE + 64]
>>> +   add x9, x30, 8
>>> +   cmp x30, x9
> 
> Can you explain why do we need this "cmp" before jumping to target
> function?

This is due the function prologue addition for var args handling:

  mov x11, 
  sub sp, sp, 
  add x10, x29, x11
  b.csfunction:
  mov x10, x28

If __morestack is called it will use the the 'b.cs' to setup the
correct var arg pointer.


Below it the last iteration patch, however I now seeing some similar issue
s390 hit when building libgo:

../../../gcc-git/libgo/go/syscall/socket_linux.go:90:1: error: flow control 
insn inside a basic block
(jump_insn 90 89 91 14 (set (pc)
(if_then_else (geu (reg:CC 66 cc)
(const_int 0 [0]))
(label_ref 92)
(pc))) ../../../gcc-git/libgo/go/syscall/socket_linux.go:90 -1
 (nil)
 -> 92)
../../../gcc-git/libgo/go/syscall/socket_linux.go:90:1: internal compiler 
error: in rtl_verify_bb_insns, at cfgrtl.c:2658
0xac35af _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)

It shows only with -O2, which I think it due how the block is reorganized
internally and regarding the pseudo-return instruction inserted by split-stack.
I am still debugging the issue and how to proper fix it, so if you have any
advice I open to suggestions.




diff --git a/gcc/common/config/aarch64/aarch64-common.c 
b/gcc/common/config/aarch64/aarch64-common.c
index 08e7959..01c3239 100644
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -106,6 +106,21 @@ aarch64_handle_option (struct gcc_options *opts,
 }
 }
 
+/* -fsplit-stack uses a TCB field available on glibc-2.25.  GLIBC also
+   exports symbol, __tcb_private_ss, to signal it has the field available
+   on TCB allocation.  This aims to prevent binaries linked against newer
+   GLIBC to run on non-supported ones.  */
+
+static bool
+aarch64_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+ struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
+#undef TARGET_SUPPORTS_SPLIT_STACK
+#define TARGET_SUPPORTS_SPLIT_STACK aarch64_supports_split_stack
+
 struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
 
 /* An ISA extension in the co-processor and main 

Re: Repeated use of the OpenACC routine directive

2016-08-18 Thread Cesar Philippidis
On 08/16/2016 06:05 PM, Thomas Schwinge wrote:

> commit bffb0ee6c0a83b8c85cd919e1172086b51fdc452
> Author: tschwinge 
> Date:   Wed Aug 17 00:55:02 2016 +
> 
> Repeated use of the C/C++ OpenACC routine directive
> 
>   gcc/
>   * omp-low.c (verify_oacc_routine_clauses): Change formal
>   parameters.  Add checking if already marked as an accelerator
>   routine.  Adjust all users.
>   gcc/c/
>   * c-parser.c (c_finish_oacc_routine): Rework checking if already
>   marked as an accelerator routine.
>   gcc/cp/
>   * parser.c (cp_finalize_oacc_routine): Rework checking if already
>   marked as an accelerator routine.
>   gcc/testsuite/
>   * c-c++-common/goacc/oaccdevlow-routine.c: Update.

This test case is missing in both gomp4 and the git diff --stat below.

Cesar

>   * c-c++-common/goacc/routine-5.c: Likewise.
>   * c-c++-common/goacc/routine-level-of-parallelism-1.c: Likewise.
>   * c-c++-common/goacc/routine-level-of-parallelism-2.c: New file.
> 
>   gcc/testsuite/
>   * c-c++-common/goacc/routine-1.c: Update.
>   * c-c++-common/goacc/routine-2.c: Likewise.
>   * c-c++-common/goacc/routine-nohost-1.c: Likewise.
>   * g++.dg/goacc/routine-2.C: Likewise.
>   * c-c++-common/goacc/routine-nohost-2.c: New file.
> 
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@239521 
> 138bc75d-0d04-0410-961f-82ee72b054a4
> ---
>  gcc/ChangeLog.gomp |   4 +
>  gcc/c/ChangeLog.gomp   |   3 +
>  gcc/c/c-parser.c   |  42 ++--
>  gcc/cp/ChangeLog.gomp  |   3 +
>  gcc/cp/parser.c|  51 +++--
>  gcc/omp-low.c  | 146 -
>  gcc/omp-low.h  |   2 +-
>  gcc/testsuite/ChangeLog.gomp   |  11 +
>  gcc/testsuite/c-c++-common/goacc/routine-1.c   |  51 -
>  gcc/testsuite/c-c++-common/goacc/routine-2.c   | 143 +
>  gcc/testsuite/c-c++-common/goacc/routine-5.c   |  46 +---
>  .../goacc/routine-level-of-parallelism-1.c | 233 
> ++---
>  .../goacc/routine-level-of-parallelism-2.c |  73 +++
>  .../c-c++-common/goacc/routine-nohost-1.c  |  20 ++
>  .../c-c++-common/goacc/routine-nohost-2.c  |  97 +
>  gcc/testsuite/g++.dg/goacc/routine-2.C |   9 +
>  16 files changed, 816 insertions(+), 118 deletions(-)



[PR fortran/60774] Patch

2016-08-18 Thread Steve Kargl
If no one objects, I would like to commit the
following patch on Saturday.

Note, Bud Davis found the location of the problem for
free-form code and proposed a fixed.  I've modified his
proposed patch and applied a similar change for fixed-form.

Previously, gfortran would issue a warning and reject a
statement label on a line by itself (leading to and ICE).
The patch will cause an error to be emitted and the invalid
line of code is accepted.

Also note, label_3.f90 and empty_label.f90 are essentially
identical code.  I intend to delete label_3.f90 when I 
commit, but here I've included the diff for label_3.f90.

2016-08-20  Steven G. Kargl  
Bud Davis  

* parse.c (next_free,next_fixed): Issue error for statement label
without a statement.

2016-08-20  Steven G. Kargl  
* gfortran.dg/empty_label.f: Adjust test for new error message.
* gfortran.dg/empty_label.f90: Ditto.
* gfortran.dg/empty_label_typedecl.f90: Ditto.
* gfortran.dg/label_3.f90: Ditto.
* gfortran.dg/warnings_are_errors_1.f90: Remove invlid statement label.

Index: gcc/fortran/parse.c
===
--- gcc/fortran/parse.c (revision 239543)
+++ gcc/fortran/parse.c (working copy)
@@ -1071,13 +1071,8 @@ next_free (void)
}
 
  if (gfc_match_eos () == MATCH_YES)
-   {
- gfc_warning_now (0, "Ignoring statement label in empty statement "
-  "at %L", _locus);
- gfc_free_st_label (gfc_statement_label);
- gfc_statement_label = NULL;
- return ST_NONE;
-   }
+   gfc_error_now ("Statement label without statement at %L",
+  _locus);
}
 }
   else if (c == '!')
@@ -1333,8 +1328,7 @@ next_fixed (void)
 
 blank_line:
   if (digit_flag)
-gfc_warning_now (0, "Ignoring statement label in empty statement at %L",
-_locus);
+gfc_error_now ("Statement label without statement at %L", _locus);
 
   gfc_current_locus.lb->truncated = 0;
   gfc_advance_line ();
Index: gcc/testsuite/gfortran.dg/empty_label.f
===
--- gcc/testsuite/gfortran.dg/empty_label.f (revision 239543)
+++ gcc/testsuite/gfortran.dg/empty_label.f (working copy)
@@ -1,7 +1,4 @@
 C { dg-do compile }
-C { dg-options "-Werror -fmax-errors=1" }
-100   ! { dg-error "empty statement" }
+100   ! { dg-error "Statement label without statement" }
   end
-subroutine foo ! Not checked ...
-end function ! ... but an error
-C { dg-excess-errors "warnings being treated as errors" }
+
Index: gcc/testsuite/gfortran.dg/empty_label.f90
===
--- gcc/testsuite/gfortran.dg/empty_label.f90   (revision 239543)
+++ gcc/testsuite/gfortran.dg/empty_label.f90   (working copy)
@@ -1,7 +1,3 @@
 ! { dg-do compile }
-! { dg-options "-Werror -fmax-errors=1" }
-100   ! { dg-error "empty statement" }
+100   ! { dg-error "Statement label without statement" }
 end
-subroutine foo ! Not checked ...
-end function ! ... but an error
-! { dg-excess-errors "warnings being treated as errors" }
Index: gcc/testsuite/gfortran.dg/empty_label_typedecl.f90
===
--- gcc/testsuite/gfortran.dg/empty_label_typedecl.f90  (revision 239543)
+++ gcc/testsuite/gfortran.dg/empty_label_typedecl.f90  (working copy)
@@ -1,8 +1,6 @@
 ! { dg-do compile }
-! { dg-options "-Werror" }
 subroutine s
   type t
-  1 ! { dg-error "empty statement" }
+  1 ! { dg-error "Statement label without statement" }
   end type
 end subroutine
-! { dg-excess-errors "warnings being treated as errors" }
Index: gcc/testsuite/gfortran.dg/label_3.f90
===
--- gcc/testsuite/gfortran.dg/label_3.f90   (revision 239543)
+++ gcc/testsuite/gfortran.dg/label_3.f90   (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do compile }
 ! PR fortran/25756.
 ! This used to ICE due to the space after the label.
-1 ! { dg-warning "Ignoring statement label in empty statement" }
+1 ! { dg-error "Statement label without statement" }
 end
Index: gcc/testsuite/gfortran.dg/warnings_are_errors_1.f90
===
--- gcc/testsuite/gfortran.dg/warnings_are_errors_1.f90 (revision 239543)
+++ gcc/testsuite/gfortran.dg/warnings_are_errors_1.f90 (working copy)
@@ -20,8 +20,6 @@
 1234  complex :: cplx ! { dg-error "defined but cannot be used" }
   cplx = 20.
 
-! gfc_warning_now:
- 1 ! { dg-error "Ignoring statement label in empty statement" }
end
 ! { dg-final { output-exists-not } }
 ! { dg-excess-errors "warnings being treated as errors" }

-- 
Steve


Re: [Patch] Implement std::experimental::variant

2016-08-18 Thread Tim Shen
Tested on x86_64-linux-gnu and checked in as r239590.

Thanks!


-- 
Regards,
Tim Shen


Re: [PATCH] Add source information to -fverbose-asm

2016-08-18 Thread Jeff Law

On 08/12/2016 01:13 PM, David Malcolm wrote:

On Thu, 2016-08-11 at 20:00 -0600, Sandra Loosemore wrote:

On 08/11/2016 02:34 PM, David Malcolm wrote:

I sometimes find myself scouring assembler output from the compiler
and trying to figure out which instructions correspond to which
lines of source code; I believe this is a common activity for some
end-users.

The following patch adds a new -fasm-show-source option, which
emits comments into the generated asm showing the pertinent
line of source code, whenever it changes.  It uses the same logic
as debug_hooks->source_line for tracking this (for handling
line-based breakpoints).

An example can be seen in the invoke.texi part of the patch.  As
noted there, it's aimed at end-users, rather than gcc developers.
The example shows a relatively short function; the option is
likely to be much more useful for longer functions.

I think it would further improve usability if this option were
enabled
by default when the final output is .s (either via -S, or by "-o
foo.s").
Ideas on how to implement that (in the driver) would be welcome - I
started looking at the spec-handling code, but thought I'd post the
idea here first, before diving in too deeply.

Successfully bootstrapped on x86_64-pc-linux-gnu; adds
2 PASS results to gcc.sum.

Thoughts?  OK for trunk as-is?


Why not extend the existing -fverbose-asm to do this?  E.g.
-fverbose-asm=source, or something like that.  Otherwise you need to
cross-reference the documentation for the two options and explain how
they interact (or don't interact, as the case may be).

-Sandra


Thanks.

With the patch as-is, if I pass both -fverbose-asm and
-fasm-show-source, I get the following:

# test.c:7:   int total = 0;
xorl%eax, %eax  # 
# test.c:9:   for (i = 0; i < n; i++)
xorl%edx, %edx  # i
.L2:
# test.c:9:   for (i = 0; i < n; i++)
cmpl%edi, %edx  # n, i
jge .L5 #,
# test.c:10: total += i * i;
movl%edx, %ecx  # i, tmp92
imull   %edx, %ecx  # i, tmp92
# test.c:9:   for (i = 0; i < n; i++)
incl%edx# i
# test.c:10: total += i * i;
addl%ecx, %eax  # tmp92, 
jmp .L2 #
.L5:
# test.c:13: }
ret
.cfi_endproc

I find the above pleasing, as it shows both the source, and the
variable names associated with the asm insn arguments.  The source
line information works well with jump-to-source, in Emacs, at least.

-fverbose-asm also adds some metadata to the top of the dump, that I
wasn't interested in from a see-the-source point of view, but I can live
with that.

Currently -fverbose-asm is documented in common.opt as:
"Add extra commentary to assembler output"

and in invoke.texi as:
  @opindex fverbose-asm
  Put extra commentary information in the generated assembly code to
  make it more readable.  This option is generally only of use to those
  who actually need to read the generated assembly code (perhaps while
  debugging the compiler itself).

  @option{-fno-verbose-asm}, the default, causes the
  extra information to be omitted and is useful when comparing two
  assembler files.

Given that the precise output format for -fverbose-asm isn't
documented, and it already can be thought of as our option for
"make the asm readable please", my preference would be to extend it
to print the source lines, without adding any "=source"
complications: if the user was interested in the relationship of ins
arguments to source expressions, they're likely also interested in
seeing the corresponding source lines.

Following is a rewritten version of the patch that does this, adding
a description of the output to invoke.texi (along with a caveat that
one should not try to parse the comments).

OK for trunk if it survives bootstrap testing?

As mentioned before, I'd also like to make this more discoverable,
by automatically adding the option in the driver if the user has
specified asm output (via -S or via "-o something.s") - is this
latter idea also reasonable?

Dave

gcc/ChangeLog:
* doc/invoke.texi (fverbose-asm): Note that source code lines
are emitted, and provide an example.
* final.c (asm_show_source): New function.
(final_scan_insn): Call asm_show_source.

gcc/testsuite/ChangeLog:
* gcc.dg/verbose-asm-2.c: New test case.
I don't think we have any kind of requirement to maintain a format for 
-fverbose-asm.  So I think adding the new stuff under the -fverbose-asm 
flag is the way to go.


Whether or not to add it to -S or not is a distinct question and I don't 
think the answer is as clear cut...


OK on the updated patch.

jeff



Re: [PATCH] DWARF: remove pessimistic DWARF version checks for imported entities

2016-08-18 Thread Jeff Law

On 08/18/2016 03:33 AM, Pierre-Marie de Rodat wrote:

Hello,

A check in dwarf2out_imported_module_or_decl prevents valid strict
DWARF2 constructs such as DW_TAG_imported_declaration from being emitted
in dwarf2out_imported_module_or_decl_1.

The latter already protects the emission of newer DWARF tags with
appropriate checks, so the one in the former is redundant and
pessimistic.  This function is already called from places like
process_scope_var, which are not protected anyway.

This patch removes the check in dwarf2out_imported_module_or_decl so
that tags like DW_TAG_imported_declaration are emitted even in strict
DWARF2 mode.

Bootstrapped and regtested on x86_64-linux, no regression.  I also
checked that the new testcase fails on mainline.  Ok to commit?
Thank you in advance!

gcc/

* dwarf2out.c (dwarf2out_imported_module_or_decl): Remove
pessimistic DWARF version check.

gcc/testsuite/

* gnat.dg/debug7.adb, gnat.dg/debug7.ads: New testcase.


Jakub -- you want to take a looksie at this one?  Jason is on PTO for 
the next couple weeks.  I'm not well versed enough to verify that 
d_i_m_o_d_1 has all the right checks or not.



Jeff



Re: [Patch, testsuite] Skip tests that expect 4 byte alignment for avr

2016-08-18 Thread Jeff Law

On 08/16/2016 11:55 PM, Senthil Kumar Selvaraj wrote:


Jeff Law writes:


On 08/11/2016 01:40 AM, Senthil Kumar Selvaraj wrote:

Hi,

  The below patch adds the AVR target to the list of targets that don't
  have natural_alignment_32. It also skips ipa/propalign-*.c
  tests (which expect 4 byte alignment), if both
  natural_alignment_32 and natural_alignment_64 are false.

  Is this the right way to fix this? Ok to commit?

Regards
Senthil

gcc/testsuite/ChangeLog:

2016-08-11  Senthil Kumar Selvaraj  

* gcc.dg/ipa/propalign-1.c: Skip for targets with !natural_alignment_32
and !natural_alignment_64.
* gcc.dg/ipa/propalign-2.c: Likewise.
* gcc.dg/ipa/propalign-3.c: Likewise.
* gcc.dg/ipa/propalign-4.c: Likewise.
* gcc.dg/ipa/propalign-5.c: Likewise.
* lib/target-supports.exp
(check_effective_target_natural_alignment_32): Add avr-*-*.

Do you need to add an avr case to
check_effective_target_natural_alignment_64 as well?


The 64 bit version is written to return true only for lp64 and spu-*-*,
so I didn't have to do anything there.


Have you tested this on anything other than avr?



Yes, x86_64-pc-linux passes these tests, like it did before my patch.

OK for the trunk.  Please install.

Thanks,
jeff


Re: Fwd: [PATCH, doc/ARM] Remove false affirmation that Thumb cannot use an FPU

2016-08-18 Thread Sandra Loosemore

On 08/18/2016 04:05 AM, Richard Earnshaw (lists) wrote:


I think it's probably best to just drop the entire parenthetical
subcluase.  This is documentation of how to use MULTILIB_EXCEPTIONS not
precise documentation on what needs to be done on ARM.

In fact, it might be better to just rewrite the whole section based on a
theoretical machine that has two ISAs, one which can support
floating-point and one which can't.  You then get back to essentially
the same text as we had originally but anonymised and future proofed.


Yes, this is a good idea.  Or, carrying it even farther, you could make 
the whole thing hypothetical, like "Suppose a target supports flags 
@option{-mfoo} and @option{-mbar} that are mutually exclusive"


-Sandra



Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-18 Thread Jeff Law

On 08/18/2016 01:22 PM, Richard Biener wrote:

On August 18, 2016 8:25:18 PM GMT+02:00, Patrick Palka  
wrote:

In comment #5 Yuri reports that r235653 introduces a runtime failure
for
176.gcc which I guess is caused by the combining step in
simplify_control_stmt_condition_1() not behaving properly on operands
of
type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
VECTOR_TYPEs because the logic should be generic enough to support them
as well.  But it was confirmed that restricting the combining step to
operands of scalar type fixes the runtime failure so here is a patch
that does this.  Does this look OK to commit after bootstrap +
regtesting on x86_64-pc-linux-gnu?


Hum, I'd rather understand what is going wrong.  Can you at least isolate a 
testcase?
Agreed.  And a fix like this ought to include that isolated test in the 
regression suite.


jeff



Re: Early jump threading

2016-08-18 Thread Jeff Law

On 08/11/2016 08:02 AM, Jan Hubicka wrote:

Hi,
this patch adds early jump threading pass.  Jump threading is one of most
common cases where estimated profile becomes corrupted, because the branches
are predicted independently beforehand. This patch simply adds early mode to
jump threader that does not permit code growth and thus only win/win threads
are done before profile is constructed.

Naturally this also improves IPA decisions because code size/time is estimated
more acurately.

It is not very cool to add another pass and the jump threader is already
run 5 times. I think incrementally we could drop one of late threaders at least.
I tried to measure compile time effects but they are in wash. Moreover the patch
pays for itself in cc1plus code size:

Before patch to tweak size estimates: 27779964
Current mainline: 27748900
With patch applied:   27716173

So despite adding few functions the code size effect is positive which I think
is quite nice.

Given the fact that jump threading seems quite lightweit, I wonder why it is
guarded by flag_expensive_optimizations? Is it expensive in terms of code
size?

The effectivity of individual threading passes is as follows (for tramp3d)

  mainline  with patch
pass  thread count profile mismatches   thread countprofile mismatch
early   525
1 1853 1900 316 337
2 4812  4   288
So the real question here is what did VRP1 threading do between the 
standalone thread1/thread2 passes.  ie, it may look tempting to 
eliminate the standalone thread2 pass, but threading done by VRP1 is 
expected to be pushed into the thread{1,2} passes.  So removing thread2 
is premature at this point.






The patch distorts testusite somewhat, in most cases we only want to update
dump files or disable early threading:

+XPASS: gcc.dg/uninit-15.c  (test for warnings, line 13)
+XPASS: gcc.dg/uninit-15.c  (test for warnings, line 23)
+FAIL: gcc.dg/uninit-15.c  (test for warnings, line 24)
This seems like a step backwards to me.  We're warning about "j", but in 
the context in which its used, we really ought to be warning about "i". 
And we lost a later warning.






+FAIL: gcc.dg/tree-ssa/pr68198.c scan-tree-dump-times thread1 "Registering FSM" 
1

Moved into early-threading, so adjusting test for that seems appropriate.



+FAIL: gcc.dg/tree-ssa/pr69196-1.c scan-tree-dump thread1 "FSM did not thread around 
loop and would copy too many statements"
I'm not sure how to best check this one.  Threading this test too 
aggressively results in something like a 2X bloat in the resulting code 
on the sparc.






+FAIL: gcc.dg/tree-ssa/ssa-dom-thread-2b.c scan-tree-dump-times thread1 "Jumps 
threaded: 1" 1
Moved into early threading, so adjusting the test for that seems 
appropriate to me.




+FAIL: gcc.dg/tree-ssa/ssa-thread-13.c scan-tree-dump thread1 "FSM"
Moved into early threading, so adjusting the test for that seems 
appropriate to me.




+FAIL: gcc.dg/tree-ssa/vrp01.c scan-tree-dump-times vrp1 "Folding predicate p_.*to 
1" 1
So the optimization moved into fre1 as a result of early jump threading 
which is good. We should probably check for that, even if the test is 
badly mis-named now.


But this is showing one of the cases where the backwards threader is 
deficient.  It's not using the conditional as an implied set.  It's a 
known limitation and I'm hoping Andrew's work makes that an easier 
problem to solve.



+FAIL: gcc.dg/tree-ssa/vrp56.c scan-tree-dump thread1 "Jumps threaded: 1"
Moved into early threading, so adjusting the test for that seems 
appropriate to me.



+FAIL: gcc.dg/tree-ssa/vrp92.c scan-tree-dump vrp1 "res_.: [1, 1]"
Some results adjustment seems to be in order here.  I think verifying 
that early threading finds the obvious jump threads, then vrp is able to 
eliminate the 2nd conditional and the result is a collapsed return i for 
the entire function.




This testcase is the now probably unnecesary heuristics (it may still be
relevant in cases we do not thread because of size bounds but its effectivity
may not be worth the maintenance cost):

+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++11  scan-tree-dump-times profile_estimate 
"extra loop exit heuristics of edge[^:]*:" 2
+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++11  scan-tree-dump-times profile_estimate 
"loop exit heuristics of edge[^:]*:" 3
+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++14  scan-tree-dump-times profile_estimate 
"extra loop exit heuristics of edge[^:]*:" 2
+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++14  scan-tree-dump-times profile_estimate 
"loop exit heuristics of edge[^:]*:" 3
+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++98  scan-tree-dump-times profile_estimate 
"extra loop exit heuristics of edge[^:]*:" 2
+FAIL: g++.dg/predict-loop-exit-1.C  -std=gnu++98  

Re: [patch, Fortran] Fix PR 71902

2016-08-18 Thread Thomas Koenig

Hello Mikael,


This doesn't look correct, what about substrings following component or
array references?


You're right; corrected in the attached patch.  I have also added a test
case for this.


PS: What about the original fix, wasn't it a dependency
problem/inaccuracy after all?


The main error was using the allocatable attribute on the symbol that
we were dealing with a deferred length.  Changing the test to check for
expr->ts.deferred is the right thing to do.

There are still some some overly pessimistic assumptions in dependency
handling.  However, I am not sure that this causes actual performance
issues.

Regards

Thomas

2016-08-18  Thomas Koenig  

PR fortran/71902
* frontend-passes.c (realloc_string_callback):  Check for deferred
on the expression instead for allocatable on the symbol.  Name 
temporary

variable "realloc_string".

2016-08-18  Thomas Koenig  

PR fortran/71902
* gfortran.dg/dependency_47.f90:  New test.
* gfortran.dg/dependency_48.f90:  New test.

Index: frontend-passes.c
===
--- frontend-passes.c	(Revision 239488)
+++ frontend-passes.c	(Arbeitskopie)
@@ -164,6 +164,8 @@ realloc_string_callback (gfc_code **c, int *walk_s
   gfc_expr *expr1, *expr2;
   gfc_code *co = *c;
   gfc_expr *n;
+  gfc_ref *ref;
+  bool found_substr;
 
   if (co->op != EXEC_ASSIGN)
 return 0;
@@ -170,7 +172,7 @@ realloc_string_callback (gfc_code **c, int *walk_s
 
   expr1 = co->expr1;
   if (expr1->ts.type != BT_CHARACTER || expr1->rank != 0
-  || !expr1->symtree->n.sym->attr.allocatable)
+  || !expr1->ts.deferred)
 return 0;
 
   expr2 = gfc_discard_nops (co->expr2);
@@ -177,6 +179,18 @@ realloc_string_callback (gfc_code **c, int *walk_s
   if (expr2->expr_type != EXPR_VARIABLE)
 return 0;
 
+  found_substr = false;
+  for (ref = expr2->ref; ref; ref = ref->next)
+{
+  if (ref->type == REF_SUBSTRING)
+	{
+	  found_substr = true;
+	  break;
+	}
+}
+  if (!found_substr)
+return 0;
+
   if (!gfc_check_dependency (expr1, expr2, true))
 return 0;
 
@@ -190,7 +204,7 @@ realloc_string_callback (gfc_code **c, int *walk_s
   current_code = c;
   inserted_block = NULL;
   changed_statement = NULL;
-  n = create_var (expr2, "trim");
+  n = create_var (expr2, "realloc_string");
   co->expr2 = n;
   return 0;
 }
! { dg-do compile }
! { dg-options "-fdump-tree-original" }
! PR fortran/71902 - make sure that component references are followed
! for dependency analysis.
program main
  type foo
 character(len=:), allocatable :: x
  end type foo
  type(foo) :: a
  a%x = 'asdf'
  a%x = a%x(2:3)
  print *,a%x
end program main
! { dg-final { scan-tree-dump-times "__var_1" 4 "original" } }
! { dg-do compile }
! Make sure there is only one instance of a temporary variable here.
! { dg-options "-fdump-tree-original" }

SUBROUTINE prtdata(ilen)
  INTEGER :: ilen
  character(len=ilen), allocatable :: cline(:)
  allocate(cline(2))
  cline(1) = 'a'
  cline(1)(2:3) = cline(1)(1:2)
  cline(2) = cline(1)
  print *,c
END SUBROUTINE prtdata
! { dg-final { scan-tree-dump-not "__var_" "original" } }


Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-18 Thread Richard Biener
On August 18, 2016 8:25:18 PM GMT+02:00, Patrick Palka  
wrote:
>In comment #5 Yuri reports that r235653 introduces a runtime failure
>for
>176.gcc which I guess is caused by the combining step in
>simplify_control_stmt_condition_1() not behaving properly on operands
>of
>type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
>VECTOR_TYPEs because the logic should be generic enough to support them
>as well.  But it was confirmed that restricting the combining step to
>operands of scalar type fixes the runtime failure so here is a patch
>that does this.  Does this look OK to commit after bootstrap +
>regtesting on x86_64-pc-linux-gnu?

Hum, I'd rather understand what is going wrong.  Can you at least isolate a 
testcase?

Richard.

>gcc/ChangeLog:
>
>   PR tree-optimization/71077
>   * tree-ssa-threadedge.c (simplify_control_stmt_condition_1):
>   Perform the combining step only if the operands have an integral
>   or a pointer type.
>---
> gcc/tree-ssa-threadedge.c | 3 +++
> 1 file changed, 3 insertions(+)
>
>diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>index 170e456..a97c00c 100644
>--- a/gcc/tree-ssa-threadedge.c
>+++ b/gcc/tree-ssa-threadedge.c
>@@ -577,6 +577,9 @@ simplify_control_stmt_condition_1 (edge e,
>   if (handle_dominating_asserts
>   && (cond_code == EQ_EXPR || cond_code == NE_EXPR)
>   && TREE_CODE (op0) == SSA_NAME
>+  /* ??? Vector types are mishandled here.  */
>+  && (INTEGRAL_TYPE_P (TREE_TYPE (op0))
>+|| POINTER_TYPE_P (TREE_TYPE (op0)))
>   && integer_zerop (op1))
> {
>   gimple *def_stmt = SSA_NAME_DEF_STMT (op0);




[committed] Allow calling diagnostic_show_locus without a diagnostic_info

2016-08-18 Thread David Malcolm
Much of diagnostic-show-locus.c currently expects a diagnostic_info *,
but it only uses the rich_location and the diagnostic_t.

Change the signature of diagnostic_show_locus from:

  void
  diagnostic_show_locus (diagnostic_context *,
 const diagnostic_info *);

to:

  void
  diagnostic_show_locus (diagnostic_context *,
 rich_location *richloc,
 diagnostic_t diagnostic_kind);

so that it can be used for things other than diagnostics.

Use this flexibility to add selftests for diagnostic_show_locus.

Successfully bootstrapped on x86_64-pc-linux-gnu.
Verified stage1 -fself-test on powerpc-ibm-aix7.1.3.0 (gcc111).

Committed to trunk as r239586.

gcc/c-family/ChangeLog:
* c-opts.c (c_diagnostic_finalizer): Update for change to
diagnostic_show_locus.

gcc/ChangeLog:
* diagnostic-show-locus.c (colorizer::colorizer): Replace diagnostic
param with diagnostic_kind.
(class colorizer): Similarly replace field m_diagnostic with
m_diagnostic_kind.
(colorizer::colorizer): Replace diagnostic
param with diagnostic_kind.
(colorizer::begin_state): Update for above field change.
(layout::layout): Replace diagnostic param with rich_location *
and diagnostic_kind.
(diagnostic_show_locus): Replace diagnostic param with richloc
and diagnostic_kind.
(class selftest::test_diagnostic_context): New class.
(selftest::test_diagnostic_show_locus_unknown_location): New
function.
(selftest::test_one_liner_simple_caret): New function.
(selftest::test_one_liner_caret_and_range): New function.
(selftest::test_one_liner_multiple_carets_and_ranges): New
function.
(selftest::test_one_liner_fixit_remove): New function.
(selftest::test_one_liner_fixit_replace): New function.
(selftest::test_diagnostic_show_locus_one_liner): New function.
(selftest::diagnostic_show_locus_c_tests): Call the new test
functions.
* diagnostic.c (diagnostic_initialize): Initialize
colorize_source_p, show_ruler_p and parseable_fixits_p.
(default_diagnostic_finalizer): Update for change to
diagnostic_show_locus.
(diagnostic_append_note): Likewise.
* diagnostic.h (diagnostic_show_locus): Replace
const diagnostic_info * param with location * and diagnostic_t.

gcc/fortran/ChangeLog:
* error.c (gfc_diagnostic_starter): Update for change to
diagnostic_show_locus.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(custom_diagnostic_finalizer): Update for change to
diagnostic_show_locus.
---
 gcc/c-family/c-opts.c  |   2 +-
 gcc/diagnostic-show-locus.c| 243 +++--
 gcc/diagnostic.c   |   7 +-
 gcc/diagnostic.h   |   4 +-
 gcc/fortran/error.c|   2 +-
 .../plugin/diagnostic_plugin_test_show_locus.c |   2 +-
 6 files changed, 236 insertions(+), 24 deletions(-)

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 574718a..e83944c 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -164,7 +164,7 @@ static void
 c_diagnostic_finalizer (diagnostic_context *context,
diagnostic_info *diagnostic)
 {
-  diagnostic_show_locus (context, diagnostic);
+  diagnostic_show_locus (context, diagnostic->richloc, diagnostic->kind);
   /* By default print macro expansion contexts in the diagnostic
  finalizer -- for tokens resulting from macro expansion.  */
   virt_loc_aware_diagnostic_finalizer (context, diagnostic);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 49f7f11..4498f7c 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -74,7 +74,7 @@ class colorizer
 {
  public:
   colorizer (diagnostic_context *context,
-const diagnostic_info *diagnostic);
+diagnostic_t diagnostic_kind);
   ~colorizer ();
 
   void set_range (int range_idx) { set_state (range_idx); }
@@ -90,7 +90,7 @@ class colorizer
   static const int STATE_NORMAL_TEXT = -1;
 
   diagnostic_context *m_context;
-  const diagnostic_info *m_diagnostic;
+  diagnostic_t m_diagnostic_kind;
   int m_current_state;
   const char *m_caret_cs;
   const char *m_caret_ce;
@@ -187,7 +187,8 @@ class layout
 {
  public:
   layout (diagnostic_context *context,
- const diagnostic_info *diagnostic);
+ rich_location *richloc,
+ diagnostic_t diagnostic_kind);
 
   int get_num_line_spans () const { return m_line_spans.length (); }
   const line_span *get_line_span (int idx) const { return _line_spans[idx]; }
@@ -239,9 +240,9 @@ class layout
different kinds of things we might need to print.  */
 
 colorizer::colorizer (diagnostic_context 

Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-18 Thread Manuel López-Ibáñez

On 18/08/16 18:07, David Malcolm wrote:


This isn't quite the way that fix-its are meant to be used, in my mind,
at least: the insertion text is supposed to be something that could be
literally inserted into the code (e.g. by an IDE) i.e. it should be a
code fragment, rather than a message to the user.


I added this info to

https://gcc.gnu.org/wiki/DiagnosticsGuidelines#Fix-it_hints




+static void
+maybe_warn_implicit_fallthrough (gimple_seq seq)
+{
+  if (!warn_implicit_fallthrough || lang_GNU_Fortran ())
+return;
+



Does this warning make sense if !(lang_GNU_C() || lang_GNU_CXX()) ?

Cheers,

Manuel.


[PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-18 Thread Patrick Palka
In comment #5 Yuri reports that r235653 introduces a runtime failure for
176.gcc which I guess is caused by the combining step in
simplify_control_stmt_condition_1() not behaving properly on operands of
type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
VECTOR_TYPEs because the logic should be generic enough to support them
as well.  But it was confirmed that restricting the combining step to
operands of scalar type fixes the runtime failure so here is a patch
that does this.  Does this look OK to commit after bootstrap +
regtesting on x86_64-pc-linux-gnu?

gcc/ChangeLog:

PR tree-optimization/71077
* tree-ssa-threadedge.c (simplify_control_stmt_condition_1):
Perform the combining step only if the operands have an integral
or a pointer type.
---
 gcc/tree-ssa-threadedge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 170e456..a97c00c 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -577,6 +577,9 @@ simplify_control_stmt_condition_1 (edge e,
   if (handle_dominating_asserts
   && (cond_code == EQ_EXPR || cond_code == NE_EXPR)
   && TREE_CODE (op0) == SSA_NAME
+  /* ??? Vector types are mishandled here.  */
+  && (INTEGRAL_TYPE_P (TREE_TYPE (op0))
+ || POINTER_TYPE_P (TREE_TYPE (op0)))
   && integer_zerop (op1))
 {
   gimple *def_stmt = SSA_NAME_DEF_STMT (op0);
-- 
2.9.3.650.g20ba99f



Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-08-18 Thread Jeff Law

On 08/11/2016 08:13 PM, Martin Sebor wrote:

Attached is an updated patch with changes addressing most of
the suggestions and comments I received.

The main changes/improvements are the following:

 *  New option, -fprintf-return-value, enables the sprintf return
value optimization.  It's disabled by default in this version
but I'm hoping to enable in the final version of the patch.
I added it in this version mainly to be able to verify the
correctness of the pass by comparing the sprintf return value
computed by the pass to the expected result returned from
the same libc call.  (Before I had to rely solely on results
hardcoded in the tests).

 *  With the -fdump-tree-printf-return-value the pass generates
simple dump output to indicate what has been optimized and
what hasn't.

 *  Floating point formatting relies on mpfr_snprintf to determine
the size of the output for known values.  This lets the pass
avoid duplicating the tricky floating point math done by sprintf
and getting it wrong.

 *  New target hooks remove hardcoding target-specific assumptions
about libc implementation-specific details (%p format and printf
floating point rounding mode).

What's missing is integration with David's latest location range
changes.  I plan to work on it next but this seemed like a good
stopping point to get feedback.

There also are still opportunities for improvements to the string
length computation (now via gimple-fold.c's get_maxval_strlen,
thanks to Jakub) to improve checking of %s directives.  But since
that's a general-purpose function that's used outside this pass
it will be a separate patch.
My biggest concern with this iteration is the tight integration between 
the optimization and warning.  We generally avoid that kind of tight 
integration such that enabling the warning does not affect the 
optimization and vice-versa.


So ISTM you have to do the analysis if the optimization or warning has 
been requested.  Then you conditionalize whether or not the warnings are 
emitted by their flag and the optimization based on its flag.


I understand you're going to have some further work to do because of 
conflicts with David's patches.  With that in mind, I'd suggest a bit of 
carving things up so things can start moving forward.



Patch #1.  All the fixes to static buffer sizes that were inspired by 
your warning.  These are all approved and can go in immediately.


Patch #2. Improvement to __builtin_object_size to handle 
POINTER_PLUS_EXPR on arrays.  This is something that stands on it own 
and ought to be reviewable quickly and doesn't really belong in the 
bowels of the warning/optimization patch you're developing.


Patch #3. Core infrastructure and possibly the warning.  The reason I 
say possibly the warning is they may be intertwined enough that 
separating them makes more work than it saves.  I think the warning bits 
are largely ready to go and may just need twiddling due to conflicts 
with David's work.


Patch #4. The optimizations you've got now which I'll want to take 
another look at.  Other than the overly tight integration with the 
warning, I don't see anything inherently wrong, but I would like to take 
another look at those once #1-#3 are done and dusted.


Patch #5 and beyond: Further optimization work.







+
+static void
+compute_format_length (const pass_sprintf_length::call_info ,
+  format_result*res,
+  const char   *cvtbeg,
+  size_t   cvtlen,
+  size_t   offset,
+  const conversion_spec,
+  tree arg)
Needs function comment.  Please don't bother lining up arguments like 
that.  Just one space between type and its name.






+/* Given a suitable result RES of a call to a formatted output function
+   described by INFO, substitute the result for the return value of
+   the call.  The result is suitable if the number of bytes it represents
+   is known and exact.  */
+
+static void
+try_substitute_return_value (gimple_stmt_iterator  gsi,
+const pass_sprintf_length::call_info  ,
+const format_result  )
Single space between the type and the variable name.  I only happened to 
see this one because I was going to make a comment about this function, 
so please check elsewhere in your new code.



+{
+  tree lhs = gimple_get_lhs (info.callstmt);
+  if (lhs && res.bounded && res.number_chars < HOST_WIDE_INT_MAX)
+{
+  /* Replace the left-hand side of the call with the constant
+result of the formatting function minus 1 for the terminating
+NUL which the functions' return value does not include.  */
+  gimple_call_set_lhs (info.callstmt, NULL_TREE);
+  tree cst = 

[PR fortran/61318] Patch

2016-08-18 Thread Steve Kargl
I intend to commit the following patch on Saturday.
It improves the reporting on where an error occurs.
Without the patch, one gets

% gfc7 -c gcc/testsuite/gfortran.dg/pr61318.f90 
gcc/testsuite/gfortran.dg/pr61318.f90:19:6:

   use gbl_message
  1
Error: Type mismatch in argument 'message' at (1); \
   passed INTEGER(4) to CHARACTER(1)


which points to the module where the function is declared.
With the patch, I get

% gfc7 -c gcc/testsuite/gfortran.dg/pr61318.f90
gcc/testsuite/gfortran.dg/pr61318.f90:21:78:

   call gagout(seve%e,'Some string')
   1
Error: Type mismatch in argument 'message' at (1); \
   passed INTEGER(4) to CHARACTER(1)

which points to the line where the error actually occurs.

2016-08-20  Steven G. Kargl  

PR fortran/61318
* interface.c (compare_paramete): Use a better for error message.

2016-08-20  Steven G. Kargl  
PR fortran/61318
* gfortran.dg/pr61318.f90: New test.


Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 239543)
+++ gcc/fortran/interface.c (working copy)
@@ -2146,7 +2146,7 @@ compare_parameter (gfc_symbol *formal, g
 {
   if (where)
gfc_error ("Type mismatch in argument %qs at %L; passed %s to %s",
-  formal->name, >where, gfc_typename (>ts),
+  formal->name, where, gfc_typename (>ts),
   gfc_typename (>ts));
   return 0;
 }
Index: gcc/testsuite/gfortran.dg/pr61318.f90
===
--- gcc/testsuite/gfortran.dg/pr61318.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr61318.f90   (working copy)
@@ -0,0 +1,23 @@
+! { dg-do compile }
+!
+module gbl_message
+  type :: mytype
+integer(kind=4) :: e
+  end type mytype
+  type(mytype), parameter :: seve = mytype(1)
+end module gbl_message
+
+module gbl_interfaces
+  interface
+subroutine gagout(message)
+  character(len=*), intent(in) :: message
+end subroutine gagout
+  end interface
+end module gbl_interfaces
+
+program test
+  use gbl_message
+  use gbl_interfaces
+  call gagout(seve%e,'Some string') ! { dg-error "Type mismatch in argument" }
+end program test
+! { dg-final { cleanup-modules "gbl_interfaces gbl_message" } }
-- 
Steve


Re: [PATCH v3] S/390: Add splitter for "and" with complement.

2016-08-18 Thread Dominik Vogt
Version 5 of the patch with the splitter for -O0/-O1 removed, as
discussed internally.  Regression tested on s390x biarch and s390.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.md ("*andc_split"): New splitter for and with
complement.
gcc/testsuite/ChangeLog

* gcc.target/s390/md/andc-splitter-1.c: New test case.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
>From 480a4599ccc170c01da1c012090407d9581a6915 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 14 Mar 2016 17:48:17 +0100
Subject: [PATCH] S/390: Add splitter for "and" with complement.

Force splitting of logical operator expressions ...  with three operands, a
register destination and a memory operand because there are no instructions for
that and combine results in inefficient code.
---
 gcc/config/s390/s390.md| 27 ++
 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c | 61 ++
 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c | 61 ++
 3 files changed, 149 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index f8c61a8..295e3f7 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -7262,6 +7262,33 @@
(set_attr "z10prop" "z10_super_E1,z10_super,*")])
 
 ;
+; And with complement
+;
+; c = ~b & a = (b & a) ^ a
+
+(define_insn_and_split "*andc_split_"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "")
+   (and:GPR (not:GPR (match_operand:GPR 1 "nonimmediate_operand" ""))
+(match_operand:GPR 2 "general_operand" "")))
+   (clobber (reg:CC CC_REGNUM))]
+  "! reload_completed && s390_logical_operator_ok_p (operands)"
+  "#"
+  "&& 1"
+  [
+  (parallel
+   [(set (match_dup 3) (and:GPR (match_dup 1) (match_dup 2)))
+   (clobber (reg:CC CC_REGNUM))])
+  (parallel
+   [(set (match_dup 0) (xor:GPR (match_dup 3) (match_dup 2)))
+   (clobber (reg:CC CC_REGNUM))])]
+{
+  if (reg_overlap_mentioned_p (operands[0], operands[2]))
+operands[3] = gen_reg_rtx (mode);
+  else
+operands[3] = operands[0];
+})
+
+;
 ; Block and (NC) patterns.
 ;
 
diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c 
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
new file mode 100644
index 000..ed78921
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
@@ -0,0 +1,61 @@
+/* Machine description pattern tests.  */
+
+/* { dg-do run { target { lp64 } } } */
+/* { dg-options "-mzarch -save-temps -dP" } */
+/* Skip test if -O0 is present on the command line:
+
+{ dg-skip-if "" { *-*-* } { "-O0" } { "" } }
+
+   Skip test if the -O option is missing from the command line
+{ dg-skip-if "" { *-*-* } { "*" } { "-O*" } }
+*/
+
+__attribute__ ((noinline))
+unsigned long andc_vv(unsigned long a, unsigned long b)
+{ return ~b & a; }
+/* { dg-final { scan-assembler ":15 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":15 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_pv(unsigned long *a, unsigned long b)
+{ return ~b & *a; }
+/* { dg-final { scan-assembler ":21 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":21 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_vp(unsigned long a, unsigned long *b)
+{ return ~*b & a; }
+/* { dg-final { scan-assembler ":27 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":27 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_pp(unsigned long *a, unsigned long *b)
+{ return ~*b & *a; }
+/* { dg-final { scan-assembler ":33 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":33 .\* \{\\*xordi3\}" } } */
+
+/* { dg-final { scan-assembler-times "\tngr\?k\?\t" 4 } } */
+/* { dg-final { scan-assembler-times "\txgr\?\t" 4 } } */
+
+int
+main (void)
+{
+  unsigned long a = 0xc00cllu;
+  unsigned long b = 0x500allu;
+  unsigned long e = 0x8004llu;
+  unsigned long c;
+
+  c = andc_vv (a, b);
+  if (c != e)
+__builtin_abort ();
+  c = andc_pv (, b);
+  if (c != e)
+__builtin_abort ();
+  c = andc_vp (a, );
+  if (c != e)
+__builtin_abort ();
+  c = andc_pp (, );
+  if (c != e)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c 
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c
new file mode 100644
index 000..d88da4d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c
@@ -0,0 +1,61 @@
+/* Machine description pattern tests.  */
+
+/* { dg-do run } */
+/* { dg-options "-save-temps -dP" } */
+/* Skip test if -O0 is present on the command line:
+
+{ dg-skip-if "" { *-*-* } { "-O0" } { "" } }
+
+   Skip test if the -O option is missing from the command line
+{ dg-skip-if "" { *-*-* } { 

Re: Implement -Wimplicit-fallthrough (take 3): add gcc_fallthrough

2016-08-18 Thread Marek Polacek
On Thu, Aug 18, 2016 at 12:28:18PM -0500, Segher Boessenkool wrote:
> On Thu, Aug 18, 2016 at 03:59:32PM +0200, Marek Polacek wrote:
> > * eload1.c (elimination_effects): Likewise.
> 
> So we are deleting reload one char at a time now?  :-)

If only!  ;)

I fixed the typo in my local copy, thanks.

Marek


[PR tree-optimization/71691] Fix unswitching in presence of maybe-undefined SSA_NAMEs

2016-08-18 Thread Jeff Law


So to recap, the problem with this BZ is we have a maybe-undefined 
SSA_NAME in the IL.  That maybe-undefined name is used as a condition 
inside a loop and we unswitch that condition.


tree-ssa-loop-unswitch.c already tries to avoid doing that, but uses the 
optimistic ssa_undefined_value_p function.


Essentially ssa_undefined_value_p just checks to see if the SSA_NAME is 
set from a real statement.  It doesn't look at the operands of that 
statement to see if they're undefined, nor does it walk the use-def 
chains of the RHS operands.  In short, it's totally inappropriate for 
unswitching's needs.


This patch introduces a new class where we can ask if a particular 
SSA_NAME is defined or may be undefined.  Only the latter interface is 
currently used and I wouldn't object if we wanted to avoid the former 
interface until we needed it (it's just a trivial bitmap test, so we're 
not losing any real knowledge of how to implement it).


We walk the CFG in dominator order.  For each block we walk the PHIs and 
mark the LHS as defined IFF all the RHS arguments are defined.  Then we 
walk the statements and mark their LHSs as defined IIFF all the RHS 
arguments are defined.


This gives us a conservative solution to the may be undefined question. 
We do not try to keep this information up-to-date as statements or CFG 
are updated -- queries on newly added SSA_NAMEs always return 
may-be-undefined.


This information is ephemeral and not kept up-to-date.  We perform the 
analysis in the class's constructor and tear down the resulting bitmap 
in the class's destructor.



Bootstrapped and regression tested on x86_64-linux-gnu.

Ok for the trunk?


Jeff

PR tree-optimization/71691
* Makefile.in (OBJS): Add tree-ssa-defined-or-undefined.
* tree-ssa-defined-or-undefined.h: New file.
* tree-ssa-defined-or-undefined.c: New file.
* tree-ssa-loop-unswitch.c: Include tree-ssa-defined-or-undefined.h
(tree_ssa_unswitch_loops): Create a defined_or_undefined class instance
and pass it to tree_unswitch_single_loop.
(tree_unswitch_single_loop): Pass instance down through recursive calls
and into tree_may_unswitch_on.
(tree_may_unswitch_on): Use defined_or_undefined instance rather than
ssa_undefined_value_p.

PR tree-optimization/71691
* gcc.c-torture/execute/pr71691.c: New test.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7a0160f..9e881b8 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1491,6 +1491,7 @@ OBJS = \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
tree-ssa-dce.o \
+   tree-ssa-defined-or-undefined.o \
tree-ssa-dom.o \
tree-ssa-dse.o \
tree-ssa-forwprop.o \
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr71691.c 
b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
new file mode 100644
index 000..2c5dbb6
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
@@ -0,0 +1,45 @@
+char b;
+short f;
+unsigned e;
+int g = 20;
+
+void
+foo ()
+{
+  int l, h;
+  for (l = 0; l <= 7; l++)
+{
+  int j = 38;
+  if (g)
+   h = 0;
+  for (; h <= 7; h++)
+   {
+ int i, k = b % (j % 4);
+ g = f;
+ for (;;)
+   {
+ j = 6 || b;
+ if (e)
+   {
+ for (; j; --j)
+   if (k)
+ __builtin_printf ("%d", 9);
+ if (i)
+   __builtin_printf ("%d", j);
+   }
+ if (l)
+   continue;
+ break;
+   }
+ i = f || b;
+   }
+}
+}
+
+int
+main ()
+{
+  foo ();
+  exit (0);
+}
+
diff --git a/gcc/tree-ssa-defined-or-undefined.c 
b/gcc/tree-ssa-defined-or-undefined.c
new file mode 100644
index 000..50de443
--- /dev/null
+++ b/gcc/tree-ssa-defined-or-undefined.c
@@ -0,0 +1,128 @@
+/* Simple class for classifying SSA_NAMEs that are either fully defined
+   or possibly undefined.
+
+   This is meant to support conservative analysis for optimization purposes,
+   not for generating warnings.  The analysis for generating warnings is
+   deeper to avoid generation of false positives.
+
+   Copyright (C) 2001-2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include 

Re: Implement -Wimplicit-fallthrough (take 3): add gcc_fallthrough

2016-08-18 Thread Segher Boessenkool
On Thu, Aug 18, 2016 at 03:59:32PM +0200, Marek Polacek wrote:
>   * eload1.c (elimination_effects): Likewise.

So we are deleting reload one char at a time now?  :-)


Segher


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-18 Thread David Malcolm
On Thu, 2016-08-18 at 15:50 +0200, Marek Polacek wrote:
> Now that all various switch fallthrough bugfixes and adjustments were
> committed, and this patch has shrunk considerably, I'm presenting the
> latest
> version.  The changes from the last version are not huge; we don't
> warn for a
> fall through to a user-defined label anymore, and I made some tiny
> changes
> regarding parsing attributes in the C FE, as requested by Joseph.
> 
> This patch is accompanied by another patch that merely adds various
> gcc_fallthroughs, in places where a FALLTHRU comment wouldn't work.
> 
> It's been tested on powerpc64le-unknown-linux-gnu, aarch64-linux-gnu,
> and x86_64-redhat-linux.
> 
> 2016-08-18  Marek Polacek  
>   Jakub Jelinek  
> 
>   PR c/7652

[...]
 
> diff --git gcc/gcc/gimplify.c gcc/gcc/gimplify.c
> index 1e43dbb..1925263 100644
> --- gcc/gcc/gimplify.c
> +++ gcc/gcc/gimplify.c

[...]

> + if (warned_p)
> +   {
> + rich_location richloc (line_table, gimple_location
> (next));
> + richloc.add_fixit_insert (gimple_location (next),
> +   "insert '__attribute__ "
> +   "((fallthrough));' to
> silence "
> +   "this warning");
> + richloc.add_fixit_insert (gimple_location (next),
> +   "insert 'break;' to avoid
> "
> +   "fall-through");
> + inform_at_rich_loc (, "here");
> +   }

This isn't quite the way that fix-its are meant to be used, in my mind,
at least: the insertion text is supposed to be something that could be
literally inserted into the code (e.g. by an IDE) i.e. it should be a
code fragment, rather than a message to the user.

Here's an idea of how the above could look:

  if (warned_p)
{
  /* Suggestion one: add "__attribute__ ((fallthrough));".  */
  rich_location richloc_attr (line_table, gimple_location (next));
  richloc_attr.add_fixit_insert (gimple_location (next),
 "__attribute__ ((fallthrough));");
  inform_at_rich_loc (_attr, "insert %qs to silence this warning",
  "__attribute__ ((fallthrough));")

  /* Suggestion two: add "break;".  */
  rich_location richloc_break (line_table, gimple_location (next));
  richloc_break.add_fixit_insert (gimple_location (next),
  "break;");
  inform_at_rich_loc (_break, "insert %qs to avoid fall-through",
  "break;");
}


(and using %qs for quoting the language elements in the messages
themselves).

There doesn't seem to be any test coverage in the patch for the fix-it
hints.

The easiest way to do this is to create a test case with:

/* { dg-options "-fdiagnostics-show-caret" } */

and then add:

 /* { dg-begin-multiline-output "" }
QUOTE OF EXPECTED CARET+FIXIT OUTPUT, omitting any trailing deja-gnu
directives
 { dg-end-multiline-output "" } */

so that we can directly verify that the results look sane.

BTW, is there some way to break up warn_implicit_fallthrough_r, maybe
moving the GIMPLE_LABEL handling to a subroutine? (and perhaps the
suggestion-handling above could live in its own subroutine, etc).

[...]

Hope this is constructive
Dave



Re: Fwd: [PATCH] genmultilib: improve error reporting for MULTILIB_REUSE

2016-08-18 Thread Jeff Law

On 08/10/2016 09:51 AM, Thomas Preudhomme wrote:



*** gcc/ChangeLog ***

2016-08-01  Thomas Preud'homme  

* doc/fragments.texi (MULTILIB_REUSE): Mention that only options in
MULTILIB_OPTIONS should be used.  Small wording fixes.
* genmultilib: Memorize set of all option combinations in
combination_space.  Detect if RHS of MULTILIB_REUSE uses an
option not
found in MULTILIB_OPTIONS by checking if option set is listed in
combination_space.  Output new and existing error message to
stderr.

[ snip ]



+A reuse rule is comprised of two parts connected by equality sign.  The left
+part is the option set used to build multilib and the right part is the option
+set that will reuse this multilib.  Both part should only use options specified

"Both part" -> "Both parts" I think here.

OK with that change.

jeff




Re: [PATCH] Fix warning breaking profiled bootstrap

2016-08-18 Thread Jeff Law

On 08/11/2016 09:28 PM, Andi Kleen wrote:


If sym1 results in a return value that is some useful tree and inv1
is true and cst1 is true via this call:


The only way for get_single_symbol to return a non NULL tree
is to hit the return at the end -- and that always initializes
inv and neg.

Right.



And when the return is NULL the && prevents evaluating
inv or neg.
Consider the case where sym1 results in a non-null return value (and 
initializes neg1/inv1), but sym2 results in a null return value, leaving 
neg2/inv2 undefined, but cst2 is can still be true (ADDR_EXPR with an 
invariant address comes to mind).


Thus we can get into these statements:


  tree cst = cst1 ? val1 : val2;
  tree inv = cst1 ? inv2 : inv1;


Note carefully how they test cst1 and depending on its value, they may 
read val2 or inv2.


Jeff



[committed] selftest.h: add class line_table_test

2016-08-18 Thread David Malcolm
input.c has a fixture class for running each selftest with a fresh
line_table, and logic for looping over various interesting line_table
test cases.

This patch exposes the above in selftest.h so that such
location-handling tests can be written in other files, renaming the
class from temp_line_table to line_table_test.

Also, the patch moves the stored line table ptr from being a member of
the test class to being a global GC-root, to avoid it being collected
if the GC runs during such a test.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Committed to trunk as r239580.

gcc/ChangeLog:
* input.c (saved_line_table): New global.
(class selftest::temp_line_table): Rename to line_table_test and
move declaration to selftest.h, and drop field m_old_line_table.
(selftest::temp_line_table::temp_line_table): Rename ctor to...
(selftest::line_table_test::line_table_test): ...this.  Add a
default ctor.  Store current value of line_table within
saved_line_table.
(selftest::temp_line_table::~temp_line_table): Rename dtor to...
(selftest::line_table_test::~line_table_test): ...this, and
restore line_table from the saved_line_table, rather than
m_old_line_table.
(selftest::test_accessing_ordinary_linemaps): Update for above
renaming.
(selftest::test_lexer): Likewise.
(struct selftest::lexer_test): Likewise.
(selftest::lexer_test::lexer_test): Likewise.
(selftest::input_c_tests): Move the looping over test cases from
here into...
(selftest::for_each_line_table_case): New function.
* input.h (saved_line_table): New decl.
* selftest.h (struct selftest::line_table_case): New forward decl.
(class selftest::line_table_test): New class, moved here from
selftest::temp_line_table in input.c, and renamed.
(selftest::for_each_line_table_case): New decl.
---
 gcc/input.c| 126 +
 gcc/input.h|   1 +
 gcc/selftest.h |  42 +++
 3 files changed, 117 insertions(+), 52 deletions(-)

diff --git a/gcc/input.c b/gcc/input.c
index 76a3307..2dcecfb 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -114,6 +114,13 @@ location_t input_location = UNKNOWN_LOCATION;
 
 struct line_maps *line_table;
 
+/* A stashed copy of "line_table" for use by selftest::line_table_test.
+   This needs to be a global so that it can be a GC root, and thus
+   prevent the stashed copy from being garbage-collected if the GC runs
+   during a line_table_test.  */
+
+struct line_maps *saved_line_table;
+
 static fcache *fcache_tab;
 static const size_t fcache_tab_size = 16;
 static const size_t fcache_buffer_size = 4 * 1024;
@@ -1591,8 +1598,8 @@ assert_loceq (const char *exp_filename, int exp_linenum, 
int exp_colnum,
 ASSERT_EQ (exp_colnum, LOCATION_COLUMN (loc));
 }
 
-/* Various selftests in this file involve constructing a line table
-   and one or more line maps within it.
+/* Various selftests involve constructing a line table and one or more
+   line maps within it.
 
For maximum test coverage we want to run these tests with a variety
of situations:
@@ -1618,29 +1625,35 @@ struct line_table_case
   int m_base_location;
 };
 
-/* A class for overriding the global "line_table" within a selftest,
-   restoring its value afterwards.  */
+/* Constructor.  Store the old value of line_table, and create a new
+   one, using sane defaults.  */
 
-class temp_line_table
+line_table_test::line_table_test ()
 {
- public:
-  temp_line_table (const line_table_case &);
-  ~temp_line_table ();
-
- private:
-  line_maps *m_old_line_table;
-};
+  gcc_assert (saved_line_table == NULL);
+  saved_line_table = line_table;
+  line_table = ggc_alloc ();
+  linemap_init (line_table, BUILTINS_LOCATION);
+  gcc_assert (saved_line_table->reallocator);
+  line_table->reallocator = saved_line_table->reallocator;
+  gcc_assert (saved_line_table->round_alloc_size);
+  line_table->round_alloc_size = saved_line_table->round_alloc_size;
+  line_table->default_range_bits = 0;
+}
 
 /* Constructor.  Store the old value of line_table, and create a new
one, using the sitation described in CASE_.  */
 
-temp_line_table::temp_line_table (const line_table_case _)
-  : m_old_line_table (line_table)
+line_table_test::line_table_test (const line_table_case _)
 {
+  gcc_assert (saved_line_table == NULL);
+  saved_line_table = line_table;
   line_table = ggc_alloc ();
   linemap_init (line_table, BUILTINS_LOCATION);
-  line_table->reallocator = m_old_line_table->reallocator;
-  line_table->round_alloc_size = m_old_line_table->round_alloc_size;
+  gcc_assert (saved_line_table->reallocator);
+  line_table->reallocator = saved_line_table->reallocator;
+  gcc_assert (saved_line_table->round_alloc_size);
+  line_table->round_alloc_size = saved_line_table->round_alloc_size;
   line_table->default_range_bits = 

Re: [PATCH 2/2][v4] Drop excess size used for run time allocated stack variables.

2016-08-18 Thread Jeff Law

On 07/26/2016 09:53 AM, Dominik Vogt wrote:

Finally a patch that works and is simple.  Bootstrapped and
regression tested on s390, s390x biarch and x86_64.  The new patch
exploits the known alignment of (stack pointer +
STACK_DYNAMIC_OFFSET) as described earlier (see below).  I think
that is the right way to get rid of the extra allocation.  It
took a long time to understand the problem.

As the patch triggers a bug in the fortran compiler, the
der_type.f90 test case may fail on some targets if this patch is
used without the fortran fix that I've posted in another thread.

(The patch also contains a fix for a typo in a comment in the
patched function.)

See ChangeLog for a full description of the new patch.

Since the patch is all new, we're not going to commit it without a
new OK.

I like this one much better :-)

OK.

Thanks for your patience,
JEff



Re: C PATCH to disallow invalid arguments to __atomic_* (PR c/71514)

2016-08-18 Thread Joseph Myers
On Thu, 18 Aug 2016, Marek Polacek wrote:

> As discussed in the PR, for __atomic_* functions we shouldn't accept
> pointer-to-function and pointer-to-VLA as arguments.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Richard Biener
On August 18, 2016 5:54:49 PM GMT+02:00, Jakub Jelinek  wrote:
>On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote:
>> > I'd prefer to make updates atomic in multi-threaded applications.
>> > The best proxy we have for that is -pthread.
>> > 
>> > Is it slower, most definitely, but odds are we're giving folks
>> > garbage data otherwise, which in many ways is even worse.
>> 
>> It will likely be catastrophically slower in some cases. 
>> 
>> Catastrophically as in too slow to be usable.
>> 
>> An atomic instruction is a lot more expensive than a single
>increment. Also
>> they sometimes are really slow depending on the state of the machine.
>
>Can't we just have thread-local copies of all the counters (perhaps
>using
>__thread pointer as base) and just atomically merge at thread
>termination?

I suggested that as well but of course it'll have its own class of issues 
(short lived threads, so we need to somehow re-use counters from terminated 
threads, large number of threads and thus using too much memory for the 
counters)

Richard.

>   Jakub




Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Richard Biener
On August 18, 2016 5:51:31 PM GMT+02:00, Andi Kleen  wrote:
>> I'd prefer to make updates atomic in multi-threaded applications.
>> The best proxy we have for that is -pthread.
>> 
>> Is it slower, most definitely, but odds are we're giving folks
>> garbage data otherwise, which in many ways is even worse.
>
>It will likely be catastrophically slower in some cases. 
>
>Catastrophically as in too slow to be usable.
>
>An atomic instruction is a lot more expensive than a single increment.
>Also
>they sometimes are really slow depending on the state of the machine.

The important part is to optimize increments in loops - sth we have special 
ways to do for the regular counters.  OTOH if we can delay instrumenting to 
late enough we can instrument loops optimized in the first place.

Richard.

>-Andi




Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Jakub Jelinek
On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote:
> > I'd prefer to make updates atomic in multi-threaded applications.
> > The best proxy we have for that is -pthread.
> > 
> > Is it slower, most definitely, but odds are we're giving folks
> > garbage data otherwise, which in many ways is even worse.
> 
> It will likely be catastrophically slower in some cases. 
> 
> Catastrophically as in too slow to be usable.
> 
> An atomic instruction is a lot more expensive than a single increment. Also
> they sometimes are really slow depending on the state of the machine.

Can't we just have thread-local copies of all the counters (perhaps using
__thread pointer as base) and just atomically merge at thread termination?

Jakub


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Jeff Law

On 08/18/2016 09:51 AM, Andi Kleen wrote:

I'd prefer to make updates atomic in multi-threaded applications.
The best proxy we have for that is -pthread.

Is it slower, most definitely, but odds are we're giving folks
garbage data otherwise, which in many ways is even worse.


It will likely be catastrophically slower in some cases.

Catastrophically as in too slow to be usable.

An atomic instruction is a lot more expensive than a single increment. Also
they sometimes are really slow depending on the state of the machine.

And for those cases there's a way to override.

The default should be set for correctness.

jeff


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Andi Kleen
> I'd prefer to make updates atomic in multi-threaded applications.
> The best proxy we have for that is -pthread.
> 
> Is it slower, most definitely, but odds are we're giving folks
> garbage data otherwise, which in many ways is even worse.

It will likely be catastrophically slower in some cases. 

Catastrophically as in too slow to be usable.

An atomic instruction is a lot more expensive than a single increment. Also
they sometimes are really slow depending on the state of the machine.

-Andi


Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-18 Thread Uros Bizjak
On Thu, Aug 18, 2016 at 4:53 PM, H.J. Lu  wrote:
> On Wed, Aug 17, 2016 at 11:21 PM, Uros Bizjak  wrote:
>> On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
>>> builtin_memset_gen_str returns a register used for memset, which only
>>> supports integer registers.  But a target may use vector registers in
>>> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
>>> QImode value to mode derived from STORE_MAX_PIECES, which can be used
>>> with vector instructions.  The default hook is the same as the original
>>> builtin_memset_gen_str.  A target can override it to support vector
>>> instructions for STORE_MAX_PIECES.
>>>
>>> Tested on x86-64 and i686.  Any comments?
>>
>> It looks to me you have attached an older version of the patch,
>> STORE_MAX_PIECES change in i386.h is already in the mainline.
>>
>
> Did you mean MOVE_MAX_PIECES?  There is no STORE_MAX_PIECES
> in i386.h

Uh, yes. Sorry for the confusion.

Uros.


Re: [PATCH] Spelling suggestions for misspelled preprocessor directives

2016-08-18 Thread Marek Polacek
On Thu, Aug 18, 2016 at 11:28:02AM -0400, David Malcolm wrote:
> This patch allows the preprocessor to offer suggestions for misspelled
> directives, taking us from e.g.:
> 
> test.c:5:2: error: invalid preprocessing directive #endfi
>  #endfi
>   ^
> 
> to:
> 
> test.c:5:2: error: invalid preprocessing directive #endfi; did you mean 
> #endif?
>  #endfi
>   ^
>   endif
> 
> I can self-approve all of it apart from the changes to c-family.
> 
> Successfully bootstrapped on x86_64-pc-linux-gnu.
> 
> OK for trunk?
> 
> gcc/c-family/ChangeLog:
>   * c-common.c: Include "spellcheck.h".
>   (cb_get_suggestion): New function.
>   * c-common.h (cb_get_suggestion): New decl.
>   * c-lex.c (init_c_lex): Initialize cb->get_suggestion to
>   cb_get_suggestion.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/cpp/misspelled-directive-1.c: New testcase.
>   * gcc.dg/cpp/misspelled-directive-2.c: New testcase.
> 
> libcpp/ChangeLog:
>   * directives.c (directive_names): New array.
>   (_cpp_handle_directive): Offer spelling suggestions for misspelled
>   directives.
>   * errors.c (cpp_diagnostic_at_richloc): New function.
>   (cpp_error_at_richloc): New function.
>   * include/cpplib.h (struct cpp_callbacks): Add field
>   "get_suggestion".
>   (cpp_error_at_richloc): New decl.
> ---
>  gcc/c-family/c-common.c   | 17 ++
>  gcc/c-family/c-common.h   |  5 +++
>  gcc/c-family/c-lex.c  |  1 +
>  gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c | 12 +++
>  gcc/testsuite/gcc.dg/cpp/misspelled-directive-2.c | 21 
>  libcpp/directives.c   | 41 
> +--
>  libcpp/errors.c   | 36 
>  libcpp/include/cpplib.h   |  8 +
>  8 files changed, 139 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/cpp/misspelled-directive-2.c
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 569f000..a8ef595 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "opts.h"
>  #include "gimplify.h"
>  #include "substring-locations.h"
> +#include "spellcheck.h"
>  
>  cpp_reader *parse_in;/* Declared in c-pragma.h.  */
>  
> @@ -12934,6 +12935,22 @@ cb_get_source_date_epoch (cpp_reader *pfile 
> ATTRIBUTE_UNUSED)
>return (time_t) epoch;
>  }
>  
> +/* Callback for libcpp for offering spelling suggestions for misspelled
> +   directives.  GOAL is an unrecognized string; CANDIDATES is a
> +   NULL-terminated array of candidate strings.  Return the closest
> +   match to GOAL within CANDIDATES, or NULL if none are good
> +   suggestions.  */
> +
> +const char *
> +cb_get_suggestion (cpp_reader *, const char *goal,
> +const char * const *candidates)

Should be "const char *const" (i.e. no space).

> +{
> +  best_match bm (goal);
> +  while (*candidates)
> +bm.consider (*(candidates++));

I don't think you need the parens here.

> +  return bm.get_best_meaningful_candidate ();
> +}
> +
>  /* Check and possibly warn if two declarations have contradictory
> attributes, such as always_inline vs. noinline.  */
>  
> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index 61f9ced..a3da631 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1110,6 +1110,11 @@ extern time_t cb_get_source_date_epoch (cpp_reader 
> *pfile);
> __TIME__ can store.  */
>  #define MAX_SOURCE_DATE_EPOCH HOST_WIDE_INT_C (253402300799)
>  
> +/* Callback for libcpp for offering spelling suggestions for misspelled
> +   directives.  */
> +extern const char *cb_get_suggestion (cpp_reader *, const char *,
> +   const char * const *);
> +

Same here.

Otherwise the c-family/ changes LGTM.

Marek


Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-18 Thread H.J. Lu
On Thu, Aug 18, 2016 at 1:18 AM, Richard Biener
 wrote:
> On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
>> builtin_memset_gen_str returns a register used for memset, which only
>> supports integer registers.  But a target may use vector registers in
>> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
>> QImode value to mode derived from STORE_MAX_PIECES, which can be used
>> with vector instructions.  The default hook is the same as the original
>> builtin_memset_gen_str.  A target can override it to support vector
>> instructions for STORE_MAX_PIECES.
>>
>> Tested on x86-64 and i686.  Any comments?
>>
>> H.J.
>> ---
>> gcc/
>>
>> * builtins.c (builtin_memset_gen_str): Call targetm.gen_memset_value.
>> (default_gen_memset_value): New function.
>> * target.def (gen_memset_value): New hook.
>> * targhooks.c: Inclue "expmed.h" and "builtins.h".
>> (default_gen_memset_value): New function.
>
> I see default_gen_memset_value in builtins.c but it belongs here.
>
>> * targhooks.h (default_gen_memset_value): New prototype.
>> * config/i386/i386.c (ix86_gen_memset_value): New function.
>> (TARGET_GEN_MEMSET_VALUE): New.
>> * config/i386/i386.h (STORE_MAX_PIECES): Likewise.
>> * doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
>> * doc/tm.texi: Updated.
>>

Like this?

H.J.
From bb2d5a4f79dd79a5b3772987d51123547b9319fa Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 6 Mar 2016 06:38:21 -0800
Subject: [PATCH] Add the TARGET_GEN_MEMSET_VALUE hook

builtin_memset_gen_str returns a register used for memset.  But it only
supports integer registers.  But a target may vector registers in memmset.
This patch adds the TARGET_GEN_MEMSET_VALUE hook to duplicate QImode value
to mode specified by STORE_MAX_PIECES with vector instructions.  The default
hook is the same as the original builtin_memset_gen_str.  A target can
override it to support a larger STORE_MAX_PIECES.

gcc/

	* builtins.c (c_readstr): Make it global.
	(builtin_memset_gen_str): Call targetm.gen_memset_value.
	* builtins.h: Add c_readstr.
	* target.def (gen_memset_value): New hook.
	* targhooks.c: Inclue "expmed.h" and "builtins.h".
	(default_gen_memset_value): New function.
	* targhooks.h (default_gen_memset_value): New prototype.
	* config/i386/i386.c (ix86_gen_memset_value): New function.
	(TARGET_GEN_MEMSET_VALUE): New.
	* config/i386/i386.h (STORE_MAX_PIECES): Likewise.
	* doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
	* doc/tm.texi: Updated.

gcc/testsuite/

	* gcc.target/i386/pieces-memset-1.c: New test.
	* gcc.target/i386/pieces-memset-2.c: Likewise.
	* gcc.target/i386/pieces-memset-3.c: Likewise.
	* gcc.target/i386/pieces-memset-4.c: Likewise.
	* gcc.target/i386/pieces-memset-5.c: Likewise.
	* gcc.target/i386/pieces-memset-6.c: Likewise.
	* gcc.target/i386/pieces-memset-7.c: Likewise.
	* gcc.target/i386/pieces-memset-8.c: Likewise.
	* gcc.target/i386/pieces-memset-9.c: Likewise.
	* gcc.target/i386/pieces-memset-10.c: Likewise.
	* gcc.target/i386/pieces-memset-11.c: Likewise.
	* gcc.target/i386/pieces-memset-12.c: Likewise.
	* gcc.target/i386/pieces-memset-13.c: Likewise.
	* gcc.target/i386/pieces-memset-14.c: Likewise.
	* gcc.target/i386/pieces-memset-15.c: Likewise.
	* gcc.target/i386/pieces-memset-16.c: Likewise.
	* gcc.target/i386/pieces-memset-17.c: Likewise.
	* gcc.target/i386/pieces-memset-18.c: Likewise.
	* gcc.target/i386/pieces-memset-19.c: Likewise.
	* gcc.target/i386/pieces-memset-20.c: Likewise.
	* gcc.target/i386/pieces-memset-21.c: Likewise.
	* gcc.target/i386/pieces-memset-22.c: Likewise.
	* gcc.target/i386/pieces-memset-23.c: Likewise.
	* gcc.target/i386/pieces-memset-24.c: Likewise.
	* gcc.target/i386/pieces-memset-25.c: Likewise.
	* gcc.target/i386/pieces-memset-26.c: Likewise.
	* gcc.target/i386/pieces-memset-27.c: Likewise.
	* gcc.target/i386/pieces-memset-28.c: Likewise.
	* gcc.target/i386/pieces-memset-29.c: Likewise.
	* gcc.target/i386/pieces-memset-30.c: Likewise.
	* gcc.target/i386/pieces-memset-31.c: Likewise.
	* gcc.target/i386/pieces-memset-32.c: Likewise.
	* gcc.target/i386/pieces-memset-33.c: Likewise.
	* gcc.target/i386/pieces-memset-34.c: Likewise.
	* gcc.target/i386/pieces-memset-35.c: Likewise.
	* gcc.target/i386/pieces-memset-36.c: Likewise.
	* gcc.target/i386/pieces-memset-37.c: Likewise.
	* gcc.target/i386/pieces-memset-38.c: Likewise.
	* gcc.target/i386/pieces-memset-39.c: Likewise.
	* gcc.target/i386/pieces-memset-40.c: Likewise.
	* gcc.target/i386/pieces-memset-41.c: Likewise.
	* gcc.target/i386/pieces-memset-42.c: Likewise.
	* gcc.target/i386/pieces-memset-43.c: Likewise.
	* gcc.target/i386/pieces-memset-44.c: Likewise.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 03a0dc8..3706660 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -89,7 +89,6 @@ builtin_info_type builtin_info[(int)END_BUILTINS];
 /* Non-zero if 

Re: [PATCH, PR70895] Add copy mapping for reductions on OpenACC loop directives

2016-08-18 Thread Chung-Lin Tang
On 2016/8/16 7:06 PM, Thomas Schwinge wrote:
> Hi!
> 
> On Mon, 15 Aug 2016 19:25:48 +0800, Chung-Lin Tang  
> wrote:
>> per the discussion on the bugzilla PR page, reductions on OpenACC loop
>> directives will automatically get a copy clause mapping on an enclosing
>> parallel construct (unless bounded by a local variable or an explicit
>> firstprivate clause).
>>
>> There is also a patch for libgomp testsuite cases. Asides from the
>> fortran case which now needs explicit firstprivate clauses to work,
>> other C/C++ cases have been adjusted to remove explicit copy clauses.
>> (I have not exhaustively searched everywhere to eliminate them though)
>>
>> This has been tested using gomp-4_0-branch, which is based on GCC 6,
>> which is what this PR was originally filed for.
>>
>> I will be committing this soon for gomp-4_0-branch,
>> is this okay for gcc-6-branch and trunk as well?
> 
> On Mon, 15 Aug 2016 15:23:14 +0200, Jakub Jelinek  wrote:
>> The gimplify.c change is ok for trunk and 6.3 (after 6.2 is released).
>> As for the testsuite, I'll leave it to Thomas/Nathan on what they prefer,
>> I'd think that having explicit clauses in e.g. half of the testcases and
>> implicit ones in the other half wouldn't hurt, so that both are tested
>> enough.
> 
> ACK, but from a quick scan it seems as if there's still sufficient
> coverage remaining with explicit usage.
> 
> What I'd like to see changed/added, though, is some libgomp.oacc-fortran
> test coverage of the new implicit copy clauses, and a handful of C/C++ as
> well as Fortran tree-scanning tests in gcc/testsuite/ -- basically, to
> document the expected behavior.

I've added the kernels case assertion you mentioned below, and committed the
gimplify patch to trunk, gcc-6-branch, and gomp-4_0-branch.
I've also added some of those testing items you mentioned, see the attached 
updated
testsuite patches.

>> +  /* For reductions clauses in OpenACC loop directives, by default create a
>> + copy clause on the enclosing parallel construct for carrying back the
>> + results.  */
>> +  if (ctx->region_type == ORT_ACC && (flags & GOVD_REDUCTION))
> 
> Should this be "ctx->region_type & ORT_ACC" instead of "=="?  I suppose
> the same thing also applies to OpenACC nested parallelism:
> 
> #pragma acc parallel
>   {
> [...]
> #pragma acc parallel reduction([...])
> {
>   [...]
> 
> ..., which we're not supporting right now, but that'll make it easier to
> spot once adding such support.

I'm ignoring this issue for now; this patch only deals with acc loop directives,
hence "== ORT_ACC". For Cesar's code path that deals with adding copy clauses 
for
parallel construct reductions, see the case in gimple_adjust_omp_clauses().
IMHO, to infer how we'll deal with nested parallelism right now is too long a 
shot.

> I suppose we can also run into this for ORT_ACC_KERNELS (if not, should
> mark/document that with gcc_unreachable or some such); per OpenACC 2.0a,
> 2.5.2 Kernels Construct, "a scalar variable referenced in the kernels
> construct that does not appear in a data clause for the construct or any
> enclosing data construct will be treated as if it appeared in a copy
> clause", so we should assert that this already is a GOVD_MAP.

Fair enough, I've added that in the gimplify patch.

> Can we also run into this for ORT_ACC_DATA and ORT_ACC_HOST_DATA, but
> nothing needs to be done for these?

Don't we have other mechanisms for enforcing proper nesting? data/host_data
constructs are supposed to be host code. They should not be directly enclosing 
loop directives,
as what we're concerned here.

The attached are the final patches I committed.

Thanks,
Chung-Lin

2016-08-18  Chung-Lin Tang  

PR middle-end/70895
gcc/
* gimplify.c (omp_add_variable): Adjust/add variable mapping on
enclosing parallel construct for reduction variables on OpenACC loop
directives.

gcc/testsuite/
* gfortran.dg/goacc/loop-tree-1.f90: Add gimple scan-tree-dump test.
* c-c++-common/goacc/reduction-1.c: Likewise.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.

libgomp/
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Add explicit
firstprivate clauses.
* testsuite/libgomp.oacc-fortran/reduction-6.f90: Remove explicit
copy clauses.
* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-flt.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
* 

[PATCH] Spelling suggestions for misspelled preprocessor directives

2016-08-18 Thread David Malcolm
This patch allows the preprocessor to offer suggestions for misspelled
directives, taking us from e.g.:

test.c:5:2: error: invalid preprocessing directive #endfi
 #endfi
  ^

to:

test.c:5:2: error: invalid preprocessing directive #endfi; did you mean #endif?
 #endfi
  ^
  endif

I can self-approve all of it apart from the changes to c-family.

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/c-family/ChangeLog:
* c-common.c: Include "spellcheck.h".
(cb_get_suggestion): New function.
* c-common.h (cb_get_suggestion): New decl.
* c-lex.c (init_c_lex): Initialize cb->get_suggestion to
cb_get_suggestion.

gcc/testsuite/ChangeLog:
* gcc.dg/cpp/misspelled-directive-1.c: New testcase.
* gcc.dg/cpp/misspelled-directive-2.c: New testcase.

libcpp/ChangeLog:
* directives.c (directive_names): New array.
(_cpp_handle_directive): Offer spelling suggestions for misspelled
directives.
* errors.c (cpp_diagnostic_at_richloc): New function.
(cpp_error_at_richloc): New function.
* include/cpplib.h (struct cpp_callbacks): Add field
"get_suggestion".
(cpp_error_at_richloc): New decl.
---
 gcc/c-family/c-common.c   | 17 ++
 gcc/c-family/c-common.h   |  5 +++
 gcc/c-family/c-lex.c  |  1 +
 gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c | 12 +++
 gcc/testsuite/gcc.dg/cpp/misspelled-directive-2.c | 21 
 libcpp/directives.c   | 41 +--
 libcpp/errors.c   | 36 
 libcpp/include/cpplib.h   |  8 +
 8 files changed, 139 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/misspelled-directive-2.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 569f000..a8ef595 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "gimplify.h"
 #include "substring-locations.h"
+#include "spellcheck.h"
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
 
@@ -12934,6 +12935,22 @@ cb_get_source_date_epoch (cpp_reader *pfile 
ATTRIBUTE_UNUSED)
   return (time_t) epoch;
 }
 
+/* Callback for libcpp for offering spelling suggestions for misspelled
+   directives.  GOAL is an unrecognized string; CANDIDATES is a
+   NULL-terminated array of candidate strings.  Return the closest
+   match to GOAL within CANDIDATES, or NULL if none are good
+   suggestions.  */
+
+const char *
+cb_get_suggestion (cpp_reader *, const char *goal,
+  const char * const *candidates)
+{
+  best_match bm (goal);
+  while (*candidates)
+bm.consider (*(candidates++));
+  return bm.get_best_meaningful_candidate ();
+}
+
 /* Check and possibly warn if two declarations have contradictory
attributes, such as always_inline vs. noinline.  */
 
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 61f9ced..a3da631 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1110,6 +1110,11 @@ extern time_t cb_get_source_date_epoch (cpp_reader 
*pfile);
__TIME__ can store.  */
 #define MAX_SOURCE_DATE_EPOCH HOST_WIDE_INT_C (253402300799)
 
+/* Callback for libcpp for offering spelling suggestions for misspelled
+   directives.  */
+extern const char *cb_get_suggestion (cpp_reader *, const char *,
+ const char * const *);
+
 extern GTY(()) string_concat_db *g_string_concat_db;
 
 /* libcpp can calculate location information about a range of characters
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 4c7e385..c904ee6 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -81,6 +81,7 @@ init_c_lex (void)
   cb->read_pch = c_common_read_pch;
   cb->has_attribute = c_common_has_attribute;
   cb->get_source_date_epoch = cb_get_source_date_epoch;
+  cb->get_suggestion = cb_get_suggestion;
 
   /* Set the debug callbacks if we can use them.  */
   if ((debug_info_level == DINFO_LEVEL_VERBOSE
diff --git a/gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c 
b/gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c
new file mode 100644
index 000..f79670a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/misspelled-directive-1.c
@@ -0,0 +1,12 @@
+#ifndef SOME_GUARD /* { dg-error "unterminated" } */
+
+#if 1
+/* Typo here: "endfi" should have been "endif".  */
+#endfi /* { dg-error "invalid preprocessing directive #endfi; did you mean 
#endif?" } */
+
+int make_non_empty;
+
+/* Another transposition typo:  */
+#deifne FOO /* { dg-error "invalid preprocessing directive #deifne; did you 
mean #define?" } */ 
+
+#endif /* #ifndef SOME_GUARD */
diff --git 

Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-18 Thread H.J. Lu
On Wed, Aug 17, 2016 at 11:21 PM, Uros Bizjak  wrote:
> On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
>> builtin_memset_gen_str returns a register used for memset, which only
>> supports integer registers.  But a target may use vector registers in
>> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
>> QImode value to mode derived from STORE_MAX_PIECES, which can be used
>> with vector instructions.  The default hook is the same as the original
>> builtin_memset_gen_str.  A target can override it to support vector
>> instructions for STORE_MAX_PIECES.
>>
>> Tested on x86-64 and i686.  Any comments?
>
> It looks to me you have attached an older version of the patch,
> STORE_MAX_PIECES change in i386.h is already in the mainline.
>

Did you mean MOVE_MAX_PIECES?  There is no STORE_MAX_PIECES
in i386.h

H.J.


Re: [PATCH] Generalize overriding mechanism for debug info output options defaults

2016-08-18 Thread Jeff Law

On 08/18/2016 08:01 AM, Pierre-Marie de Rodat wrote:

On 08/18/2016 11:40 AM, Jakub Jelinek wrote:

I believe that is the preferred way, rather than macros.
The macros are yet another thing that would need to be undone if we ever
start supporting multiple targets in the same binary.
I don't see any advantages of introducing the macros.


Ah, I understand, thank you. We debated a bit internally about whether
one was better than the other, so thank you for the rationale!

By the way, are there “official” plans to support multiple targets in
the same binary at some point, by the way? It’s the first time I hear
about it (that would be nice!).
It's pie in the sky right now, but it's something many of us want to see 
happen.  To that end we're trying to steer away from certain problematic 
idioms (like using macros to conditionalize behavior) towards schemes 
that would work in that world (hooks).


One good way to start thinking about this stuff is what would need to 
change to support a world where the target (or front-end) were a DSO and 
replaceable.


jeff




Re: [PATCH] DWARF: do not emit DW_TAG_variable to materialize DWARF procedures

2016-08-18 Thread Pierre-Marie de Rodat

On 08/18/2016 02:39 PM, Richard Biener wrote:

Ok.


Thank you for the quick review! This is commited, now (revision 239575).

--
Pierre-Marie de Rodat


Re: [PATCH] DWARF: space-optimize loc. descr. for integer literals on 32-bit targets

2016-08-18 Thread Pierre-Marie de Rodat

On 08/18/2016 01:29 PM, Trevor Saunders wrote:

HOST_BITS_PER_WIDE_INT is now always 64, so you don't need that check.


Good point, thank you! I’ll adapt the patch.

--
Pierre-Marie de Rodat


Re: [patch] New libstdc++ docs on testing and library versioning

2016-08-18 Thread Jonathan Wakely

On 07/08/16 23:36 -0600, Sandra Loosemore wrote:

On 08/04/2016 06:37 PM, Jonathan Wakely wrote:

I've been working on some changes to the libstdc++ manual recently.

A lot of the changes are just using DocBook markup with more semantic
meaning (e.g.  or  instead of using  for
everything that should use a monospace font) but there are some
changes to content too.

I've added a new subsection documenting the steps needed to bump the
library version when adding new symbols, and rewritten the section on
writing testcases. The main reason for the latter is to encourage the
use of { dg-do run { target c++11 } } rather than the { dg-options
"-std=gnu++11" } approach used until now. I've also started
documenting the libstdc++-specific dg-require-SUPPORT directives
available for our tests.

[snip]


I tried to look over this patch and my eyes started glazing over

I suggest trying to separate the markup changes (possibly multiple 
patches, for different kinds of changes) from the new content.  From 
my perspective, I think formatting cleanups and minor copy-editing can 
go in under the obvious patch rule as long as you are satisfied with 
the way it looks and have self-reviewed the patch to make sure you 
didn't accidentally delete some chunk of text or introduce some other 
unintended change.


The new material should be reviewed by someone knowledgeable enough to 
catch problems with the content.


I've committed the changes as four patches, attached.

1 only changes markup.

   Improve markup in libstdc++ manual
   
   * doc/xml/manual/build_hacking.xml: Improve markup.

   * doc/xml/manual/test.xml: Likewise. Change section title from "Test"
   to "Testing".
   * doc/xml/faq.xml: Change link text to "Testing".


2 adds the new section to the configure hacking docs.

   Document libstdc++.so versioning in manual
   
   * doc/xml/manual/build_hacking.xml: New section on shared library

   versioning.

3 is mostly markup, but documents some more makefile targets from the
testsuite sub-dir.

   Improve documentation of libstdc++ test targets
   
   * doc/xml/manual/test.xml: Improve documentation of test targets.

   Document new-abi-baseline, check-debug, and check-parallel targets.


4 rewrites the docs on writing tests.

   Expand libstdc++ docs on testing
   
   * doc/xml/manual/test.xml (test.run.permutations): Expand section.

   (test.new_tests): Rewrite section.
   (tests.dg.directives): New section.
   * doc/html/*: Regenerate.


The result of 2 is at:
https://gcc.gnu.org/onlinedocs/libstdc++/manual/appendix_porting.html#build_hacking.configure.version

And the result of 4 starts at:
https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html#test.run.permutations

The previous version was identical to:
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/libstdc++/manual/manual/test.html#test.run.permutations

If anyone wants to review that content I'll happily make changes, but
I thought I'd go ahead and commit it. It's much easier to read the
HTML in a browser than to read patches against the Docbook XML.


commit 1305655217385a4b315236c83a0da7ba4e81ca3a
Author: Jonathan Wakely 
Date:   Thu Aug 18 14:04:24 2016 +0100

Improve markup in libstdc++ manual

* doc/xml/manual/build_hacking.xml: Improve markup.
* doc/xml/manual/test.xml: Likewise. Change section title from "Test"
to "Testing".
* doc/xml/faq.xml: Change link text to "Testing".

diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index b24ee22..9466231 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -328,7 +328,7 @@
 performance testing. Please consult the 
 http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/install/test.html;>testing
 documentation for GCC and
-Test in the libstdc++
+Testing in the libstdc++
 manual for more details.
  
 
diff --git a/libstdc++-v3/doc/xml/manual/build_hacking.xml 
b/libstdc++-v3/doc/xml/manual/build_hacking.xml
index 1b789d3..0bcd879 100644
--- a/libstdc++-v3/doc/xml/manual/build_hacking.xml
+++ b/libstdc++-v3/doc/xml/manual/build_hacking.xml
@@ -93,7 +93,7 @@ in the build directory starts the build process. The 
all targ
 
   
 Regenerate all generated files by using the command 
-autoreconf at the top level of the libstdc++ source
+autoreconf at the top level of the libstdc++ source
 directory.
   
 
@@ -108,10 +108,10 @@ in the build directory starts the build process. The 
all targ
 
 
   
-Until that glorious day when we can use AC_TRY_LINK with a
-cross-compiler, we have to hardcode the results of what the tests
+Until that glorious day when we can use AC_TRY_LINK
+with a cross-compiler, we have to hardcode the results of what the tests
 would have shown if they could be run.  So we have an inflexible
-mess like crossconfig.m4.
+mess like crossconfig.m4.
   
 
   
@@ 

Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-18 Thread Jakub Jelinek
On Thu, Aug 18, 2016 at 03:50:07PM +0200, Marek Polacek wrote:
> +case GIMPLE_BIND:
> +  {
> + gbind *bind = as_a  (stmt);
> + return last_stmt_in_scope (
> +  gimple_seq_last_stmt (gimple_bind_body (bind)));
> +  }
> +
> +case GIMPLE_TRY:
> +  {
> + gtry *try_stmt = as_a  (stmt);
> + return last_stmt_in_scope (
> +  gimple_seq_last_stmt (gimple_try_eval (try_stmt)));

Just a minor formatting detail.
stmt = gimple_seq_last_stmt (gimple_try_eval (try_stmt));
return last_stmt_in_scope (stmt);
and similarly above might be nicer.  Or do the tail recursion by hand?
No need to repost for that.

Jakub


Re: [PATCH] Generalize overriding mechanism for debug info output options defaults

2016-08-18 Thread Pierre-Marie de Rodat

On 08/18/2016 11:40 AM, Jakub Jelinek wrote:

I believe that is the preferred way, rather than macros.
The macros are yet another thing that would need to be undone if we ever
start supporting multiple targets in the same binary.
I don't see any advantages of introducing the macros.


Ah, I understand, thank you. We debated a bit internally about whether 
one was better than the other, so thank you for the rationale!


By the way, are there “official” plans to support multiple targets in 
the same binary at some point, by the way? It’s the first time I hear 
about it (that would be nice!).


--
Pierre-Marie de Rodat


Implement -Wimplicit-fallthrough (take 3): add gcc_fallthrough

2016-08-18 Thread Marek Polacek
Tested on powerpc64le-unknown-linux-gnu, aarch64-linux-gnu,
and x86_64-redhat-linux.

2016-08-18  Marek Polacek  

PR c/7652
* Makefile.in (insn-attrtab.o-warn, insn-dfatab.o-warn,
insn-latencytab.o-warn, insn-output.o-warn, insn-emit.o-warn): Add
-Wno-switch-fallthrough.
* builtins.c (expand_builtin_int_roundingfn_2): Add gcc_fallthrough.
(expand_builtin): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_vectorized_libmass): Likewise.
* convert.c (convert_to_real_1): Likewise.
(convert_to_integer_1): Likewise.
* final.c (output_alternate_entry_point): Likewise.
* genattrtab.c (make_canonical): Likewise.
(write_test_expr): Likewise.
* genpreds.c (validate_exp): Likewise.
* gimple-ssa-strength-reduction.c
(find_candidates_dom_walker::before_dom_children): Likewise.
* godump.c (go_format_type): Likewise.
* eload1.c (elimination_effects): Likewise.
* resource.c (mark_referenced_resources): Likewise.
(mark_set_resources): Likewise.
* tree-ssa-loop-ivopts.c (find_deriving_biv_for_expr): Likewise.
* varasm.c (output_addressed_constants): Likewise.
gcc/c-family/
* c-common.c (resolve_overloaded_builtin): Add gcc_fallthrough.
gcc/c/
* c-decl.c (pop_scope): Add gcc_fallthrough.
* c-typeck.c (composite_type): Likewise.
gcc/gcc/cp/
* error.c (dump_type): Add gcc_fallthrough.
(dump_decl): Likewise.
(dump_expr): Likewise.
* parser.c (cp_parser_storage_class_specifier_opt): Likewise.
(cp_parser_skip_to_end_of_template_parameter_list): Likewise.
(cp_parser_cache_defarg): Likewise.
(cp_parser_omp_for_cond): Likewise.
* semantics.c (finish_decltype_type): Likewise.
* typeck.c (structural_comptypes): Likewise.
(cp_build_binary_op): Likewise.
(cp_build_modify_expr): Likewise.
gcc/fortran/
* arith.c (eval_intrinsic): Add gcc_fallthrough.
* frontend-passes.c (optimize_op): Likewise.
(gfc_expr_walker): Likewise.
* parse.c (next_fixed): Likewise.
* primary.c (match_variable): Likewise.
* trans-array.c: Likewise.
* trans-expr.c (flatten_array_ctors_without_strlen): Likewise.
libstdc++-v3/
* libsupc++/hash_bytes.cc: Add [[gnu::fallthrough]].

--- gcc/gcc/Makefile.in
+++ gcc/gcc/Makefile.in
@@ -218,6 +218,11 @@ libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
+insn-attrtab.o-warn = -Wno-implicit-fallthrough
+insn-dfatab.o-warn = -Wno-implicit-fallthrough
+insn-latencytab.o-warn = -Wno-implicit-fallthrough
+insn-output.o-warn = -Wno-implicit-fallthrough
+insn-emit.o-warn = -Wno-implicit-fallthrough
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
--- gcc/gcc/builtins.c
+++ gcc/gcc/builtins.c
@@ -2587,7 +2587,7 @@ expand_builtin_int_roundingfn_2 (tree exp, rtx target)
 {
 CASE_FLT_FN (BUILT_IN_IRINT):
   fallback_fn = BUILT_IN_LRINT;
-  /* FALLTHRU */
+  gcc_fallthrough ();
 CASE_FLT_FN (BUILT_IN_LRINT):
 CASE_FLT_FN (BUILT_IN_LLRINT):
   builtin_optab = lrint_optab;
@@ -2595,7 +2595,7 @@ expand_builtin_int_roundingfn_2 (tree exp, rtx target)
 
 CASE_FLT_FN (BUILT_IN_IROUND):
   fallback_fn = BUILT_IN_LROUND;
-  /* FALLTHRU */
+  gcc_fallthrough ();
 CASE_FLT_FN (BUILT_IN_LROUND):
 CASE_FLT_FN (BUILT_IN_LLROUND):
   builtin_optab = lround_optab;
@@ -5902,6 +5902,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
 CASE_FLT_FN (BUILT_IN_ILOGB):
   if (! flag_unsafe_math_optimizations)
break;
+  gcc_fallthrough ();
 CASE_FLT_FN (BUILT_IN_ISINF):
 CASE_FLT_FN (BUILT_IN_FINITE):
 case BUILT_IN_ISFINITE:
--- gcc/gcc/c-family/c-common.c
+++ gcc/gcc/c-family/c-common.c
@@ -11528,6 +11528,7 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
gcc_unreachable ();
}
/* Fallthrough to the normal processing.  */
+   gcc_fallthrough ();
   }
 case BUILT_IN_ATOMIC_EXCHANGE_N:
 case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N:
@@ -11536,6 +11537,7 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
   {
fetch_op = false;
/* Fallthrough to further processing.  */
+   gcc_fallthrough ();
   }
 case BUILT_IN_ATOMIC_ADD_FETCH_N:
 case BUILT_IN_ATOMIC_SUB_FETCH_N:
@@ -11552,6 +11554,7 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
   {
 orig_format = false;
/* Fallthru for parameter processing.  */
+   gcc_fallthrough ();
   }
 case BUILT_IN_SYNC_FETCH_AND_ADD_N:
 case BUILT_IN_SYNC_FETCH_AND_SUB_N:
--- gcc/gcc/c/c-decl.c
+++ gcc/gcc/c/c-decl.c
@@ 

Implement -Wimplicit-fallthrough (take 3)

2016-08-18 Thread Marek Polacek
Now that all various switch fallthrough bugfixes and adjustments were
committed, and this patch has shrunk considerably, I'm presenting the latest
version.  The changes from the last version are not huge; we don't warn for a
fall through to a user-defined label anymore, and I made some tiny changes
regarding parsing attributes in the C FE, as requested by Joseph.

This patch is accompanied by another patch that merely adds various
gcc_fallthroughs, in places where a FALLTHRU comment wouldn't work.

It's been tested on powerpc64le-unknown-linux-gnu, aarch64-linux-gnu,
and x86_64-redhat-linux.

2016-08-18  Marek Polacek  
Jakub Jelinek  

PR c/7652
gcc/
* common.opt (Wimplicit-fallthrough): New option.
* doc/extend.texi: Document statement attributes and the fallthrough
attribute.
* doc/invoke.texi: Document -Wimplicit-fallthrough.
* gimple.h (gimple_call_internal_p): New function.
* gimplify.c (struct gimplify_ctx): Add in_switch_expr.
(struct label_entry): New struct.
(find_label_entry): New function.
(case_label_p): New function.
(last_stmt_in_scope): New function.
(warn_implicit_fallthrough_r): New function.
(maybe_warn_implicit_fallthrough): New function.
(expand_FALLTHROUGH_r): New function.
(expand_FALLTHROUGH): New function.
(gimplify_switch_expr): Call maybe_warn_implicit_fallthrough and
expand_FALLTHROUGH for the innermost GIMPLE_SWITCH.
(gimplify_label_expr): New function.
(gimplify_case_label_expr): Set location.
(gimplify_expr): Call gimplify_label_expr.
* internal-fn.c (expand_FALLTHROUGH): New function.
* internal-fn.def (FALLTHROUGH): New internal function.
* system.h (gcc_fallthrough): Define.
* tree-core.h: Add FALLTHROUGH_LABEL_P comment.
* tree.h (FALLTHROUGH_LABEL_P): Define.
gcc/c-family/
* c-common.c (c_common_attribute_table): Add fallthrough attribute.
(handle_fallthrough_attribute): New function.
gcc/c/
* c-parser.c (struct c_token): Add flags field.
(c_lex_one_token): Pass it to c_lex_with_flags.
(c_parser_declaration_or_fndef): Turn __attribute__((fallthrough));
into IFN_FALLTHROUGH.
(c_parser_label): Set FALLTHROUGH_LABEL_P on labels.
(c_parser_statement_after_labels): Handle RID_ATTRIBUTE.
gcc/cp/
* constexpr.c (cxx_eval_internal_function): Handle IFN_FALLTHROUGH.
(potential_constant_expression_1): Likewise.
* parser.c (cp_parser_primary_expression): Handle RID_ATTRIBUTE.
(cp_parser_statement): Handle fallthrough attribute.
(cp_parser_label_for_labeled_statement): Set FALLTHROUGH_LABEL_P on
labels.
(cp_parser_std_attribute): Handle fallthrough attribute.
(cp_parser_check_std_attribute): Detect duplicated fallthrough
attribute.
gcc/testsuite/
* c-c++-common/Wimplicit-fallthrough-1.c: New test.
* c-c++-common/Wimplicit-fallthrough-10.c: New test.
* c-c++-common/Wimplicit-fallthrough-11.c: New test.
* c-c++-common/Wimplicit-fallthrough-12.c: New test.
* c-c++-common/Wimplicit-fallthrough-13.c: New test.
* c-c++-common/Wimplicit-fallthrough-14.c: New test.
* c-c++-common/Wimplicit-fallthrough-15.c: New test.
* c-c++-common/Wimplicit-fallthrough-16.c: New test.
* c-c++-common/Wimplicit-fallthrough-17.c: New test.
* c-c++-common/Wimplicit-fallthrough-18.c: New test.
* c-c++-common/Wimplicit-fallthrough-19.c: New test.
* c-c++-common/Wimplicit-fallthrough-2.c: New test.
* c-c++-common/Wimplicit-fallthrough-3.c: New test.
* c-c++-common/Wimplicit-fallthrough-4.c: New test.
* c-c++-common/Wimplicit-fallthrough-5.c: New test.
* c-c++-common/Wimplicit-fallthrough-6.c: New test.
* c-c++-common/Wimplicit-fallthrough-7.c: New test.
* c-c++-common/Wimplicit-fallthrough-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-9.c: New test.
* c-c++-common/attr-fallthrough-1.c: New test.
* c-c++-common/attr-fallthrough-2.c: New test.
* g++.dg/cpp0x/fallthrough1.C: New test.
* g++.dg/cpp0x/fallthrough2.C: New test.
* g++.dg/cpp1z/fallthrough1.C: New test.
libcpp/
* include/cpplib.h (PREV_FALLTHROUGH): Define.
* internal.h (CPP_FALLTHRU): Define.
* lex.c (fallthrough_comment_p): New function.
(_cpp_lex_direct): Set PREV_FALLTHROUGH on tokens succeeding a falls
through comment.

diff --git gcc/gcc/c-family/c-common.c gcc/gcc/c-family/c-common.c
index 51b6ca9..86fdb3a 100644
--- gcc/gcc/c-family/c-common.c
+++ gcc/gcc/c-family/c-common.c
@@ -395,6 +395,7 @@ static tree handle_designated_init_attribute (tree *, tree, 
tree, int, bool *);
 static tree handle_bnd_variable_size_attribute (tree *, 

Re: [PATCH] Implement -fdiagnostics-parseable-fixits

2016-08-18 Thread Eelis

On 2016-06-22 05:25, David Malcolm wrote:

Clang has an option -fdiagnostics-parseable-fixits, which emits a
machine-readable version of the fixits, for use by IDEs.  (The only
IDE I know of that supports this format is Xcode [1]; does anyone
know of other IDEs supporting this?  I'd be very happy if someone
implemented Emacs support for it, or could point me at it).

This patch implements the option for gcc.  It seems prudent to be
compatible with Clang here, so I've deliberately used the same option
name and attempted to emulate the output format,


Thanks a lot for this! I now use this in geordi (http://eelis.net/geordi). 
Geordi already supported Clang's fix-its, and thanks to your taking care to use 
the same format, enabling it for GCC was a matter of adding the flag.

Cheers!


Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope

2016-08-18 Thread Jakub Jelinek
On Thu, May 12, 2016 at 04:12:21PM +0200, Martin Liška wrote:
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -243,6 +243,11 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value;
>  static bool asan_shadow_offset_computed;
>  static vec sanitized_sections;
>  
> +/* Set of variable declarations that are going to be guarded by
> +   use-after-scope sanitizer.  */
> +
> +static hash_set asan_handled_variables (13);

Doesn't this introduce yet another global ctor?  If yes (and we
unfortunately have already way too many), I'd strongly prefer to avoid that,
use pointer to hash_set or something similar.

> +/* Depending on POISON flag, emit a call to poison (or unpoison) stack memory
> +   allocated for local variables, localted in OFFSETS.  LENGTH is number
> +   of OFFSETS, BASE is the register holding the stack base,
> +   against which OFFSETS array offsets are relative to.  BASE_OFFSET 
> represents
> +   an offset requested by alignment and similar stuff.  */
> +
> +static void
> +asan_poison_stack_variables (rtx shadow_base, rtx base,
> +  HOST_WIDE_INT base_offset,
> +  HOST_WIDE_INT *offsets, int length,
> +  tree *decls, bool poison)
> +{
> +  if (asan_sanitize_use_after_scope ())
> +for (int l = length - 2; l > 0; l -= 2)
> +  {

I think this is unfortunate, it leads to:
movl$-235802127, 2147450880(%rax)
movl$-185335552, 2147450884(%rax)
movl$-202116109, 2147450888(%rax)
movb$-8, 2147450884(%rax)
movb$-8, 2147450885(%rax)
(e.g. on use-after-scope-1.c).
The asan_emit_stack_protection function already walks all the
entries in the offsets array in both of the
  for (l = length; l; l -= 2)
loops, so please handle the initial poisoning and final unpoisoning there
as well.  The goal is that for variables that you want poison-after-scope
at the start of the function (btw, I've noticed that current SVN LLVM
doesn't bother with it and thus doesn't track "use before scope" (before the
scope is entered for the first time, maybe we shouldn't either, that would
catch only compiler bugs rather than user code bugs, right?)) have 0xf8
on all corresponding bytes including the one that would otherwise have 0x01
through 0x07.  When unpoisoning at the end of the function, again you should
combine that with unpoisoning of the red zone and partial zone bytes plus
the last 0x01 through 0x07, etc.

Plus, as I've mentioned before, it would be nice to optimize - for ASAN_MARK
unpoison appearing strictly before (i.e. dominating) the first (non-shadow) 
memory read
or write in the function (explicit or possible through function calls etc.)
you really don't need to unpoison (depending on whether we follow LLVM as
mentioned above then it can be removed without anything, or the decl needs
to be somehow marked and tell asan_emit_stack_protection it shouldn't poison
it at the start), and for ASAN_MARK poisoning appearing after the last
load/store in the function (post dominating those, you don't care about
noreturn though) you can combine that (remove the ASAN_MARK) with letting
asan_emit_stack_protection know it doesn't need to unpoison.

> + char c = poison ? ASAN_STACK_MAGIC_USE_AFTER_SCOPE : 0;
> + for (unsigned i = 0; i < shadow_size; ++i)
> +   {
> + emit_move_insn (var_mem, gen_int_mode (c, QImode));
> + var_mem = adjust_address (var_mem, QImode, 1);

When you combine it with the loop, you can also use the infrastructure to
handle it 4 bytes at a time.

Another thing I've noticed is that the inline expansion of
__asan_unpoison_stack_memory you emit looks buggy.
In use-after-scope-1.c I see:
  _9 = (unsigned long) _char;
  _10 = _9 >> 3;
  _11 = _10 + 2147450880;
  _12 = (signed char *) _11;
  MEM[(short int *)_12] = 0;

That would be fine only for 16 byte long my_char, but we have instead 9 byte
one.  So I believe in that case we need to store
0x00, 0x01 bytes, for little endian thus 0x0100.  You could use for it
a function similarly to asan_shadow_cst, just build INTEGER_CST rather than
CONST_INT out of it.  In general, poisioning is storing 0xf8 to all affected
shadow bytes, unpoisioning should restore the state what we would emit
without use-after-scope sanitization, which is all but the last byte 0, and
the last byte 0 only if the var size is a multiple of 8, otherwise number
of valid bytes (1-7).

As for the option, it seems clang uses now
-fsanitize-address-use-after-scope option, while I don't like that much, if
they have already released some version with that option or if they are
unwilling to change, I'd go with their option.

> + if (flag_stack_reuse != SR_NONE
> + && flag_openacc
> + && oacc_declare_returns != NULL)

This actually looks like preexisting OpenACC bug, I doubt the OpenACC
behavior should depend on -fstack-reuse= setting.

+  bool unpoison_var = asan_poisoned_variables.contains 

[committed] Evict selftest tempfiles from the diagnostics file cache

2016-08-18 Thread David Malcolm
Selftests can use class selftest::temp_source_file to write out files
for testing input-handling, and the files are unlinked in the dtor.

This leads to stale entries in input.c's cache of file content, which
could lead to errors if a temporary filename gets reused during a run
of the selftests.

We don't normally expect files to be "deleted from under us", so
special-case this by adding a special way for temp_source_file's
dtor to purge any cache entries referring to it.

Successfully bootstrapped on x86_64-pc-linux-gnu.

Committed to trunk as r239570.

gcc/ChangeLog:
* input.c (diagnostics_file_cache_forcibly_evict_file): New
function.
* input.h (diagnostics_file_cache_forcibly_evict_file): New
declaration.
* selftest.c (selftest::temp_source_file::~temp_source_file):
Evict m_filename from the diagnostic file cache.
---
 gcc/input.c| 26 ++
 gcc/input.h|  2 ++
 gcc/selftest.c |  1 +
 3 files changed, 29 insertions(+)

diff --git a/gcc/input.c b/gcc/input.c
index 172d13a..76a3307 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -249,6 +249,32 @@ lookup_file_in_cache_tab (const char *file_path)
   return r;
 }
 
+/* Purge any mention of FILENAME from the cache of files used for
+   printing source code.  For use in selftests when working
+   with tempfiles.  */
+
+void
+diagnostics_file_cache_forcibly_evict_file (const char *file_path)
+{
+  gcc_assert (file_path);
+
+  fcache *r = lookup_file_in_cache_tab (file_path);
+  if (!r)
+/* Not found.  */
+return;
+
+  r->file_path = NULL;
+  if (r->fp)
+fclose (r->fp);
+  r->fp = NULL;
+  r->nb_read = 0;
+  r->line_start_idx = 0;
+  r->line_num = 0;
+  r->line_record.truncate (0);
+  r->use_count = 0;
+  r->total_lines = 0;
+}
+
 /* Return the file cache that has been less used, recently, or the
first empty one.  If HIGHEST_USE_COUNT is non-null,
*HIGHEST_USE_COUNT is set to the highest use count of the entries
diff --git a/gcc/input.h b/gcc/input.h
index c17e440..0f187c7 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -95,6 +95,8 @@ void dump_location_info (FILE *stream);
 
 void diagnostics_file_cache_fini (void);
 
+void diagnostics_file_cache_forcibly_evict_file (const char *file_path);
+
 struct GTY(()) string_concat
 {
   string_concat (int num, location_t *locs);
diff --git a/gcc/selftest.c b/gcc/selftest.c
index 0a7192e..d25f5c0 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -111,6 +111,7 @@ selftest::temp_source_file::temp_source_file (const 
location ,
 selftest::temp_source_file::~temp_source_file ()
 {
   unlink (m_filename);
+  diagnostics_file_cache_forcibly_evict_file (m_filename);
   free (m_filename);
 }
 
-- 
1.8.5.3



Re: [Patch] Implement std::experimental::variant

2016-08-18 Thread Jonathan Wakely

On 05/08/16 22:45 -0700, Tim Shen wrote:

On Fri, Aug 5, 2016 at 4:08 AM, Jonathan Wakely  wrote:

I think these all need to use ::new (__ptr) with qualification, see
below.


Done. I didn't add a testcase for this. Do you think that we need one/some?


No, I don't think we need a test. I doubt anybody will try to remove
the qualification, and if they do somebody who understands the need
for it can explain it before the patch gets committed :-)


+  // Stores a void alternative, until void becomes a regular type.



Personally I hope it won't become a regular type :-)


Sorry :)

Do you mind to share your insights?


My impression of the proposal was that making it regular solves a few
problems, but introduces some other incompatibilities. If we were
designing a new language it might be a reasonable approach, but I
wouldn't want to introduce such a different object model based on what
we have today.




(The decval() case is OK as that won't do ADL).


Done for all of them, for consistency.




+  template
+struct __gen_vtable_impl<_Array_type, tuple<_First, _Rest...>,
+tuple<_Args...>>



Do you actually need to use std::tuple here, or would something much
more lightweight be OK too?


I actually just need a template to hold all alternatives. How about
forward declaring tuple, and not including the header?

I was totally unaware of the cost of , and tried to use
tuple_cat() on a bunch of function calls:
(void)tuple_cat(foo<_Types>()...) where foo returns tuple<>.

But not we have fold expression \o/! So I can directly write:
(foo<_Types>, ...)

which is perfect.


Nice.



For example we have std::tr2::__reflection_typelist in
. Or it looks like this could just use something even
simpler:
 template struct __type_list

What we really need is to standardize Eric Niebler's metapgroamming
library, or Peter Dimov's one, or *anything* that gives us a set of
nice tools for doing this stuff. std::tuple is a very heavyweight type
to use for simple type lists.


I don't need the whole pack of metaprogramming tools here, lucky me. ;)



It doesn't look like you're using any members of std::tuple, so maybe
it won't actually instantiate any of the base classes or member
functions, which is what would be inefficient.


+  template
+_Tp& get(variant<_Types...>& __v)



Please add 'inline' to these one-line functions. All the get and
get_if overloads are tiny.


What's the difference between non-inline function templates and inline
function templates? At some point you may already explained that to
me, but I'm still confused.


In practical terms, there's no difference. Function templates and
inline functions both get compiled to weak symbols, which are allowed
to have multiple definitions. Duplicate definitions will be discarded
by the linker.

That's an implementation detail related to how they are handled by the
compiler.  In formal C++ language terms they are still different
things. A function template is not an inline function unless it is
declared 'inline'. My preference is to add 'inline' if it's a small
function that would typically be marked inline if it wasn't a
template.

If the compiler ever uses the presence of the 'inline' keyword as a
hint for inlining (rather than just as the trigger for generating a
weak symbol) then putting it there would make a difference.


Good news! This compiles now! I learned the technique from Anthony
Williams's implementation, whose code also compiles, but it requires a
close-to-trunk gcc, which implements
"...for unions, at least one non-static data member is of non-volatile
literal type, ...".

Also added it as a test.


Great.


Please verify the implementation by looking at _Uninitialized and
_Variant_storage.




However, I think we could commit this for now as it's 99% complete.
What do you think?

I am concerned that if we commit this implementation now, and *don't*
get a 100% conforming rewrite before we want to declare C++17 support
stable and non-experimental, then we'd have to introduce an
incompatible change. That could be done by replacing std::variant with
std::_V2::variant, so it would still be possible.

If you want to propose this for trunk please make the fixes noted
above and send a new patch, CCing gcc-patches.

Thanks!




I also moved the tests from experiemntal/variant to 20_util/variant.

Bootstrapped and tested on x86_64-linux-gnu.


OK for trunk, thanks!




Re: [PATCH] DWARF: do not emit DW_TAG_variable to materialize DWARF procedures

2016-08-18 Thread Richard Biener
On Thu, Aug 18, 2016 at 11:33 AM, Pierre-Marie de Rodat
 wrote:
> Hello,
>
> For -gdwarf-3 and newer, the DWARF back-end sometimes generates DWARF
> procedures to factorize complex location descriptions.  DWARF procedures
> can be materialized as DW_TAG_dwarf_procedure DIEs, but actually any DIE
> that can hold a DW_AT_location attribute is also accepted.
>
> Unlike what I thought at some point, the DW_TAG_dwarf_procedure tag was
> introduced in the DWARFv3 standard, not the DWARFv4 one, so the back-end
> can always emit DW_TAG_dwarf_procedure DIEs, as this simplifies code and
> prevents the types pruning pass from missing a DWARF procedure.
>
> Boostrapped and regtested on x86_64-linux: no regression.  Ok to commit?
> Thank you in advance!

Ok.

Richard.

> gcc/
>
> * dwarf2out.c (copy_dwarf_procedure): Remove obsolete comment.
> (new_dwarf_proc_die): Emit DW_TAG_dwarf_procedure DIEs even for
> -gdwarf-3.
> (function_to_dwarf_procedure): Update comment.
> ---
>  gcc/dwarf2out.c | 16 ++--
>  1 file changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 0fdab9a..98f08b7 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -7747,9 +7747,6 @@ copy_dwarf_procedure (dw_die_ref die,
>   comdat_type_node *type_node,
>   hash_map _dwarf_procs)
>  {
> -  /* We do this for COMDAT section, which is DWARFv4 specific, so
> - DWARF procedure are always DW_TAG_dwarf_procedure DIEs (unlike
> - DW_TAG_variable in DWARFv3).  */
>gcc_assert (die->die_tag == DW_TAG_dwarf_procedure);
>
>/* DWARF procedures are not supposed to have children...  */
> @@ -15332,22 +15329,15 @@ static dw_die_ref
>  new_dwarf_proc_die (dw_loc_descr_ref location, tree fndecl,
> dw_die_ref parent_die)
>  {
> -  const bool dwarf_proc_supported = dwarf_version >= 4;
>dw_die_ref dwarf_proc_die;
>
>if ((dwarf_version < 3 && dwarf_strict)
>|| location == NULL)
>  return NULL;
>
> -  dwarf_proc_die  = new_die (dwarf_proc_supported
> -? DW_TAG_dwarf_procedure
> -: DW_TAG_variable,
> -parent_die,
> -fndecl);
> +  dwarf_proc_die = new_die (DW_TAG_dwarf_procedure, parent_die, fndecl);
>if (fndecl)
>  equate_decl_number_to_die (fndecl, dwarf_proc_die);
> -  if (!dwarf_proc_supported)
> -add_AT_flag (dwarf_proc_die, DW_AT_artificial, 1);
>add_AT_loc (dwarf_proc_die, DW_AT_location, location);
>return dwarf_proc_die;
>  }
> @@ -15673,9 +15663,7 @@ function_to_dwarf_procedure (tree fndecl)
>if (dwarf_proc_die != NULL)
>  return dwarf_proc_die;
>
> -  /* DWARF procedures are available starting with the DWARFv3 standard, but
> - it's the DWARFv4 standard that introduces the DW_TAG_dwarf_procedure
> - DIE.  */
> +  /* DWARF procedures are available starting with the DWARFv3 standard.  */
>if (dwarf_version < 3 && dwarf_strict)
>  return NULL;
>
> --
> 2.9.3
>


Re: [v3 PATCH] Implement the latest proposed resolution of LWG 2756

2016-08-18 Thread Jonathan Wakely

On 09/08/16 02:22 +0300, Ville Voutilainen wrote:

   Implement the latest proposed resolution of LWG 2756.
   * include/std/optional (Optional_base(const _Tp&))
   (Optional_base(_Tp&&), using _Base::_Base): Remove.
   (optional(nullopt_t)): New.
   (optional(_Up&&)): Invoke base directly with in_place
   rather than creating a temporary, add default template
   argument, change constraints.
   (optional(const optional<_Up>&)): Invoke base directly
   with in_place, remove unnecessary constraints.
   (optional(optional<_Up>&& __t)): Likewise.
   (optional(in_place_t, _Args&&...)): New.
   (optional(in_place_t, initializer_list<_Up>, _Args&&...)): Likewise.
   (operator=(_Up&&)): Add default template argument, change constraints.
   (operator=(const optional<_Up>&)): Put is_same first in the
   constraints.
   (operator=(optional<_Up>&&)): Likewise.
   * testsuite/20_util/optional/assignment/5.cc: Add a test to
   verify assignment from something that can't be perfect-forwarded.
   * testsuite/20_util/optional/cons/value.cc: Add tests to verify
   that a nested optional is disengaged when constructed
   from a disengaged element type,and to verify that assignments
   from an engaged element type engage the optional.


OK for trunk.




Re: [PATCH] DWARF: space-optimize loc. descr. for integer literals on 32-bit targets

2016-08-18 Thread Trevor Saunders
On Thu, Aug 18, 2016 at 11:33:06AM +0200, Pierre-Marie de Rodat wrote:
> Hello,
> 
> This enhances location description generation so that the generated
> opcodes for integer literals are as space-efficient when HOST_WIDE_INT
> is 64-bits large than when it's 32-bits large. In particular, this
> reduces the size of the opcodes generated to produce big unsigned
> literals using small literal integers instead.
> 
> Bootstrapped and regtested on x86-linux (no regression).  I also checked
> that the new testcase fails with mainline.  Ok to commit?  Thank you in
> advance!
> 
> gcc/
> 
>   * dwarf2out.c (int_loc_descriptor): Generate opcodes for another
>   equivalent 32-bit constant (modulo 2**32) when that yields
>   smaller instructions.
>   (size_of_int_loc_descriptor): Update accordingly.
> ---
>  gcc/dwarf2out.c  | 31 ++-
>  gcc/testsuite/gnat.dg/debug8.adb | 29 +
>  2 files changed, 55 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gnat.dg/debug8.adb
> 
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 0fdab9a..f175ea1 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -11954,20 +11954,36 @@ int_loc_descriptor (HOST_WIDE_INT i)
>   /* DW_OP_const1u X DW_OP_litY DW_OP_shl takes just 4 bytes,
>  while DW_OP_const4u is 5 bytes.  */
>   return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8);
> +
> +  else if (HOST_BITS_PER_WIDE_INT > 32

HOST_BITS_PER_WIDE_INT is now always 64, so you don't need that check.

Trev


Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-18 Thread Nathan Sidwell

On 08/17/16 23:15, Jeff Law wrote:

On 08/12/2016 07:31 AM, Martin Liška wrote:

On 08/09/2016 09:03 PM, Andi Kleen wrote:

It could potentially make things a lot slower. I don't think it's a good
idea to do this by default.

-Andi


Ok, alternative can be a warning in the driver that would inform a user
that combining -pthread and -fprofile-update=single can lead to profile
corruption.
My first attempt with option handling gcc.c was not successful, I'll try it 
once.

Is it reasonable approach?

I'd prefer to make updates atomic in multi-threaded applications.  The best
proxy we have for that is -pthread.

Is it slower, most definitely, but odds are we're giving folks garbage data
otherwise, which in many ways is even worse.


True, and if someone cares they can always override the new default behaviour.

nathan


C PATCH to disallow invalid arguments to __atomic_* (PR c/71514)

2016-08-18 Thread Marek Polacek
As discussed in the PR, for __atomic_* functions we shouldn't accept
pointer-to-function and pointer-to-VLA as arguments.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-08-18  Marek Polacek  

PR c/71514
* c-common.c (get_atomic_generic_size): Disallow pointer-to-function
and pointer-to-VLA.

* gcc.dg/pr71514.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index d413146..22e3844 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -11081,6 +11081,20 @@ get_atomic_generic_size (location_t loc, tree function,
function);
  return 0;
}
+  else if (TYPE_SIZE_UNIT (TREE_TYPE (type))
+  && TREE_CODE ((TYPE_SIZE_UNIT (TREE_TYPE (type
+ != INTEGER_CST)
+   {
+ error_at (loc, "argument %d of %qE must be a pointer to a constant "
+   "size type", x + 1, function);
+ return 0;
+   }
+  else if (FUNCTION_POINTER_TYPE_P (type))
+   {
+ error_at (loc, "argument %d of %qE must not be a pointer to a "
+   "function", x + 1, function);
+ return 0;
+   }
   tree type_size = TYPE_SIZE_UNIT (TREE_TYPE (type));
   size = type_size ? tree_to_uhwi (type_size) : 0;
   if (size != size_0)
diff --git gcc/testsuite/gcc.dg/pr71514.c gcc/testsuite/gcc.dg/pr71514.c
index e69de29..8bfa805 100644
--- gcc/testsuite/gcc.dg/pr71514.c
+++ gcc/testsuite/gcc.dg/pr71514.c
@@ -0,0 +1,23 @@
+/* PR c/71514 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+void
+foo ()
+{
+}
+
+int a, b;
+
+void
+fn1 (void)
+{
+  __atomic_exchange (, , , __ATOMIC_RELAXED); /* { dg-error "must not 
be a pointer to a function" } */
+}
+
+void
+fn2 (int n)
+{
+  int arr[n];
+  __atomic_exchange (, , , __ATOMIC_RELAXED); /* { dg-error "must be a 
pointer to a constant size type" } */
+}

Marek


Re: Fwd: [PATCH, doc/ARM] Remove false affirmation that Thumb cannot use an FPU

2016-08-18 Thread Thomas Preudhomme

Hi Sandra,

On 18/08/16 07:00, Sandra Loosemore wrote:

On 08/11/2016 04:31 AM, Thomas Preudhomme wrote:


diff --git a/gcc/doc/fragments.texi b/gcc/doc/fragments.texi
index
b6d8541c8ca820fa732363a05221e2cd4d1251c2..abf4e128671bb4751d21f24bb69625593d3c839e
100644
--- a/gcc/doc/fragments.texi
+++ b/gcc/doc/fragments.texi
@@ -117,12 +117,15 @@ specified, there are combinations that should not be
built.  In that
 case, set @code{MULTILIB_EXCEPTIONS} to be all of the switch exceptions
 in shell case syntax that should not be built.

-For example the ARM processor cannot execute both hardware floating
-point instructions and the reduced size THUMB instructions at the same
-time, so there is no need to build libraries with both of these
-options enabled.  Therefore @code{MULTILIB_EXCEPTIONS} is set to:
+For example on ARM targets @option{-mfloat-abi=soft} requests to use a
+softfloat implementation for floating-point operations.  Therefore, it
+does not make sense to find both @option{-mfloat-abi=soft} and an
+@option{mfpu} option on the command line so @code{MULTILIB_EXCEPTIONS}
+could contain the following exception (assuming that
+@option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
+that @option{-mfloat-abi=soft} is the default value):
 @smallexample
-*mthumb/*mhard-float*
+*mfpu=*
 @end smallexample

 @findex MULTILIB_REQUIRED


This version still has a lot of copy-editing issues.  I suggest rewriting as:

  For example, on ARM targets @option{-mfloat-abi=soft} requests use of
  software floating-point operations, so it
  does not make sense to build libraries with both
  @option{-mfloat-abi=soft} and an @option{-mfpu} option.
  @code{MULTILIB_EXCEPTIONS} could contain the following exception


Thanks for the help. That sounds better indeed. Future readers thank you ;-)



but here I get stuck in suggesting a rewrite, because I can't parse this part at
all to figure out what you're trying to say:

  (assuming that
  @option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
  that @option{-mfloat-abi=soft} is the default value):

"comes after in"?  Comes after what?  If the order is important here, the
documentation should explain why instead of just "assuming" things about it.


Actually I made a mistake in that example. What I meant is that a wildchar also 
match slashes so one must consider the order of options in MULTILIB_OPTIONS when 
understanding what option sets will a MULTILIB_EXCEPTION rule match.


Thinking about this, MULTILIB_EXCEPTIONS and MULTILIB_REQUIRED don't specify the 
syntax of the directives (option separated by slashes) and the important of the 
order of operations in MULTILIB_OPTIONS. The documentation for MULTILIB_REUSE is 
a better example for that.


I think that Richard makes a good case for going away altogher from ARM as an 
example to both keep things simple and be more future proof. I'll rewrite that 
patch to do this and specify the syntax of MULTILIB_EXCEPTIONS and 
MULTILIB_REQUIRED. Do you prefer these two aspects to be in 2 different patches?




-Sandra the confused



My mistake really.

Best regards,

Thomas


Re: Fwd: [PATCH, doc/ARM] Remove false affirmation that Thumb cannot use an FPU

2016-08-18 Thread Richard Earnshaw (lists)
On 18/08/16 07:00, Sandra Loosemore wrote:
> On 08/11/2016 04:31 AM, Thomas Preudhomme wrote:
> 
>> diff --git a/gcc/doc/fragments.texi b/gcc/doc/fragments.texi
>> index
>> b6d8541c8ca820fa732363a05221e2cd4d1251c2..abf4e128671bb4751d21f24bb69625593d3c839e
>> 100644
>> --- a/gcc/doc/fragments.texi
>> +++ b/gcc/doc/fragments.texi
>> @@ -117,12 +117,15 @@ specified, there are combinations that should
>> not be built.  In that
>>  case, set @code{MULTILIB_EXCEPTIONS} to be all of the switch exceptions
>>  in shell case syntax that should not be built.
>>
>> -For example the ARM processor cannot execute both hardware floating
>> -point instructions and the reduced size THUMB instructions at the same
>> -time, so there is no need to build libraries with both of these
>> -options enabled.  Therefore @code{MULTILIB_EXCEPTIONS} is set to:
>> +For example on ARM targets @option{-mfloat-abi=soft} requests to use a
>> +softfloat implementation for floating-point operations.  Therefore, it
>> +does not make sense to find both @option{-mfloat-abi=soft} and an
>> +@option{mfpu} option on the command line so @code{MULTILIB_EXCEPTIONS}
>> +could contain the following exception (assuming that
>> +@option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
>> +that @option{-mfloat-abi=soft} is the default value):
>>  @smallexample
>> -*mthumb/*mhard-float*
>> +*mfpu=*
>>  @end smallexample
>>
>>  @findex MULTILIB_REQUIRED
> 
> This version still has a lot of copy-editing issues.  I suggest
> rewriting as:
> 
>   For example, on ARM targets @option{-mfloat-abi=soft} requests use of
>   software floating-point operations, so it
>   does not make sense to build libraries with both
>   @option{-mfloat-abi=soft} and an @option{-mfpu} option.
>   @code{MULTILIB_EXCEPTIONS} could contain the following exception
> 
> but here I get stuck in suggesting a rewrite, because I can't parse this
> part at all to figure out what you're trying to say:
> 
>   (assuming that
>   @option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
>   that @option{-mfloat-abi=soft} is the default value):
> 
> "comes after in"?  Comes after what?  If the order is important here,
> the documentation should explain why instead of just "assuming" things
> about it.
> 
> -Sandra the confused
> 

I think it's probably best to just drop the entire parenthetical
subcluase.  This is documentation of how to use MULTILIB_EXCEPTIONS not
precise documentation on what needs to be done on ARM.

In fact, it might be better to just rewrite the whole section based on a
theoretical machine that has two ISAs, one which can support
floating-point and one which can't.  You then get back to essentially
the same text as we had originally but anonymised and future proofed.

R.



[PATCH] Fix PR77282

2016-08-18 Thread Richard Biener

The following fixes PR77282, tested on x86_64-unknown-linux-gnu, applied
as obvious.

Richard.

2016-08-18  Richard Biener  

PR tree-optimization/77282
* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
When doing auto-parallelizing also prevent use of PHIs that
carry dependences across loop backedges.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 239564)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4270,7 +4270,7 @@ eliminate_dom_walker::before_dom_childre
  if (sprime
  && TREE_CODE (sprime) == SSA_NAME
  && do_pre
- && flag_tree_loop_vectorize
+ && (flag_tree_loop_vectorize || flag_tree_parallelize_loops)
  && loop_outer (b->loop_father)
  && has_zero_uses (sprime)
  && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))


[PATCH] Remove unintended dg-options directive

2016-08-18 Thread Jonathan Wakely

I don't think I meant for this test to use -std=gnu++0x when I added
it, that probably came from copying 20_util/function/cons/addressof.cc

It certainly works OK with C++98, which is how tr1::function is most
likely to be used.

* testsuite/tr1/3_function_objects/function/10.cc: Remove unintended
dg-options directive.

Tested x86_64-linux, committed to trunk.


commit 24ddeed88b16e02ed33612c11727783aadef3c6a
Author: Jonathan Wakely 
Date:   Thu Aug 18 10:45:56 2016 +0100

Remove unintended dg-options directive

* testsuite/tr1/3_function_objects/function/10.cc: Remove unintended
dg-options directive.

diff --git a/libstdc++-v3/testsuite/tr1/3_function_objects/function/10.cc 
b/libstdc++-v3/testsuite/tr1/3_function_objects/function/10.cc
index b09ca86..2c1edaf 100644
--- a/libstdc++-v3/testsuite/tr1/3_function_objects/function/10.cc
+++ b/libstdc++-v3/testsuite/tr1/3_function_objects/function/10.cc
@@ -15,7 +15,6 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++11" }
 // { dg-do compile }
 
 #include 


Re: [PATCH 3/4][Ada,DJGPP] Ada support for DJGPP

2016-08-18 Thread Arnaud Charlet
> 2016-07-30 Andris Pavenis 
> 
> * ada/gcc-interface/Makefile.in (LIBGNAT_TARGET_PAIRS): Define for
> DJGPP target
>   (EH_MECHANISM): Define to -gcc for DJGPP
> * ada/system-djgpp.ads: New file
> 
> Andris

> +++ b/gcc/ada/system-djgpp.ads
> @@ -0,0 +1,148 @@
> +--
> 
> +--  
> --
> +--GNAT RUN-TIME COMPONENTS  
> --
> +--  
> --
> +--   S Y S T E M
> --
> +--  
> --
> +-- S p e c  
> --
> +--(DJGPP Version) --

Wrong formatting here.

> +--  
> --
> +--  Copyright (C) 1992-2015, Free Software Foundation, Inc. 
> --

Wrong copyright here.


Re: [PATCH 2/4][Ada,DJGPP] Ada support for DJGPP

2016-08-18 Thread Arnaud Charlet
> This patch (2nd of 4) includes various changes to Ada related C files
> required for DJGPP support
> 
> ChangeLog entry:
> 
> 2016-07-30 Andris Pavenis 
> 
> * ada/ctrl_c.c: Do not use macro SA_RESTART for DJGPP.
> 
> * ada/gsocket.h: Do not support sockets for DJGPP.
> 
> * ada/init.c (timestruct_t): Define for DJGPP.
>   (nanosleep): Implement for DJGPP using usleep().

Why do you need to define nanosleep() here? I suspect this is no longer
needed. If it is, would be good to know why.

Arno


Re: [PATCH] Generalize overriding mechanism for debug info output options defaults

2016-08-18 Thread Jakub Jelinek
On Thu, Aug 18, 2016 at 11:32:45AM +0200, Pierre-Marie de Rodat wrote:
> Currently, the VxWorks target overrides the defaults for debug info
> output options (DWARF version, strictness) in a target-specific options
> hook.  This patch creates macros so that these defaults can be overriden

I believe that is the preferred way, rather than macros.
The macros are yet another thing that would need to be undone if we ever
start supporting multiple targets in the same binary.
I don't see any advantages of introducing the macros.

> gcc/
>   * defaults.h (DWARF_STRICT_DEFAULT, DWARF_VERSION_DEFAULT): New
>   macros.
>   * common.opt (-gdwarf-, -gno-strict-dwarf): Update to use macros
>   for default values.

Jakub


[PATCH] DWARF: remove pessimistic DWARF version checks for imported entities

2016-08-18 Thread Pierre-Marie de Rodat
Hello,

A check in dwarf2out_imported_module_or_decl prevents valid strict
DWARF2 constructs such as DW_TAG_imported_declaration from being emitted
in dwarf2out_imported_module_or_decl_1.

The latter already protects the emission of newer DWARF tags with
appropriate checks, so the one in the former is redundant and
pessimistic.  This function is already called from places like
process_scope_var, which are not protected anyway.

This patch removes the check in dwarf2out_imported_module_or_decl so
that tags like DW_TAG_imported_declaration are emitted even in strict
DWARF2 mode.

Bootstrapped and regtested on x86_64-linux, no regression.  I also
checked that the new testcase fails on mainline.  Ok to commit?
Thank you in advance!

gcc/

* dwarf2out.c (dwarf2out_imported_module_or_decl): Remove
pessimistic DWARF version check.

gcc/testsuite/

* gnat.dg/debug7.adb, gnat.dg/debug7.ads: New testcase.
---
 gcc/dwarf2out.c  |  3 ---
 gcc/testsuite/gnat.dg/debug7.adb | 10 ++
 gcc/testsuite/gnat.dg/debug7.ads |  4 
 3 files changed, 14 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/debug7.adb
 create mode 100644 gcc/testsuite/gnat.dg/debug7.ads

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 0fdab9a..7bc0378 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -24034,9 +24034,6 @@ dwarf2out_imported_module_or_decl (tree decl, tree 
name, tree context,
   && !should_emit_struct_debug (context, DINFO_USAGE_DIR_USE))
 return;
 
-  if (!(dwarf_version >= 3 || !dwarf_strict))
-return;
-
   scope_die = get_context_die (context);
 
   if (child)
diff --git a/gcc/testsuite/gnat.dg/debug7.adb b/gcc/testsuite/gnat.dg/debug7.adb
new file mode 100644
index 000..98230ba
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/debug7.adb
@@ -0,0 +1,10 @@
+-- { dg-do compile }
+-- { dg-options "-cargs -g -gdwarf-2 -gstrict-dwarf -dA" }
+-- { dg-final { scan-assembler "DW_TAG_imported_decl" } }
+
+package body Debug7 is
+   function Next (I : Integer) return Integer is
+   begin
+  return I + 1;
+   end Next;
+end Debug7;
diff --git a/gcc/testsuite/gnat.dg/debug7.ads b/gcc/testsuite/gnat.dg/debug7.ads
new file mode 100644
index 000..047d4a6
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/debug7.ads
@@ -0,0 +1,4 @@
+package Debug7 is
+   function Next (I : Integer) return Integer;
+   function Renamed_Next (I : Integer) return Integer renames Next;
+end Debug7;
-- 
2.9.3



[PATCH] DWARF: do not emit DW_TAG_variable to materialize DWARF procedures

2016-08-18 Thread Pierre-Marie de Rodat
Hello,

For -gdwarf-3 and newer, the DWARF back-end sometimes generates DWARF
procedures to factorize complex location descriptions.  DWARF procedures
can be materialized as DW_TAG_dwarf_procedure DIEs, but actually any DIE
that can hold a DW_AT_location attribute is also accepted.

Unlike what I thought at some point, the DW_TAG_dwarf_procedure tag was
introduced in the DWARFv3 standard, not the DWARFv4 one, so the back-end
can always emit DW_TAG_dwarf_procedure DIEs, as this simplifies code and
prevents the types pruning pass from missing a DWARF procedure.

Boostrapped and regtested on x86_64-linux: no regression.  Ok to commit?
Thank you in advance!

gcc/

* dwarf2out.c (copy_dwarf_procedure): Remove obsolete comment.
(new_dwarf_proc_die): Emit DW_TAG_dwarf_procedure DIEs even for
-gdwarf-3.
(function_to_dwarf_procedure): Update comment.
---
 gcc/dwarf2out.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 0fdab9a..98f08b7 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -7747,9 +7747,6 @@ copy_dwarf_procedure (dw_die_ref die,
  comdat_type_node *type_node,
  hash_map _dwarf_procs)
 {
-  /* We do this for COMDAT section, which is DWARFv4 specific, so
- DWARF procedure are always DW_TAG_dwarf_procedure DIEs (unlike
- DW_TAG_variable in DWARFv3).  */
   gcc_assert (die->die_tag == DW_TAG_dwarf_procedure);
 
   /* DWARF procedures are not supposed to have children...  */
@@ -15332,22 +15329,15 @@ static dw_die_ref
 new_dwarf_proc_die (dw_loc_descr_ref location, tree fndecl,
dw_die_ref parent_die)
 {
-  const bool dwarf_proc_supported = dwarf_version >= 4;
   dw_die_ref dwarf_proc_die;
 
   if ((dwarf_version < 3 && dwarf_strict)
   || location == NULL)
 return NULL;
 
-  dwarf_proc_die  = new_die (dwarf_proc_supported
-? DW_TAG_dwarf_procedure
-: DW_TAG_variable,
-parent_die,
-fndecl);
+  dwarf_proc_die = new_die (DW_TAG_dwarf_procedure, parent_die, fndecl);
   if (fndecl)
 equate_decl_number_to_die (fndecl, dwarf_proc_die);
-  if (!dwarf_proc_supported)
-add_AT_flag (dwarf_proc_die, DW_AT_artificial, 1);
   add_AT_loc (dwarf_proc_die, DW_AT_location, location);
   return dwarf_proc_die;
 }
@@ -15673,9 +15663,7 @@ function_to_dwarf_procedure (tree fndecl)
   if (dwarf_proc_die != NULL)
 return dwarf_proc_die;
 
-  /* DWARF procedures are available starting with the DWARFv3 standard, but
- it's the DWARFv4 standard that introduces the DW_TAG_dwarf_procedure
- DIE.  */
+  /* DWARF procedures are available starting with the DWARFv3 standard.  */
   if (dwarf_version < 3 && dwarf_strict)
 return NULL;
 
-- 
2.9.3



[PATCH] DWARF: space-optimize loc. descr. for integer literals on 32-bit targets

2016-08-18 Thread Pierre-Marie de Rodat
Hello,

This enhances location description generation so that the generated
opcodes for integer literals are as space-efficient when HOST_WIDE_INT
is 64-bits large than when it's 32-bits large. In particular, this
reduces the size of the opcodes generated to produce big unsigned
literals using small literal integers instead.

Bootstrapped and regtested on x86-linux (no regression).  I also checked
that the new testcase fails with mainline.  Ok to commit?  Thank you in
advance!

gcc/

* dwarf2out.c (int_loc_descriptor): Generate opcodes for another
equivalent 32-bit constant (modulo 2**32) when that yields
smaller instructions.
(size_of_int_loc_descriptor): Update accordingly.
---
 gcc/dwarf2out.c  | 31 ++-
 gcc/testsuite/gnat.dg/debug8.adb | 29 +
 2 files changed, 55 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/debug8.adb

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 0fdab9a..f175ea1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -11954,20 +11954,36 @@ int_loc_descriptor (HOST_WIDE_INT i)
/* DW_OP_const1u X DW_OP_litY DW_OP_shl takes just 4 bytes,
   while DW_OP_const4u is 5 bytes.  */
return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8);
+
+  else if (HOST_BITS_PER_WIDE_INT > 32
+  && DWARF2_ADDR_SIZE == 4 && i > 0x7fff
+  && size_of_int_loc_descriptor ((HOST_WIDE_INT) (int32_t) i)
+ <= 4)
+   {
+ /* As i >= 2**31, the double cast above will yield a negative number.
+Since wrapping is defined in DWARF expressions we can output big
+positive integers as small negative ones, regardless of the size
+of host wide ints.
+
+Here, since the evaluator will handle 32-bit values and since i >=
+2**31, we know it's going to be interpreted as a negative literal:
+store it this way if we can do better than 5 bytes this way.  */
+ return int_loc_descriptor ((HOST_WIDE_INT) (int32_t) i);
+   }
   else if (HOST_BITS_PER_WIDE_INT == 32 || i <= 0x)
op = DW_OP_const4u;
+
+  /* Past this point, i >= 0x1 and thus DW_OP_constu will take at
+least 6 bytes: see if we can do better before falling back to it.  */
   else if (clz + ctz >= HOST_BITS_PER_WIDE_INT - 8
   && clz + 8 + 255 >= HOST_BITS_PER_WIDE_INT)
-   /* DW_OP_const1u X DW_OP_const1u Y DW_OP_shl takes just 5 bytes,
-  while DW_OP_constu of constant >= 0x1 takes at least
-  6 bytes.  */
+   /* DW_OP_const1u X DW_OP_const1u Y DW_OP_shl takes just 5 bytes.  */
return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 8);
   else if (clz + ctz >= HOST_BITS_PER_WIDE_INT - 16
   && clz + 16 + (size_of_uleb128 (i) > 5 ? 255 : 31)
  >= HOST_BITS_PER_WIDE_INT)
/* DW_OP_const2u X DW_OP_litY DW_OP_shl takes just 5 bytes,
-  DW_OP_const2u X DW_OP_const1u Y DW_OP_shl takes 6 bytes,
-  while DW_OP_constu takes in this case at least 6 bytes.  */
+  DW_OP_const2u X DW_OP_const1u Y DW_OP_shl takes 6 bytes.  */
return int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT - clz - 16);
   else if (clz + ctz >= HOST_BITS_PER_WIDE_INT - 32
   && clz + 32 + 31 >= HOST_BITS_PER_WIDE_INT
@@ -12192,6 +12208,11 @@ size_of_int_loc_descriptor (HOST_WIDE_INT i)
   && clz + 8 + 31 >= HOST_BITS_PER_WIDE_INT)
return size_of_int_shift_loc_descriptor (i, HOST_BITS_PER_WIDE_INT
- clz - 8);
+  else if (HOST_BITS_PER_WIDE_INT > 32
+  && DWARF2_ADDR_SIZE == 4 && i > 0x7fff
+  && size_of_int_loc_descriptor ((HOST_WIDE_INT) (int32_t) i)
+ <= 4)
+   return size_of_int_loc_descriptor ((HOST_WIDE_INT) (int32_t) i);
   else if (HOST_BITS_PER_WIDE_INT == 32 || i <= 0x)
return 5;
   s = size_of_uleb128 ((unsigned HOST_WIDE_INT) i);
diff --git a/gcc/testsuite/gnat.dg/debug8.adb b/gcc/testsuite/gnat.dg/debug8.adb
new file mode 100644
index 000..fabcc22
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/debug8.adb
@@ -0,0 +1,29 @@
+-- { dg-do compile }
+-- { dg-options "-cargs -g -fgnat-encodings=minimal -dA" }
+-- { dg-final { scan-assembler-not "DW_OP_const4u" } }
+-- { dg-final { scan-assembler-not "DW_OP_const8u" } }
+
+--  The DW_AT_byte_size attribute DWARF expression for the
+--  DW_TAG_structure_type DIE that describes Rec_Type contains the -4u literal.
+--  Check that it is not created using an inefficient encoding (DW_OP_const1s
+--  is expected).
+
+procedure Debug8 is
+
+   type Rec_Type (I : Integer) is record
+  B : Boolean;
+  case I is
+ when 0 =>
+null;
+ when 1 .. 10 =>
+C : Character;
+ 

[PATCH] Generalize overriding mechanism for debug info output options defaults

2016-08-18 Thread Pierre-Marie de Rodat
Hello,

Currently, the VxWorks target overrides the defaults for debug info
output options (DWARF version, strictness) in a target-specific options
hook.  This patch creates macros so that these defaults can be overriden
in config headers.  In particular, this will make it easier to have
different default for different VxWorks versions.

No behavior change, bootstrapped and regtested on x86_64-linux.  We (at
AdaCore) also have this patch in our branches for more than a year.  Ok
to commit?  Thank you in advance!

gcc/
* defaults.h (DWARF_STRICT_DEFAULT, DWARF_VERSION_DEFAULT): New
macros.
* common.opt (-gdwarf-, -gno-strict-dwarf): Update to use macros
for default values.
---
 gcc/common.opt   | 4 ++--
 gcc/config/vxworks.c | 8 
 gcc/config/vxworks.h | 8 
 gcc/defaults.h   | 8 
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 65a9762..23d3576 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2736,7 +2736,7 @@ Common JoinedOrMissing Negative(gdwarf-)
 Generate debug information in default version of DWARF format.
 
 gdwarf-
-Common Joined UInteger Var(dwarf_version) Init(4) Negative(gstabs)
+Common Joined UInteger Var(dwarf_version) Init(DWARF_VERSION_DEFAULT) 
Negative(gstabs)
 Generate debug information in DWARF v2 (or later) format.
 
 ggdb
@@ -2780,7 +2780,7 @@ Common JoinedOrMissing Negative(gvms)
 Generate debug information in extended STABS format.
 
 gno-strict-dwarf
-Common RejectNegative Var(dwarf_strict,0) Init(0)
+Common RejectNegative Var(dwarf_strict,0) Init(DWARF_STRICT_DEFAULT)
 Emit DWARF additions beyond selected version.
 
 gstrict-dwarf
diff --git a/gcc/config/vxworks.c b/gcc/config/vxworks.c
index ce2c170..229ed93 100644
--- a/gcc/config/vxworks.c
+++ b/gcc/config/vxworks.c
@@ -143,12 +143,4 @@ vxworks_override_options (void)
   /* PIC is only supported for RTPs.  */
   if (flag_pic && !TARGET_VXWORKS_RTP)
 error ("PIC is only supported for RTPs");
-
-  /* Default to strict dwarf-2 to prevent potential difficulties observed with
- non-gdb debuggers on extensions > 2.  */
-  if (!global_options_set.x_dwarf_strict)
-dwarf_strict = 1;
-
-  if (!global_options_set.x_dwarf_version)
-dwarf_version = 2;
 }
diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h
index f356926..ba07274 100644
--- a/gcc/config/vxworks.h
+++ b/gcc/config/vxworks.h
@@ -138,3 +138,11 @@ extern void vxworks_asm_out_destructor (rtx symbol, int 
priority);
 
 /* The diab linker does not handle .gnu_attribute sections.  */
 #undef HAVE_AS_GNU_ATTRIBUTE
+
+/* Default to strict dwarf-2 with all GNAT encodings to prevent potential
+   difficulties observed with non-gdb debuggers on extensions > 2.  */
+#undef DWARF_STRICT_DEFAULT
+#define DWARF_STRICT_DEFAULT 1
+
+#undef DWARF_VERSION_DEFAULT
+#define DWARF_VERSION_DEFAULT 2
diff --git a/gcc/defaults.h b/gcc/defaults.h
index af8fe91..3658ae6 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1485,6 +1485,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 
 #endif /* GCC_INSN_FLAGS_H  */
 
+#ifndef DWARF_STRICT_DEFAULT
+#define DWARF_STRICT_DEFAULT 0
+#endif
+
+#ifndef DWARF_VERSION_DEFAULT
+#define DWARF_VERSION_DEFAULT 4
+#endif
+
 #ifndef DWARF_GNAT_ENCODINGS_DEFAULT
 #define DWARF_GNAT_ENCODINGS_DEFAULT DWARF_GNAT_ENCODINGS_GDB
 #endif
-- 
2.9.3



[PATCH][Aarch64][gcc] Fix vld2/3/4 on big endian systems

2016-08-18 Thread Tamar Christina
Hi all,

This fixes a bug in the vector load functions in which they load the
vector in the wrong order for big endian systems. This patch flips the
order conditionally in the vec_concats.

No testcase given because plenty of existing tests for vld functions.
Ran regression tests on aarch64_be-none-elf and aarch64-none-elf.
Vldx tests now pass on aarch64_be-none-elf and no regressions on both.

Ok for trunk?

I do not have commit rights so if ok can someone apply it for me?

Thanks,
Tamar

gcc/
2016-08-16  Tamar Christina  

* gcc/config/aarch64/aarch64-simd.md
(aarch64_ld2_dreg_le): New.
(aarch64_ld2_dreg_be): New.
(aarch64_ld2_dreg): Removed.
(aarch64_ld3_dreg_le): New.
(aarch64_ld3_dreg_be): New.
(aarch64_ld3_dreg): Removed.
(aarch64_ld4_dreg_le): New.
(aarch64_ld4_dreg_be): New.
(aarch64_ld4_dreg): Removed.
(aarch64_ld): Wrapper around _le, _be.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index feb5e96..7d3dfe8 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4861,7 +4861,7 @@
   DONE;
 })
 
-(define_insn "aarch64_ld2_dreg"
+(define_insn "aarch64_ld2_dreg_le"
   [(set (match_operand:OI 0 "register_operand" "=w")
 	(subreg:OI
 	  (vec_concat:
@@ -4874,12 +4874,30 @@
 	 (unspec:VD [(match_dup 1)]
 			UNSPEC_LD2)
 	 (vec_duplicate:VD (const_int 0 0))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
   "ld2\\t{%S0. - %T0.}, %1"
   [(set_attr "type" "neon_load2_2reg")]
 )
 
-(define_insn "aarch64_ld2_dreg"
+(define_insn "aarch64_ld2_dreg_be"
+  [(set (match_operand:OI 0 "register_operand" "=w")
+	(subreg:OI
+	  (vec_concat:
+	(vec_concat:
+	 (vec_duplicate:VD (const_int 0))
+	 (unspec:VD
+		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
+		UNSPEC_LD2))
+	(vec_concat:
+	 (vec_duplicate:VD (const_int 0))
+	 (unspec:VD [(match_dup 1)]
+			UNSPEC_LD2))) 0))]
+  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+  "ld2\\t{%S0. - %T0.}, %1"
+  [(set_attr "type" "neon_load2_2reg")]
+)
+
+(define_insn "aarch64_ld2_dreg_le"
   [(set (match_operand:OI 0 "register_operand" "=w")
 	(subreg:OI
 	  (vec_concat:
@@ -4892,12 +4910,30 @@
 	 (unspec:DX [(match_dup 1)]
 			UNSPEC_LD2)
 	 (const_int 0))) 0))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
+  "ld1\\t{%S0.1d - %T0.1d}, %1"
+  [(set_attr "type" "neon_load1_2reg")]
+)
+
+(define_insn "aarch64_ld2_dreg_be"
+  [(set (match_operand:OI 0 "register_operand" "=w")
+	(subreg:OI
+	  (vec_concat:
+	(vec_concat:
+	 (const_int 0)
+	 (unspec:DX
+		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
+		UNSPEC_LD2))
+	(vec_concat:
+	 (const_int 0)
+	 (unspec:DX [(match_dup 1)]
+			UNSPEC_LD2))) 0))]
+  "TARGET_SIMD && BYTES_BIG_ENDIAN"
   "ld1\\t{%S0.1d - %T0.1d}, %1"
   [(set_attr "type" "neon_load1_2reg")]
 )
 
-(define_insn "aarch64_ld3_dreg"
+(define_insn "aarch64_ld3_dreg_le"
   [(set (match_operand:CI 0 "register_operand" "=w")
 	(subreg:CI
 	 (vec_concat:
@@ -4915,12 +4951,35 @@
 	 (unspec:VD [(match_dup 1)]
 			UNSPEC_LD3)
 	 (vec_duplicate:VD (const_int 0 0))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
   "ld3\\t{%S0. - %U0.}, %1"
   [(set_attr "type" "neon_load3_3reg")]
 )
 
-(define_insn "aarch64_ld3_dreg"
+(define_insn "aarch64_ld3_dreg_be"
+  [(set (match_operand:CI 0 "register_operand" "=w")
+	(subreg:CI
+	 (vec_concat:
+	  (vec_concat:
+	(vec_concat:
+	 (vec_duplicate:VD (const_int 0))
+	 (unspec:VD
+		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
+		UNSPEC_LD3))
+	(vec_concat:
+	 (vec_duplicate:VD (const_int 0))
+	 (unspec:VD [(match_dup 1)]
+			UNSPEC_LD3)))
+	  (vec_concat:
+	 (vec_duplicate:VD (const_int 0))
+	 (unspec:VD [(match_dup 1)]
+			UNSPEC_LD3))) 0))]
+  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+  "ld3\\t{%S0. - %U0.}, %1"
+  [(set_attr "type" "neon_load3_3reg")]
+)
+
+(define_insn "aarch64_ld3_dreg_le"
   [(set (match_operand:CI 0 "register_operand" "=w")
 	(subreg:CI
 	 (vec_concat:
@@ -4938,12 +4997,35 @@
 	 (unspec:DX [(match_dup 1)]
 			UNSPEC_LD3)
 	 (const_int 0))) 0))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
+  "ld1\\t{%S0.1d - %U0.1d}, %1"
+  [(set_attr "type" "neon_load1_3reg")]
+)
+
+(define_insn "aarch64_ld3_dreg_be"
+  [(set (match_operand:CI 0 "register_operand" "=w")
+	(subreg:CI
+	 (vec_concat:
+	  (vec_concat:
+	(vec_concat:
+	 (const_int 0)
+	 (unspec:DX
+		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
+		UNSPEC_LD3))
+	(vec_concat:
+	 (const_int 0)
+	 (unspec:DX [(match_dup 1)]
+			UNSPEC_LD3)))
+	  (vec_concat:
+	 (const_int 0)
+	 (unspec:DX [(match_dup 1)]
+			UNSPEC_LD3))) 0))]
+  "TARGET_SIMD && BYTES_BIG_ENDIAN"
   "ld1\\t{%S0.1d - %U0.1d}, %1"
   [(set_attr "type" "neon_load1_3reg")]
 )
 
-(define_insn 

Re: [wwwdocs] Improve example at https://gcc.gnu.org/gcc-6/porting_to.html#flifetime-dse

2016-08-18 Thread Jonathan Wakely

On 16/08/16 05:16 +, Bernd Edlinger wrote:

Hi Jonathan,


I think this would be an improvement, although I still can't get the
assertion to fail:


Probably because the memory is still initialized to zero,
when it is used for the first time.

Try this:

#include 
#include 
#include 

struct A
{
A() {}

void* operator new(size_t s)
{
  void* ptr = malloc(s);
  memset(ptr, 0xFF, s);
  return ptr;
}

void operator delete(void* ptr) { free(ptr); }

int value;
};

int main()
{
A* a =  new A;
assert(a->value == -1); /* Use of uninitialized value */
delete a;
}


I've committed Bernd's improved example that demonstrates the new
optimization.

Thanks for everyone's input.

Index: gcc-6/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.24
diff -u -r1.24 porting_to.html
--- gcc-6/porting_to.html	15 Aug 2016 17:32:29 -	1.24
+++ gcc-6/porting_to.html	18 Aug 2016 08:58:33 -
@@ -357,29 +357,25 @@
 
 struct A
 {
-  A () {}
-  void *operator new (size_t s)
-  {
-void *ptr = malloc (s);
-memset (ptr, 0, s);
-return ptr;
-  }
+ A() {}
 
-  int value;
-};
+ void* operator new(size_t s)
+ {
+   void* ptr = malloc(s);
+   memset(ptr, 0xFF, s);
+   return ptr;
+ }
 
-A *
-__attribute__ ((noinline))
-build (void)
-{
-  return new A ();
-}
+ void operator delete(void* ptr) { free(ptr); }
+
+ int value;
+};
 
 int main()
 {
-  A *a =  build ();
-  assert (a-value == 0); /* Use of uninitialized value */
-  free (a);
+ A* a =  new A;
+ assert(a-value == -1); // Use of uninitialized value
+ delete a;
 }
 
 


[PATCH] S/390: Improve result verification in test case vec-genmask-1.c.

2016-08-18 Thread Dominik Vogt
THe attached patch improves checking of teh results of the
subtests "a" and "f".  As they share the same "vone" instruction,
the duplicate scan-assembler-times was bogus.  Moved "f" to a
separate function to fix this.  Also double check that no extra
"vgmf" instructions are used.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc.target/s390/zvector/vec-genmask-1.c: Improve result verification.
>From 6750201c2b594bbbdce9abb14a3e1944806c425d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 18 Aug 2016 09:52:40 +0100
Subject: [PATCH] S/390: Improve result verification in test case
 vec-genmask-1.c.

---
 .../gcc.target/s390/zvector/vec-genmask-1.c  | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
index 745c1ed..0e57b8d 100644
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
@@ -14,11 +14,19 @@ foo ()
   c = vec_genmasks_32 (31, 31);
   d = vec_genmasks_32 (5, 5);
   e = vec_genmasks_32 (31, 0);
+}
+
+int
+bar ()
+{
+  /* Needs to be in a separate function so that the vone from "a" is not reused
+ for "f".  */
   f = vec_genmasks_32 (6, 5);
 }
-/* { dg-final { scan-assembler-times "vone" 1 } } */
-/* { dg-final { scan-assembler-times "vgmf\t%v.*,0,0" 1 } } */
-/* { dg-final { scan-assembler-times "vgmf\t%v.*,31,31" 1 } } */
-/* { dg-final { scan-assembler-times "vgmf\t%v.*,5,5" 1 } } */
-/* { dg-final { scan-assembler-times "vgmf\t%v.*,31,0" 1 } } */
-/* { dg-final { scan-assembler-times "vone" 1 } } */
+
+/* a + f: { dg-final { scan-assembler-times "vone" 2 } } */
+/* b: { dg-final { scan-assembler-times "vgmf\t%v.*,0,0" 1 } } */
+/* c: { dg-final { scan-assembler-times "vgmf\t%v.*,31,31" 1 } } */
+/* d: { dg-final { scan-assembler-times "vgmf\t%v.*,5,5" 1 } } */
+/* e: { dg-final { scan-assembler-times "vgmf\t%v.*,31,0" 1 } } */
+/* b - e: { dg-final { scan-assembler-times "vgmf" 4 } } */
-- 
2.3.0



Re: [PATCH] [GCC] Don't use section anchors for declarations that don't fit in a single anchor range

2016-08-18 Thread James Greenhalgh
On Wed, Aug 17, 2016 at 09:09:07PM -0600, Jeff Law wrote:
> On 08/17/2016 02:23 AM, Richard Biener wrote:
> >On Tue, Aug 16, 2016 at 6:06 PM, Jeff Law  wrote:
> >>On 08/16/2016 08:01 AM, Tamar Christina wrote:
> >>>
> >>>
> >>>Hi All,
> >>>
> >>>This patch turns off the usage of section anchors for
> >>>declarations that do not fit in a single anchor range.
> >>>A large enough object will use the full anchor range
> >>>and also force the use of another anchor pointer.
> >>>
> >>>By not using an anchor for large objects more globals
> >>>can share the same anchor.
> >>>
> >>>The patch has been benchmarked using Spec2000 and
> >>>the impact on performance is negligible, however some files
> >>>using large arrays showed a appreciable reduction in amount of
> >>>instructions in the assembly file.
> >>>
> >>>Regression tests were run on aarch64-none-elf and no regressions.
> >>>
> >>>Ok for trunk?
> >>>
> >>>2016-08-16  Tamar Christina  
> >>>Ramana Radhakrishnan  
> >>>
> >>>* gcc/varasm.c
> >>>(default_use_anchors_for_symbol_p): Reject too large decls.
> >>
> >>Do you have to worry about DECL_SIZE being non-constant here or are those
> >>filtered out earlier?
> >
> >Globals can't have variable size.
> Duh.  You're obviously correct.
> 
> Patch is OK for the trunk.

Tamar's not yet got commit rights, so I've committed this on his behalf as
r239561.

Thanks for the review!

James



[PATCH] S/390: Fix insv-1.c test with -m31.

2016-08-18 Thread Dominik Vogt
The atteched patch fixes the S/390 test case insv-1.c with -m31.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc.target/s390/insv-1.c: Fix test when running with -m31.
>From 4021fe9c4703e29b9446a24584ebd998c591d7ed Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 12 Aug 2016 17:20:55 +0100
Subject: [PATCH] S/390: Fix insv-1.c test with -m31.

---
 gcc/testsuite/gcc.target/s390/insv-1.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/insv-1.c 
b/gcc/testsuite/gcc.target/s390/insv-1.c
index 8d464f5..020b5d8 100644
--- a/gcc/testsuite/gcc.target/s390/insv-1.c
+++ b/gcc/testsuite/gcc.target/s390/insv-1.c
@@ -110,9 +110,9 @@ foo4c (unsigned long a, unsigned long b)
 
 /* The functions foo3, foo4, foo3b, foo4b no longer use risbg but rosbg 
instead.
 
-   On s390x, four risbg go away and four new ones appear in other functions ...
- { dg-final { scan-assembler-times "risbg" 6 { target { s390x-*-* } } } }
+   On 64 bit, four risbg go away and four new ones appear in other functions
+ { dg-final { scan-assembler-times "risbg" 6 { target { lp64 } } } }
 
-   but not on s390.
- { dg-final { scan-assembler-times "risbg" 2 { target { s390-*-* } } } }
+   ... but not on 31 bit.
+ { dg-final { scan-assembler-times "risbg" 2 { target { ! lp64 } } } }
 */
-- 
2.3.0



Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-18 Thread Richard Biener
On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
> builtin_memset_gen_str returns a register used for memset, which only
> supports integer registers.  But a target may use vector registers in
> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
> QImode value to mode derived from STORE_MAX_PIECES, which can be used
> with vector instructions.  The default hook is the same as the original
> builtin_memset_gen_str.  A target can override it to support vector
> instructions for STORE_MAX_PIECES.
>
> Tested on x86-64 and i686.  Any comments?
>
> H.J.
> ---
> gcc/
>
> * builtins.c (builtin_memset_gen_str): Call targetm.gen_memset_value.
> (default_gen_memset_value): New function.
> * target.def (gen_memset_value): New hook.
> * targhooks.c: Inclue "expmed.h" and "builtins.h".
> (default_gen_memset_value): New function.

I see default_gen_memset_value in builtins.c but it belongs here.

> * targhooks.h (default_gen_memset_value): New prototype.
> * config/i386/i386.c (ix86_gen_memset_value): New function.
> (TARGET_GEN_MEMSET_VALUE): New.
> * config/i386/i386.h (STORE_MAX_PIECES): Likewise.
> * doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
> * doc/tm.texi: Updated.
>
> gcc/testsuite/
>
> * gcc.target/i386/pieces-memset-1.c: New test.
> * gcc.target/i386/pieces-memset-2.c: Likewise.
> * gcc.target/i386/pieces-memset-3.c: Likewise.
> * gcc.target/i386/pieces-memset-4.c: Likewise.
> * gcc.target/i386/pieces-memset-5.c: Likewise.
> * gcc.target/i386/pieces-memset-6.c: Likewise.
> * gcc.target/i386/pieces-memset-7.c: Likewise.
> * gcc.target/i386/pieces-memset-8.c: Likewise.
> * gcc.target/i386/pieces-memset-9.c: Likewise.
> * gcc.target/i386/pieces-memset-10.c: Likewise.
> * gcc.target/i386/pieces-memset-11.c: Likewise.
> * gcc.target/i386/pieces-memset-12.c: Likewise.
> * gcc.target/i386/pieces-memset-13.c: Likewise.
> * gcc.target/i386/pieces-memset-14.c: Likewise.
> * gcc.target/i386/pieces-memset-15.c: Likewise.
> * gcc.target/i386/pieces-memset-16.c: Likewise.
> * gcc.target/i386/pieces-memset-17.c: Likewise.
> * gcc.target/i386/pieces-memset-18.c: Likewise.
> * gcc.target/i386/pieces-memset-19.c: Likewise.
> * gcc.target/i386/pieces-memset-20.c: Likewise.
> * gcc.target/i386/pieces-memset-21.c: Likewise.
> * gcc.target/i386/pieces-memset-22.c: Likewise.
> * gcc.target/i386/pieces-memset-23.c: Likewise.
> * gcc.target/i386/pieces-memset-24.c: Likewise.
> * gcc.target/i386/pieces-memset-25.c: Likewise.
> * gcc.target/i386/pieces-memset-26.c: Likewise.
> * gcc.target/i386/pieces-memset-27.c: Likewise.
> * gcc.target/i386/pieces-memset-28.c: Likewise.
> * gcc.target/i386/pieces-memset-29.c: Likewise.
> * gcc.target/i386/pieces-memset-30.c: Likewise.
> * gcc.target/i386/pieces-memset-31.c: Likewise.
> * gcc.target/i386/pieces-memset-32.c: Likewise.
> * gcc.target/i386/pieces-memset-33.c: Likewise.
> * gcc.target/i386/pieces-memset-34.c: Likewise.
> * gcc.target/i386/pieces-memset-35.c: Likewise.
> * gcc.target/i386/pieces-memset-36.c: Likewise.
> * gcc.target/i386/pieces-memset-37.c: Likewise.
> * gcc.target/i386/pieces-memset-38.c: Likewise.
> * gcc.target/i386/pieces-memset-39.c: Likewise.
> * gcc.target/i386/pieces-memset-40.c: Likewise.
> * gcc.target/i386/pieces-memset-41.c: Likewise.
> * gcc.target/i386/pieces-memset-42.c: Likewise.
> * gcc.target/i386/pieces-memset-43.c: Likewise.
> * gcc.target/i386/pieces-memset-44.c: Likewise.
> ---
>  gcc/builtins.c   | 32 
>  gcc/config/i386/i386.c   | 38 
> 
>  gcc/config/i386/i386.h   | 13 
>  gcc/doc/tm.texi  |  7 +
>  gcc/doc/tm.texi.in   |  2 ++
>  gcc/target.def   |  9 ++
>  gcc/targhooks.h  |  1 +
>  gcc/testsuite/gcc.target/i386/pieces-memset-1.c  | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-10.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-11.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-12.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-13.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-14.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-15.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-16.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-17.c | 12 
>  

[PATCH] Make clone materialization an IPA pass

2016-08-18 Thread Richard Biener

The following patch makes it possible to add statistic counters to
update-ssa.  Clone materialization ends up updating SSA form from
a context with current_pass being NULL - wrapping materialize_all_clones
into a pass fixes this.

Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for trunk?

(simple-ipa-pass was the simplest but not sure if it is the most efficient
given I remember we somehow pull in all bodies for this?)

Thanks,
Richard.

2016-08-18  Richard Biener  

* tree-pass.h (make_pass_materialize_all_clones): Declare.
* ipa.c (pass_data_materialize_all_clones, pass_materialize_all_clones,
make_pass_materialize_all_clones): New simple IPA pass encapsulating
clone materialization.
* passes.def (all_late_ipa_passes): Start with
pass_materialize_all_clones.
* cgraphunit.c (symbol_table::compile): Remove call to
materialize_all_clones.
* tree-into-ssa.c: Include statistics.h.
(update_ssa): Count number of times we do incremental/rewrite
SSA update.

Index: gcc/tree-pass.h
===
*** gcc/tree-pass.h (revision 239560)
--- gcc/tree-pass.h (working copy)
*** extern ipa_opt_pass_d *make_pass_ipa_pro
*** 504,509 
--- 504,511 
  extern ipa_opt_pass_d *make_pass_ipa_cdtor_merge (gcc::context *ctxt);
  extern ipa_opt_pass_d *make_pass_ipa_single_use (gcc::context *ctxt);
  extern ipa_opt_pass_d *make_pass_ipa_comdats (gcc::context *ctxt);
+ extern simple_ipa_opt_pass *make_pass_materialize_all_clones (gcc::context *
+ ctxt);
  
  extern gimple_opt_pass *make_pass_cleanup_cfg_post_optimizing (gcc::context
   *ctxt);
Index: gcc/ipa.c
===
*** gcc/ipa.c   (revision 239560)
--- gcc/ipa.c   (working copy)
*** make_pass_ipa_single_use (gcc::context *
*** 1443,1445 
--- 1443,1486 
  {
return new pass_ipa_single_use (ctxt);
  }
+ 
+ /* Materialize all clones.  */
+ 
+ namespace {
+ 
+ const pass_data pass_data_materialize_all_clones =
+ {
+   SIMPLE_IPA_PASS, /* type */
+   "materialize-all-clones", /* name */
+   OPTGROUP_NONE, /* optinfo_flags */
+   TV_IPA_OPT, /* tv_id */
+   0, /* properties_required */
+   0, /* properties_provided */
+   0, /* properties_destroyed */
+   0, /* todo_flags_start */
+   0, /* todo_flags_finish */
+ };
+ 
+ class pass_materialize_all_clones : public simple_ipa_opt_pass
+ {
+ public:
+   pass_materialize_all_clones (gcc::context *ctxt)
+ : simple_ipa_opt_pass (pass_data_materialize_all_clones, ctxt)
+   {}
+ 
+   /* opt_pass methods: */
+   virtual unsigned int execute (function *)
+ {
+   symtab->materialize_all_clones ();
+   return 0;
+ }
+ 
+ }; // class pass_materialize_all_clones
+ 
+ } // anon namespace
+ 
+ simple_ipa_opt_pass *
+ make_pass_materialize_all_clones (gcc::context *ctxt)
+ {
+   return new pass_materialize_all_clones (ctxt);
+ }
Index: gcc/passes.def
===
*** gcc/passes.def  (revision 239560)
--- gcc/passes.def  (working copy)
*** along with GCC; see the file COPYING3.
*** 167,172 
--- 167,173 
   passes are executed after partitioning and thus see just parts of the
   compiled unit.  */
INSERT_PASSES_AFTER (all_late_ipa_passes)
+   NEXT_PASS (pass_materialize_all_clones);
NEXT_PASS (pass_ipa_pta);
NEXT_PASS (pass_dispatcher_calls);
NEXT_PASS (pass_omp_simd_clone);
Index: gcc/cgraphunit.c
===
*** gcc/cgraphunit.c(revision 239560)
--- gcc/cgraphunit.c(working copy)
*** symbol_table::compile (void)
*** 2435,2441 
  fprintf (stderr, "Assembling functions:\n");
symtab_node::checking_verify_symtab_nodes ();
  
-   materialize_all_clones ();
bitmap_obstack_initialize (NULL);
execute_ipa_pass_list (g->get_passes ()->all_late_ipa_passes);
bitmap_obstack_release (NULL);
--- 2438,2443 
Index: gcc/tree-into-ssa.c
===
*** gcc/tree-into-ssa.c (revision 239560)
--- gcc/tree-into-ssa.c (working copy)
*** along with GCC; see the file COPYING3.
*** 37,42 
--- 37,43 
  #include "tree-dfa.h"
  #include "tree-ssa.h"
  #include "domwalk.h"
+ #include "statistics.h"
  
  #define PERCENT(x,y) ((float)(x) * 100.0 / (float)(y))
  
*** update_ssa (unsigned update_flags)
*** 3248,3253 
--- 3249,3256 
   OLD_SSA_NAMES.  */
if (bitmap_first_set_bit (new_ssa_names) >= 0)
  {
+   statistics_counter_event (cfun, "Incremental SSA update", 1);
+ 
prepare_names_to_update (insert_phi_p);
  
/* If all the names in 

Re: [PATCH] Fix up i386 dependencies on i386-builtin-types.inc

2016-08-18 Thread Uros Bizjak
On Thu, Aug 18, 2016 at 9:39 AM, Jakub Jelinek  wrote:
> Hi!
>
> While looking if something doesn't need to be updated for the
> i386-builtin.def addition (I think we are ok, this file doesn't need to be
> shipped for plugins, as it is only included in i386.c, the first time
> i386.c is built no matter what and next time we'll have it listed in
> .deps/i386.Po), I've noticed IMHO incorrect dependency.
> i386-builtin-types.inc is included only by i386.c, not by i386-c.c, and
> while the second time it will be listed in .deps/i386.Po, I think if
> somebody is unlucky enough that i386.o is compiled before i386-c.o, then
> i386-builtin-types.inc might not be compiled yet.
>
> As an experiment,
> rm -f i386{,-c}.o .deps/i386{,-c}.Po s-i386-bt i386-builtin-types.inc
> make i386.o
> fails to compile without this patch.  i386-c.o apparently comes before
> i386.o in dependencies of cc1, so even with make -j1 it is built before
> that.  If I add sleep 4s into the s-i386-bt: rules before move-if-change
> then even make -j16 build fails after the above rm, so it is all about good
> timing.
>
> Ok for trunk if testing passes?

Uh, yes. This is also OK for backports.

Thanks,
Uros.

> 2016-08-18  Jakub Jelinek  
>
> * config/i386/t-i386 (i386-c.o): Don't depend on
> i386-builtin-types.inc.
> (i386.o): Depend on i386-builtin-types.inc.
>
> --- gcc/config/i386/t-i386.jj   2016-01-04 14:55:56.0 +0100
> +++ gcc/config/i386/t-i386  2016-08-18 09:23:28.523058356 +0200
> @@ -19,10 +19,12 @@
>  OPTIONS_H_EXTRA += $(srcdir)/config/i386/stringop.def
>  TM_H += $(srcdir)/config/i386/x86-tune.def
>
> -i386-c.o: $(srcdir)/config/i386/i386-c.c i386-builtin-types.inc
> +i386-c.o: $(srcdir)/config/i386/i386-c.c
>   $(COMPILE) $<
>   $(POSTCOMPILE)
>
> +i386.o: i386-builtin-types.inc
> +
>  i386-builtin-types.inc: s-i386-bt ; @true
>  s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \
>$(srcdir)/config/i386/i386-builtin-types.def
>
> Jakub


[PATCH] Fix up i386 dependencies on i386-builtin-types.inc

2016-08-18 Thread Jakub Jelinek
Hi!

While looking if something doesn't need to be updated for the
i386-builtin.def addition (I think we are ok, this file doesn't need to be
shipped for plugins, as it is only included in i386.c, the first time
i386.c is built no matter what and next time we'll have it listed in
.deps/i386.Po), I've noticed IMHO incorrect dependency.
i386-builtin-types.inc is included only by i386.c, not by i386-c.c, and
while the second time it will be listed in .deps/i386.Po, I think if
somebody is unlucky enough that i386.o is compiled before i386-c.o, then
i386-builtin-types.inc might not be compiled yet.

As an experiment,
rm -f i386{,-c}.o .deps/i386{,-c}.Po s-i386-bt i386-builtin-types.inc
make i386.o
fails to compile without this patch.  i386-c.o apparently comes before
i386.o in dependencies of cc1, so even with make -j1 it is built before
that.  If I add sleep 4s into the s-i386-bt: rules before move-if-change
then even make -j16 build fails after the above rm, so it is all about good
timing.

Ok for trunk if testing passes?

2016-08-18  Jakub Jelinek  

* config/i386/t-i386 (i386-c.o): Don't depend on
i386-builtin-types.inc.
(i386.o): Depend on i386-builtin-types.inc.

--- gcc/config/i386/t-i386.jj   2016-01-04 14:55:56.0 +0100
+++ gcc/config/i386/t-i386  2016-08-18 09:23:28.523058356 +0200
@@ -19,10 +19,12 @@
 OPTIONS_H_EXTRA += $(srcdir)/config/i386/stringop.def
 TM_H += $(srcdir)/config/i386/x86-tune.def
 
-i386-c.o: $(srcdir)/config/i386/i386-c.c i386-builtin-types.inc
+i386-c.o: $(srcdir)/config/i386/i386-c.c
  $(COMPILE) $<
  $(POSTCOMPILE)
 
+i386.o: i386-builtin-types.inc
+
 i386-builtin-types.inc: s-i386-bt ; @true
 s-i386-bt: $(srcdir)/config/i386/i386-builtin-types.awk \
   $(srcdir)/config/i386/i386-builtin-types.def

Jakub


[PATCH] Update virtual SSA in if-conversion

2016-08-18 Thread Richard Biener

The following patch avoids rewriting virtual SSA after each if-converted
loop (possibly).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-08-18  Richard Biener  

* ssa-iterators.h (ssa_vuse_operand): New inline.
* tree-if-conv.c (ifc_temp_var): Update virtual operand.
(predicate_all_scalar_phis): Use remove_phi_node to remove
phi nodes predicated.  Delay removing virtual PHIs.
(predicate_mem_writes): Update virtual operands.
(combine_blocks): Likewise.  Propagate out remaining virtual PHIs.
(tree_if_conversion): Do not rewrite virtual SSA form.
* tree-phinodes.c (release_phi_node): Make static.
* tree-phinodes.h (release_phi_node): Remove.

Index: gcc/ssa-iterators.h
===
--- gcc/ssa-iterators.h (revision 239529)
+++ gcc/ssa-iterators.h (working copy)
@@ -699,6 +699,15 @@ single_ssa_use_operand (gimple *stmt, in
   return NULL_USE_OPERAND_P;
 }
 
+/* Return the single virtual use operand in STMT if present.  Otherwise
+   return NULL.  */
+static inline use_operand_p
+ssa_vuse_operand (gimple *stmt)
+{
+  if (! gimple_vuse (stmt))
+return NULL_USE_OPERAND_P;
+  return USE_OP_PTR (gimple_use_ops (stmt));
+}
 
 
 /* If there is a single operand in STMT matching FLAGS, return it.  Otherwise
Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 239529)
+++ gcc/tree-if-conv.c  (working copy)
@@ -326,6 +326,7 @@ ifc_temp_var (tree type, tree expr, gimp
 {
   tree new_name = make_temp_ssa_name (type, NULL, "_ifc_");
   gimple *stmt = gimple_build_assign (new_name, expr);
+  gimple_set_vuse (stmt, gimple_vuse (gsi_stmt (*gsi)));
   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
   return new_name;
 }
@@ -1946,12 +1947,14 @@ predicate_all_scalar_phis (struct loop *
   while (!gsi_end_p (phi_gsi))
{
  phi = phi_gsi.phi ();
- predicate_scalar_phi (phi, );
- release_phi_node (phi);
- gsi_next (_gsi);
+ if (virtual_operand_p (gimple_phi_result (phi)))
+   gsi_next (_gsi);
+ else
+   {
+ predicate_scalar_phi (phi, );
+ remove_phi_node (_gsi, false);
+   }
}
-
-  set_phi_nodes (bb, NULL);
 }
 }
 
@@ -2218,11 +2221,18 @@ predicate_mem_writes (loop_p loop)
  = gimple_build_call_internal (IFN_MASK_LOAD, 3, addr,
ptr, mask);
gimple_call_set_lhs (new_stmt, lhs);
+   gimple_set_vuse (new_stmt, gimple_vuse (stmt));
  }
else
- new_stmt
-   = gimple_build_call_internal (IFN_MASK_STORE, 4, addr, ptr,
- mask, rhs);
+ {
+   new_stmt
+ = gimple_build_call_internal (IFN_MASK_STORE, 4, addr, ptr,
+ mask, rhs);
+   gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+   gimple_set_vdef (new_stmt, gimple_vdef (stmt));
+   SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt;
+ }
+
gsi_replace (, new_stmt, true);
  }
else if (gimple_vdef (stmt))
@@ -2361,6 +2371,20 @@ combine_blocks (struct loop *loop)
 }
 
   merge_target_bb = loop->header;
+
+  /* Get at the virtual def valid for uses starting at the first block
+ we merge into the header.  Without a virtual PHI the loop has the
+ same virtual use on all stmts.  */
+  gphi *vphi = get_virtual_phi (loop->header);
+  tree last_vdef = NULL_TREE;
+  if (vphi)
+{
+  last_vdef = gimple_phi_result (vphi);
+  for (gimple_stmt_iterator gsi = gsi_start_bb (loop->header);
+  ! gsi_end_p (gsi); gsi_next ())
+   if (gimple_vdef (gsi_stmt (gsi)))
+ last_vdef = gimple_vdef (gsi_stmt (gsi));
+}
   for (i = 1; i < orig_loop_num_nodes; i++)
 {
   gimple_stmt_iterator gsi;
@@ -2371,6 +2395,24 @@ combine_blocks (struct loop *loop)
   if (bb == exit_bb || bb == loop->latch)
continue;
 
+  /* We release virtual PHIs late because we have to propagate them
+ out using the current VUSE.  The def might be the one used
+after the loop.  */
+  vphi = get_virtual_phi (bb);
+  if (vphi)
+   {
+ imm_use_iterator iter;
+ use_operand_p use_p;
+ gimple *use_stmt;
+ FOR_EACH_IMM_USE_STMT (use_stmt, iter, gimple_phi_result (vphi))
+   {
+ FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+   SET_USE (use_p, last_vdef);
+   }
+ gsi = gsi_for_stmt (vphi); 
+ remove_phi_node (, true);
+   }
+
   /* Make stmts member of loop->header and clear range info from all stmts
 in BB which is now no longer executed 

Re: [PATCH] Speed up ix86_expand_builtin

2016-08-18 Thread Uros Bizjak
On Wed, Aug 17, 2016 at 4:17 PM, Jakub Jelinek  wrote:
> On Tue, Aug 16, 2016 at 09:21:57PM +0200, Uros Bizjak wrote:
>> The idea is indeed good, but please leave full names in the *.def
>> file. We can change them later, if need arises.
>
> Ok, here it is.
> i386-builtin.def is basically moved the bdesc_* definitions, except that
>   { ... },
> is replaced with
> BDESC (...)
> and comments reindented.  The first entry in each bdesc_* array
> uses BDESC_FIRST macro instead of BDESC, and there is BDESC_END after the
> last one.  I've managed to do it without gaps in between enumerator values
> for the boundaries, and with the goal that additions of BDESC entries at the
> end of bdesc_* sections will be much more common than adding new bdesc_*
> arrays, so e.g. didn't want a BDESC_LAST macro that would need to be changed
> to BDESC and added for some new entry every time something is appended.
> Also, I've tried to make the real builtin IX86_BUILTIN_* values to always
> precede the IX86_BUILTIN__BDESC_*_{FIRST,LAST} aliases, so that in the
> debugger one can see IX86_BUILTIN_COMIEQSS rather than
> IX86_BUILTIN__BDESC_COMI_FIRST etc.
>
> Full patch is attached xz compressed, included below is the same patch with
> lots of - or + lines replaced with ... where it doesn't contain anything
> interesting for the review.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-08-17  Jakub Jelinek  
>
> * config/i386/i386.c (enum ix86_builtins): Remove IX86_BUILTIN_*
> codes that appear in bdesc_* arrays, instead include i386-builtin.def
> twice to define those.
> (bdesc_comi, bdesc_pcmpestr, bdesc_pcmpistr, bdesc_special_args,
> bdesc_args, bdesc_round_args, bdesc_mpx, bdesc_mpx_const,
> bdesc_multi_arg): Define by including i386-builtin.def the third time.
> * config/i386/i386-builtin.def: New file.

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-08-16 21:27:02.0 +0200
> +++ gcc/config/i386/i386.c  2016-08-17 12:43:20.255605518 +0200
> @@ -30661,2415 +30661,41 @@ enum ix86_builtins
>IX86_BUILTIN_READ_FLAGS,
>IX86_BUILTIN_WRITE_FLAGS,
>
> -  /* All the remaining builtins are tracked in bdesc_* arrays.
> ...
> -  IX86_BUILTIN_VPERMIL2PS256,
> -  IX86_BUILTIN__BDESC_MULTI_ARG_LAST = IX86_BUILTIN_VPERMIL2PS256,
> +  /* All the remaining builtins are tracked in bdesc_* arrays in
> + i386-builtin.def.  Don't add any IX86_BUILTIN_* enumerators after
> + this point.  */
> +#define BDESC(mask, icode, name, code, comparison, flag) \
> +  code,
> +#define BDESC_FIRST(kind, kindu, mask, icode, name, code, comparison, flag) \
> +  code,  
>   \
> +  IX86_BUILTIN__BDESC_##kindu##_FIRST = code,
> +#define BDESC_END(kind, next_kind)
> +
> +#include "i386-builtin.def"
> +
> +#undef BDESC
> +#undef BDESC_FIRST
> +#undef BDESC_END
> +
> +  IX86_BUILTIN_MAX,
> +
> +  IX86_BUILTIN__BDESC_MAX_FIRST = IX86_BUILTIN_MAX,
> +
> +  /* Now just the aliases for bdesc_* start/end.  */
> +#define BDESC(mask, icode, name, code, comparison, flag)
> +#define BDESC_FIRST(kind, kindu, mask, icode, name, code, comparison, flag)
> +#define BDESC_END(kind, next_kind) \
> +  IX86_BUILTIN__BDESC_##kind##_LAST\
> += IX86_BUILTIN__BDESC_##next_kind##_FIRST - 1,
> +
> +#include "i386-builtin.def"
> +
> +#undef BDESC
> +#undef BDESC_FIRST
> +#undef BDESC_END
>
> -  IX86_BUILTIN_MAX
> +  /* Just to make sure there is no comma after the last enumerator.  */
> +  IX86_BUILTIN__BDESC_MAX_LAST = IX86_BUILTIN__BDESC_MAX_FIRST
>  };
>
>  /* Table for the ix86 builtin decls.  */
> @@ -33236,2475 +30862,6 @@ struct builtin_description
>const int flag;
>  };
>
> -static const struct builtin_description bdesc_comi[] =
> -{
> -  { OPTION_MASK_ISA_SSE, CODE_FOR_sse_comi, "__builtin_ia32_comieq", 
> IX86_BUILTIN_COMIEQSS, UNEQ, 0 },
> ...
> -
> -/* FMA4 and XOP.  */
>  #define MULTI_ARG_4_DF2_DI_I   V2DF_FTYPE_V2DF_V2DF_V2DI_INT
>  #define MULTI_ARG_4_DF2_DI_I1  V4DF_FTYPE_V4DF_V4DF_V4DI_INT
>  #define MULTI_ARG_4_SF2_SI_I   V4SF_FTYPE_V4SF_V4SF_V4SI_INT
> @@ -35758,199 +30915,20 @@ static const struct builtin_description
>  #define MULTI_ARG_1_QI_SI  V4SI_FTYPE_V16QI
>  #define MULTI_ARG_1_QI_HI  V8HI_FTYPE_V16QI
>
> -static const struct builtin_description bdesc_multi_arg[] =
> -{
> ...
> -  { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vpermil2v8sf3, 
> "__builtin_ia32_vpermil2ps256", IX86_BUILTIN_VPERMIL2PS256, UNKNOWN, 
> (int)MULTI_ARG_4_SF2_SI_I1 },
> -
> +#define BDESC(mask, icode, name, code, comparison, flag) \
> +  { mask, icode, name, code, comparison, flag },
> +#define BDESC_FIRST(kind, kindu, mask, icode, name, code, comparison, flag) \
> +static const struct builtin_description bdesc_##kind[] =   \
> +{

Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-18 Thread Uros Bizjak
On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
> builtin_memset_gen_str returns a register used for memset, which only
> supports integer registers.  But a target may use vector registers in
> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
> QImode value to mode derived from STORE_MAX_PIECES, which can be used
> with vector instructions.  The default hook is the same as the original
> builtin_memset_gen_str.  A target can override it to support vector
> instructions for STORE_MAX_PIECES.
>
> Tested on x86-64 and i686.  Any comments?

It looks to me you have attached an older version of the patch,
STORE_MAX_PIECES change in i386.h is already in the mainline.

(The patch needs middle-end review first).

Uros.

> H.J.
> ---
> gcc/
>
> * builtins.c (builtin_memset_gen_str): Call targetm.gen_memset_value.
> (default_gen_memset_value): New function.
> * target.def (gen_memset_value): New hook.
> * targhooks.c: Inclue "expmed.h" and "builtins.h".
> (default_gen_memset_value): New function.
> * targhooks.h (default_gen_memset_value): New prototype.
> * config/i386/i386.c (ix86_gen_memset_value): New function.
> (TARGET_GEN_MEMSET_VALUE): New.
> * config/i386/i386.h (STORE_MAX_PIECES): Likewise.
> * doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
> * doc/tm.texi: Updated.
>
> gcc/testsuite/
>
> * gcc.target/i386/pieces-memset-1.c: New test.
> * gcc.target/i386/pieces-memset-2.c: Likewise.
> * gcc.target/i386/pieces-memset-3.c: Likewise.
> * gcc.target/i386/pieces-memset-4.c: Likewise.
> * gcc.target/i386/pieces-memset-5.c: Likewise.
> * gcc.target/i386/pieces-memset-6.c: Likewise.
> * gcc.target/i386/pieces-memset-7.c: Likewise.
> * gcc.target/i386/pieces-memset-8.c: Likewise.
> * gcc.target/i386/pieces-memset-9.c: Likewise.
> * gcc.target/i386/pieces-memset-10.c: Likewise.
> * gcc.target/i386/pieces-memset-11.c: Likewise.
> * gcc.target/i386/pieces-memset-12.c: Likewise.
> * gcc.target/i386/pieces-memset-13.c: Likewise.
> * gcc.target/i386/pieces-memset-14.c: Likewise.
> * gcc.target/i386/pieces-memset-15.c: Likewise.
> * gcc.target/i386/pieces-memset-16.c: Likewise.
> * gcc.target/i386/pieces-memset-17.c: Likewise.
> * gcc.target/i386/pieces-memset-18.c: Likewise.
> * gcc.target/i386/pieces-memset-19.c: Likewise.
> * gcc.target/i386/pieces-memset-20.c: Likewise.
> * gcc.target/i386/pieces-memset-21.c: Likewise.
> * gcc.target/i386/pieces-memset-22.c: Likewise.
> * gcc.target/i386/pieces-memset-23.c: Likewise.
> * gcc.target/i386/pieces-memset-24.c: Likewise.
> * gcc.target/i386/pieces-memset-25.c: Likewise.
> * gcc.target/i386/pieces-memset-26.c: Likewise.
> * gcc.target/i386/pieces-memset-27.c: Likewise.
> * gcc.target/i386/pieces-memset-28.c: Likewise.
> * gcc.target/i386/pieces-memset-29.c: Likewise.
> * gcc.target/i386/pieces-memset-30.c: Likewise.
> * gcc.target/i386/pieces-memset-31.c: Likewise.
> * gcc.target/i386/pieces-memset-32.c: Likewise.
> * gcc.target/i386/pieces-memset-33.c: Likewise.
> * gcc.target/i386/pieces-memset-34.c: Likewise.
> * gcc.target/i386/pieces-memset-35.c: Likewise.
> * gcc.target/i386/pieces-memset-36.c: Likewise.
> * gcc.target/i386/pieces-memset-37.c: Likewise.
> * gcc.target/i386/pieces-memset-38.c: Likewise.
> * gcc.target/i386/pieces-memset-39.c: Likewise.
> * gcc.target/i386/pieces-memset-40.c: Likewise.
> * gcc.target/i386/pieces-memset-41.c: Likewise.
> * gcc.target/i386/pieces-memset-42.c: Likewise.
> * gcc.target/i386/pieces-memset-43.c: Likewise.
> * gcc.target/i386/pieces-memset-44.c: Likewise.
> ---
>  gcc/builtins.c   | 32 
>  gcc/config/i386/i386.c   | 38 
> 
>  gcc/config/i386/i386.h   | 13 
>  gcc/doc/tm.texi  |  7 +
>  gcc/doc/tm.texi.in   |  2 ++
>  gcc/target.def   |  9 ++
>  gcc/targhooks.h  |  1 +
>  gcc/testsuite/gcc.target/i386/pieces-memset-1.c  | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-10.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-11.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-12.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-13.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-14.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-15.c | 12 
>  gcc/testsuite/gcc.target/i386/pieces-memset-16.c | 12 
>  

Re: [PATCH] PR target/72839: Increase MOVE_RATIO to 17 for Lakemont

2016-08-18 Thread Uros Bizjak
On Wed, Aug 17, 2016 at 8:26 PM, H.J. Lu  wrote:
> Larger MOVE_RATIO will always make code faster.  17 is the number with
> smaller code sizes for Lakemont.
>
> Tested on x86-64.  OK for trunk?

OK, I assume that the patch is based on some benchmark.

Thanks,
Uros.

>
> H.J.
> ---
> gcc/
>
> PR target/72839
> * config/i386/i386.c (lakemont_cost): Set MOVE_RATIO to 17.
>
> gcc/testsuite/
>
> PR target/72839
> * gcc.target/i386/pr72839.c: New test.
> ---
>  gcc/config/i386/i386.c  |  2 +-
>  gcc/testsuite/gcc.target/i386/pr72839.c | 17 +
>  2 files changed, 18 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr72839.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 3805817..8fe3821 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -426,7 +426,7 @@ struct processor_costs lakemont_cost = {
>COSTS_N_INSNS (3),   /* cost of movsx */
>COSTS_N_INSNS (2),   /* cost of movzx */
>8,   /* "large" insn */
> -  9,   /* MOVE_RATIO */
> +  17,  /* MOVE_RATIO */
>6,/* cost for loading QImode using movzbl 
> */
>{2, 4, 2},   /* cost of loading integer registers
>in QImode, HImode and SImode.
> diff --git a/gcc/testsuite/gcc.target/i386/pr72839.c 
> b/gcc/testsuite/gcc.target/i386/pr72839.c
> new file mode 100644
> index 000..ea724f7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr72839.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ia32 } */
> +/* { dg-options "-O2 -mtune=lakemont" } */
> +
> +extern char *strcpy (char *, const char *);
> +
> +void
> +foo (char *s)
> +{
> +  strcpy (s,
> + "12345678123456781234567812345678123456781234567812345678"
> + "1234567");
> +}
> +
> +/* { dg-final { scan-assembler-times "movl\[ \\t\]+\\$\[0-9\]+, 
> \[0-9\]*\\(%\[^,\]+\\)" 16 } } */
> +/* { dg-final { scan-assembler-not "rep movsl" } } */
> +/* { dg-final { scan-assembler-not "rep movsb" } } */
> --
> 2.7.4
>


Re: Fwd: [PATCH, doc/ARM] Remove false affirmation that Thumb cannot use an FPU

2016-08-18 Thread Sandra Loosemore

On 08/11/2016 04:31 AM, Thomas Preudhomme wrote:


diff --git a/gcc/doc/fragments.texi b/gcc/doc/fragments.texi
index 
b6d8541c8ca820fa732363a05221e2cd4d1251c2..abf4e128671bb4751d21f24bb69625593d3c839e
 100644
--- a/gcc/doc/fragments.texi
+++ b/gcc/doc/fragments.texi
@@ -117,12 +117,15 @@ specified, there are combinations that should not be 
built.  In that
 case, set @code{MULTILIB_EXCEPTIONS} to be all of the switch exceptions
 in shell case syntax that should not be built.

-For example the ARM processor cannot execute both hardware floating
-point instructions and the reduced size THUMB instructions at the same
-time, so there is no need to build libraries with both of these
-options enabled.  Therefore @code{MULTILIB_EXCEPTIONS} is set to:
+For example on ARM targets @option{-mfloat-abi=soft} requests to use a
+softfloat implementation for floating-point operations.  Therefore, it
+does not make sense to find both @option{-mfloat-abi=soft} and an
+@option{mfpu} option on the command line so @code{MULTILIB_EXCEPTIONS}
+could contain the following exception (assuming that
+@option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
+that @option{-mfloat-abi=soft} is the default value):
 @smallexample
-*mthumb/*mhard-float*
+*mfpu=*
 @end smallexample

 @findex MULTILIB_REQUIRED


This version still has a lot of copy-editing issues.  I suggest 
rewriting as:


  For example, on ARM targets @option{-mfloat-abi=soft} requests use of
  software floating-point operations, so it
  does not make sense to build libraries with both
  @option{-mfloat-abi=soft} and an @option{-mfpu} option.
  @code{MULTILIB_EXCEPTIONS} could contain the following exception

but here I get stuck in suggesting a rewrite, because I can't parse this 
part at all to figure out what you're trying to say:


  (assuming that
  @option{-mfloat-abi} comes after in @code{MULTILIB_OPTIONS} and given
  that @option{-mfloat-abi=soft} is the default value):

"comes after in"?  Comes after what?  If the order is important here, 
the documentation should explain why instead of just "assuming" things 
about it.


-Sandra the confused