Re: [PATCH] PR fortran/91513 -- Fix poorly worded error message

2019-10-11 Thread Steve Kargl
On Fri, Oct 11, 2019 at 09:56:15PM -0700, Steve Kargl wrote:
> In PR fortran/91513, Damian Rouson points out that the Fortran
> does contain the words "impure variable".

Geez. That is a messed up sentence.

In PR fortran/91513, Damian Rouson points out that the Fortran
*standard* does *not* contain the words "impure variable.

(It's late, and I'm tired!)

>  Damian and I spent
> an afternoon together a few weeks ago where I gave Damian a
> world wind view of how I see gcc/fortran and go about debugging
> problems.  Damian and I both have full schedules, but I was
> hoping he would submit the patch to the list.  My time of 
> fixing gfortran bugs is running out, so ...
> 
> OK to commit?
> 
> 2019-09-29  Damian Rouson  
> 
>   PR fortran/91513
>   * resolve.c (resolve_ordinary_assign): Improved error message.
> 
> 2019-09-29  Damian Rouson  
> 
>   PR fortran/91513
>   * gfortran.dg/impure_assignment_2.f90: Update dg-error regex.
> 
> -- 
> Steve

> Index: gcc/fortran/resolve.c
> ===
> --- gcc/fortran/resolve.c (revision 276269)
> +++ gcc/fortran/resolve.c (working copy)
> @@ -10774,9 +10774,12 @@ resolve_ordinary_assign (gfc_code *code, 
> gfc_namespace
>   "component in a PURE procedure",
>   >where);
> else
> - gfc_error ("The impure variable at %L is assigned to "
> - "a derived type variable with a POINTER "
> - "component in a PURE procedure (12.6)",
> +   /* F2008, C1283 (4).  */
> + gfc_error ("In a pure subprogram an INTENT(IN) dummy argument "
> + "shall not be used as the expr at %L of an intrinsic "
> + "assignment statement in which the variable is of a "
> + "derived type if the derived type has a pointer "
> + "component at any level of component selection.",
>   >where);
> return rval;
>   }
> Index: gcc/testsuite/gfortran.dg/impure_assignment_2.f90
> ===
> --- gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (revision 276269)
> +++ gcc/testsuite/gfortran.dg/impure_assignment_2.f90 (working copy)
> @@ -40,7 +40,7 @@ CONTAINS
>PURE FUNCTION give_next3(node)
>   TYPE(node_type), intent(in) :: node
>   TYPE(node_type) :: give_next
> - give_next = node ! { dg-error "impure variable" }
> + give_next = node ! { dg-error "pure subprogram" }
>END FUNCTION
>  END MODULE pr20863
>  


-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


[PATCH] PR fortran/91513 -- Fix poorly worded error message

2019-10-11 Thread Steve Kargl
In PR fortran/91513, Damian Rouson points out that the Fortran
does contain the words "impure variable".  Damian and I spent
an afternoon together a few weeks ago where I gave Damian a
world wind view of how I see gcc/fortran and go about debugging
problems.  Damian and I both have full schedules, but I was
hoping he would submit the patch to the list.  My time of 
fixing gfortran bugs is running out, so ...

OK to commit?

2019-09-29  Damian Rouson  

PR fortran/91513
* resolve.c (resolve_ordinary_assign): Improved error message.

2019-09-29  Damian Rouson  

PR fortran/91513
* gfortran.dg/impure_assignment_2.f90: Update dg-error regex.

-- 
Steve
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 276269)
+++ gcc/fortran/resolve.c	(working copy)
@@ -10774,9 +10774,12 @@ resolve_ordinary_assign (gfc_code *code, gfc_namespace
 			"component in a PURE procedure",
 			>where);
 	  else
-	gfc_error ("The impure variable at %L is assigned to "
-			"a derived type variable with a POINTER "
-			"component in a PURE procedure (12.6)",
+	  /* F2008, C1283 (4).  */
+	gfc_error ("In a pure subprogram an INTENT(IN) dummy argument "
+			"shall not be used as the expr at %L of an intrinsic "
+			"assignment statement in which the variable is of a "
+			"derived type if the derived type has a pointer "
+			"component at any level of component selection.",
 			>where);
 	  return rval;
 	}
Index: gcc/testsuite/gfortran.dg/impure_assignment_2.f90
===
--- gcc/testsuite/gfortran.dg/impure_assignment_2.f90	(revision 276269)
+++ gcc/testsuite/gfortran.dg/impure_assignment_2.f90	(working copy)
@@ -40,7 +40,7 @@ CONTAINS
   PURE FUNCTION give_next3(node)
  TYPE(node_type), intent(in) :: node
  TYPE(node_type) :: give_next
- give_next = node ! { dg-error "impure variable" }
+ give_next = node ! { dg-error "pure subprogram" }
   END FUNCTION
 END MODULE pr20863
 


[PATCH] rs6000: -flto forgets 'no-vsx' function attributes (PR target/70010)

2019-10-11 Thread Jiufu Guo
Hi,

As expected in the PR, when a function is marked with no-vsx, we would
assume user has good reason to disable VSX code generation for the function.  
To avoid VSX code generation, this function should not be inlined into VSX
function. 

In previous implementation, function with non-vsx is treated as with subset
features of target with vsx, even programer set no-vsx attribute.  Following
patch prevents no-vsx function to be inlined into vsx function.

Bootstrapped/regtested on ppc64le-linux, ok for trunk?

Thanks,
Jiufu Guo

[gcc]
2019-10-12  Jiufu Guo  

PR target/70010
* config/rs6000/rs6000.c (rs6000_can_inline_p): Check 'vsx' feature
for caller and callee

[gcc/testsuite]
2019-10-12  Jiufu Guo  

PR target/70010
* gcc.target/powerpc/inline_error.c: New test

---
 gcc/config/rs6000/rs6000.c  | 14 +-
 gcc/testsuite/gcc.target/powerpc/inline_error.c | 17 +
 2 files changed, 26 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/inline_error.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d1434a9..c38dc87 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -23978,11 +23978,15 @@ rs6000_can_inline_p (tree caller, tree callee)
   struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
   struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
 
-  /* Callee's options should a subset of the caller's, i.e. a vsx function
-can inline an altivec function but a non-vsx function can't inline a
-vsx function.  */
-  if ((caller_opts->x_rs6000_isa_flags & callee_opts->x_rs6000_isa_flags)
- == callee_opts->x_rs6000_isa_flags)
+  /* Callee's options should a subset of the caller's. In addition,
+vsx function should not inline function with non-vsx by which
+programmer may intend to disable VSX code generation.  */
+  if ((caller_opts->x_rs6000_isa_flags & OPTION_MASK_VSX)
+ && (callee_opts->x_rs6000_isa_flags & OPTION_MASK_VSX) == 0)
+   ret = false;
+  else if ((caller_opts->x_rs6000_isa_flags
+   & callee_opts->x_rs6000_isa_flags)
+  == callee_opts->x_rs6000_isa_flags)
ret = true;
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/inline_error.c 
b/gcc/testsuite/gcc.target/powerpc/inline_error.c
new file mode 100644
index 000..78870db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/inline_error.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -flto -mvsx" } */
+
+vector int c, a, b;
+
+static inline void __attribute__ ((__always_inline__, target ("no-vsx")))
+foo () /* { dg-error "inlining failed in call to .* target specific option 
mismatch" } */
+{
+  c = a + b;
+}
+
+int
+main ()
+{
+  foo (); /* { dg-message "called from here" } */
+  c = a + b;
+}
-- 
2.7.4



Go patch commited: Mangle dots in pkgpath

2019-10-11 Thread Ian Lance Taylor
The Go frontend needs to mangle dots to avoid problems with
-fgo-pkgpath=a.0.  That will confuse the name mangling, which assumes
that names entering the mangling cannot contain arbitrary dot
characters.  We don't need to mangle other characters; go_encode_id
will handle them.  This patch fixes the problem.  It fixes
https://golang.org/issue/33871.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 276594)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ddfb845fad1f2e8b84383f262ed5ea5be7b3e35a
+f174fdad69cad42309984dfa108d80f2ae8a9f78
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/go-encode-id.cc
===
--- gcc/go/gofrontend/go-encode-id.cc   (revision 276382)
+++ gcc/go/gofrontend/go-encode-id.cc   (working copy)
@@ -253,3 +253,16 @@ go_mangle_struct_tag(const std::string&
 }
   return ret;
 }
+
+// Encode a package path.
+
+std::string
+go_mangle_pkgpath(const std::string& pkgpath)
+{
+  std::string s = pkgpath;
+  for (size_t i = s.find('.');
+   i != std::string::npos;
+   i = s.find('.', i + 1))
+s.replace(i, 1, ".x2e"); // 0x2e is the ASCII encoding for '.'
+  return s;
+}
Index: gcc/go/gofrontend/go-encode-id.h
===
--- gcc/go/gofrontend/go-encode-id.h(revision 276382)
+++ gcc/go/gofrontend/go-encode-id.h(working copy)
@@ -34,4 +34,12 @@ go_selectively_encode_id(const std::stri
 extern std::string
 go_mangle_struct_tag(const std::string& tag);
 
+// Encode a package path.  A package path can contain any arbitrary
+// character, including '.'.  go_encode_id expects that any '.' will
+// be inserted by name mangling in a controlled manner.  So first
+// translate any '.' using the same .x encoding as used by
+// go_mangle_struct_tag.
+extern std::string
+go_mangle_pkgpath(const std::string& pkgpath);
+
 #endif // !defined(GO_ENCODE_ID_H)
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 276579)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -298,7 +298,7 @@ void
 Gogo::set_pkgpath(const std::string& arg)
 {
   go_assert(!this->pkgpath_set_);
-  this->pkgpath_ = arg;
+  this->pkgpath_ = go_mangle_pkgpath(arg);
   this->pkgpath_set_ = true;
   this->pkgpath_from_option_ = true;
 }
@@ -396,7 +396,8 @@ Gogo::set_package_name(const std::string
{
  if (!this->prefix_from_option_)
this->prefix_ = "go";
- this->pkgpath_ = this->prefix_ + '.' + package_name;
+ this->pkgpath_ = (go_mangle_pkgpath(this->prefix_) + '.'
+   + package_name);
  this->pkgpath_symbol_ = (Gogo::pkgpath_for_symbol(this->prefix_) + '.'
   + Gogo::pkgpath_for_symbol(package_name));
}


Re: [PATCH][ARM] Switch to default sched pressure algorithm

2019-10-11 Thread Ramana Radhakrishnan
On Fri, Oct 11, 2019 at 10:42 PM Wilco Dijkstra  wrote:
>
> Hi,
>
>  > the defaults for v7-a are still to use the
>  > Cortex-A8 scheduler
>
> I missed that part, but that's a serious bug btw - Cortex-A8 is 15 years old 
> now so
> way beyond obsolete. Even Cortex-A53 is ancient now, but it has an accurate 
> scheduler
> that performs surprisingly well on both in-order and out-of-order 
> implementations.

On armv8-a we do use cortex-a53 as the default scheduler in the AArch32 backend.

regards
Ramana

>
> Cheers,
> Wilco


Re: [PATCH][ARM] Switch to default sched pressure algorithm

2019-10-11 Thread Ramana Radhakrishnan
On Fri, Oct 11, 2019 at 6:19 PM Wilco Dijkstra  wrote:
>
> Hi Ramana,
>
> > Can you see what happens with the Cortex-A8 or Cortex-A9 schedulers to
> > spread the range across some v7-a CPUs as well ? While they aren't that 
> > popular today I
> > would suggest you look at them because the defaults for v7-a are still to 
> > use the
> > Cortex-A8 scheduler and the Cortex-A9 scheduler might well also get used in 
> > places given
> > the availability of hardware.
>
> The results are practically identical to Cortex-A53 and A57 - there is a huge 
> codesize win
> across the board on SPEC2006, there isn't a single benchmark that is larger 
> (ie. more
> spilling).
>
> > I'd be happy to move this forward if you could show if there is no 
> > *increase* in spills
> > for the same range of benchmarks that you are doing for the Cortex-A8 and 
> > Cortex-A9
> > schedulers.
>
> There certainly isn't. I don't think results like these could be any more 
> one-sided, it's a
> significant win for every single benchmark, both for codesize and performance!
>

Ok go ahead - please be sensitive to testsuite regressions.

Ramana


> What isn't clear is whether something has gone horribly wrong in the 
> scheduler which
> could be fixed/reverted, but as it is right now I can't see it being useful 
> at all. This means
> we should also reevaluate whether pressure scheduling now hurts AArch64 too.
>
> Cheers,
> Wilco


[PATCH] PR fortran/90297 -- Remove code with no functional effect

2019-10-11 Thread Steve Kargl
The patch is fairly self-explanatory.  OK to commit?

2019-10-11  Steven G. Kargl  

PR fortran/90297
* resolve.c (resolve_typebound_function): Remove code with no
functional effect.

-- 
Steve
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 276899)
+++ gcc/fortran/resolve.c	(working copy)
@@ -6548,21 +6548,6 @@ resolve_typebound_function (gfc_expr* e)
   overridable = !e->value.compcall.tbp->non_overridable;
   if (expr && expr->ts.type == BT_CLASS && e->value.compcall.name)
 {
-  /* If the base_object is not a variable, the corresponding actual
-	 argument expression must be stored in e->base_expression so
-	 that the corresponding tree temporary can be used as the base
-	 object in gfc_conv_procedure_call.  */
-  if (expr->expr_type != EXPR_VARIABLE)
-	{
-	  gfc_actual_arglist *args;
-
-	  for (args= e->value.function.actual; args; args = args->next)
-	{
-	  if (expr == args->expr)
-		expr = args->expr;
-	}
-	}
-
   /* Since the typebound operators are generic, we have to ensure
 	 that any delays in resolution are corrected and that the vtab
 	 is present.  */


Re: [C++ Patch] Remove most uses of in_system_header_at

2019-10-11 Thread Jakub Jelinek
On Fri, Oct 11, 2019 at 11:48:27PM +0200, Paolo Carlini wrote:
> --- decl.c(revision 276903)
> +++ decl.c(working copy)
> @@ -9328,7 +9328,6 @@ grokfndecl (tree ctype,
>   }
> /* 17.6.3.3.5  */
> if (suffix[0] != '_'
> -   && !in_system_header_at (location)
> && !current_function_decl && !(friendp && !funcdef_flag))
>   warning_at (location, OPT_Wliteral_suffix,
>   "literal operator suffixes not preceded by %<_%>"

This should make no functional difference.

> @@ -10036,8 +10035,6 @@ compute_array_index_type_loc (location_t name_loc,
>  indicated by the state of complain), so that
>  another substitution can be found.  */
>   return error_mark_node;
> -   else if (in_system_header_at (input_location))
> - /* Allow them in system headers because glibc uses them.  */;
> else if (name)
>   pedwarn (loc, OPT_Wpedantic,
>"ISO C++ forbids zero-size array %qD", name);

But this one is unclear, in_system_header_at is testing a different location
from what will be used by pedwarn, so if input_location is in system header
and loc is not, it didn't pedwarn before and now it will.
Similarly various other spots in the patch.  I haven't tried to check in
detail what exactly we want in each case, all I want to say is that some
cases are obvious, other cases are not.

Jakub


Support decimal floating-point constants in C2x

2019-10-11 Thread Joseph Myers
ISO C2x adds decimal floating point as an optional standard feature.
This patch accordingly makes GCC accept DFP constants (DF, DD, DL, df,
dd, dl suffixes) in strict C2X mode, with a pedwarn-if-pedantic for
older standards and a warning with -Wc11-c2x-compat even in C2x mode
(which in turn requires -Wc11-c2x-compat to be newly passed through to
libcpp).

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c-family:
2019-10-11  Joseph Myers  

* c.opt (Wc11-c2x-compat): Add CPP(cpp_warn_c11_c2x_compat)
CppReason(CPP_W_C11_C2X_COMPAT).

gcc/testsuite:
2019-10-11  Joseph Myers  

* gcc.dg/dfp/c11-constants-1.c, gcc.dg/dfp/c11-constants-2.c,
gcc.dg/dfp/c2x-constants-1.c, gcc.dg/dfp/c2x-constants-2.c: New
tests.
* gcc.dg/dfp/constants-pedantic.c: Use -std=gnu17 explicitly.
Update expected diagnostics.

libcpp:
2019-10-11  Joseph Myers  

* include/cpplib.h (struct cpp_options): Add dfp_constants and
cpp_warn_c11_c2x_compat.
(enum cpp_warning_reason): Add CPP_W_C11_C2X_COMPAT.
* init.c (struct lang_flags): Add dfp_constants.
(lang_defaults): Set dfp_constants to 1 for GNUC2X and STDC2X and
0 for other languages.
(cpp_set_lang): Set dfp_constants from language.
(cpp_create_reader): Set cpp_warn_c11_c2x_compat to -1.
* expr.c (interpret_float_suffix): Mention DFP constants as C2X in
comment.
(cpp_classify_number): Do not diagnose DFP constants for languages
setting dfp_constants, unless cpp_warn_c11_c2x_compat.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 276896)
+++ gcc/c-family/c.opt  (working copy)
@@ -367,7 +367,7 @@ C ObjC C++ ObjC++ CPP(warn_builtin_macro_redefined
 Warn when a built-in preprocessor macro is undefined or redefined.
 
 Wc11-c2x-compat
-C ObjC Var(warn_c11_c2x_compat) Init(-1) Warning
+C ObjC CPP(cpp_warn_c11_c2x_compat) CppReason(CPP_W_C11_C2X_COMPAT) 
Var(warn_c11_c2x_compat) Init(-1) Warning
 Warn about features not present in ISO C11, but present in ISO C2X.
 
 Wc90-c99-compat
Index: gcc/testsuite/gcc.dg/dfp/c11-constants-1.c
===
--- gcc/testsuite/gcc.dg/dfp/c11-constants-1.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c11-constants-1.c  (working copy)
@@ -0,0 +1,13 @@
+/* Test that DFP constants are diagnosed in C11 mode: -pedantic.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic" } */
+
+int a = (int) 1.1DF; /* { dg-warning "C2X feature" } */
+int b = (int) 2.df; /* { dg-warning "C2X feature" } */
+int c = (int) .33DD; /* { dg-warning "C2X feature" } */
+int d = (int) 2e1dd; /* { dg-warning "C2X feature" } */
+int e = (int) .3e2DL; /* { dg-warning "C2X feature" } */
+int f = (int) 4.5e3dl; /* { dg-warning "C2X feature" } */
+int g = (int) 5.e0DF; /* { dg-warning "C2X feature" } */
+int h = (int) 1e+2df; /* { dg-warning "C2X feature" } */
+int i = (int) 1000e-3DL; /* { dg-warning "C2X feature" } */
Index: gcc/testsuite/gcc.dg/dfp/c11-constants-2.c
===
--- gcc/testsuite/gcc.dg/dfp/c11-constants-2.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c11-constants-2.c  (working copy)
@@ -0,0 +1,13 @@
+/* Test that DFP constants are diagnosed in C11 mode: -pedantic-errors.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+int a = (int) 1.1DF; /* { dg-error "C2X feature" } */
+int b = (int) 2.df; /* { dg-error "C2X feature" } */
+int c = (int) .33DD; /* { dg-error "C2X feature" } */
+int d = (int) 2e1dd; /* { dg-error "C2X feature" } */
+int e = (int) .3e2DL; /* { dg-error "C2X feature" } */
+int f = (int) 4.5e3dl; /* { dg-error "C2X feature" } */
+int g = (int) 5.e0DF; /* { dg-error "C2X feature" } */
+int h = (int) 1e+2df; /* { dg-error "C2X feature" } */
+int i = (int) 1000e-3DL; /* { dg-error "C2X feature" } */
Index: gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c
===
--- gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c2x-constants-1.c  (working copy)
@@ -0,0 +1,13 @@
+/* Test that DFP constants are accepted in C2X mode.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+int a = (int) 1.1DF;
+int b = (int) 2.df;
+int c = (int) .33DD;
+int d = (int) 2e1dd;
+int e = (int) .3e2DL;
+int f = (int) 4.5e3dl;
+int g = (int) 5.e0DF;
+int h = (int) 1e+2df;
+int i = (int) 1000e-3DL;
Index: gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c
===
--- gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c2x-constants-2.c  (working copy)
@@ -0,0 +1,13 @@
+/* Test that DFP constants are accepted in C2X mode: compat warnings.  */
+/* { dg-do compile } */

Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Jason Merrill

On 10/11/19 6:58 AM, Paolo Carlini wrote:

Hi,

another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent 
with the one I used in build_anon_union_vars. Tested x86_64-linux.


Thanks, Paolo.

/

OK.

Jason



Re: [PATCH] V5, #6 of 15: Make vector load/store instruction length correct with prefixed addresses

2019-10-11 Thread Segher Boessenkool
Hi!

On Wed, Oct 09, 2019 at 04:26:20PM -0400, Michael Meissner wrote:
> --- gcc/config/rs6000/vsx.md  (revision 276713)
> +++ gcc/config/rs6000/vsx.md  (working copy)
> @@ -1149,10 +1149,14 @@ (define_insn "vsx_mov_64bit"
> "vecstore,  vecload,   vecsimple, mffgpr,mftgpr,load,
>  store, load,  store, *, vecsimple, 
> vecsimple,
>  vecsimple, *, *, vecstore,  vecload")
> -   (set_attr "length"
> +   (set_attr "non_prefixed_length"
> "*, *, *, 8, *, 8,
>  8, 8, 8, 8, *, *,
>  *, 20,8, *, *")
> +   (set_attr "prefixed_length"
> +   "*, *, *, 8, *, 20,
> +20,20,20,8, *, *,
> +*, 20,8, *, *")

Alternative 13 has non_prefixed_length 20, I wonder what insns that
generates?

Other than that, looks good afaics.


Segher


[C++ Patch] Remove most uses of in_system_header_at

2019-10-11 Thread Paolo Carlini

Hi,

as promised, this removes most uses of the predicate. I left alone the 
two occurrences in decl.c which guard permerrors, not warnings, and two 
more in parser.c which should be a (minor) optimization (particularly 
that in cp_parser_cast_expression).


Tested x86_64-linux.

Thanks, Paolo.

///

2019-10-11  Paolo Carlini  

* decl.c (grokfndecl): Remove redundant use of in_system_header_at.
(compute_array_index_type_loc): Likewise.
(grokdeclarator): Likewise.
* error.c (cp_printer): Likewise.
* lambda.c (add_default_capture): Likewise.
* parser.c (cp_parser_primary_expression): Likewise.
(cp_parser_selection_statement): Likewise.
(cp_parser_toplevel_declaration): Likewise.
(cp_parser_enumerator_list): Likewise.
(cp_parser_using_declaration): Likewise.
(cp_parser_member_declaration): Likewise.
(cp_parser_exception_specification_opt): Likewise.
(cp_parser_std_attribute_spec): Likewise.
* pt.c (do_decl_instantiation): Likewise.
(do_type_instantiation): Likewise.
* typeck.c (cp_build_unary_op): Likewise.
Index: decl.c
===
--- decl.c  (revision 276903)
+++ decl.c  (working copy)
@@ -9328,7 +9328,6 @@ grokfndecl (tree ctype,
}
  /* 17.6.3.3.5  */
  if (suffix[0] != '_'
- && !in_system_header_at (location)
  && !current_function_decl && !(friendp && !funcdef_flag))
warning_at (location, OPT_Wliteral_suffix,
"literal operator suffixes not preceded by %<_%>"
@@ -10036,8 +10035,6 @@ compute_array_index_type_loc (location_t name_loc,
   indicated by the state of complain), so that
   another substitution can be found.  */
return error_mark_node;
- else if (in_system_header_at (input_location))
-   /* Allow them in system headers because glibc uses them.  */;
  else if (name)
pedwarn (loc, OPT_Wpedantic,
 "ISO C++ forbids zero-size array %qD", name);
@@ -11037,7 +11034,7 @@ grokdeclarator (const cp_declarator *declarator,
}
   /* Don't pedwarn if the alternate "__intN__" form has been used instead
 of "__intN".  */
-  else if (!int_n_alt && pedantic && ! in_system_header_at 
(input_location))
+  else if (!int_n_alt && pedantic)
pedwarn (declspecs->locations[ds_type_spec], OPT_Wpedantic,
 "ISO C++ does not support %<__int%d%> for %qs",
 int_n_data[declspecs->int_n_idx].bitsize, name);
@@ -12695,10 +12692,7 @@ grokdeclarator (const cp_declarator *declarator,
else
  {
/* Array is a flexible member.  */
-   if (in_system_header_at (input_location))
- /* Do not warn on flexible array members in system
-headers because glibc uses them.  */;
-   else if (name)
+   if (name)
  pedwarn (id_loc, OPT_Wpedantic,
   "ISO C++ forbids flexible array member %qs", name);
else
Index: error.c
===
--- error.c (revision 276903)
+++ error.c (working copy)
@@ -4317,10 +4317,7 @@ cp_printer (pretty_printer *pp, text_info *text, c
 void
 maybe_warn_cpp0x (cpp0x_warn_str str)
 {
-  if ((cxx_dialect == cxx98) && !in_system_header_at (input_location))
-/* We really want to suppress this warning in system headers,
-   because libstdc++ uses variadic templates even when we aren't
-   in C++0x mode. */
+  if (cxx_dialect == cxx98)
 switch (str)
   {
   case CPP0X_INITIALIZER_LISTS:
Index: lambda.c
===
--- lambda.c(revision 276903)
+++ lambda.c(working copy)
@@ -697,8 +697,7 @@ add_default_capture (tree lambda_stack, tree id, t
   /* Warn about deprecated implicit capture of this via [=].  */
   if (cxx_dialect >= cxx2a
  && this_capture_p
- && LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda) == CPLD_COPY
- && !in_system_header_at (LAMBDA_EXPR_LOCATION (lambda)))
+ && LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda) == CPLD_COPY)
{
  if (warning_at (LAMBDA_EXPR_LOCATION (lambda), OPT_Wdeprecated,
  "implicit capture of %qE via %<[=]%> is deprecated "
Index: parser.c
===
--- parser.c(revision 276903)
+++ parser.c(working copy)
@@ -5364,8 +5364,7 @@ cp_parser_primary_expression (cp_parser *parser,
  {
expr = cp_parser_fold_expression (parser, expr);
if (expr != error_mark_node
-   && cxx_dialect < cxx17
-   && !in_system_header_at (input_location))
+   

Re: [PATCH] V5, #5 of 15: Support -fstack-protect and large stack frames

2019-10-11 Thread Segher Boessenkool
On Wed, Oct 09, 2019 at 04:22:06PM -0400, Michael Meissner wrote:
> This patch allows -fstack-protect to work with large stack frames.

I don't understand; please explain.

What I see is a workaround for not properly handling prefixed addresses
in the stack protect code (by forcing the addresses to not be prefixed at
expand time).

> +rtx
> +make_memory_non_prefixed (rtx mem)
> +{
> +  gcc_assert (MEM_P (mem));
> +
> +  rtx old_addr = XEXP (mem, 0);
> +  if (address_is_prefixed (old_addr, GET_MODE (mem), NON_PREFIXED_DEFAULT))
> +{
> +  rtx new_addr;
> +
> +  if (GET_CODE (old_addr) == PLUS
> +   && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))

How can you ever have a subreg in an address?  One in Pmode even?

> +   && CONST_INT_P (XEXP (old_addr, 1)))
> + {
> +   rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
> +   new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
> + }
> +  else
> + new_addr = force_reg (Pmode, old_addr);
> +
> +  mem = change_address (mem, VOIDmode, new_addr);

replace_equiv_address ?

> +(define_expand "stack_protect_setdi"
> +  [(parallel [(set (match_operand:DI 0 "memory_operand")
> +(unspec:DI [(match_operand:DI 1 "memory_operand")]
> +UNSPEC_SP_SET))
> +   (set (match_scratch:DI 2)
> +(const_int 0))])]
> +  "TARGET_64BIT"
> +{
> +  if (TARGET_PREFIXED_ADDR)
> +{
> +  operands[0] = make_memory_non_prefixed (operands[0]);
> +  operands[1] = make_memory_non_prefixed (operands[1]);
> +}
> +})

It shouldn't be terribly hard to make the define_insns just *work* with
prefixed insns, instead?  Is there any reason we are sure these memory
references will not turn into something that needs prefixed insns, after
expand?  It seems fragile like this.

> +@item em
> +A memory operand that does not contain a prefixed address.

"A memory operand that does not require a prefixed instruction"?


Segher


Re: [PATCH][ARM] Switch to default sched pressure algorithm

2019-10-11 Thread Wilco Dijkstra
Hi,

 > the defaults for v7-a are still to use the
 > Cortex-A8 scheduler

I missed that part, but that's a serious bug btw - Cortex-A8 is 15 years old 
now so
way beyond obsolete. Even Cortex-A53 is ancient now, but it has an accurate 
scheduler
that performs surprisingly well on both in-order and out-of-order 
implementations.

Cheers,
Wilco

Re: [PATCH] V5, #4 of 15: Update predicates

2019-10-11 Thread Segher Boessenkool
On Wed, Oct 09, 2019 at 04:07:50PM -0400, Michael Meissner wrote:
> It also adds two new predicates that disallow prefixed memory that will used 
> in
> patches #5 and #7.

Then you should have really introduced them in *those* patches.

> 2019-10-08  Michael Meissner  
> 
>   * config/rs6000/predicates.md (lwa_operand): Allow using PLWA to
>   generate sign extend with odd offsets.

I don't understand what this means.  "Odd offsets" isn't correct, in any
case?

> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> + the offset must be 0.  The prefixed PLWA does not have this
> + restriction.  */
> +  if (address_is_prefixed (addr, DImode, NON_PREFIXED_DS))
> +return true;

DImode?

> +;; Return 1 if op is a memory operand that is not prefixed.
> +(define_predicate "non_prefixed_memory"
> +  (match_code "mem")
> +{
> +  if (!memory_operand (op, mode))
> +return false;
> +
> +  return !address_is_prefixed (XEXP (op, 0), mode, NON_PREFIXED_DEFAULT);
> +})

This one is fine.

> +;; Return 1 if op is either a register operand or a memory operand that does
> +;; not use a prefixed address.
> +(define_predicate "reg_or_non_prefixed_memory"
> +  (match_code "reg,subreg,mem")
> +{
> +  return (gpc_reg_operand (op, mode) || non_prefixed_memory (op, mode));
> +})

This never allows subreg.


Segher


Re: [PATCH] V5, #3 of 15: Deal with prefixed instructions in rs6000_insn_cost

2019-10-11 Thread Segher Boessenkool
Hi!

On Wed, Oct 09, 2019 at 04:03:16PM -0400, Michael Meissner wrote:
> The basic problem is if there is no explicit cost predicate, rs6000_insn_cost
> uses the instruction size to figure out how many instructions are present, and
> make the cost a fact on that.  Since prefixed instructions are 12 bytes within
> GCC (to deal with the implicit NOP), if we did not do this change, the
> optimizers would try to save registers from prefixed loads because they 
> thought
> the load was more expensive.

Maybe we should just have an attribute that says how many insns this is?
You can get rid of many prefixed_length and non_prefixed_length attributes
that way, too.

> +  int cost;
> +  int length = get_attr_length (insn);
> +  int n = length / 4;
> +
> +  /* How many real instructions are generated for this insn?  This is 
> slightly

What is a "real" instruction?  Machine instruction?

> + different from the length attribute, in that the length attribute counts
> + the number of bytes.  With prefixed instructions, we don't want to 
> count a
> + prefixed instruction (length 12 bytes including possible NOP) as taking 
> 3
> + instructions, but just one.  */
> +  if (length >= 12 && get_attr_prefixed (insn) == PREFIXED_YES)
> +{
> +  /* Single prefixed instruction.  */
> +  if (length == 12)
> + n = 1;
> +
> +  /* A normal instruction and a prefixed instruction (16) or two back
> +  to back prefixed instructions (20).  */
> +  else if (length == 16 || length == 20)
> + n = 2;
> +
> +  /* Guess for larger instruction sizes.  */
> +  else
> + n = 2 + (length - 20) / 4;

That's a pretty bad estimate.

Can you look at non_prefixed_size, will that help?


Segher


C++ PATCH to add test for c++/92070

2019-10-11 Thread Marek Polacek
Also fixed by r276906.

Tested on x86_64-linux, applying to trunk.

2019-10-11  Marek Polacek  

PR c++/92070 - bogus error with -fchecking=2.
* g++.dg/expr/cond17.C: New test.

diff --git gcc/testsuite/g++.dg/expr/cond17.C gcc/testsuite/g++.dg/expr/cond17.C
new file mode 100644
index 000..1999c376dd1
--- /dev/null
+++ gcc/testsuite/g++.dg/expr/cond17.C
@@ -0,0 +1,11 @@
+// PR c++/92070 - bogus error with -fchecking=2.
+// { dg-additional-options "-fchecking=2" }
+
+struct a;
+struct b {
+  static a c();
+};
+struct a : b {};
+template  struct d {
+  void e() { 0 ? b() : b::c(); }
+};


C++ PATCH for c++/92062 - ODR-use ignored for static member of class template

2019-10-11 Thread Marek Polacek
has_value_dependent_address wasn't stripping location wrappers so it
gave the wrong answer for "" in the static_assert.  That led us to
thinking that the expression isn't instantiation-dependent, and we
skipped static initialization of A<0>::x.

This patch adds stripping so that has_value_dependent_address gives the
same answer as it used to before the location wrappers addition.

Bootstrapped/regtested on x86_64-linux, ok for trunk and 9?

2019-10-11  Marek Polacek  

PR c++/92062 - ODR-use ignored for static member of class template.
* pt.c (has_value_dependent_address): Strip location wrappers.

* g++.dg/cpp0x/constexpr-odr1.C: New test.
* g++.dg/cpp0x/constexpr-odr2.C: New test.

diff --git gcc/cp/pt.c gcc/cp/pt.c
index 84464436991..521d0c56002 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -6542,6 +6542,8 @@ check_valid_ptrmem_cst_expr (tree type, tree expr,
 static bool
 has_value_dependent_address (tree op)
 {
+  STRIP_ANY_LOCATION_WRAPPER (op);
+
   /* We could use get_inner_reference here, but there's no need;
  this is only relevant for template non-type arguments, which
  can only be expressed as   */
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C
new file mode 100644
index 000..cf3f95f0565
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-odr1.C
@@ -0,0 +1,19 @@
+// PR c++/92062 - ODR-use ignored for static member of class template.
+// { dg-do run { target c++11 } }
+
+template struct A {
+  static const bool x;
+  static_assert(, ""); // odr-uses A<...>::x
+};
+
+int g;
+
+template
+const bool A::x = (g = 42, false);
+
+void f(A<0>) {}// A<0> must be complete, so is instantiated
+int main()
+{
+  if (g != 42)
+__builtin_abort ();
+}
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C
new file mode 100644
index 000..0927488e569
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-odr2.C
@@ -0,0 +1,19 @@
+// PR c++/92062 - ODR-use ignored for static member of class template.
+// { dg-do run { target c++11 } }
+
+template struct A {
+  static const bool x;
+  enum { force_instantiation =! }; // odr-uses A<...>::x
+};
+
+int g;
+
+template
+const bool A::x = (g = 42, false);
+
+void f(A<0>) {}// A<0> must be complete, so is instantiated
+int main()
+{
+  if (g != 42)
+__builtin_abort ();
+}


Meet Visitors of IOT Solutions World Congress 2019

2019-10-11 Thread Darby Howe
Greeting,

I was wondering if you be interested in acquiring the list of visitors who
are going to attend the show IOT Solutions World Congress on BARCELONA -
29-31 OCTOBER 2019
Total visitors' counts: 16,000

Each record in the list contains:

Company name, contact name, job titles, phone number, verified e-mail
address, Industry, etc.

Please let me know if you are interested in purchasing the list of 16,000
contacts so that I can send you pricing for review

Awaiting your response

Best regards
Darby Howe


To be removed from my emails, please reply STOP.

 



Re: C++ PATCH for c++/92049 - extra error with -fchecking=2

2019-10-11 Thread Jason Merrill

On 10/10/19 8:08 PM, Marek Polacek wrote:

The concepts merge brought this bit

@@ -26326,9 +26559,9 @@ build_non_dependent_expr (tree expr)
  unexpected recursive instantiations.  */
&& !parsing_nsdmi ()
/* Don't do this during concept expansion either and for
- the same reason.  */
-  && !expanding_concept ())
-fold_non_dependent_expr (expr, tf_none);
+the same reason.  */
+  && !parsing_constraint_expression_p ())
+fold_non_dependent_expr (expr);


Weird.



STRIP_ANY_LOCATION_WRAPPER (expr);

(which I'm not finding in the ChangeLog).  Dropping tf_none means that
fold_non_dependent_expr will use tf_warning_or_error by default, and in
this test that causes an error:

   template struct cond;

   template struct S {
 void f(int i) {
   cond<__builtin_constant_p(i)>();
 }
   };

   S<1> s;

where it complains that cond is incomplete.  Which it is, but we're not
actually instantiating the function f, so we need not issue an error.  It's
broken a bunch of tests if --enable-checking=extra is in effect.  This patch
brings that tf_none back.  We will still complain if we do instantiate f.

Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK, thanks.

Jason


Re: [PATCH] Fix constexpr-dtor3.C FAIL on arm

2019-10-11 Thread Jason Merrill

On 10/11/19 4:20 AM, Jakub Jelinek wrote:

On Thu, Oct 10, 2019 at 05:38:21PM -0400, Jason Merrill wrote:

FAIL: g++.dg/cpp2a/constexpr-dtor3.C  -std=c++2a (test for excess errors)
Excess errors:
/gcc/testsuite/g++.dg/cpp2a/constexpr-dtor3.C:152:12:   in 'constexpr'
expansion of '(& w13)->W7::~W7()'



This also seems unrelated, but I'll take a look.


This is solely a location_t difference, comparing x86_64 and arm output
  constexpr-dtor3.C: At global scope:
  constexpr-dtor3.C:156:23:   in ‘constexpr’ expansion of ‘f4()’
-constexpr-dtor3.C:156:24:   in ‘constexpr’ expansion of ‘(& w13)->W7::~W7()’
+constexpr-dtor3.C:152:12:   in ‘constexpr’ expansion of ‘(& w13)->W7::~W7()’
  constexpr-dtor3.C:88:34: error: inline assembly is not a constant expression
 88 |   constexpr ~W7 () { if (w == 5) asm (""); w = 3; } // { dg-error "inline 
assembly is not a constant expression" }
|  ^~~
  constexpr-dtor3.C:88:34: note: only unevaluated inline assembly is allowed in 
a ‘constexpr’ function in C++2a

constexpr int
f4 ()
{
   W7 w13 = 5;
   return 0;
}

constexpr int x4 = f4 ();   // { dg-message "in 'constexpr' expansion of" }

Line 152 is the W7 w13 = 5; line, line 156 the x4 = f4 () line.
 From warning POV, the arm locations are nicer, but see below.

The difference is in cxx_maybe_build_cleanup:

   /* build_delete sets the location of the destructor call to the
  current location, even though the destructor is going to be
  called later, at the end of the current scope.  This can lead to
  a "jumpy" behavior for users of debuggers when they step around
  the end of the block.  So let's unset the location of the
  destructor call instead.  */
   protected_set_expr_location (cleanup, UNKNOWN_LOCATION);

On x86_64 and most other targets, cleanup here (if non-NULL) is the
CALL_EXPR, as destructor return type is void, but on arm, as the dtor return
type is some pointer, the CALL_EXPR is wrapped into a NOP_EXPR to void.
protected_set_expr_location then on x86_64 clears the CALL_EXPR location,
but on arm only NOP_EXPR location.

The following patch (totally untested) should fix that.

For the warning location, perhaps we could special case destructor calls
in push_cx_call_context (to offset the intentional clearing of location for
debugging purposes), if they don't have location set, don't use
input_location for them, but try to pick DECL_SOURCE_LOCATION for the
variable being destructed?


Expanding the CLEANUP_EXPR of a CLEANUP_STMT could use the EXPR_LOCATION 
of the CLEANUP_STMT.  Or the EXPR_LOCATION of *jump_target, if suitable.


Jason


Attendees of IOT Solutions World Congress 2019

2019-10-11 Thread Nadia Dach
Greeting,

I was wondering if you be interested in acquiring the list of visitors who
are going to attend the show IOT Solutions World Congress on BARCELONA -
29-31 OCTOBER 2019
Total visitors' counts: 16,000

Each record in the list contains:

Company name, contact name, job titles, phone number, verified e-mail
address, Industry, etc.

Please let me know if you are interested in purchasing the list of 16,000
contacts so that I can send you pricing for review

Awaiting your response

Best regards
Nadia Dach


To be removed from my emails, please reply STOP.

 



Re: [PATCH][ARM] Tweak HONOR_REG_ALLOC_ORDER

2019-10-11 Thread Ramana Radhakrishnan
On Fri, Oct 11, 2019 at 3:52 PM Wilco Dijkstra  wrote:
>
> Hi Ramana,
>
> > My only question would be whether it's more suitable to use
> > optimize_function_for_size_p(cfun) instead as IIRC that gives us a
> > chance with lto rather than the global optimize_size.
>
> Yes that is even better and that defaults to optimize_size if cfun isn't
> set. I've committed this:
>
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 
> 8b67c9c3657b312be223ab60c01969958baa9ed3..5fad1e5bcc2bc448489fdc8239c676246bbc8879
>  100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1068,9 +1068,8 @@ extern int arm_regs_in_sequence[];
>  /* Use different register alloc ordering for Thumb.  */
>  #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc ()
>
> -/* Tell IRA to use the order we define rather than messing it up with its
> -   own cost calculations.  */
> -#define HONOR_REG_ALLOC_ORDER 1
> +/* Tell IRA to use the order we define when optimizing for size.  */
> +#define HONOR_REG_ALLOC_ORDER optimize_function_for_size_p (cfun)

I'd be happy with a patch that goes and looks at more such uses in the
backend in your copious free time. hint hint.

R

>
>  /* Interrupt functions can only use registers that have already been
> saved by the prologue, even if they would normally be


Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2

2019-10-11 Thread Ramana Radhakrishnan
On Fri, Oct 11, 2019 at 5:17 PM Wilco Dijkstra  wrote:
>
> Hi Ramana,
>
> >On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra  wrote:
> >>
> >> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting 
> >> in
> >> inefficient code.  Since Thumb-2 supports similar address offsets use the 
> >> Arm
> >> legitimization code for Thumb-2 to get significant codesize and performance
> >> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 
> >> 0.2%.
> >
> > What were the sort of code size gains  ? It did end up piquing my
> > curiosity as to how we missed something so basic.  For instance ldr
>
> h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB
> smaller (~0.1%).
>
> > r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
> >  there was an illegitimate address given here, would we end up
> > producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
> > Scratching my head a bit , it's late at night.
>
> If the proposed offsets are not legal, GCC tries something different that is
> legal, hence there is no requirement to only propose legal splits. In any
> case we don't optimize the negative range even on Arm - the offsets are
> always split into 4KB ranges (not 8KB as would be possible).
>
> The negative offset splitting doesn't appear to have changed on Thumb-2
> so there is apparently a different mechanism that deals with negative offsets.
> So only positive offsets are improved with my patch:
>
> int f(int *p) { return p[1025] + p[1026]; }
>
> before:
> movwr3, #4104
> movwr2, #4100
> ldr r2, [r0, r2]
> ldr r0, [r0, r3]
> add r0, r0, r2
> bx  lr
>
> after:
> add r3, r0, #4096
> ldrdr0, r3, [r3, #4]
> add r0, r0, r3
> bx  lr
>
> > Orthogonally it looks like you can clean up the MINUS handling here
> > and in legitimate_address_p , I'm not sure what the status of LRA with
> > MINUS is either and thus we should now look to clean this up or look
> > to turn this on and see what happens. However that's a subject of a
> > future patch.
>
> Well there are lots of cases that aren't handled correctly or optimally
> yet - I'd copy the way I wrote aarch64_legitimize_address_displacement,
> but that's for a future patch indeed.
>
> > For the record, bootstrap with Thumb2  presumably and the testruns were 
> > clean ?
>
> Yes, at the time I ran them.

Yeah it dropped about 2660 bytes out of a Thumb2 bootstrap - so it
seems worth while.

Ok please apply but as usual watch out for any fallout from this.

regards
Ramana
>
> Cheers,
> Wilco


[C++ PATCH] Preserve the location of explicitly defaulted functions.

2019-10-11 Thread Jason Merrill
While working on operator<=> I noticed that a diagnostic referring to an
explicitly defaulted function was using the location of the class, rather than
the location of the declaration.  We change DECL_SOURCE_LOCATION of implicitly
declared functions to indicate where they were first needed, but I don't want
to lose the location of actual declarations.

Tested x86_64-pc-linux-gnu, applying to trunk.

* decl2.c (mark_used): Don't clobber DECL_SOURCE_LOCATION on
explicitly defaulted functions.
* method.c (synthesize_method): Likewise.
---
 gcc/cp/decl2.c  | 3 ++-
 gcc/cp/method.c | 7 ---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index ee198cdf5ce..6d5e973b487 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -5588,7 +5588,8 @@ mark_used (tree decl, tsubst_flags_t complain)
   /* Remember the current location for a function we will end up
 synthesizing.  Then we can inform the user where it was
 required in the case of error.  */
-  DECL_SOURCE_LOCATION (decl) = input_location;
+  if (DECL_ARTIFICIAL (decl))
+   DECL_SOURCE_LOCATION (decl) = input_location;
 
   /* Synthesizing an implicitly defined member function will result in
 garbage collection.  We must treat this situation as if we were
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 01bf534aef2..73a01147ff9 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -892,7 +892,7 @@ synthesize_method (tree fndecl)
 
   /* Reset the source location, we might have been previously
  deferred, and thus have saved where we were first needed.  */
-  if (!DECL_INHERITED_CTOR (fndecl))
+  if (DECL_ARTIFICIAL (fndecl) && !DECL_INHERITED_CTOR (fndecl))
 DECL_SOURCE_LOCATION (fndecl)
   = DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT (fndecl)));
 
@@ -953,8 +953,9 @@ synthesize_method (tree fndecl)
   pop_deferring_access_checks ();
 
   if (error_count != errorcount || warning_count != warningcount + werrorcount)
-inform (input_location, "synthesized method %qD first required here",
-   fndecl);
+if (DECL_ARTIFICIAL (fndecl))
+  inform (input_location, "synthesized method %qD first required here",
+ fndecl);
 }
 
 /* Build a reference to type TYPE with cv-quals QUALS, which is an

base-commit: 6dc6cd99d8bb4313585a1d9b93fc7a43fb2016a9
-- 
2.21.0



Re: Type representation in CTF and DWARF

2019-10-11 Thread Indu Bhagat




On 10/11/2019 04:23 AM, Richard Biener wrote:

Thanks for your pointers.

CTF does not encode location information. So, I used early exit in the
add_src_coords_attributes to avoid generation of location info (file, line,
column). To answer Richard's question, CTF does have type debug info
for function declarations and the argument types. So I think with these
changes, both CTF and DWARF generation will emit debug info for the same set of
types and decl.

Compile with -g -gdwarf-like-ctf and use dwz -o   (using
dwz compiled from the master branch) on the generated binaries:

(coreutils-0.22)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
ls   30616   |1136   |21098   | 26240   
| 0.62
pwd  10734   |788|10433   | 13929   
| 0.83
groups 10706 |811|10249   | 13378   
| 0.80

(emacs-26.3)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
emacs-26.3.1 674657  |6402   |   273963   |   273910
| 0.33

I chose to account for 50% of .debug_str because at this point, it will be
unfair to not account for them. Actually, one could even argue that upto 70%
of the .debug_str are names of entities. CTF section sizes do include the CTF
string tables.

Across coreutils, I see a geomean of 0.73 (ratio of
.ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
"-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
footprint than CTF (with 50% of .debug_str accounted for).

I'm not convinced this "improvement" in size is worth maintainig another
debug-info format much less since it lacks desirable features right now
and thus evaluation is tricky.

At least you can improve dwarf size considerably with a low amount of work.

I suspect another factor where dwarf is bigger compared to CTF is that dwarf
is recording typedef names as well as qualified type variants.  But maybe
CTF just has a more compact representation for the bits it actually implements.

Richard.


CTF represents typedefs and qualified type variants. They are included in the
the .ctf section sizes above.

Indu



Re: [PATCH] PR fortran/89943 -- Duplicate BIND(c) allowed (?)

2019-10-11 Thread Steve Kargl
PING.

On Fri, Oct 04, 2019 at 03:26:53PM -0700, Steve Kargl wrote:
> The attached patch allows the declaration of a BIND(C)
> module function or module subroutine to appear in a
> submodule (see testcases).  Regression test was clean.
> OK to commit?
> 
> Before a rubber stamped 'OK'.  I do NOT use submodules.
> A submodule user needs to pipe up on the validity of
> the patch.
> 
> 
> 2019-10-04  Steven G. Kargl  
> 
>   PR fortran/89943
>   decl.c (gfc_match_function_decl): Ignore duplicate BIND(C) for function
>   declaration in submodule.
>   (gfc_match_entry): Use temporary for locus, which allows removal of
>   one gfc_error_now().
>   (gfc_match_subroutine): Ignore duplicate BIND(C) for subroutine
>   declaration in submodule.
> 
> 2019-10-04  Steven G. Kargl  
> 
>   PR fortran/89943
>   * gfortran.dg/pr89943_1.f90: New test.
>   * gfortran.dg/pr89943_2.f90: Ditto.
> 
> -- 
> Steve

> Index: gcc/fortran/decl.c
> ===
> --- gcc/fortran/decl.c(revision 276601)
> +++ gcc/fortran/decl.c(working copy)
> @@ -7259,13 +7259,16 @@ gfc_match_function_decl (void)
>if (sym->attr.is_bind_c == 1)
>  {
>sym->attr.is_bind_c = 0;
> -  if (sym->old_symbol != NULL)
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks",
> -   &(sym->old_symbol->declared_at));
> -  else
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks", _current_locus);
> +
> +  if (gfc_state_stack->previous
> +   && gfc_state_stack->previous->state != COMP_SUBMODULE)
> + {
> +   locus loc;
> +   loc = sym->old_symbol != NULL
> + ? sym->old_symbol->declared_at : gfc_current_locus;
> +   gfc_error_now ("BIND(C) attribute at %L can only be used for "
> +  "variables or common blocks", );
> + }
>  }
>  
>if (found_match != MATCH_YES)
> @@ -7517,16 +7520,16 @@ gfc_match_entry (void)
>   not allowed for procedures.  */
>if (entry->attr.is_bind_c == 1)
>  {
> +  locus loc;
> +
>entry->attr.is_bind_c = 0;
> -  if (entry->old_symbol != NULL)
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks",
> -   &(entry->old_symbol->declared_at));
> -  else
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks", _current_locus);
> -}
>  
> +  loc = entry->old_symbol != NULL
> + ? entry->old_symbol->declared_at : gfc_current_locus; 
> +  gfc_error_now ("BIND(C) attribute at %L can only be used for "
> +  "variables or common blocks", );
> + }
> +
>/* Check what next non-whitespace character is so we can tell if there
>   is the required parens if we have a BIND(C).  */
>old_loc = gfc_current_locus;
> @@ -7725,13 +7728,16 @@ gfc_match_subroutine (void)
>if (sym->attr.is_bind_c == 1)
>  {
>sym->attr.is_bind_c = 0;
> -  if (sym->old_symbol != NULL)
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks",
> -   &(sym->old_symbol->declared_at));
> -  else
> -gfc_error_now ("BIND(C) attribute at %L can only be used for "
> -   "variables or common blocks", _current_locus);
> +
> +  if (gfc_state_stack->previous
> +   && gfc_state_stack->previous->state != COMP_SUBMODULE)
> + {
> +   locus loc;
> +   loc = sym->old_symbol != NULL
> + ? sym->old_symbol->declared_at : gfc_current_locus;
> +   gfc_error_now ("BIND(C) attribute at %L can only be used for "
> +  "variables or common blocks", );
> + }
>  }
>  
>/* C binding names are not allowed for internal procedures.  */
> Index: gcc/testsuite/gfortran.dg/pr89943_1.f90
> ===
> --- gcc/testsuite/gfortran.dg/pr89943_1.f90   (nonexistent)
> +++ gcc/testsuite/gfortran.dg/pr89943_1.f90   (working copy)
> @@ -0,0 +1,31 @@
> +! { dg-do compile }
> +! PR fortran/89943
> +! Code contributed by Alberto Luaces  
> +module Foo_mod
> +
> +   implicit none
> +
> +   interface
> +  module subroutine runFoo4C(ndim) bind(C, name="runFoo")
> + use, intrinsic :: iso_c_binding
> + implicit none
> + integer(c_int32_t) , intent(in) :: ndim
> +  end subroutine runFoo4C
> +   end interface
> +
> +   contains
> +
> +end module Foo_mod
> +
> +submodule(Foo_mod) Foo_smod
> +
> +   contains
> +
> +  module subroutine runFoo4C(ndim) bind(C, name="runFoo")
> + use, intrinsic :: iso_c_binding
> + implicit none
> + 

Re: [PATCH][AArch64] Fix symbol offset limit

2019-10-11 Thread Wilco Dijkstra
Hi Richard,

> If global_char really is a char then isn't that UB?  

No why? We can do all kinds of arithmetic based on pointers, either using
pointer types or converted to uintptr_t. Note that the optimizer actually 
creates
these expressions, for example arr[N-x] can be evaluated as ([0] + N) - x.
So this remains legal even if N is well outside the bounds of the array.

> I guess we still have to compile it without error though...

*THIS* a million times... Any non-trivial application relies on UB behaviour 
with
100% certainty. But we still have to compile it correctly!

>> To avoid this, limit the offset to +/-1MB so that the symbol needs to be 
>> within a
>> 3.9GB offset from its references.  For the tiny code model use a 64KB 
>> offset, allowing
>> most of the 1MB range for code/data between the symbol and its 
>> references.
>
> These new values seem a bit magical.  Would it work for the original
> testcase to use FORCE_TO_MEM if !offset_within_block_p (x, offset)?

No - the testcases fail with that. It also reduces codequality by not allowing
commonly used offsets as part of the symbol relocation.

So how would you want to make the offsets less magical? There isn't a clear 
limitation
like for MOVW/MOVT relocations (which are limited to +-32KB), so it just is an 
arbitrary
decision which allows this optimization for all frequently used offsets but 
avoids relocation
failures caused by large offsets. The values I used were chosen so that 
codequality isn't
affected, but it is practically impossible to trigger relocation errors.

And they can always be changed again if required - this is not meant to be the 
final
AArch64 patch ever!

Cheers,
Wilco 

Re: [Darwin, machopic 4/n, committed] Arrange to indirect IVARs when needed.

2019-10-11 Thread Mike Stump
On Oct 10, 2019, at 12:25 PM, Iain Sandoe  wrote:
> 
> Objective C V2 (m64) IVAR offset refs from Apple GCC-4.x have an indirection
> for m64 code on PPC (which is the only 64b user for Mach-O PIC).  Apple GCC
> 4.x places the indirections in the .data section, however this seems to have
> been unintentional - and we are placing the indirections in the non-lazy
> symbol pointers section as usual.
> 
> If Mike can recall any reason that they should be in the .data section, we can
> revise that - but testing hasn’t revealed any issues so far.

Don't recall any reason.

Support _Decimal* keywords for C2x

2019-10-11 Thread Joseph Myers
ISO C2x adds decimal floating point as an optional standard feature.
This patch accordingly makes GCC accept the _Decimal* keywords in
strict C2x mode, using pedwarn_c11 to get a warning for
-Wc11-c2x-compat.  (Constants, where the pedwarn is in libcpp, will be
dealt with separately.)

The _Decimal* keywords are marked with D_EXT, meaning they are not
considered keywords at all in strict conformance modes.  This is
contrary to the normal practice for most implementation-namespace
keywords, which are accepted in all standards modes but with
appropriate pedwarns for older standards.  This patch removes D_EXT
from those keywords so they are accepted as keywords for all
standards, consequently removing the gcc.dg/dfp/keywords-ignored-c99.c
test that is no longer valid.

(A new D_C2X for keywords will still be needed if any new keywords get
added that aren't in the implementation namespace for older standards;
there are proposals for such keywords, albeit as predefined macros
that might not actually need new keywords in the compiler in all
cases.  If the DFP keywords end up as decimal32 etc., of course
appropriate compiler and testcase changes will be needed, and a
version of keywords-ignored-c99.c would make sense again with such
spellings.)

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c:
2019-10-11  Joseph Myers  

* c-decl.c (declspecs_add_type): Use pedwarn_c11 for DFP types.

gcc/c-family:
2019-10-11  Joseph Myers  

* c-common.c (c_common_reswords): Do not use D_EXT for _Decimal32,
_Decimal64 and _Decimal128.

gcc/testsuite:
2019-10-11  Joseph Myers  

* gcc.dg/dfp/c11-keywords-1.c, gcc.dg/dfp/c11-keywords-2.c,
gcc.dg/dfp/c2x-keywords-1.c, gcc.dg/dfp/c2x-keywords-2.c: New
tests.
* gcc.dg/dfp/keywords-ignored-c99.c: Remove test.
* gcc.dg/dfp/constants-c99.c, gcc.dg/dfp/keywords-c89.c,
gcc.dg/dfp/keywords-c99.c: Use -pedantic-errors.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  (revision 276895)
+++ gcc/c/c-decl.c  (working copy)
@@ -10959,8 +10959,9 @@ declspecs_add_type (location_t loc, struct c_decls
error_at (loc,
  ("decimal floating-point not supported "
   "for this target"));
- pedwarn (loc, OPT_Wpedantic,
-  "ISO C does not support decimal floating-point");
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "ISO C does not support decimal floating-point "
+  "before C2X");
  return specs;
case RID_FRACT:
case RID_ACCUM:
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 276895)
+++ gcc/c-family/c-common.c (working copy)
@@ -351,9 +351,9 @@ const struct c_common_resword c_common_reswords[]
   { "_Float32x",RID_FLOAT32X,  D_CONLY },
   { "_Float64x",RID_FLOAT64X,  D_CONLY },
   { "_Float128x",   RID_FLOAT128X, D_CONLY },
-  { "_Decimal32",   RID_DFLOAT32,  D_CONLY | D_EXT },
-  { "_Decimal64",   RID_DFLOAT64,  D_CONLY | D_EXT },
-  { "_Decimal128",  RID_DFLOAT128, D_CONLY | D_EXT },
+  { "_Decimal32",   RID_DFLOAT32,  D_CONLY },
+  { "_Decimal64",   RID_DFLOAT64,  D_CONLY },
+  { "_Decimal128",  RID_DFLOAT128, D_CONLY },
   { "_Fract",   RID_FRACT, D_CONLY | D_EXT },
   { "_Accum",   RID_ACCUM, D_CONLY | D_EXT },
   { "_Sat", RID_SAT,   D_CONLY | D_EXT },
Index: gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c
===
--- gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c11-keywords-1.c   (working copy)
@@ -0,0 +1,7 @@
+/* Test that _Decimal* keywords diagnosed in C11 mode: -pedantic.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic" } */
+
+_Decimal32 d32; /* { dg-warning "ISO C does not support" } */
+_Decimal64 d64; /* { dg-warning "ISO C does not support" } */
+_Decimal128 d128; /* { dg-warning "ISO C does not support" } */
Index: gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c
===
--- gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c   (nonexistent)
+++ gcc/testsuite/gcc.dg/dfp/c11-keywords-2.c   (working copy)
@@ -0,0 +1,7 @@
+/* Test that _Decimal* keywords diagnosed in C11 mode: -pedantic-errors.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+_Decimal32 d32; /* { dg-error "ISO C does not support" } */
+_Decimal64 d64; /* { dg-error "ISO C does not support" } */
+_Decimal128 d128; /* { dg-error "ISO C does not support" } */
Index: gcc/testsuite/gcc.dg/dfp/c2x-keywords-1.c
===
--- 

Re: [PATCH][ARM] Switch to default sched pressure algorithm

2019-10-11 Thread Wilco Dijkstra
Hi Ramana, 

> Can you see what happens with the Cortex-A8 or Cortex-A9 schedulers to
> spread the range across some v7-a CPUs as well ? While they aren't that 
> popular today I
> would suggest you look at them because the defaults for v7-a are still to use 
> the
> Cortex-A8 scheduler and the Cortex-A9 scheduler might well also get used in 
> places given
> the availability of hardware.

The results are practically identical to Cortex-A53 and A57 - there is a huge 
codesize win
across the board on SPEC2006, there isn't a single benchmark that is larger 
(ie. more
spilling).

> I'd be happy to move this forward if you could show if there is no *increase* 
> in spills
> for the same range of benchmarks that you are doing for the Cortex-A8 and 
> Cortex-A9
> schedulers.

There certainly isn't. I don't think results like these could be any more 
one-sided, it's a
significant win for every single benchmark, both for codesize and performance!

What isn't clear is whether something has gone horribly wrong in the scheduler 
which
could be fixed/reverted, but as it is right now I can't see it being useful at 
all. This means
we should also reevaluate whether pressure scheduling now hurts AArch64 too.

Cheers,
Wilco

New Finnish PO file for 'gcc' (version 9.1.0)

2019-10-11 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Finnish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/fi.po

(This file, 'gcc-9.1.0.fi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] Fix PR c++/92024

2019-10-11 Thread Bernd Edlinger
On 10/11/19 6:31 PM, Jason Merrill wrote:
> On 10/10/19 2:06 PM, Bernd Edlinger wrote:
>> On 10/10/19 7:49 PM, Jason Merrill wrote:
>>> On 10/10/19 10:42 AM, Bernd Edlinger wrote:
 Hi,

 this fixes a crash when -Wshadow=compatible-local is
 enabled in the testcase g++.dg/parse/crash68.C
>>>
>>> Why does that flag cause this crash?
>>>
>>
>> gcc/cp/name-lookup.c:
>>
>>    if (warn_shadow)
>>  warning_code = OPT_Wshadow;
>>    else if (warn_shadow_local)
>>  warning_code = OPT_Wshadow_local;
>>    else if (warn_shadow_compatible_local
>>     && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl))
>>     || (!dependent_type_p (TREE_TYPE (decl))
>>     && !dependent_type_p (TREE_TYPE (old))
>>     /* If the new decl uses auto, we don't yet know
>>    its type (the old type cannot be using auto
>>    at this point, without also being
>>    dependent).  This is an indication we're
>>    (now) doing the shadow checking too
>>    early.  */
>>     && !type_uses_auto (TREE_TYPE (decl))
>>     && can_convert (TREE_TYPE (old), TREE_TYPE (decl),
>>     tf_none
>>  warning_code = OPT_Wshadow_compatible_local;
>>
>> if -Wshadow=compatible-local is used, the can_convert function crashes
>> in instantiate_class_template_1.
> 
> Right, checking can_convert is problematic here, as it can cause template 
> instantiations that change the semantics of the program.  Or, in this case, 
> crash.
> 

Ah, alright.

I think shadowing one type with another type of the same name is always a no no.
How about always warning (and not asking for can_convert) when either decl is a 
TYPE_DECL?

like this:

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c(Revision 276886)
+++ gcc/cp/name-lookup.c(Arbeitskopie)
@@ -2741,8 +2741,7 @@
  return;
}
 
-  /* If '-Wshadow=compatible-local' is specified without other
--Wshadow= flags, we will warn only when the type of the
+  /* -Wshadow=compatible-local warns only when the type of the
 shadowing variable (DECL) can be converted to that of the
 shadowed parameter (OLD_LOCAL). The reason why we only check
 if DECL's type can be converted to OLD_LOCAL's type (but not the
@@ -2750,29 +2749,29 @@
 parameter, more than often they would use the variable
 thinking (mistakenly) it's still the parameter. It would be
 rare that users would use the variable in the place that
-expects the parameter but thinking it's a new decl.  */
+expects the parameter but thinking it's a new decl.
+If either object is a TYPE_DECL -Wshadow=compatible-local
+warns regardless of whether one of the types involved
+is a subclass of the other, since that is never okay.  */
 
   enum opt_code warning_code;
-  if (warn_shadow)
-   warning_code = OPT_Wshadow;
-  else if (warn_shadow_local)
-   warning_code = OPT_Wshadow_local;
-  else if (warn_shadow_compatible_local
-  && (same_type_p (TREE_TYPE (old), TREE_TYPE (decl))
-  || (!dependent_type_p (TREE_TYPE (decl))
-  && !dependent_type_p (TREE_TYPE (old))
-  /* If the new decl uses auto, we don't yet know
- its type (the old type cannot be using auto
- at this point, without also being
- dependent).  This is an indication we're
- (now) doing the shadow checking too
- early.  */
-  && !type_uses_auto (TREE_TYPE (decl))
-  && can_convert (TREE_TYPE (old), TREE_TYPE (decl),
-  tf_none
+  if (same_type_p (TREE_TYPE (old), TREE_TYPE (decl))
+ || TREE_CODE (decl) == TYPE_DECL
+ || TREE_CODE (old) == TYPE_DECL
+ || (!dependent_type_p (TREE_TYPE (decl))
+ && !dependent_type_p (TREE_TYPE (old))
+ /* If the new decl uses auto, we don't yet know
+its type (the old type cannot be using auto
+at this point, without also being
+dependent).  This is an indication we're
+(now) doing the shadow checking too
+early.  */
+ && !type_uses_auto (TREE_TYPE (decl))
+ && can_convert (TREE_TYPE (old), TREE_TYPE (decl),
+ tf_none)))
warning_code = OPT_Wshadow_compatible_local;
   else
-   return;
+   warning_code = OPT_Wshadow_local;
 
   const char *msg;
   if (TREE_CODE (old) == PARM_DECL)


Bernd.


Re: [C++ Patch] Remove RROTATE_EXPR and LROTATE_EXPR handling

2019-10-11 Thread Paolo Carlini

Hi,

On 11/10/19 18:25, Jason Merrill wrote:

On 10/10/19 2:33 PM, Paolo Carlini wrote:

Hi,

On 10/10/19 19:57, Jason Merrill wrote:

On 10/10/19 1:53 PM, Jakub Jelinek wrote:

On Thu, Oct 10, 2019 at 06:12:02PM +0200, Paolo Carlini wrote:
while working on cp_build_binary_op I noticed that the testsuite 
wasn't
exercising the warnings in case RROTATE_EXPR / LROTATE_EXPR, even 
more the
code handling those tree codes seemed completely unused. Turned 
out that the

C front-end doesn't handle those tree codes at all: I'm coming to the
conclusion that the C++ front-end bits too are now obsolete and 
may be
removed, because only the middle-end generates those codes in 
order to
implement optimizations. Anything I'm missing? Any additional 
testing?


I guess it depends on where.
fold_binary_loc certainly has code to create {L,R}ROTATE_EXPR,
just look at
unsigned foo (unsigned x)
{
   return (x << 3) + (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - 3));
}

unsigned bar (unsigned x, unsigned y)
{
   return (x << y) | (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - y));
}
and the *.original dump.
The cp_build_binary_op case is unlikely to ever trigger, unless 
we'd rerun

it on cp_folded trees.
cxx_eval_constant_expression is unlikely, because recently we've 
switched to
performing constexpr evaluation on pre-cp_folded bodies, not sure 
if we

never encounter folded trees though.
cp_fold itself depends on whether we ever reprocess the already folded
trees, I'd be afraid we could.


True, and the failure mode there is silent.  Let's leave the codes 
in that switch.


Ok, thanks Jason and Jakub for the additional information.

While I give this more thought and maybe manage to come up with a 
testcase triggering the warning, shall we simply pass the location to 
those two warnings too - cannot hurt, AFAICS?


I meant let's omit the changes to cp_fold, the rest of the patch is 
still OK.


Nice, thanks, I'll do that.

Paolo.



[PATCH] S/390: Run %a0:DI splitters only after reload

2019-10-11 Thread Ilya Leoshkevich
Bootstrapped and regtested on s390x-redhat-linux.

gcc/ChangeLog:

2019-10-10  Ilya Leoshkevich  

* config/s390/s390.md: Run %a0:DI splitters only after reload.

gcc/testsuite/ChangeLog:

2019-10-10  Ilya Leoshkevich  

* gcc.target/s390/load-thread-pointer-once.c: New test.
---
 gcc/config/s390/s390.md | 13 ++---
 .../gcc.target/s390/load-thread-pointer-once.c  | 12 
 2 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 4f7bde6616b..1e6439d5fd6 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1860,10 +1860,17 @@
   *,*,yes")
 ])
 
+; Splitters for loading/storing TLS pointers from/to %a0:DI.
+; Do this only during split2, which runs after reload. At the point when split1
+; runs, some of %a0:DI occurrences might be nested inside other rtxes and thus
+; not matched. As a result, only some occurrences will be split, which will
+; prevent CSE. At the point when split2 runs, reload will have ensured that no
+; nested references exist.
+
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 (match_operand:DI 1 "register_operand" ""))]
-  "TARGET_ZARCH && ACCESS_REG_P (operands[1])"
+  "TARGET_ZARCH && ACCESS_REG_P (operands[1]) && reload_completed"
   [(set (match_dup 2) (match_dup 3))
(set (match_dup 0) (ashift:DI (match_dup 0) (const_int 32)))
(set (strict_low_part (match_dup 2)) (match_dup 4))]
@@ -1873,7 +1880,7 @@
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 (match_operand:DI 1 "register_operand" ""))]
-  "TARGET_ZARCH && ACCESS_REG_P (operands[0])
+  "TARGET_ZARCH && ACCESS_REG_P (operands[0]) && reload_completed
&& dead_or_set_p (insn, operands[1])"
   [(set (match_dup 3) (match_dup 2))
(set (match_dup 1) (lshiftrt:DI (match_dup 1) (const_int 32)))
@@ -1884,7 +1891,7 @@
 (define_split
   [(set (match_operand:DI 0 "register_operand" "")
 (match_operand:DI 1 "register_operand" ""))]
-  "TARGET_ZARCH && ACCESS_REG_P (operands[0])
+  "TARGET_ZARCH && ACCESS_REG_P (operands[0]) && reload_completed
&& !dead_or_set_p (insn, operands[1])"
   [(set (match_dup 3) (match_dup 2))
(set (match_dup 1) (rotate:DI (match_dup 1) (const_int 32)))
diff --git a/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c 
b/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c
new file mode 100644
index 000..21a5bfaa732
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/load-thread-pointer-once.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+__thread void *foo;
+
+void *bar()
+{
+  return (foo = __builtin_thread_pointer());
+}
+
+/* { dg-final { scan-assembler-times {\n\tear\t} 2 { target { lp64 } } } } */
+/* { dg-final { scan-assembler-times {\n\tear\t} 1 { target { ! lp64 } } } } */
-- 
2.23.0



Re: [PATCH] bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679)

2019-10-11 Thread Martin Sebor

On 9/10/19 4:35 PM, Jeff Law wrote:

On 9/6/19 1:27 PM, Martin Sebor wrote:

Recent enhancements to -Wstringop-overflow improved the warning
to the point that it detects a superset of the problems -Warray-
bounds is intended detect in character accesses.  Because both
warnings detect overlapping sets of problems, and because the IL
they work with tends to change in subtle ways from target to
targer, tests designed to verify one or the other sometimes fail
with a target where the warning isn't robust enough to detect
the problem given the IL representation.

To reduce these test suite failures the attached patch extends
-Warray-bounds to handle some of the same problems -Wstringop-
overflow does, pecifically, out-of-bounds accesses to array
members of structs, including zero-length arrays and flexible
array members of defined objects.

In the process of testing the enhancement I realized that
the recently added component_size() function doesn't work as
intended for non-character array members (see below).  The patch
corrects this by reverting back to the original implementation
of the function until the better/simpler solution can be put in
place as mentioned below.

Tested on x86_64-linux.

Martin


[*] component_size() happens to work for char arrays because those
are transformed to STRING_CSTs, but not for arrays that are not.
E.g., in

   struct S { int64_t i; int16_t j; int16_t a[]; }
 s = { 0, 0, { 1, 0 } };

unless called with type set to int16_t[2], fold_ctor_reference
will return s.a[0] rather than all of s.a.  But set type to
int16_t[2] we would need to know that s.a's initializer has two
elements, and that's just what we're using fold_ctor_reference
to find out.

I think this could probably be made to work somehow by extending
useless_type_conversion_p to handle this case as special somehow,
but it doesn't seem worth the effort given that there should be
an easier way to do it as you noted below.

Given the above, the long term solution should be to rely on
DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard
suggested in the review of its initial implementation.
Unfortunately, because of bugs in both the C and C++ front ends
(I just opened PR 65403 with the details) the simple formula
doesn't give the right answers either.  So until the bugs are
fixed, the patch reverts back to the original loopy solution.
It's no more costly than the current fold_ctor_reference
approach.

...


So no concerns with the patch itself, just the fallout you mentioned in
a follow-up message.  Ideally we'd have glibc and the kernel fixed
before this goes in, but I'd settle for just getting glibc fixed since
we have more influence there.


Half of the issues there were due to a bug in the warning.  The rest
are caused by Glibc's use of interior zero-length arrays to access
subsequent members.  It works in simple cases but it's very brittle
because GCC assumes that even such members don't alias. If it's meant
to be a supported feature then aliasing would have to be changed to
take it into account.  But I'd rather encourage projects to move away
from these dangerous hacks and towards cleaner, safer code.

I've fixed the bug in the attached patch.  The rest can be suppressed
by replacing the zero-length arrays with flexible array members but
that's just trading one misuse for another.  If the code can't be
changed to avoid this (likely not an option since the arrays are in
visible in the public API) I think the best way to deal with them is
to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097
in Glibc Bugzilla to track this.


Out of curiosity are the kernel issues you raised due to flexible arrays
or just cases where we're doing a better job on normal objects?  I'd be
a bit surprised to find flexible arrays in the kernel.


I don't think I've come across any flexible arrays in the kernel.

The patch triggers 94 instances of -Warray-bounds (60 of which
are for distinct code) in 21 .c files.  I haven't looked at all
of them but some of the patterns I noticed are:

1) Intentionally using an interior zero-length array to access
   (e.g., memset) one or more subsequent members. E.g.,
   _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and
   quite a few others.  This is pretty pervasive but seems easily
   avoidable.

2) Overwriting a member array with more data (e.g., function
   cxio_rdev_open in
   drivers/infiniband/hw/cxgb3/cxio_hal.c or in function
   pk_probe in drivers/hid/hid-prodikeys.c).  At first glance
   some of these look like bugs but with stuff obscured by macros
   and no comments it's hard to tell.

3) Uses of the container_of() macro to access one member given
   the address of another.  This is undefined (and again breaks
   the aliasing rules) but the macro is used all over the place
   in the kernel.  I count over 15,000 references to it.

4) Uses of one-element arrays as members of other one-element
   arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant
   to 

Re: [PATCH] Fix PR c++/92024

2019-10-11 Thread Jason Merrill

On 10/10/19 2:06 PM, Bernd Edlinger wrote:

On 10/10/19 7:49 PM, Jason Merrill wrote:

On 10/10/19 10:42 AM, Bernd Edlinger wrote:

Hi,

this fixes a crash when -Wshadow=compatible-local is
enabled in the testcase g++.dg/parse/crash68.C


Why does that flag cause this crash?



gcc/cp/name-lookup.c:

   if (warn_shadow)
 warning_code = OPT_Wshadow;
   else if (warn_shadow_local)
 warning_code = OPT_Wshadow_local;
   else if (warn_shadow_compatible_local
&& (same_type_p (TREE_TYPE (old), TREE_TYPE (decl))
|| (!dependent_type_p (TREE_TYPE (decl))
&& !dependent_type_p (TREE_TYPE (old))
/* If the new decl uses auto, we don't yet know
   its type (the old type cannot be using auto
   at this point, without also being
   dependent).  This is an indication we're
   (now) doing the shadow checking too
   early.  */
&& !type_uses_auto (TREE_TYPE (decl))
&& can_convert (TREE_TYPE (old), TREE_TYPE (decl),
tf_none
 warning_code = OPT_Wshadow_compatible_local;

if -Wshadow=compatible-local is used, the can_convert function crashes
in instantiate_class_template_1.


Right, checking can_convert is problematic here, as it can cause 
template instantiations that change the semantics of the program.  Or, 
in this case, crash.


Jason



Re: [C++ Patch] Remove RROTATE_EXPR and LROTATE_EXPR handling

2019-10-11 Thread Jason Merrill

On 10/10/19 2:33 PM, Paolo Carlini wrote:

Hi,

On 10/10/19 19:57, Jason Merrill wrote:

On 10/10/19 1:53 PM, Jakub Jelinek wrote:

On Thu, Oct 10, 2019 at 06:12:02PM +0200, Paolo Carlini wrote:

while working on cp_build_binary_op I noticed that the testsuite wasn't
exercising the warnings in case RROTATE_EXPR / LROTATE_EXPR, even 
more the
code handling those tree codes seemed completely unused. Turned out 
that the

C front-end doesn't handle those tree codes at all: I'm coming to the
conclusion that the C++ front-end bits too are now obsolete and may be
removed, because only the middle-end generates those codes in order to
implement optimizations. Anything I'm missing? Any additional testing?


I guess it depends on where.
fold_binary_loc certainly has code to create {L,R}ROTATE_EXPR,
just look at
unsigned foo (unsigned x)
{
   return (x << 3) + (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - 3));
}

unsigned bar (unsigned x, unsigned y)
{
   return (x << y) | (x >> (__SIZEOF_INT__ * __CHAR_BIT__ - y));
}
and the *.original dump.
The cp_build_binary_op case is unlikely to ever trigger, unless we'd 
rerun

it on cp_folded trees.
cxx_eval_constant_expression is unlikely, because recently we've 
switched to

performing constexpr evaluation on pre-cp_folded bodies, not sure if we
never encounter folded trees though.
cp_fold itself depends on whether we ever reprocess the already folded
trees, I'd be afraid we could.


True, and the failure mode there is silent.  Let's leave the codes in 
that switch.


Ok, thanks Jason and Jakub for the additional information.

While I give this more thought and maybe manage to come up with a 
testcase triggering the warning, shall we simply pass the location to 
those two warnings too - cannot hurt, AFAICS?


I meant let's omit the changes to cp_fold, the rest of the patch is 
still OK.


Jason


Re: [PATCH][ARM] Enable arm_legitimize_address for Thumb-2

2019-10-11 Thread Wilco Dijkstra
Hi Ramana,

>On Mon, Sep 9, 2019 at 6:03 PM Wilco Dijkstra  wrote:
>>
>> Currently arm_legitimize_address doesn't handle Thumb-2 at all, resulting in
>> inefficient code.  Since Thumb-2 supports similar address offsets use the Arm
>> legitimization code for Thumb-2 to get significant codesize and performance
>> gains.  SPECINT2006 shows 0.4% gain on Cortex-A57, while SPECFP improves 
>> 0.2%.
>
> What were the sort of code size gains  ? It did end up piquing my
> curiosity as to how we missed something so basic.  For instance ldr

h264ref is 1% smaller, various other benchmarks a few hundred bytes to 1KB
smaller (~0.1%).

> r0, [r0, #-4080] is valid in Arm state but not in Thumb2. Thus if
>  there was an illegitimate address given here, would we end up
> producing plus (r0, -4080) ? Yeah a simple testcase doesn't work out.
> Scratching my head a bit , it's late at night.

If the proposed offsets are not legal, GCC tries something different that is
legal, hence there is no requirement to only propose legal splits. In any
case we don't optimize the negative range even on Arm - the offsets are
always split into 4KB ranges (not 8KB as would be possible).

The negative offset splitting doesn't appear to have changed on Thumb-2
so there is apparently a different mechanism that deals with negative offsets.
So only positive offsets are improved with my patch:

int f(int *p) { return p[1025] + p[1026]; }

before:
movwr3, #4104
movwr2, #4100
ldr r2, [r0, r2]
ldr r0, [r0, r3]
add r0, r0, r2
bx  lr

after:
add r3, r0, #4096
ldrdr0, r3, [r3, #4]
add r0, r0, r3
bx  lr

> Orthogonally it looks like you can clean up the MINUS handling here
> and in legitimate_address_p , I'm not sure what the status of LRA with
> MINUS is either and thus we should now look to clean this up or look
> to turn this on and see what happens. However that's a subject of a
> future patch.

Well there are lots of cases that aren't handled correctly or optimally
yet - I'd copy the way I wrote aarch64_legitimize_address_displacement,
but that's for a future patch indeed.

> For the record, bootstrap with Thumb2  presumably and the testruns were clean 
> ?

Yes, at the time I ran them.

Cheers,
Wilco

Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Marek Polacek
On Fri, Oct 11, 2019 at 05:54:43PM +0200, Paolo Carlini wrote:
> Hi,
> 
> On 11/10/19 17:52, Jakub Jelinek wrote:
> > On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote:
> > > Hi again...
> > > 
> > > On 11/10/19 17:30, Paolo Carlini wrote:
> > > > Oh nice, I wasn't aware of that, to be honest, probably we should audit
> > > > the front-end for more such redundant uses.
> > > ... and I can confirm that we have *many*. If we agree that removing all 
> > > of
> > > them is the way to go I can do that in a follow up patch.
> > If they just guard a pedwarn/warning/warning_at etc., that's fine, if they
> > guard also some code that needs to compute something for the diagnostics,
> > it might not be redundant.

Yeah, much like with warn_foo guards.

> Indeed. We got many of the very straightforward ones ;)

Sounds like a nice cleanup!

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Paolo Carlini

Hi,

On 11/10/19 17:52, Jakub Jelinek wrote:

On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote:

Hi again...

On 11/10/19 17:30, Paolo Carlini wrote:

Oh nice, I wasn't aware of that, to be honest, probably we should audit
the front-end for more such redundant uses.

... and I can confirm that we have *many*. If we agree that removing all of
them is the way to go I can do that in a follow up patch.

If they just guard a pedwarn/warning/warning_at etc., that's fine, if they
guard also some code that needs to compute something for the diagnostics,
it might not be redundant.


Indeed. We got many of the very straightforward ones ;)

Paolo.



[PATCH] Implement header for C++20

2019-10-11 Thread Jonathan Wakely

There are currently no tests for [concepts.compare], but they will be
added ASAP.

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include .
* include/std/concepts: New header for C++20.
* include/std/version (__cpp_lib_concepts): Define.
* scripts/create_testsuite_files: Look for test files in new std
directory.
* testsuite/libstdc++-dg/conformance.exp: Likewise.
* testsuite/std/concepts/concepts.callable/invocable.cc: New test.
* testsuite/std/concepts/concepts.callable/regular_invocable.cc: New
test.
* testsuite/std/concepts/concepts.callable/relation.cc: New test.
* testsuite/std/concepts/concepts.callable/strictweakorder.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
floating_point.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/integral.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
signed_integral.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
unsigned_integral.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.assignable/1.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.common/1.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.commonref/1.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.constructible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.convertible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.copyconstructible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.defaultconstructible/
1.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.derived/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.destructible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.moveconstructible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.same/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.swappable/swappable.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.swappable/
swappable_with.cc: New test.
* testsuite/std/concepts/concepts.object/copyable.cc: New test.
* testsuite/std/concepts/concepts.object/movable.cc: New test.
* testsuite/std/concepts/concepts.object/regular.cc: New test.
* testsuite/std/concepts/concepts.object/semiregular.cc: New test.

Tested x86_64-linux, committed to trunk.

commit a27d35a1cf5d2c8b0609006b67a561f54d1f0e73
Author: Jonathan Wakely 
Date:   Fri Oct 11 16:52:30 2019 +0100

Implement  header for C++20

There are currently no tests for [concepts.compare], but they will be
added ASAP.

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include .
* include/std/concepts: New header for C++20.
* include/std/version (__cpp_lib_concepts): Define.
* scripts/create_testsuite_files: Look for test files in new std
directory.
* testsuite/libstdc++-dg/conformance.exp: Likewise.
* testsuite/std/concepts/concepts.callable/invocable.cc: New test.
* testsuite/std/concepts/concepts.callable/regular_invocable.cc: New
test.
* testsuite/std/concepts/concepts.callable/relation.cc: New test.
* testsuite/std/concepts/concepts.callable/strictweakorder.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
floating_point.cc: New test.
* 
testsuite/std/concepts/concepts.lang/concept.arithmetic/integral.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
signed_integral.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.arithmetic/
unsigned_integral.cc: New test.
* testsuite/std/concepts/concepts.lang/concept.assignable/1.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.common/1.cc: New 
test.
* testsuite/std/concepts/concepts.lang/concept.commonref/1.cc: New
test.
* testsuite/std/concepts/concepts.lang/concept.constructible/1.cc:
New test.
* testsuite/std/concepts/concepts.lang/concept.convertible/1.cc:
New test.
* 
testsuite/std/concepts/concepts.lang/concept.copyconstructible/1.cc:
New test.
* 

Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Jakub Jelinek
On Fri, Oct 11, 2019 at 05:34:16PM +0200, Paolo Carlini wrote:
> Hi again...
> 
> On 11/10/19 17:30, Paolo Carlini wrote:
> > Oh nice, I wasn't aware of that, to be honest, probably we should audit
> > the front-end for more such redundant uses.
> 
> ... and I can confirm that we have *many*. If we agree that removing all of
> them is the way to go I can do that in a follow up patch.

If they just guard a pedwarn/warning/warning_at etc., that's fine, if they
guard also some code that needs to compute something for the diagnostics,
it might not be redundant.

Jakub


[PATCH] Use __is_same_as for std::is_same and std::is_same_v

2019-10-11 Thread Jonathan Wakely

By using the built-in we don't need to match a partial specialization
for std::is_same and don't need to instantiate std::is_same at all for
uses of std::is_same_v.

* include/std/type_traits (is_same): Replace partial specialization
by using __is_same_as built-in in primary template.
(is_same_v): Use __is_same_as built-in instead of instantiating the
is_same trait.

Tested x86_64-linux, committed to trunk.

commit ee208e530893aeb12b6a9edb1563a4e8ecaea887
Author: Jonathan Wakely 
Date:   Wed Sep 11 21:23:31 2019 +0100

Use __is_same_as for std::is_same and std::is_same_v

By using the built-in we don't need to match a partial specialization
for std::is_same and don't need to instantiate std::is_same at all for
uses of std::is_same_v.

* include/std/type_traits (is_same): Replace partial specialization
by using __is_same_as built-in in primary template.
(is_same_v): Use __is_same_as built-in instead of instantiating the
is_same trait.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index dc8a019324d..4de5daa9f06 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1388,13 +1388,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type relations.
 
   /// is_same
-  template
+  template
 struct is_same
-: public false_type { };
-
-  template
-struct is_same<_Tp, _Tp>
-: public true_type { };
+: public integral_constant
+{ };
 
   /// is_base_of
   template
@@ -3158,7 +3155,7 @@ template 
 template 
   inline constexpr size_t extent_v = extent<_Tp, _Idx>::value;
 template 
-  inline constexpr bool is_same_v = is_same<_Tp, _Up>::value;
+  inline constexpr bool is_same_v = __is_same_as(_Tp, _Up);
 template 
   inline constexpr bool is_base_of_v = is_base_of<_Base, _Derived>::value;
 template 


Re: [PATCH] PR fortran/91714 -- Mangled type-spec

2019-10-11 Thread Steve Kargl
On Fri, Oct 11, 2019 at 05:57:04PM +0300, Janne Blomqvist wrote:
> On Sat, Sep 28, 2019 at 10:16 PM Steve Kargl
>  wrote:
> >
> > The attached patch issues errors for a few mangled type-specs.
> > It has been regression tested on x86_64-*-freebsd.  OK to commit?
> >
> > 2019-09-28  Steven G. Kargl  
> >
> > PR fortran/91714
> > * decl.c (gfc_match_decl_type_spec):  Issue errors for a few
> > mangled types.
> >
> > 2019-09-28  Steven G. Kargl  
> >
> > PR fortran/91714
> > * gfortran.dg/dec_type_print_3.f90: Update dg-error regex.
> > * gfortran.dg/pr91714.f90: New test.
> >
> > A daunting task awaits a brave soul as gfc_match_decl_type_spec
> > is a minefield for bugs.  This is dues to the fact the the functions
> > has grown by adding kludge upon kludge upon kludge.  The first
> > thing to fix is the broken parsing that follows from the
> > matching of 'type('.
> >
> > --
> > Steve
> 
> Ok.
> 

This was committed as

r276270 | kargl | 2019-09-29 09:19:58 -0700 (Sun, 29 Sep 2019) | 12 lines

Again, I was getting a backlog of patches, and committed this
as somewhat obvious.

I will repeat that gfc_match_decl_type_spec has become a difficult
nightmare to work through.  The complication comes from the new
TYPE(INTRINSIC-TYPE-SPEC) feature of either F2008 or F2018.  In
developing this patch, I found that something like

   type(character(len=1) 

is accepted as valid, but it is missing the closing ')'.  I cannot
find why the closing ')' in character is used twice! 

A long range project would be to split the declaration matching into
gfc_match_type_decl_stmt and gfc_match_intrinsic_type_decl.  Then, we
can refactor gfc_match_decl_type_spec as

gfc_match_decl_type_spec
   gfc_match_type_decl_stmt
   gfc_match_intrinsic_type_decl 
   

There are complication because TYPE is overloaded via TYPE(*)
TYPE(derive-type), and a nonstandard TYPE for IO.

-- 
Steve


Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Paolo Carlini

Hi again...

On 11/10/19 17:30, Paolo Carlini wrote:
Oh nice, I wasn't aware of that, to be honest, probably we should 
audit the front-end for more such redundant uses.


... and I can confirm that we have *many*. If we agree that removing all 
of them is the way to go I can do that in a follow up patch.


Thanks, Paolo.



Re: [PATCH] PR fortran/91943 -- More BOZ fallout

2019-10-11 Thread Steve Kargl
On Fri, Oct 11, 2019 at 05:55:45PM +0300, Janne Blomqvist wrote:
> On Tue, Oct 1, 2019 at 1:23 AM Steve Kargl
>  wrote:
> >
> > A BOZ cannot be an actual argument in a subroutine or function
> > reference except those intrinsic function listed in the Fortran
> > standard.  The attach patch checks for invalid BOZ.  Built and
> > tested on x86_64-*-freebsd.  OK to commit?
> >
> > 2019-09-30  Steven G. Kargl  
> >
> > PR fortran/91943
> > * match.c (gfc_match_call): BOZ cannot be an actual argument in
> > a subroutine reference.
> > * resolve.c (resolve_function): BOZ cannot be an actual argument in
> > a function reference.
> >
> > 2019-09-30  Steven G. Kargl  
> >
> > PR fortran/91943
> > gfortran.dg/pr91943.f90
> >
> > --
> > Steve
> 
> Ok, though I believe your other BOZ patch that I just reviewed applies
> on top of this one?
> 

Yes.  This patch was applied in

r276471 | kargl | 2019-10-02 10:01:30 -0700 (Wed, 02 Oct 2019) | 13 lines

I had a growing stack of patches interfering/interacting with
each other.  As I completely changed how BOZ are handled, I
figure that this one could be committed without a review.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


[SLP] SLP vectorization: vectorize vector constructors

2019-10-11 Thread Joel Hutton
Hi Richard,

Thanks for your help, I've reworked my SLP RFC based on your feedback.
 > I think a better place for the loop searching for CONSTRUCTORs is
 > vect_slp_analyze_bb_1 where I'd put it before the check you remove,
 > and I'd simply append found CONSTRUCTORs to the grouped_stores
 > array
I've moved this check into a separate function and called it from 
vect_slp_analyze_bb_1

 > The fixup you do in vectorizable_operation doesn't
 > belong there either, I'd add a new field to the SLP instance
 > structure refering to the CONSTRUCTOR stmt and do the fixup
 > in vect_schedule_slp_instance instead where you can simply
 > replace the CONSTRUCTOR with the vectorized SSA name then.

Done.

 > +   /* Check that the constructor elements are unique.  */
 > +   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val)
 > + {
 > +   tree prev_val;
 > +   int j;
 > +   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), j,
 > prev_val)
 > +   {
 > + if (val == prev_val && i!=j)
 >
 > why's that necessary? (it looks incomplete, also doesn't catch
 > [duplicate] constants)

The thinking was that there was no benefit in vectorizing a constructor of
duplicates, or a vector of constants, although now you mention it that 
thinking
may be flawed. I've removed it

 > You miss to check that CONSTRUCTOR_NELTS == TYPE_VECTOR_SUBPARTS
 > (we can have omitted trailing zeros).
...
 > What happens if you have a vector constructor that is twice
 > as large as the machine supports?  The vectorizer will happily
 > produce a two vector SSA name vectorized result but your
 > CONSTRUCTOR replacement doesn't work here.  I think this should
 > be made work correctly (not give up on that case).

I've reworked the patch to account for this, by checking that the 
vectorized version
has one vectorized stmt at the root of the SLP tree. I'm not sure how to 
handle
a vector constructor twice as large as the machine supports, as far as I 
can see,
when only analyzing a within a basic block, the SSA name of the 
constructor has to
be maintained.


Currently SLP vectorization can build SLP trees starting from reductions or
from group stores. This patch adds a third starting point: vector 
constructors.

For the following test case (compiled with -O3):

char g_d[1024], g_s1[1024], g_s2[1024];
void test_loop(void)
{
   char d = g_d, s1 = g_s1, *s2 = g_s2;
   for ( int y = 0; y < 128; y++ )

   {
     for ( int x = 0; x < 16; x++ )
   d[x] = s1[x] + s2[x];
     d += 16;
   }
}

before patch:
test_loop:
.LFB0:
     .cfi_startproc
     adrp    x0, g_s1
     adrp    x2, g_s2
     add x3, x0, :lo12:g_s1
     add x4, x2, :lo12:g_s2
     ldrb    w7, [x2, #:lo12:g_s2]
     ldrb    w1, [x0, #:lo12:g_s1]
     adrp    x0, g_d
     ldrb    w6, [x4, 1]
     add x0, x0, :lo12:g_d
     ldrb    w5, [x3, 1]
     add w1, w1, w7
     fmov    s0, w1
     ldrb    w7, [x4, 2]
     add w5, w5, w6
     ldrb    w1, [x3, 2]
     ldrb    w6, [x4, 3]
     add x2, x0, 2048
     ins v0.b[1], w5
     add w1, w1, w7
     ldrb    w7, [x3, 3]
     ldrb    w5, [x4, 4]
     add w7, w7, w6
     ldrb    w6, [x3, 4]
     ins v0.b[2], w1
     ldrb    w8, [x4, 5]
     add w6, w6, w5
     ldrb    w5, [x3, 5]
     ldrb    w9, [x4, 6]
     add w5, w5, w8
     ldrb    w1, [x3, 6]
     ins v0.b[3], w7
     ldrb    w8, [x4, 7]
     add w1, w1, w9
     ldrb    w11, [x3, 7]
     ldrb    w7, [x4, 8]
     add w11, w11, w8
     ldrb    w10, [x3, 8]
     ins v0.b[4], w6
     ldrb    w8, [x4, 9]
     add w10, w10, w7
     ldrb    w9, [x3, 9]
     ldrb    w7, [x4, 10]
     add w9, w9, w8
     ldrb    w8, [x3, 10]
     ins v0.b[5], w5
     ldrb    w6, [x4, 11]
     add w8, w8, w7
     ldrb    w7, [x3, 11]
     ldrb    w5, [x4, 12]
     add w7, w7, w6
     ldrb    w6, [x3, 12]
     ins v0.b[6], w1
     ldrb    w12, [x4, 13]
     add w6, w6, w5
     ldrb    w5, [x3, 13]
     ldrb    w1, [x3, 14]
     add w5, w5, w12
     ldrb    w13, [x4, 14]
     ins v0.b[7], w11
     ldrb    w12, [x4, 15]
     add w4, w1, w13
     ldrb    w1, [x3, 15]
     add w1, w1, w12
     ins v0.b[8], w10
     ins v0.b[9], w9
     ins v0.b[10], w8
     ins v0.b[11], w7
     ins v0.b[12], w6
     ins v0.b[13], w5
     ins v0.b[14], w4
     ins v0.b[15], w1
     .p2align 3,,7
.L2:
     str q0, [x0], 16
     cmp x2, x0
     bne .L2
     ret
     .cfi_endproc
.LFE0:

After patch:

test_loop:
.LFB0:
     .cfi_startproc
     adrp    x3, g_s1
     adrp    x2, g_s2
  

[PATCH] PR libstdc++/92059 fix several bugs in tr2::dynamic_bitset

2019-10-11 Thread Jonathan Wakely

PR libstdc++/92059
* include/tr2/dynamic_bitset (__dynamic_bitset_base): Define all
special member functions as defaulted. Add noexcept to most members.
(__dynamic_bitset_base(size_t, unsigned long long, const _Alloc&)):
Mask off unwanted bits in the __val parameter. Avoid undefined left
shifts.
(__dynamic_bitset_base::_M_assign): Remove.
(__dynamic_bitset_base::_M_do_reset): Use std::fill.
(__dynamic_bitset_base::_M_are_all_aux): Avoid integer promotion when
block_type has lower rank than int.
(dynamic_bitset): Add noexcept to most members. Use injected-class-name
in return types and parameter types.
(dynamic_bitset::_M_Nb): Add default member initializer.
(dynamic_bitset(), dynamic_bitset(const dynamic_bitset&)): Define as
defaulted.
(dynamic_bitset(dynamic_bitset&&)): Clear source object after move.
(dynamic_bitset::operator=(const dynamic_bitset&)): Define as
defaulted.
(dynamic_bitset::operator=(dynamic_bitset&&)): Add noexcept-specifier.
Define without using swap, to propagate allocator correctly.
(dynamic_bitset(const char*, const _Alloc&)): Use strlen.
(dynamic_bitset::_M_do_sanitize, dynamic_bitset::_M_do_fill): Use
casts to avoid unwanted integer promotions.
(dynamic_bitset::_M_copy_from_ptr): Rearrange template parameters and
add default template arguments and default argument to simplify usage.
(dynamic_bitset::_M_copy_from_string): Adjust call to _M_copy_from_ptr.
(operator==(const dynamic_bitset&, const dynamic_bitset&))
(operator<(const dynamic_bitset&, const dynamic_bitset&)): Use _M_Nb.
* include/tr2/dynamic_bitset.tcc (dynamic_bitset::_M_copy_from_ptr):
Adjust template parameters to match declaration.
* testsuite/tr2/dynamic_bitset/cmp.cc: New test.
* testsuite/tr2/dynamic_bitset/cons.cc: New test.
* testsuite/tr2/dynamic_bitset/copy.cc: New test.
* testsuite/tr2/dynamic_bitset/move.cc: New test.
* testsuite/tr2/dynamic_bitset/pr92059.cc: New test.

Tested x86_64-linux. Committed to trunk.

This seems worth backporting too, as the type is now no longer totally
broken (e.g. it is now copy constructible, and the move constructor no
longer breaks the class invariant).


commit 18ed132a6b119bc2c23078848a8981ba96067a15
Author: redi 
Date:   Fri Oct 11 15:29:55 2019 +

PR libstdc++/92059 fix several bugs in tr2::dynamic_bitset

PR libstdc++/92059
* include/tr2/dynamic_bitset (__dynamic_bitset_base): Define all
special member functions as defaulted. Add noexcept to most members.
(__dynamic_bitset_base(size_t, unsigned long long, const _Alloc&)):
Mask off unwanted bits in the __val parameter. Avoid undefined left
shifts.
(__dynamic_bitset_base::_M_assign): Remove.
(__dynamic_bitset_base::_M_do_reset): Use std::fill.
(__dynamic_bitset_base::_M_are_all_aux): Avoid integer promotion 
when
block_type has lower rank than int.
(dynamic_bitset): Add noexcept to most members. Use 
injected-class-name
in return types and parameter types.
(dynamic_bitset::_M_Nb): Add default member initializer.
(dynamic_bitset(), dynamic_bitset(const dynamic_bitset&)): Define as
defaulted.
(dynamic_bitset(dynamic_bitset&&)): Clear source object after move.
(dynamic_bitset::operator=(const dynamic_bitset&)): Define as
defaulted.
(dynamic_bitset::operator=(dynamic_bitset&&)): Add 
noexcept-specifier.
Define without using swap, to propagate allocator correctly.
(dynamic_bitset(const char*, const _Alloc&)): Use strlen.
(dynamic_bitset::_M_do_sanitize, dynamic_bitset::_M_do_fill): Use
casts to avoid unwanted integer promotions.
(dynamic_bitset::_M_copy_from_ptr): Rearrange template parameters 
and
add default template arguments and default argument to simplify 
usage.
(dynamic_bitset::_M_copy_from_string): Adjust call to 
_M_copy_from_ptr.
(operator==(const dynamic_bitset&, const dynamic_bitset&))
(operator<(const dynamic_bitset&, const dynamic_bitset&)): Use 
_M_Nb.
* include/tr2/dynamic_bitset.tcc (dynamic_bitset::_M_copy_from_ptr):
Adjust template parameters to match declaration.
* testsuite/tr2/dynamic_bitset/cmp.cc: New test.
* testsuite/tr2/dynamic_bitset/cons.cc: New test.
* testsuite/tr2/dynamic_bitset/copy.cc: New test.
* testsuite/tr2/dynamic_bitset/move.cc: New test.
* testsuite/tr2/dynamic_bitset/pr92059.cc: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276890 

Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Paolo Carlini

Hi,

On 11/10/19 17:23, Marek Polacek wrote:

I mean the latter; pedwarn will check for diagnostic_report_warnings_p so
the pedwarn will not trigger in a system header unless -Wsystem-headers
even without that check.


Oh nice, I wasn't aware of that, to be honest, probably we should audit 
the front-end for more such redundant uses. Anyway, consider it dropped 
in this case, I'm re-running the testsuite to be safe.


Thanks, Paolo.



[PATCH] Avoid warnings in

2019-10-11 Thread Jonathan Wakely

* include/bits/charconv.h (__to_chars_len): Avoid -Wsign-compare
warnings.

Tested x86_64-linux. Committed to trunk.
commit 0839fa95cf5775f0022b35cda01bb0d6c27ca08a
Author: Jonathan Wakely 
Date:   Fri Oct 11 16:10:24 2019 +0100

Avoid warnings in 

* include/bits/charconv.h (__to_chars_len): Avoid -Wsign-compare
warnings.

diff --git a/libstdc++-v3/include/bits/charconv.h 
b/libstdc++-v3/include/bits/charconv.h
index a5b6be567bc..7c0922fec21 100644
--- a/libstdc++-v3/include/bits/charconv.h
+++ b/libstdc++-v3/include/bits/charconv.h
@@ -50,16 +50,16 @@ namespace __detail
   static_assert(is_unsigned<_Tp>::value, "implementation bug");
 
   unsigned __n = 1;
-  const int __b2 = __base  * __base;
-  const int __b3 = __b2 * __base;
-  const int __b4 = __b3 * __base;
+  const unsigned __b2 = __base  * __base;
+  const unsigned __b3 = __b2 * __base;
+  const unsigned long __b4 = __b3 * __base;
   for (;;)
{
- if (__value < __base) return __n;
+ if (__value < (unsigned)__base) return __n;
  if (__value < __b2) return __n + 1;
  if (__value < __b3) return __n + 2;
  if (__value < __b4) return __n + 3;
- __value /= (unsigned)__b4;
+ __value /= __b4;
  __n += 4;
}
 }


Re: [PATCH] add __has_builtin (PR 66970)

2019-10-11 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00062.html

On 10/1/19 11:16 AM, Martin Sebor wrote:

Attached is an implementation of the __has_builtin special
preprocessor operator/macro analogous to __has_attribute and
(hopefully) compatible with the synonymous Clang feature (I
couldn't actually find tests for it in the Clang test suite
but if someone points me at them I'll verify it).

Tested on x86_64-linux.

Martin

PS I couldn't find an existing API to test whether a reserved
symbol like __builtin_offsetof is a function-like built-in so
I hardwired the tests for C and C++ into the new names_builtin_p
functions.  I don't like this very much because the next time
such an operator is added there is nothing to remind us to update
the functions.  Adding a flag to the c_common_reswords array would
solve the problem but at the expense of a linear search through
it.  Does anyone have a suggestion for how to do this better?




Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Marek Polacek
On Fri, Oct 11, 2019 at 05:14:33PM +0200, Paolo Carlini wrote:
> Hi,
> 
> On 11/10/19 15:37, Marek Polacek wrote:
> > 
> > > Index: cp/decl.c
> > > ===
> > > --- cp/decl.c (revision 276845)
> > > +++ cp/decl.c (working copy)
> > > @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
> > > if (TREE_CODE (declared_type) != UNION_TYPE
> > > && !in_system_header_at (input_location))
> > > - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous 
> > > structs");
> > > + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)),
> > > +  OPT_Wpedantic, "ISO C++ prohibits anonymous structs");
> > The change looks fine but the in_system_header_at line can be dropped, 
> > right?
> 
> What do you mean, exactly? Do you mean that, more correctly, we should use
> DECL_SOURCE_LOCATION for it too or that in fact is and was already
> completely redundant? I agree with the former, I didn't bother changing it
> (likely in a couple of other places too) because in practice input_location
> should work fine anyway as far as noticing that we are in system header is
> concerned and is simpler. If you mean the latter, I'm not sure, I don't
> really see why...

I mean the latter; pedwarn will check for diagnostic_report_warnings_p so
the pedwarn will not trigger in a system header unless -Wsystem-headers
even without that check.

I know you're just changing the location so you don't need to drop it.

I wouldn't worry about the former, I guess it would only make a difference
when the { comes from a system header and the } doesn't.

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: Fix unchecked use of tree_to_uhwi in tree-ssa-strlen.c

2019-10-11 Thread Martin Sebor

On 10/11/19 8:47 AM, Richard Sandiford wrote:

r273783 introduced an unchecked use of tree_to_uhwi.  This is
tested by the SVE ACLE patches, but could potentially trigger
in non-SVE cases too.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?


I wasn't able to come up with a C test case that triggers the bug
but the fix seems obviously safe and correct to me.  Thanks for
handling it!

Martin



Richard


2019-10-11  Richard Sandiford  

gcc/
* tree-ssa-strlen.c (count_nonzero_bytes): Check tree_fits_uhwi_p
before using tree_to_uhwi.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   2019-10-11 15:43:51.127514545 +0100
+++ gcc/tree-ssa-strlen.c   2019-10-11 15:46:11.718524445 +0100
@@ -4026,10 +4026,10 @@ count_nonzero_bytes (tree exp, unsigned
  
/* The size of the MEM_REF access determines the number of bytes.  */

tree type = TREE_TYPE (exp);
-  if (tree typesize = TYPE_SIZE_UNIT (type))
-   nbytes = tree_to_uhwi (typesize);
-  else
+  tree typesize = TYPE_SIZE_UNIT (type);
+  if (!typesize || !tree_fits_uhwi_p (typesize))
return false;
+  nbytes = tree_to_uhwi (typesize);
  
/* Handle MEM_REF = SSA_NAME types of assignments.  */

return count_nonzero_bytes (arg, offset, nbytes, lenrange, nulterm,





Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Paolo Carlini

Hi,

On 11/10/19 15:37, Marek Polacek wrote:



Index: cp/decl.c
===
--- cp/decl.c   (revision 276845)
+++ cp/decl.c   (working copy)
@@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
  
if (TREE_CODE (declared_type) != UNION_TYPE

  && !in_system_header_at (input_location))
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous 
structs");
+   pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)),
+OPT_Wpedantic, "ISO C++ prohibits anonymous structs");

The change looks fine but the in_system_header_at line can be dropped, right?


What do you mean, exactly? Do you mean that, more correctly, we should 
use DECL_SOURCE_LOCATION for it too or that in fact is and was already 
completely redundant? I agree with the former, I didn't bother changing 
it (likely in a couple of other places too) because in practice 
input_location should work fine anyway as far as noticing that we are in 
system header is concerned and is simpler. If you mean the latter, I'm 
not sure, I don't really see why...


Paolo.



Re: [PATCH] PR fortran/91714 -- Mangled type-spec

2019-10-11 Thread Janne Blomqvist
On Sat, Sep 28, 2019 at 10:16 PM Steve Kargl
 wrote:
>
> The attached patch issues errors for a few mangled type-specs.
> It has been regression tested on x86_64-*-freebsd.  OK to commit?
>
> 2019-09-28  Steven G. Kargl  
>
> PR fortran/91714
> * decl.c (gfc_match_decl_type_spec):  Issue errors for a few
> mangled types.
>
> 2019-09-28  Steven G. Kargl  
>
> PR fortran/91714
> * gfortran.dg/dec_type_print_3.f90: Update dg-error regex.
> * gfortran.dg/pr91714.f90: New test.
>
> A daunting task awaits a brave soul as gfc_match_decl_type_spec
> is a minefield for bugs.  This is dues to the fact the the functions
> has grown by adding kludge upon kludge upon kludge.  The first
> thing to fix is the broken parsing that follows from the
> matching of 'type('.
>
> --
> Steve

Ok.

-- 
Janne Blomqvist


[PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)

2019-10-11 Thread Stam Markianos-Wright
Hi all,

This is a patch for an issue where the compiler was generating a 
conditional branch in Thumb2, which was too far for b{cond} to handle.

This was originally reported at binutils:
https://sourceware.org/bugzilla/show_bug.cgi?id=24991

And then raised for GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816


As can be seen here:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihfddaf.html

the range of a 32-bit Thumb B{cond} is +/-1MB.

This is now checked for in arm.md and an unconditional branch is 
generated if the jump would be greater than 1MB.

New test has been written that checks this for: beq (if (a)), bne (if 
(a==1))

Patch bootstrapped and regression tested on arm-none-linux-gnueabihf, 
however, on my native Aarch32 setup the test times out when run as part 
of a big "make check-gcc" regression, but not when run individually.

Patch also regression tested on arm-none-eabi, arm-none-linux-gnueabi 
with no issues.

Also, I don't have commit rights yet, so could someone commit it on my 
behalf?

Thanks,
Stam Markianos-Wright



gcc/ChangeLog:

2019-10-11  Stamatis Markianos-Wright 

* config/arm/arm.md: Update b for Thumb2 range checks.
* config/arm/arm.c: New function arm_gen_far_branch.
* config/arm/arm-protos.h: New function arm_gen_far_branch
prototype.

gcc/testsuite/ChangeLog:

2019-10-11  Stamatis Markianos-Wright 

* testsuite/gcc.target/arm/pr91816.c: New test.
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index f995974f9bb..1dce333d1c3 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const cpu_arch_option *,
 
 void arm_initialize_isa (sbitmap, const enum isa_feature *);
 
+const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
+
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e1a1ef9a2..1a693d2ddca 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -32139,6 +32139,31 @@ arm_run_selftests (void)
 }
 } /* Namespace selftest.  */
 
+
+/* Generate code to enable conditional branches in functions over 1 MiB.  */
+const char *
+arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
+			const char * branch_format)
+{
+  rtx_code_label * tmp_label = gen_label_rtx ();
+  char label_buf[256];
+  char buffer[128];
+  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
+			CODE_LABEL_NUMBER (tmp_label));
+  const char *label_ptr = arm_strip_name_encoding (label_buf);
+  rtx dest_label = operands[pos_label];
+  operands[pos_label] = tmp_label;
+
+  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
+  output_asm_insn (buffer, operands);
+
+  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, label_ptr);
+  operands[pos_label] = dest_label;
+  output_asm_insn (buffer, operands);
+  return "";
+}
+
+
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 #endif /* CHECKING_P */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f861c72ccfc..634fd0a59da 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6686,9 +6686,16 @@
 ;; And for backward branches we have 
 ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 4).
 ;;
+;; In 16-bit Thumb these ranges are:
 ;; For a 'b'   pos_range = 2046, neg_range = -2048 giving (-2040->2048).
 ;; For a 'b' pos_range = 254,  neg_range = -256  giving (-250 ->256).
 
+;; In 32-bit Thumb these ranges are:
+;; For a 'b'   +/- 16MB is not checked for.
+;; For a 'b' pos_range = 1048574,  neg_range = -1048576  giving
+;; (-1048568 -> 1048576).
+
+
 (define_expand "cbranchsi4"
   [(set (pc) (if_then_else
 	  (match_operator 0 "expandable_comparison_operator"
@@ -6947,22 +6954,42 @@
 		  (pc)))]
   "TARGET_32BIT"
   "*
-  if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
-{
-  arm_ccfsm_state += 2;
-  return \"\";
-}
-  return \"b%d1\\t%l0\";
+ if (arm_ccfsm_state == 1 || arm_ccfsm_state == 2)
+  {
+	arm_ccfsm_state += 2;
+	return \"\";
+  }
+ switch (get_attr_length (insn))
+  {
+	// Thumb2 16-bit b{cond}
+	case 2:
+
+	// Thumb2 32-bit b{cond}
+	case 4: return \"b%d1\\t%l0\";break;
+
+	// Thumb2 b{cond} out of range.  Use unconditional branch.
+	case 8: return arm_gen_far_branch \
+		(operands, 0, \"Lbcond\", \"b%D1\t\");
+	break;
+
+	// A32 b{cond}
+	default: return \"b%d1\\t%l0\";
+  }
   "
   [(set_attr "conds" "use")
(set_attr "type" "branch")
(set (attr "length")
-	(if_then_else
-	   (and (match_test "TARGET_THUMB2")
-		(and (ge (minus (match_dup 0) (pc)) (const_int -250))
-		 (le (minus (match_dup 0) (pc)) (const_int 256
-	   (const_int 2)
-	   (const_int 4)))]
+	(if_then_else (match_test "TARGET_THUMB2")
+	(if_then_else (and (ge (minus (match_dup 0) (pc)) (const_int -250))
+	(le (minus 

Re: [PATCH] PR fortran/91943 -- More BOZ fallout

2019-10-11 Thread Janne Blomqvist
On Tue, Oct 1, 2019 at 1:23 AM Steve Kargl
 wrote:
>
> A BOZ cannot be an actual argument in a subroutine or function
> reference except those intrinsic function listed in the Fortran
> standard.  The attach patch checks for invalid BOZ.  Built and
> tested on x86_64-*-freebsd.  OK to commit?
>
> 2019-09-30  Steven G. Kargl  
>
> PR fortran/91943
> * match.c (gfc_match_call): BOZ cannot be an actual argument in
> a subroutine reference.
> * resolve.c (resolve_function): BOZ cannot be an actual argument in
> a function reference.
>
> 2019-09-30  Steven G. Kargl  
>
> PR fortran/91943
> gfortran.dg/pr91943.f90
>
> --
> Steve

Ok, though I believe your other BOZ patch that I just reviewed applies
on top of this one?

-- 
Janne Blomqvist


Re: [PATCH] PR fortran/92018 -- BOZ the gift that keeps giving

2019-10-11 Thread Janne Blomqvist
On Wed, Oct 9, 2019 at 2:14 AM Steve Kargl
 wrote:
>
> Tested on x86_64-*-freebsd.  OK to commit?
>
> A BOZ literal constant can be an actual argument in a
> very limited number of intrinsic subprograms.  For those
> intrinsics subprograms, the BOZ literal constant is converted
> either during checking (see check.c) or simplification
> (see simplify.c).  In resolve.c (resolve_function), I added
> code that would walk the actual argument list to check for a
> BOZ, but that code was restricted to functions with the EXTERNAL
> attribute.
>
> The new testcase, pr92018.f90, demonstrates a situation
> when neither the INTRINSIC and EXTERNAL attribute is set,
> and the actual argument list contains BOZ.  This led to
> an ICE.  The patch removes the previous restriction, and
> so the actual arguments for all functions are checked.
> This works except it pointed to a deficiency in the checking
> routines.  If something was rejected, (e.g., IAND(Z'12',Z34')),
> the BOZ were passed onto resolve_function() and run-on errors
> were reported.  To avoid these additional error messages, I have
> added the reset_boz() function, which converts a rejected
> BOZ to a default integer kind 0.
>
> 2019-10-09  Steven G. Kargl  
>
> PF fortran/92018
> * check.c (reset_boz): New function.
> (illegal_boz_arg, boz_args_check, gfc_check_complex, gfc_check_float,
> gfc_check_transfer): Use it.
> (gfc_check_dshift): Use reset_boz, and re-arrange the checking to
> help suppress possible run-on errors.
> (gfc_check_and): Restore checks for valid argument types.  Use
> reset_boz, and re-arrange the checking to help suppress possible
> run-on errors.
> * resolve.c (resolve_function): Actual arguments cannot be BOZ in
> a function reference.
>
> 2019-10-09  Steven G. Kargl  
>
> PF fortran/92018
> * gfortran.dg/gnu_logical_2.f90: Update dg-error regex.
> * gfortran.dg/pr81509_2.f90: Ditto.
> * gfortran.dg/pr92018.f90: New test.
>
> --
> Steve

Ok.

-- 
Janne Blomqvist


Re: [PATCH][ARM] Tweak HONOR_REG_ALLOC_ORDER

2019-10-11 Thread Wilco Dijkstra
Hi Ramana,

> My only question would be whether it's more suitable to use
> optimize_function_for_size_p(cfun) instead as IIRC that gives us a
> chance with lto rather than the global optimize_size.

Yes that is even better and that defaults to optimize_size if cfun isn't
set. I've committed this:

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
8b67c9c3657b312be223ab60c01969958baa9ed3..5fad1e5bcc2bc448489fdc8239c676246bbc8879
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1068,9 +1068,8 @@ extern int arm_regs_in_sequence[];
 /* Use different register alloc ordering for Thumb.  */
 #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc ()
 
-/* Tell IRA to use the order we define rather than messing it up with its
-   own cost calculations.  */
-#define HONOR_REG_ALLOC_ORDER 1
+/* Tell IRA to use the order we define when optimizing for size.  */
+#define HONOR_REG_ALLOC_ORDER optimize_function_for_size_p (cfun)
 
 /* Interrupt functions can only use registers that have already been
saved by the prologue, even if they would normally be


Re: [PATCH] PR fortran/92019 -- BOZ cannot be an array index

2019-10-11 Thread Janne Blomqvist
On Wed, Oct 9, 2019 at 11:26 PM Steve Kargl
 wrote:
>
> The attach patch fixes an ICE caused by the use of a bOZ as
> an array index.  Tested on x86_64-*-freebsd.  OK to commit?
>
> 2019-10-09  Steven G. Kargl  
>
> PR fortran/92019
> * array.c (match_subscript): BOZ cannot be an array subscript.
>
> 2019-10-09  Steven G. Kargl  
>
> PR fortran/92019
> * gfortran.dg/pr92019.f90: New test.
>
> --
> Steve

Ok.

-- 
Janne Blomqvist


Fix unchecked use of tree_to_uhwi in tree-ssa-strlen.c

2019-10-11 Thread Richard Sandiford
r273783 introduced an unchecked use of tree_to_uhwi.  This is
tested by the SVE ACLE patches, but could potentially trigger
in non-SVE cases too.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-10-11  Richard Sandiford  

gcc/
* tree-ssa-strlen.c (count_nonzero_bytes): Check tree_fits_uhwi_p
before using tree_to_uhwi.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   2019-10-11 15:43:51.127514545 +0100
+++ gcc/tree-ssa-strlen.c   2019-10-11 15:46:11.718524445 +0100
@@ -4026,10 +4026,10 @@ count_nonzero_bytes (tree exp, unsigned
 
   /* The size of the MEM_REF access determines the number of bytes.  */
   tree type = TREE_TYPE (exp);
-  if (tree typesize = TYPE_SIZE_UNIT (type))
-   nbytes = tree_to_uhwi (typesize);
-  else
+  tree typesize = TYPE_SIZE_UNIT (type);
+  if (!typesize || !tree_fits_uhwi_p (typesize))
return false;
+  nbytes = tree_to_uhwi (typesize);
 
   /* Handle MEM_REF = SSA_NAME types of assignments.  */
   return count_nonzero_bytes (arg, offset, nbytes, lenrange, nulterm,


[committed] Relax store_bit_field call in store_expr

2019-10-11 Thread Richard Sandiford
store_bit_field already takes a poly_uint64 size, so we can relax the
INTVAL to rtx_to_poly_int64.  This is tested by the SVE ACLE patches.

Tested on aarch64-linux-gnu and x86_64-linux-gnu, applied as obvious.

Richard


2019-10-11  Richard Sandiford  

gcc/
* expr.c (store_expr): Use rtx_to_poly_int64 rather than
INTVAL when calling store_bit_field.

Index: gcc/expr.c
===
--- gcc/expr.c  2019-10-06 16:37:30.845042970 +0100
+++ gcc/expr.c  2019-10-11 15:44:02.755432660 +0100
@@ -5790,7 +5790,8 @@ store_expr (tree exp, rtx target, int ca
copy_blkmode_from_reg (target, temp, TREE_TYPE (exp));
  else
store_bit_field (target,
-INTVAL (expr_size (exp)) * BITS_PER_UNIT,
+rtx_to_poly_int64 (expr_size (exp))
+* BITS_PER_UNIT,
 0, 0, 0, GET_MODE (temp), temp, reverse);
}
  else


Add a constant_range_value_p function (PR 92033)

2019-10-11 Thread Richard Sandiford
The range-tracking code has a pretty hard-coded assumption that
is_gimple_min_invariant is equivalent to "INTEGER_CST or invariant
ADDR_EXPR".  It seems better to add a predicate specifically for
that rather than contiually fight cases in which it can't handle
other invariants.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-10-11  Richard Sandiford  

gcc/
PR tree-optimization/92033
* tree-vrp.h (constant_range_value_p): Declare.
* tree-vrp.c (constant_range_value_p): New function.
(value_range_base::symbolic_p, value_range_base::singleton_p)
(get_single_symbol, compare_values_warnv, intersect_ranges)
(value_range_base::normalize_symbolics): Use it instead of
is_gimple_min_invariant.
(simplify_stmt_for_jump_threading): Likewise.
* vr-values.c (symbolic_range_based_on_p, valid_value_p): Likewise.
(vr_values::op_with_constant_singleton_value_range): Likewise.
(vr_values::extract_range_from_binary_expr): Likewise.
(vr_values::extract_range_from_unary_expr): Likewise.
(vr_values::extract_range_from_cond_expr): Likewise.
(vr_values::extract_range_from_comparison): Likewise.
(vr_values::extract_range_from_assignment): Likewise.
(vr_values::adjust_range_with_scev, vrp_valueize): Likewise.
(vr_values::vrp_visit_assignment_or_call): Likewise.
(vr_values::vrp_evaluate_conditional): Likewise.
(vr_values::simplify_bit_ops_using_ranges): Likewise.
(test_for_singularity): Likewise.
(vr_values::simplify_cond_using_ranges_1): Likewise.

Index: gcc/tree-vrp.h
===
--- gcc/tree-vrp.h  2019-10-08 09:23:31.282533990 +0100
+++ gcc/tree-vrp.h  2019-10-11 15:41:20.380576059 +0100
@@ -284,6 +284,7 @@ value_range_base::supports_type_p (tree
   return false;
 }
 
+extern bool constant_range_value_p (const_tree);
 extern void register_edge_assert_for (tree, edge, enum tree_code,
  tree, tree, vec &);
 extern bool stmt_interesting_for_vrp (gimple *);
Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  2019-10-08 09:23:31.282533990 +0100
+++ gcc/tree-vrp.c  2019-10-11 15:41:20.380576059 +0100
@@ -78,6 +78,18 @@ ranges_from_anti_range (const value_rang
for still active basic-blocks.  */
 static sbitmap *live;
 
+/* Return true if VALUE is considered constant for range tracking.
+   This is stricter than is_gimple_min_invariant and should be
+   used instead of it in range-related code.  */
+
+bool
+constant_range_value_p (const_tree value)
+{
+  return (TREE_CODE (value) == INTEGER_CST
+ || (TREE_CODE (value) == ADDR_EXPR
+ && is_gimple_invariant_address (value)));
+}
+
 void
 value_range::set_equiv (bitmap equiv)
 {
@@ -273,8 +285,8 @@ value_range_base::symbolic_p () const
 {
   return (!varying_p ()
  && !undefined_p ()
- && (!is_gimple_min_invariant (m_min)
- || !is_gimple_min_invariant (m_max)));
+ && (!constant_range_value_p (m_min)
+ || !constant_range_value_p (m_max)));
 }
 
 /* NOTE: This is not the inverse of symbolic_p because the range
@@ -388,7 +400,7 @@ value_range_base::singleton_p (tree *res
 }
   if (m_kind == VR_RANGE
   && vrp_operand_equal_p (min (), max ())
-  && is_gimple_min_invariant (min ()))
+  && constant_range_value_p (min ()))
 {
   if (result)
 *result = min ();
@@ -953,13 +965,13 @@ get_single_symbol (tree t, bool *neg, tr
   || TREE_CODE (t) == POINTER_PLUS_EXPR
   || TREE_CODE (t) == MINUS_EXPR)
 {
-  if (is_gimple_min_invariant (TREE_OPERAND (t, 0)))
+  if (constant_range_value_p (TREE_OPERAND (t, 0)))
{
  neg_ = (TREE_CODE (t) == MINUS_EXPR);
  inv_ = TREE_OPERAND (t, 0);
  t = TREE_OPERAND (t, 1);
}
-  else if (is_gimple_min_invariant (TREE_OPERAND (t, 1)))
+  else if (constant_range_value_p (TREE_OPERAND (t, 1)))
{
  neg_ = false;
  inv_ = TREE_OPERAND (t, 1);
@@ -1106,8 +1118,8 @@ compare_values_warnv (tree val1, tree va
  TYPE_SIGN (TREE_TYPE (val1)));
 }
 
-  const bool cst1 = is_gimple_min_invariant (val1);
-  const bool cst2 = is_gimple_min_invariant (val2);
+  const bool cst1 = constant_range_value_p (val1);
+  const bool cst2 = constant_range_value_p (val2);
 
   /* If one is of the form '[-]NAME + CST' and the other is constant, then
  it might be possible to say something depending on the constants.  */
@@ -5785,7 +5797,7 @@ intersect_ranges (enum value_range_kind
  correct estimate unless VR1 is a constant singleton range
  in which case we choose that.  */
   if (vr1type == VR_RANGE
-  && is_gimple_min_invariant (vr1min)
+  && constant_range_value_p (vr1min)
   && 

Add expr_callee_abi

2019-10-11 Thread Richard Sandiford
This turned out to be useful for the SVE PCS support, and is a natural
tree-level analogue of insn_callee_abi.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-10-11  Richard Sandiford  

gcc/
* function-abi.h (expr_callee_abi): Declare.
* function-abi.cc (expr_callee_abi): New function.

Index: gcc/function-abi.h
===
--- gcc/function-abi.h  2019-09-30 17:39:33.514597856 +0100
+++ gcc/function-abi.h  2019-10-11 15:38:54.141605718 +0100
@@ -315,5 +315,6 @@ call_clobbered_in_region_p (unsigned int
 extern const predefined_function_abi _abi (const_tree);
 extern function_abi fndecl_abi (const_tree);
 extern function_abi insn_callee_abi (const rtx_insn *);
+extern function_abi expr_callee_abi (const_tree);
 
 #endif
Index: gcc/function-abi.cc
===
--- gcc/function-abi.cc 2019-09-30 17:39:33.514597856 +0100
+++ gcc/function-abi.cc 2019-10-11 15:38:54.141605718 +0100
@@ -229,3 +229,32 @@ insn_callee_abi (const rtx_insn *insn)
 
   return default_function_abi;
 }
+
+/* Return the ABI of the function called by CALL_EXPR EXP.  Return the
+   default ABI for erroneous calls.  */
+
+function_abi
+expr_callee_abi (const_tree exp)
+{
+  gcc_assert (TREE_CODE (exp) == CALL_EXPR);
+
+  if (tree fndecl = get_callee_fndecl (exp))
+return fndecl_abi (fndecl);
+
+  tree callee = CALL_EXPR_FN (exp);
+  if (callee == error_mark_node)
+return default_function_abi;
+
+  tree type = TREE_TYPE (callee);
+  if (type == error_mark_node)
+return default_function_abi;
+
+  if (POINTER_TYPE_P (type))
+{
+  type = TREE_TYPE (type);
+  if (type == error_mark_node)
+   return default_function_abi;
+}
+
+  return fntype_abi (type);
+}


Re: Ping: [C][C++] Avoid exposing internal details in aka types

2019-10-11 Thread Marek Polacek
On Fri, Oct 11, 2019 at 03:14:09PM +0100, Richard Sandiford wrote:
> Marek Polacek  writes:
> > On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote:
> >> Ping
> >> 
> >> Richard Sandiford  writes:
> >> > The current aka diagnostics can sometimes leak internal details that
> >> > seem more likely to be distracting than useful.  E.g. on aarch64:
> >> >
> >> >   void f (va_list *va) { *va = 1; }
> >> >
> >> > gives:
> >> >
> >> >   incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} 
> >> > from type ‘int’
> >> >
> >> > where __va_list isn't something the user is expected to know about.
> >> > A similar thing happens for C++ on the arm_neon.h-based:
> >> >
> >> >   float x;
> >> >   int8x8_t y = x;
> >> >
> >> > which gives:
> >> >
> >> >   cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in 
> >> > initialization
> >> >
> >> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS --
> >> > but it's not going to be meaningful to most users.
> >
> > Agreed.
> >
> >> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to
> >> +   the user in diagnostics, false if it would be better to use TYPE 
> >> itself.
> >> +   TYPE is known to satisfy typedef_variant_p.  */
> >> +
> >> +bool
> >> +user_facing_original_type_p (const_tree type)
> >> +{
> >> +  gcc_assert (typedef_variant_p (type));
> >> +  tree decl = TYPE_NAME (type);
> >> +
> >> +  /* Look through any typedef in "user" code.  */
> >> +  if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl))
> >> +return true;
> >> +
> >> +  /* If the original type is also named and is in the user namespace,
> >> + assume it too is a user-facing type.  */
> >> +  tree orig_type = DECL_ORIGINAL_TYPE (decl);
> >> +  if (tree orig_id = TYPE_IDENTIFIER (orig_type))
> >> +{
> >> +  const char *name = IDENTIFIER_POINTER (orig_id);
> >> +  if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1])))
> >> +  return true;
> >
> > This looks like name_reserved_for_implementation_p.
> >
> > The rest looks fine to me!
> 
> Ah, nice!  I'd looked for a helper but missed that one.
> 
> Here's just the C parts, with that change.  Tested on aarch64-linux-gnu
> and x86_64-linux-gnu.  OK to install?

Ok, thanks!

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: [RFH][libgcc] fp-bit bit ordering (PR 78804)

2019-10-11 Thread Oleg Endo
On Thu, 2019-10-03 at 19:34 -0600, Jeff Law wrote:
> 
> So probably the most interesting target for this test is v850-elf as
> it's got a reasonably well functioning simulator, hard and soft FP
> targets, little endian, and I'm familiar with its current set of
> failures.
> 
> I can confirm that your patch makes no difference in the test results
> (which includes execution results).
> 
> In fact, there haven't been any problems on any target in my tester
> that
> I can tie back to this change.
> 
> At this point I'd say let's go for it.
> 

Thanks, Jeff.  I'll commit it to trunk if there are no further
objections some time next week.

It's a latent issue, not a regression.  OK for the open branches, too? 
Any opinions on that?

Cheers,
Oleg



Re: Ping: [C][C++] Avoid exposing internal details in aka types

2019-10-11 Thread Richard Sandiford
Richard Sandiford  writes:
> Marek Polacek  writes:
>> On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote:
>>> Ping
>>> 
>>> Richard Sandiford  writes:
>>> > The current aka diagnostics can sometimes leak internal details that
>>> > seem more likely to be distracting than useful.  E.g. on aarch64:
>>> >
>>> >   void f (va_list *va) { *va = 1; }
>>> >
>>> > gives:
>>> >
>>> >   incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} 
>>> > from type ‘int’
>>> >
>>> > where __va_list isn't something the user is expected to know about.
>>> > A similar thing happens for C++ on the arm_neon.h-based:
>>> >
>>> >   float x;
>>> >   int8x8_t y = x;
>>> >
>>> > which gives:
>>> >
>>> >   cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in 
>>> > initialization
>>> >
>>> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS --
>>> > but it's not going to be meaningful to most users.
>>
>> Agreed.
>>
>>> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to
>>> +   the user in diagnostics, false if it would be better to use TYPE itself.
>>> +   TYPE is known to satisfy typedef_variant_p.  */
>>> +
>>> +bool
>>> +user_facing_original_type_p (const_tree type)
>>> +{
>>> +  gcc_assert (typedef_variant_p (type));
>>> +  tree decl = TYPE_NAME (type);
>>> +
>>> +  /* Look through any typedef in "user" code.  */
>>> +  if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl))
>>> +return true;
>>> +
>>> +  /* If the original type is also named and is in the user namespace,
>>> + assume it too is a user-facing type.  */
>>> +  tree orig_type = DECL_ORIGINAL_TYPE (decl);
>>> +  if (tree orig_id = TYPE_IDENTIFIER (orig_type))
>>> +{
>>> +  const char *name = IDENTIFIER_POINTER (orig_id);
>>> +  if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1])))
>>> +   return true;
>>
>> This looks like name_reserved_for_implementation_p.
>>
>> The rest looks fine to me!
>
> Ah, nice!  I'd looked for a helper but missed that one.
>
> Here's just the C parts, with that change.

And here are the C++ parts, on top of that.  I should probably have
split them from the outset.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard

[Comment from the original covering note, in case it helps:

strip_typedefs had:

  /* Explicitly get the underlying type, as TYPE_MAIN_VARIANT doesn't
 strip typedefs with attributes.  */
  result = TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (TYPE_NAME (t)));
  result = strip_typedefs (result);

Applying TYPE_MAIN_VARIANT predates the strip_typedefs call, with the
comment originally contrasting with plain:

  result = TYPE_MAIN_VARIANT (t);

But the recursive call to strip_typedefs will apply TYPE_MAIN_VARIANT,
so it doesn't seem necessary to do it here too.  I think there was also
a missing "remove_attributes" argument, since wrapping something in a
typedef shouldn't change which attributes get removed.]

2019-10-11  Richard Sandiford  

gcc/cp/
* cp-tree.h (STF_USER_VISIBLE): New constant.
(strip_typedefs, strip_typedefs_expr): Take a flags argument.
* tree.c (strip_typedefs, strip_typedefs_expr): Likewise,
updating mutual calls accordingly.  When STF_USER_VISIBLE is true,
only look through typedefs if user_facing_original_type_p.
* error.c (dump_template_bindings, type_to_string): Pass
STF_USER_VISIBLE to strip_typedefs.
(dump_type): Likewise, unless pp_c_flag_gnu_v3 is set.

gcc/testsuite/
* g++.dg/diagnostic/aka5.h: New test.
* g++.dg/diagnostic/aka5a.C: Likewise.
* g++.dg/diagnostic/aka5b.C: Likewise.
* g++.target/aarch64/diag_aka_1.C: Likewise.

Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h2019-10-08 09:23:43.0 +0100
+++ gcc/cp/cp-tree.h2019-10-11 15:12:12.544911643 +0100
@@ -5696,6 +5696,13 @@ #define TFF_NO_OMIT_DEFAULT_TEMPLATE_ARG
 #define TFF_NO_TEMPLATE_BINDINGS   (1 << 13)
 #define TFF_POINTER(1 << 14)
 
+/* These constants can be used as bit flags to control strip_typedefs.
+
+   STF_USER_VISIBLE: use heuristics to try to avoid stripping user-facing
+   aliases of internal details.  This is intended for diagnostics,
+   where it should (for example) give more useful "aka" types.  */
+const unsigned int STF_USER_VISIBLE = 1U;
+
 /* Returns the TEMPLATE_DECL associated to a TEMPLATE_TEMPLATE_PARM
node.  */
 #define TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL(NODE) \
@@ -7254,8 +7261,10 @@ extern int zero_init_p   
(const_tree);
 extern bool check_abi_tag_redeclaration(const_tree, const_tree,
 const_tree);
 extern bool check_abi_tag_args (tree, tree);
-extern tree strip_typedefs (tree, bool * = NULL);

Re: [PATCH] Fix parser to recognize operator?:

2019-10-11 Thread Marek Polacek
On Fri, Oct 11, 2019 at 04:06:43PM +0200, Matthias Kretz wrote:
> This is a minor bugfix for improved error reporting. Overloading ?: is just 
> as 
> disallowed as it is without this change.

Thanks.  Can you provide a testcase that shows why this change makes sense?
That testcase then should be part of the patch submission.

> 2019-10-11  Matthias Kretz  
> 
>   * gcc/cp/parser.c (cp_parser_operator): Parse operator?: as an
> attempt to overload the conditional operator. Then
> grok_op_properties can print its useful "ISO C++ prohibits
> overloading operator ?:" message instead of the cryptic error
> message about a missing type-specifier before '?' token.
> 
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 3ee8da7db94..73385cb3dcb 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -15502,6 +15502,15 @@ cp_parser_operator (cp_parser* parser, location_t 
> start_loc)
>op = COMPONENT_REF;
>break;
>  
> +case CPP_QUERY:
> +  op = COND_EXPR;
> +  /* Consume the `?'.  */
> +  cp_lexer_consume_token (parser->lexer);
> +  /* Look for the matching `:'.  */
> +  cp_parser_require (parser, CPP_COLON, RT_COLON);
> +  consumed = true;
> +  break;
> +
>  case CPP_OPEN_PAREN:
>{
>  /* Consume the `('.  */

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA


Re: Ping: [C][C++] Avoid exposing internal details in aka types

2019-10-11 Thread Richard Sandiford
Marek Polacek  writes:
> On Thu, Oct 10, 2019 at 08:00:53PM +0100, Richard Sandiford wrote:
>> Ping
>> 
>> Richard Sandiford  writes:
>> > The current aka diagnostics can sometimes leak internal details that
>> > seem more likely to be distracting than useful.  E.g. on aarch64:
>> >
>> >   void f (va_list *va) { *va = 1; }
>> >
>> > gives:
>> >
>> >   incompatible types when assigning to type ‘va_list’ {aka ‘__va_list’} 
>> > from type ‘int’
>> >
>> > where __va_list isn't something the user is expected to know about.
>> > A similar thing happens for C++ on the arm_neon.h-based:
>> >
>> >   float x;
>> >   int8x8_t y = x;
>> >
>> > which gives:
>> >
>> >   cannot convert ‘float’ to ‘int8x8_t’ {aka ‘__Int8x8_t’} in initialization
>> >
>> > This is accurate -- and __Int8x8_t is defined by the AArch64 PCS --
>> > but it's not going to be meaningful to most users.
>
> Agreed.
>
>> +/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to
>> +   the user in diagnostics, false if it would be better to use TYPE itself.
>> +   TYPE is known to satisfy typedef_variant_p.  */
>> +
>> +bool
>> +user_facing_original_type_p (const_tree type)
>> +{
>> +  gcc_assert (typedef_variant_p (type));
>> +  tree decl = TYPE_NAME (type);
>> +
>> +  /* Look through any typedef in "user" code.  */
>> +  if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl))
>> +return true;
>> +
>> +  /* If the original type is also named and is in the user namespace,
>> + assume it too is a user-facing type.  */
>> +  tree orig_type = DECL_ORIGINAL_TYPE (decl);
>> +  if (tree orig_id = TYPE_IDENTIFIER (orig_type))
>> +{
>> +  const char *name = IDENTIFIER_POINTER (orig_id);
>> +  if (name[0] != '_' || (name[1] != '_' && !ISUPPER (name[1])))
>> +return true;
>
> This looks like name_reserved_for_implementation_p.
>
> The rest looks fine to me!

Ah, nice!  I'd looked for a helper but missed that one.

Here's just the C parts, with that change.  Tested on aarch64-linux-gnu
and x86_64-linux-gnu.  OK to install?

Richard


2019-10-11  Richard Sandiford  

gcc/c-family/
* c-common.h (user_facing_original_type_p): Declare.
* c-common.c: Include c-spellcheck.h.
(user_facing_original_type_p): New function.

gcc/c/
* c-objc-common.c (useful_aka_type_p): Replace with...
(get_aka_type): ...this new function.  Given the original type,
decide which aka type to print (if any).  Only look through typedefs
if user_facing_original_type_p.
(print_type): Update accordingly.

gcc/testsuite/
* gcc.dg/diag-aka-5.h: New test.
* gcc.dg/diag-aka-5a.c: Likewise.
* gcc.dg/diag-aka-5b.c: Likewise.
* gcc.target/aarch64/diag_aka_1.c (f): Expect an aka to be printed
for myvec.

Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2019-10-08 09:23:43.0 +0100
+++ gcc/c-family/c-common.h 2019-10-11 15:12:08.552939853 +0100
@@ -1063,6 +1063,7 @@ extern tree builtin_type_for_size (int,
 extern void c_common_mark_addressable_vec (tree);
 
 extern void set_underlying_type (tree);
+extern bool user_facing_original_type_p (const_tree);
 extern void record_types_used_by_current_var_decl (tree);
 extern vec *make_tree_vector (void);
 extern void release_tree_vector (vec *);
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2019-10-08 09:23:43.0 +0100
+++ gcc/c-family/c-common.c 2019-10-11 15:12:08.552939853 +0100
@@ -48,6 +48,7 @@ #define GCC_C_COMMON_C
 #include "gimplify.h"
 #include "substring-locations.h"
 #include "spellcheck.h"
+#include "c-spellcheck.h"
 #include "selftest.h"
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
@@ -7713,6 +7714,52 @@ set_underlying_type (tree x)
 }
 }
 
+/* Return true if it is worth exposing the DECL_ORIGINAL_TYPE of TYPE to
+   the user in diagnostics, false if it would be better to use TYPE itself.
+   TYPE is known to satisfy typedef_variant_p.  */
+
+bool
+user_facing_original_type_p (const_tree type)
+{
+  gcc_assert (typedef_variant_p (type));
+  tree decl = TYPE_NAME (type);
+
+  /* Look through any typedef in "user" code.  */
+  if (!DECL_IN_SYSTEM_HEADER (decl) && !DECL_IS_BUILTIN (decl))
+return true;
+
+  /* If the original type is also named and is in the user namespace,
+ assume it too is a user-facing type.  */
+  tree orig_type = DECL_ORIGINAL_TYPE (decl);
+  if (tree orig_id = TYPE_IDENTIFIER (orig_type))
+if (!name_reserved_for_implementation_p (IDENTIFIER_POINTER (orig_id)))
+  return true;
+
+  switch (TREE_CODE (orig_type))
+{
+/* Don't look through to an anonymous vector type, since the syntax
+   we use for them in diagnostics isn't real C or C++ syntax.
+   And if ORIG_TYPE is named but in the implementation namespace,
+   TYPE is likely to be 

[PATCH] Fix parser to recognize operator?:

2019-10-11 Thread Matthias Kretz
This is a minor bugfix for improved error reporting. Overloading ?: is just as 
disallowed as it is without this change.

2019-10-11  Matthias Kretz  

* gcc/cp/parser.c (cp_parser_operator): Parse operator?: as an
  attempt to overload the conditional operator. Then
  grok_op_properties can print its useful "ISO C++ prohibits
  overloading operator ?:" message instead of the cryptic error
  message about a missing type-specifier before '?' token.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3ee8da7db94..73385cb3dcb 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -15502,6 +15502,15 @@ cp_parser_operator (cp_parser* parser, location_t 
start_loc)
   op = COMPONENT_REF;
   break;
 
+case CPP_QUERY:
+  op = COND_EXPR;
+  /* Consume the `?'.  */
+  cp_lexer_consume_token (parser->lexer);
+  /* Look for the matching `:'.  */
+  cp_parser_require (parser, CPP_COLON, RT_COLON);
+  consumed = true;
+  break;
+
 case CPP_OPEN_PAREN:
   {
 /* Consume the `('.  */

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtzzentrum für Schwerionenforschung https://gsi.de
 SIMD easy and portable https://github.com/VcDevel/Vc
──

Re: [RFH][libgcc] fp-bit bit ordering (PR 78804)

2019-10-11 Thread Oleg Endo
Hi Bernd,

On Sun, 2019-09-29 at 08:49 +, Bernd Edlinger wrote:
> 
> But I cannot tell if the bitfield access generates
> more efficient code or identical code than the
> original variant when no ms bitfields are used.
> That needs closer inspection of the generated
> assembler code, a simple bootstrap / regtest will
> probably not be sufficient.

Thanks for your comment.

The bitfields in this case are used for packing and unpacking the FP
values.  So I gave it a try on RX and SH as examples (with trunk
compiler).  I've extracted the unpack function for a "double" value and
compared the generated assembly code.  Please see the attached file.

The bitfield code on RX is equivalent.  On SH bitfields is slightly
worse.  It gets *a lot* worse when the struct is passed by
reference/pointer, then things get completely out of control.

Theoretically, if the bitfields and the shifts-and-ands extract/pack
the same bits from the underlying base integer value, the resulting
code should be the same in both cases (see RX).  But it depends on what
the backend does, or tries to do (see SH).

> But my thought is if the -mms-bitfields option has such
> an impact on this structure, then it would be good if there
> was a built-in define that can be used to adjust to and/or
> diagnose the problem at compile time.

I think it'd be better to add static_assert on the expected sizes of
the various float value structs.


> 
> I think that is missing right now, but wouldn't it be nice to have
> a define like __MS_BITFIELD_LAYOUT__ ?

Honestly, I'm not a big fan of the whole MS bitfield thing.  On RX it's
been put there long time ago for compatibility with some other
compiler.  I'm not sure if it's relevant at all anymore.  So I don't
want to add any more to the pile ...

Cheers,
Oleg

typedef unsigned long long fractype;

struct unpacked_double
{
  fractype fraction;
  int exp;
  int sign;
};

struct  __attribute__ ((packed))
bits_little_endian 
{
  fractype fraction:52 __attribute__ ((packed));
  unsigned int exp:11 __attribute__ ((packed));
  unsigned int sign:1 __attribute__ ((packed));
};

static_assert (sizeof (bits_little_endian) == 8, "");
static_assert (sizeof (long long) == 8, "");


// 

unpacked_double
unpack_d_bitfields (bits_little_endian val)
{
  fractype fraction = val.fraction;
  int exp = val.exp;
  int sign = val.sign;

  return { fraction, exp, sign };
}

/*
RX -O2

__Z18unpack_d_bitfieldsRK18bits_little_endian:
	mov.L	r2, r4	 ; 4	[c=4 l=2]  *movsi_internal/4
	shlr	#20, r2, r3	 ; 9	[c=8 l=3]  lshrsi3/2
	add	#-16, r0	 ; 58	[c=4 l=3]  addsi3_internal/3
	mov.L	#0xf, r2	 ; 57	[c=4 l=5]  *movsi_internal/2
	and	r4, r2	 ; 43	[c=4 l=2]  andsi3/0
	and	#0x7ff, r3	 ; 44	[c=4 l=4]  andsi3/3
	shlr	#31, r4	 ; 45	[c=8 l=2]  lshrsi3/1
	rtsd	#16		 ; 61	[c=4 l=2]  deallocate_and_return

SH -O2

__Z18unpack_d_bitfieldsRK18bits_little_endian:
.LFB0:
	.cfi_startproc
	.cfi_startproc
	add	#-8,r15	! 85	[c=4 l=2]  *addsi3/0
	.cfi_def_cfa_offset 8
	mov	r2,r0	! 2	[c=4 l=2]  movsi_ie/1
	mov.l	.L3,r3	! 32	[c=10 l=2]  movsi_ie/0
	mov	#-21,r2	! 82	[c=4 l=2]  movsi_ie/2
	mov	r5,r1	! 81	[c=4 l=2]  movsi_ie/1
	add	r1,r1	! 16	[c=4 l=2]  ashlsi3_k/0
	shld	r2,r1	! 71	[c=4 l=2]  lshrsi3_d
	mov	r5,r2	! 83	[c=4 l=2]  movsi_ie/1
	shll	r2		! 72	[c=0 l=2]  shll
	mov.l	r4,@r0	! 44	[c=4 l=2]  movsi_ie/8
	movt	r2		! 73	[c=4 l=2]  movt
	mov.l	r1,@(8,r0)	! 35	[c=4 l=2]  movsi_ie/8
	and	r3,r5	! 33	[c=4 l=2]  *andsi_compact/3
	mov.l	r2,@(12,r0)	! 36	[c=4 l=2]  movsi_ie/8
	mov.l	r5,@(4,r0)	! 45	[c=4 l=2]  movsi_ie/8
	rts			! 91	[c=0 l=2]  *return_i
	add	#8,r15	! 90	[c=4 l=2]  *addsi3/0
	.cfi_endproc

*/

// 

unpacked_double
unpack_d_no_bitfields (long long val)
{
  fractype fraction = val & fractype)1) << 52) - 1);
  int exp = ((int)(val >> 52)) & ((1 << 11) - 1);
  int sign = ((int)(val >> (52 + 11))) & 1;

  return { fraction, exp, sign };
}

/*
RX -O2
	mov.L	r2, r4	 ; 4	[c=4 l=2]  *movsi_internal/4
	shar	#20, r2, r3	 ; 13	[c=8 l=3]  ashrsi3/2
	add	#-16, r0	 ; 56	[c=4 l=3]  addsi3_internal/3
	mov.L	#0xf, r2	 ; 55	[c=4 l=5]  *movsi_internal/2
	and	r4, r2	 ; 42	[c=4 l=2]  andsi3/0
	and	#0x7ff, r3	 ; 43	[c=4 l=4]  andsi3/3
	shlr	#31, r4	 ; 44	[c=8 l=2]  lshrsi3/1
	rtsd	#16		 ; 59	[c=4 l=2]  deallocate_and_return

SH -O2
	mov	r2,r0	! 2	[c=4 l=2]  movsi_ie/1
	mov.l	.L6,r1	! 10	[c=10 l=2]  movsi_ie/0
	mov.l	r4,@r2	! 9	[c=4 l=2]  movsi_ie/8
	and	r5,r1	! 11	[c=4 l=2]  *andsi_compact/3
	mov.l	r1,@(4,r2)	! 12	[c=4 l=2]  movsi_ie/8
	mov	#-21,r2	! 69	[c=4 l=2]  movsi_ie/2
	mov	r5,r1	! 68	[c=4 l=2]  movsi_ie/1
	shll	r5		! 59	[c=0 l=2]  shll
	add	r1,r1	! 18	[c=4 l=2]  ashlsi3_k/0
	shld	r2,r1	! 58	[c=4 l=2]  lshrsi3_d
	movt	r5		! 60	[c=4 l=2]  movt
	mov.l	r1,@(8,r0)	! 20	[c=4 l=2]  movsi_ie/8
	rts			! 74	[c=0 l=2]  *return_i
	mov.l	r5,@(12,r0)	! 28	[c=4 l=2]  movsi_ie/8
.L7:
	.align 2
.L6:
	.long	1048575

*/



Re: [C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Marek Polacek
On Fri, Oct 11, 2019 at 12:58:41PM +0200, Paolo Carlini wrote:
> Hi,
> 
> another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent with
> the one I used in build_anon_union_vars. Tested x86_64-linux.
> 
> Thanks, Paolo.
> 
> /

> /cp
> 2019-10-11  Paolo Carlini  
> 
>   * decl.c (check_tag_decl): Use DECL_SOURCE_LOCATION.
> 
> /testsuite
> 2019-10-11  Paolo Carlini  
> 
>   * g++.dg/cpp0x/constexpr-union5.C: Test location(s) too.
>   * g++.dg/diagnostic/bitfld2.C: Likewise.
>   * g++.dg/ext/anon-struct1.C: Likewise.
>   * g++.dg/ext/anon-struct6.C: Likewise.
>   * g++.dg/ext/flexary19.C: Likewise.
>   * g++.dg/ext/flexary9.C: Likewise.
>   * g++.dg/template/error17.C: Likewise.

> Index: cp/decl.c
> ===
> --- cp/decl.c (revision 276845)
> +++ cp/decl.c (working copy)
> @@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
>  
>if (TREE_CODE (declared_type) != UNION_TYPE
> && !in_system_header_at (input_location))
> - pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous 
> structs");
> + pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)),
> +  OPT_Wpedantic, "ISO C++ prohibits anonymous structs");

The change looks fine but the in_system_header_at line can be dropped, right?

Marek


Re: [patch,Fortran] PR 92050 - fix ICE with -fcheck=all

2019-10-11 Thread Steve Kargl
On Fri, Oct 11, 2019 at 12:17:49PM +0200, Tobias Burnus wrote:
> Checking produces extra code (unsurprisingly); this code needs to end up 
> in the program…
> 
> Bootstrapped on x86-64_gnu-linux. OK for the trunk?
> 

OK.

--
Steve


[PATCH] Re-instantiate more redundant store removal in FRE

2019-10-11 Thread Richard Biener


I've spent some time robustifying the original idea (fixing some
latent wrong-code on the way...).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-10-11  Richard Biener  

PR tree-optimization/90883
PR tree-optimization/91091
* tree-ssa-sccvn.c (vn_reference_lookup_3): Use correct
alias-sets both for recording VN table entries and continuing
walking after translating through copies.  Handle same-sized
reads from SSA names by returning the plain SSA name.
(eliminate_dom_walker::eliminate_stmt): Properly handle
non-size precision stores in redundant store elimination.

* gcc.dg/torture/20191011-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-fre-82.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-83.c: Likewise.
* gcc.dg/tree-ssa/redundant-assign-zero-1.c: Disable FRE.
* gcc.dg/tree-ssa/redundant-assign-zero-2.c: Likewise.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 276858)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -1877,8 +1877,10 @@
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
 "Successfully combined %u partial definitions\n", ndefs);
+  /* ???  If we track partial defs alias-set we could use that if it
+ is the same for all.  Use zero for now.  */
   return vn_reference_lookup_or_insert_for_pieces
-   (first_vuse, vr->set, vr->type, vr->operands, val);
+   (first_vuse, 0, vr->type, vr->operands, val);
 }
   else
 {
@@ -2333,7 +2335,7 @@
 
   /* If we are looking for redundant stores do not create new hashtable
  entries from aliasing defs with made up alias-sets.  */
-  if (*disambiguate_only > TR_TRANSLATE || !data->tbaa_p)
+  if (*disambiguate_only > TR_TRANSLATE)
 return (void *)-1;
 
   /* If we cannot constrain the size of the reference we cannot
@@ -2449,7 +2451,7 @@
return (void *)-1;
}
  return vn_reference_lookup_or_insert_for_pieces
-  (vuse, vr->set, vr->type, vr->operands, val);
+  (vuse, 0, vr->type, vr->operands, val);
}
   /* For now handle clearing memory with partial defs.  */
   else if (known_eq (ref->size, maxsize)
@@ -2499,7 +2501,7 @@
{
  tree val = build_zero_cst (vr->type);
  return vn_reference_lookup_or_insert_for_pieces
- (vuse, vr->set, vr->type, vr->operands, val);
+ (vuse, get_alias_set (lhs), vr->type, vr->operands, val);
}
  else if (known_eq (ref->size, maxsize)
   && maxsize.is_constant ()
@@ -2614,7 +2616,7 @@
 
  if (val)
return vn_reference_lookup_or_insert_for_pieces
-   (vuse, vr->set, vr->type, vr->operands, val);
+ (vuse, get_alias_set (lhs), vr->type, vr->operands, val);
}
}
  else if (ranges_known_overlap_p (offseti, maxsizei, offset2i, size2i))
@@ -2672,23 +2674,26 @@
 according to endianness.  */
  && (! INTEGRAL_TYPE_P (vr->type)
  || known_eq (ref->size, TYPE_PRECISION (vr->type)))
- && multiple_p (ref->size, BITS_PER_UNIT)
- && (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs))
- || type_has_mode_precision_p (TREE_TYPE (def_rhs
+ && multiple_p (ref->size, BITS_PER_UNIT))
{
- gimple_match_op op (gimple_match_cond::UNCOND,
- BIT_FIELD_REF, vr->type,
- vn_valueize (def_rhs),
- bitsize_int (ref->size),
- bitsize_int (offset - offset2));
- tree val = vn_nary_build_or_lookup ();
- if (val
- && (TREE_CODE (val) != SSA_NAME
- || ! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (val)))
+ if (known_eq (ref->size, size2))
+   return vn_reference_lookup_or_insert_for_pieces
+   (vuse, get_alias_set (lhs), vr->type, vr->operands,
+SSA_VAL (def_rhs));
+ else if (! INTEGRAL_TYPE_P (TREE_TYPE (def_rhs))
+  || type_has_mode_precision_p (TREE_TYPE (def_rhs)))
{
- vn_reference_t res = vn_reference_lookup_or_insert_for_pieces
- (vuse, vr->set, vr->type, vr->operands, val);
- return res;
+ gimple_match_op op (gimple_match_cond::UNCOND,
+ BIT_FIELD_REF, vr->type,
+ vn_valueize (def_rhs),
+ bitsize_int (ref->size)

Re: [PATCH] Cleanup parameter of vectorizable_live_operation

2019-10-11 Thread Richard Biener
On Fri, 11 Oct 2019, Bernd Edlinger wrote:

> Hi Richard,
> 
> I became aware of this while looking at the -Wshadow=compatible-local 
> warnings.
> The function vectorizable_live_operation uses a parameter called "vec_stmt"
> that is shadowed by something also called "vec_stmt".  But in this case,
> the vec_stmt actually only used as a boolean, i.e. pointer is NULL or not.
> 
> This changes the parameter vec_stmt to vec_stmt_p, and propagates that
> change to can_vectorize_live_stmts.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

OK.  Some more refactoring is in order here but it's an improvement.

Thanks,
Richard.


Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops

2019-10-11 Thread Richard Biener
On Thu, 10 Oct 2019, Andre Vieira (lists) wrote:

> Hi,
> 
> After all the discussions and respins I now believe this patch is close to
> what we envisioned.
> 
> This patch achieves two things when vect-epilogues-nomask=1:
> 1) It analyzes the original loop for each supported vector size and saves this
> analysis per loop, as well as the vector sizes we know we can vectorize the
> loop for.
> 2) When loop versioning it uses the 'skip_vector' code path to vectorize the
> epilogue, and uses the lowest versioning threshold between the main and
> epilogue's.
> 
> As side effects of this patch I also changed ensure_base_align to only update
> the alignment if the new alignment is lower than the current one.  This
> function already did that if the object was a symbol, now it behaves this way
> for any object.
> 
> I bootstrapped this patch with both vect-epilogues-nomask turned on and off on
> x86_64 (AVX512) and aarch64.  Regression tests looked good.
> 
> Is this OK for trunk?

+
+  /* Keep track of vector sizes we know we can vectorize the epilogue 
with.  */
+  vector_sizes epilogue_vsizes;
 };

please don't enlarge struct loop, instead track this somewhere
in the vectorizer (in loop_vinfo?  I see you already have
epilogue_vinfos there - so the loop_vinfo simply lacks 
convenient access to the vector_size?)  I don't see any
use that could be trivially adjusted to look at a loop_vinfo
member instead.

For the vect_update_inits_of_drs this means that we'd possibly
do less CSE.  Not sure if really an issue.

You use LOOP_VINFO_EPILOGUE_P sometimes and sometimes
LOOP_VINFO_ORIG_LOOP_INFO, please change predicates to
LOOP_VINFO_EPILOGUE_P.

@@ -2466,15 +2461,62 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree 
niters, tree nitersm1,
   else
 niters_prolog = build_int_cst (type, 0);
 
+  loop_vec_info epilogue_vinfo = NULL;
+  if (vect_epilogues)
+{ 
...
+   vect_epilogues = false;
+}
+

I don't understand what all this does - it clearly needs a comment.
Maybe the overall comment of the function should be amended with
an overview of how we handle [multiple] epilogue loop vectorization?

+
+  if (epilogue_any_upper_bound && prolog_peeling >= 0)
+   {
+ epilog->any_upper_bound = true;
+ epilog->nb_iterations_upper_bound = eiters + 1;
+   }
+

comment missing.  How can prolog_peeling be < 0?  We likely
didn't set the upper bound because we don't know it in the
case we skipped the vector loop (skip_vector)?  So make sure
to not introduce wrong-code issues here - maybe do this
optimization as followup?

 class loop *
-vect_loop_versioning (loop_vec_info loop_vinfo,
- unsigned int th, bool check_profitability,
- poly_uint64 versioning_threshold)
+vect_loop_versioning (loop_vec_info loop_vinfo)
 { 
   class loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop;
   class loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
@@ -2988,10 +3151,15 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   bool version_align = LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT 
(loop_vinfo);
   bool version_alias = LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo);
   bool version_niter = LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo);
+  poly_uint64 versioning_threshold
+= LOOP_VINFO_VERSIONING_THRESHOLD (loop_vinfo);
   tree version_simd_if_cond
 = LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND (loop_vinfo);
+  unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);

-  if (check_profitability)
+  if (th >= vect_vf_for_cost (loop_vinfo)
+  && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+  && !ordered_p (th, versioning_threshold))
 cond_expr = fold_build2 (GE_EXPR, boolean_type_node, 
scalar_loop_iters,
 build_int_cst (TREE_TYPE (scalar_loop_iters),
th - 1));

split out this refactoring - preapproved.

@@ -1726,7 +1729,13 @@ vect_analyze_loop_costing (loop_vec_info 
loop_vinfo)
   return 0;
 }

-  HOST_WIDE_INT estimated_niter = estimated_stmt_executions_int (loop);
+  HOST_WIDE_INT estimated_niter = -1;
+
+  if (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo))
+estimated_niter
+  = vect_vf_for_cost (LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) - 1;
+  if (estimated_niter == -1)
+estimated_niter = estimated_stmt_executions_int (loop);
   if (estimated_niter == -1)
 estimated_niter = likely_max_stmt_executions_int (loop);
   if (estimated_niter != -1

it's clearer if the old code is completely in a else {} path
even though vect_vf_for_cost - 1 should never be -1.

+/* Decides whether we need to create an epilogue loop to handle
+   remaining scalar iterations and sets PEELING_FOR_NITERS accordingly.  
*/
+  
+void  
+determine_peel_for_niter (loop_vec_info loop_vinfo)
+{   
+  

extra vertical space

+  unsigned HOST_WIDE_INT const_vf;
+  HOST_WIDE_INT max_niter 

if it's a 1:1 copy outlined then split it out - preapproved
(so further 

[PATCH] Cleanup parameter of vectorizable_live_operation

2019-10-11 Thread Bernd Edlinger
Hi Richard,

I became aware of this while looking at the -Wshadow=compatible-local warnings.
The function vectorizable_live_operation uses a parameter called "vec_stmt"
that is shadowed by something also called "vec_stmt".  But in this case,
the vec_stmt actually only used as a boolean, i.e. pointer is NULL or not.

This changes the parameter vec_stmt to vec_stmt_p, and propagates that
change to can_vectorize_live_stmts.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2019-10-11  Bernd Edlinger  

	* tree-vect-loop.c (vect_analyze_loop_operations): Adjust call to
	vectorizable_live_operation.
	(vectorizable_live_operation): Adjust parameters.
	* tree-vect-stmts.c (vect_init_vector,
	vect_gen_widened_results_half): Fix typo in function comment.
	(can_vectorize_live_stmts): Adjust function comment.
	Adjust parameters.  Adjust call to vectorizable_live_operation.
	(vect_analyze_stmt): Adjust call to can_vectorize_live_stmts.
	(vect_transform_stmt): Adjust function comment.  Adjust call to
	can_vectorize_live_stmts.
	* tree-vectorizer.h (vectorizable_live_operation): Adjust parameters.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c	(revision 276843)
+++ gcc/tree-vect-loop.c	(working copy)
@@ -1566,7 +1566,7 @@ vect_analyze_loop_operations (loop_vec_info loop_v
 	  && STMT_VINFO_LIVE_P (stmt_info)
 	  && !PURE_SLP_STMT (stmt_info))
 	ok = vectorizable_live_operation (stmt_info, NULL, NULL, NULL,
-	  -1, NULL, _vec);
+	  -1, false, _vec);
 
   if (!ok)
 	return opt_result::failure_at (phi,
@@ -7628,9 +7628,9 @@ vectorizable_induction (stmt_vec_info stmt_info,
 
 bool
 vectorizable_live_operation (stmt_vec_info stmt_info,
-			 gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
+			 gimple_stmt_iterator *gsi,
 			 slp_tree slp_node, slp_instance slp_node_instance,
-			 int slp_index, stmt_vec_info *vec_stmt,
+			 int slp_index, bool vec_stmt_p,
 			 stmt_vector_for_cost *)
 {
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
@@ -7652,7 +7652,7 @@ vectorizable_live_operation (stmt_vec_info stmt_in
  validity so just trigger the transform here.  */
   if (STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info)))
 {
-  if (!vec_stmt)
+  if (!vec_stmt_p)
 	return true;
   if (slp_node)
 	{
@@ -7721,7 +7721,7 @@ vectorizable_live_operation (stmt_vec_info stmt_in
 	}
 }
 
-  if (!vec_stmt)
+  if (!vec_stmt_p)
 {
   /* No transformation required.  */
   if (LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo))
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c	(revision 276843)
+++ gcc/tree-vect-stmts.c	(working copy)
@@ -1449,7 +1449,7 @@ vect_init_vector_1 (stmt_vec_info stmt_vinfo, gimp
Insert a new stmt (INIT_STMT) that initializes a new variable of type
TYPE with the value VAL.  If TYPE is a vector type and VAL does not have
vector type a vector with all elements equal to VAL is created first.
-   Place the initialization at BSI if it is not NULL.  Otherwise, place the
+   Place the initialization at GSI if it is not NULL.  Otherwise, place the
initialization at the loop preheader.
Return the DEF of INIT_STMT.
It will be used in the vectorization of STMT_INFO.  */
@@ -4484,7 +4484,7 @@ vectorizable_simd_clone_call (stmt_vec_info stmt_i
 
Create a vector stmt whose code, type, number of arguments, and result
variable are CODE, OP_TYPE, and VEC_DEST, and its arguments are
-   VEC_OPRND0 and VEC_OPRND1.  The new vector stmt is to be inserted at BSI.
+   VEC_OPRND0 and VEC_OPRND1.  The new vector stmt is to be inserted at GSI.
In the case that CODE is a CALL_EXPR, this means that a call to DECL
needs to be created (DECL is a function-decl of a target-builtin).
STMT_INFO is the original scalar stmt that we are vectorizing.  */
@@ -10474,12 +10474,12 @@ vectorizable_comparison (stmt_vec_info stmt_info,
 /* If SLP_NODE is nonnull, return true if vectorizable_live_operation
can handle all live statements in the node.  Otherwise return true
if STMT_INFO is not live or if vectorizable_live_operation can handle it.
-   GSI and VEC_STMT are as for vectorizable_live_operation.  */
+   GSI and VEC_STMT_P are as for vectorizable_live_operation.  */
 
 static bool
 can_vectorize_live_stmts (stmt_vec_info stmt_info, gimple_stmt_iterator *gsi,
 			  slp_tree slp_node, slp_instance slp_node_instance,
-			  stmt_vec_info *vec_stmt,
+			  bool vec_stmt_p,
 			  stmt_vector_for_cost *cost_vec)
 {
   if (slp_node)
@@ -10491,7 +10491,7 @@ can_vectorize_live_stmts (stmt_vec_info stmt_info,
 	  if (STMT_VINFO_LIVE_P (slp_stmt_info)
 	  && !vectorizable_live_operation (slp_stmt_info, gsi, slp_node,
 	   slp_node_instance, i,
-	   vec_stmt, cost_vec))
+	   vec_stmt_p, cost_vec))
 	return 

Re: [PATCH 1/2][GCC][RFC][middle-end]: Add SLP pattern matcher.

2019-10-11 Thread Richard Biener
On Tue, 8 Oct 2019, Tamar Christina wrote:

> Hi Richi,
> 
> Thanks for the review, I've added some comments inline.
> 
> The 10/07/2019 12:15, Richard Biener wrote:
> > On Tue, 1 Oct 2019, Tamar Christina wrote:
> > 
> > > Hi All,
> > > 
> > > This adds a framework to allow pattern matchers to be written at based on 
> > > the
> > > SLP tree.  The difference between this one and the one in 
> > > tree-vect-patterns is
> > > that this matcher allows the matching of an arbitrary number of parallel
> > > statements and replacing of an arbitrary number of children or statements.
> > > 
> > > Any relationship created by the SLP pattern matcher will be undone if SLP 
> > > fails.
> > > 
> > > The pattern matcher can also cancel all permutes depending on what the 
> > > pattern
> > > requested it to do.  As soon as one pattern requests the permutes to be
> > > cancelled all permutes are cancelled.
> > > 
> > > Compared to the previous pattern matcher this one will work for an 
> > > arbitrary
> > > group size and will match at any arbitrary node in the SLP tree.  The only
> > > requirement is that the entire node is matched or rejected.
> > > 
> > > vect_build_slp_tree_1 is a bit more lenient in what it accepts as 
> > > "compatible
> > > operations" but the matcher cannot be because in cases where you match 
> > > the order
> > > of the operands may be changed.  So all operands must be changed or none.
> > > 
> > > Furthermore the matcher relies on canonization of the operations inside 
> > > the
> > > SLP tree and on the fact that floating math operations are not 
> > > commutative.
> > > This means that matching a pattern does not need to try all alternatives 
> > > or
> > > combinations and that arguments will always be in the same order if it's 
> > > the
> > > same operation.
> > > 
> > > The pattern matcher also ignored uninteresting nodes such as type casts, 
> > > loads
> > > and stores.  Doing so is essential to keep the runtime down.
> > > 
> > > Each matcher is allowed a post condition that can be run to perform any 
> > > changes
> > > to the SLP tree as needed before the patterns are created and may also 
> > > abort
> > > the creation of the patterns.
> > > 
> > > When a pattern is matched it is not immediately created but instead it is
> > > deferred until all statements in the node have been analyzed.  Only if 
> > > all post
> > > conditions are true, and all statements will be replaced will the 
> > > patterns be
> > > created in batch.  This allows us to not have to undo any work if the 
> > > pattern
> > > fails but also makes it so we traverse the tree only once.
> > > 
> > > When a new pattern is created it is a marked as a pattern to the 
> > > statement it is
> > > replacing and be marked as used in the current SLP scope.  If SLP fails 
> > > then
> > > relationship is undone and the relevancy restored.
> > > 
> > > Each pattern matcher can detect any number of pattern it wants.  The only
> > > constraint is that the optabs they produce must all have the same arity.
> > > 
> > > The pattern matcher supports instructions that have no scalar form as they
> > > are added as pattern statements to the stmt.  The BB is left untouched and
> > > so the scalar loop is untouched.
> > > 
> > > Bootstrapped on aarch64-none-linux-gnu and no issues.
> > > No regression testing done yet.
> > 
> > If you split out the introduction of SLP_TREE_REF_COUNT you can commit
> > that right now (sorry for being too lazy there...).
> > 
> 
> I'll split those off :)
> 
> > One overall comment - you do pattern matching after SLP tree
> > creation (good) but still do it before the whole SLP graph is
> > created (bad).  Would it be possible to instead do it as a separate
> > phase in vect_analyze_slp, looping over all instances (the instances
> > present entries into the single unified SLP graph now), avoiding
> > to visit "duplicates"?
> > 
> 
> It should be, the only issue I can see is that build SLP may fail because of
> an unsupported permute, or because it can use load lanes.  If I'm 
> understanding
> it correctly you wouldn't get SLP vectorization in those cases so then the 
> matching
> can't work? So it would limit it a it more.

True.  As said later ideally the pattern matching woudl happen before
or rather as part of SLP building so that the operations then actually
match (and we then don't need that plus/minus SLP_TREE_TWO_OPERATORS
handling).  At the moment it sits in a quite awkward place which
may contribute to its complexity.

> > In patch 1/2 I see references (mostly in variable names) to
> > "complex" which is IMHO too specific.
> > 
> 
> Sorry, missed those during my cleanup.
> 
> > I'd also think that a useful first pattern to support would be
> > what we do with SLP_TREE_TWO_OPERATORS, where code generation
> > inserts extra blends.  I have yet to dive into the complex pattern
> > details to see if that's feasible or if you maybe re-use that...
> 
> Oh, I hadn't thought of that. I'll take a 

[PATCH] [MIPS] Replace insert with insve for floating-point values

2019-10-11 Thread Mihailo Stojanovic
Currently, when a function argument of type double gets loaded into a
vector register on a 32-bit target, it is firstly reloaded into two
general purpose registers, and then loaded into a vector register using
two insert.w instructions.

This patch swaps the two insert.w instructions with one insve.d
instruction, which operates on 64-bit floating point registers, so the
value can be reloaded into a FPR. This is done by adding another
alternative of constraints for msa_insert_ pattern, which
covers the case of a floating-point input value.

gcc/ChangeLog:

* config/mips/mips-msa.md (msa_insert_): Add an
alternative which covers the floating-point input value. Also
forbid the split of insert.d pattern for floating-point values.

gcc/testsuite/ChangeLog:

* gcc.target/mips/msa-insert-split.c: New test.
---
 gcc/config/mips/mips-msa.md  | 11 ---
 gcc/testsuite/gcc.target/mips/msa-insert-split.c | 16 
 2 files changed, 24 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/msa-insert-split.c

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 929646d..628423d 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -436,14 +436,17 @@
 })
 
 (define_insn "msa_insert_"
-  [(set (match_operand:MSA 0 "register_operand" "=f")
+  [(set (match_operand:MSA 0 "register_operand" "=f,f")
(vec_merge:MSA
  (vec_duplicate:MSA
-   (match_operand: 1 "reg_or_0_operand" "dJ"))
- (match_operand:MSA 2 "register_operand" "0")
+   (match_operand: 1 "reg_or_0_operand" "dJ,f"))
+ (match_operand:MSA 2 "register_operand" "0,0")
  (match_operand 3 "const__operand" "")))]
   "ISA_HAS_MSA"
 {
+  if (which_alternative == 1)
+return "insve.\t%w0[%y3],%w1[0]";
+
   if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode))
 return "#";
   else
@@ -462,6 +465,8 @@
   "reload_completed && ISA_HAS_MSA && !TARGET_64BIT"
   [(const_int 0)]
 {
+  if (REG_P (operands[1]) && FP_REG_P (REGNO (operands[1])))
+FAIL;
   mips_split_msa_insert_d (operands[0], operands[2], operands[3], operands[1]);
   DONE;
 })
diff --git a/gcc/testsuite/gcc.target/mips/msa-insert-split.c 
b/gcc/testsuite/gcc.target/mips/msa-insert-split.c
new file mode 100644
index 000..50f3b8a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/msa-insert-split.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mfp64 -mhard-float -mmsa" } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+
+typedef double v2f64 __attribute__ ((vector_size (16)));
+
+void foo (double* arr, v2f64* vec)
+{
+  v2f64 v;
+  v[0] = arr[0];
+  v[1] = arr[1];
+  *vec = v;
+}
+
+/* { dg-final { scan-assembler-not "insert.w" } } */
+/* { dg-final { scan-assembler-times "insve.d" 2 } } */
-- 
2.7.4



Re: Avid ggc_alloc and push_cfun during LTO streaming

2019-10-11 Thread Jan Hubicka
> On Fri, Oct 11, 2019 at 11:03 AM Jan Hubicka  wrote:
> >
> > Hi,
> > this patch prevents tree creation druing WPA stream out (to avoid
> > touching pages and triggering COW).  It fixes the following
> >  - gimple streamer produces MEM_REF wrappings for global decls.
> >This is to preserve the type of access and is not necessary for
> >WPA->LTRANS streaming when decls ar eno longer going to be merged.
> >  - we renumber stmt uids during streaming WPA summaries
> >  - loop optimizer is initialized in output_function.
> > After testing the patch I noticed that output_function does one extra
> > renumbering of stmts. This seems quite broken and I will fix it
> > incrementally.
> >
> > Bootstrapped/regtested x86_64-linux, comitted.
> 
> Huh.  Why do we stream function bodies at WPA time at all?
> We should already have input sections we can copy/remap?
> 
> That is, why does gcc_assert (!flag_wpa) in output_function trip?

Because we produce new bodies (such as merged ctors) and also bodies
read by ipa-ICF to memory are then streamed form in-memory copies rather
than the pickled sections.

The overall idea is to keep things flexble enough so IPA passes can
produce new functions and also load bodies to memory and modify them.
The second is useful for development where one can first get pass
working modifying at WPA time and then work on optimization summaries.
I also think we may end up with passes that does more involved
modifications on small portions of program (perhaps matric reorg could)
I am also using this to play with re-running early opts at WPA time and
seeing how many optimizations are missed by not early
inlining/propagating dead code via nothrow etc etc.

Honza


Re: Type representation in CTF and DWARF

2019-10-11 Thread Jakub Jelinek
On Fri, Oct 11, 2019 at 01:23:12PM +0200, Richard Biener wrote:
> > (coreutils-0.22)
> >   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> > ls   30616   |1136   |21098   | 26240   
> > | 0.62
> > pwd  10734   |788|10433   | 13929   
> > | 0.83
> > groups 10706 |811|10249   | 13378   
> > | 0.80
> >
> > (emacs-26.3)
> >   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> > (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> > emacs-26.3.1 674657  |6402   |   273963   |   273910
> > | 0.33
> >
> > I chose to account for 50% of .debug_str because at this point, it will be
> > unfair to not account for them. Actually, one could even argue that upto 70%
> > of the .debug_str are names of entities. CTF section sizes do include the 
> > CTF
> > string tables.
> >
> > Across coreutils, I see a geomean of 0.73 (ratio of
> > .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
> > "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
> > footprint than CTF (with 50% of .debug_str accounted for).
> 
> I'm not convinced this "improvement" in size is worth maintainig another
> debug-info format much less since it lacks desirable features right now
> and thus evaluation is tricky.
> 
> At least you can improve dwarf size considerably with a low amount of work.
> 
> I suspect another factor where dwarf is bigger compared to CTF is that dwarf
> is recording typedef names as well as qualified type variants.  But maybe
> CTF just has a more compact representation for the bits it actually 
> implements.

Does CTF record automatic variables in functions, or just global variables?
If only the latter, it would be fair to also disable addition of local
variable DIEs, lexical blocks.  Does CTF record inline functions?  Again, if
not, it would be fair to not emit that either in .debug_info.
-gno-record-gcc-switches so that the compiler command line is not encoded in
the debug info (unless it is in CTF).
DWARF is highly extensible format, what exactly is and is not emitted is
something that consumers can choose.
Yes, DWARF can be large, but mainly because it provides a lot of
information, the actual representation has been designed with size concerns
in mind and newer versions of the standard keep improving that too.

Jakub


Re: Avid ggc_alloc and push_cfun during LTO streaming

2019-10-11 Thread Richard Biener
On Fri, Oct 11, 2019 at 11:03 AM Jan Hubicka  wrote:
>
> Hi,
> this patch prevents tree creation druing WPA stream out (to avoid
> touching pages and triggering COW).  It fixes the following
>  - gimple streamer produces MEM_REF wrappings for global decls.
>This is to preserve the type of access and is not necessary for
>WPA->LTRANS streaming when decls ar eno longer going to be merged.
>  - we renumber stmt uids during streaming WPA summaries
>  - loop optimizer is initialized in output_function.
> After testing the patch I noticed that output_function does one extra
> renumbering of stmts. This seems quite broken and I will fix it
> incrementally.
>
> Bootstrapped/regtested x86_64-linux, comitted.

Huh.  Why do we stream function bodies at WPA time at all?
We should already have input sections we can copy/remap?

That is, why does gcc_assert (!flag_wpa) in output_function trip?

Richard.

>
> * gimple-streamer-out.c (output_gimple_stmt): Add explicit function
> parameter.
> * lto-streamer-out.c: Include tree-dfa.h.
> (output_cfg): Do not use cfun.
> (lto_prepare_function_for_streaming): New.
> (output_function): Do not push cfun; do not initialize loop optimizer.
> * lto-streamer.h (lto_prepare_function_for_streaming): Declare.
> * passes.c (ipa_write_summaries): Use it.
> (ipa_write_optimization_summaries): Do not modify bodies.
> * tree-dfa.c (renumber_gimple_stmt_uids): Add function parameter.
> * tree.dfa.h (renumber_gimple_stmt_uids): Update prototype.
> * tree-ssa-dse.c (pass_dse::execute): Update use of
> renumber_gimple_stmt_uids.
> * tree-ssa-math-opts.c (pass_optimize_widening_mul::execute): 
> Likewise.
>
> * lto.c (lto_wpa_write_files): Prepare all bodies for streaming.
> Index: gimple-streamer-out.c
> ===
> --- gimple-streamer-out.c   (revision 276850)
> +++ gimple-streamer-out.c   (working copy)
> @@ -57,7 +57,7 @@ output_phi (struct output_block *ob, gph
>  /* Emit statement STMT on the main stream of output block OB.  */
>
>  static void
> -output_gimple_stmt (struct output_block *ob, gimple *stmt)
> +output_gimple_stmt (struct output_block *ob, struct function *fn, gimple 
> *stmt)
>  {
>unsigned i;
>enum gimple_code code;
> @@ -80,7 +80,7 @@ output_gimple_stmt (struct output_block
>  as_a  (stmt)),
>1);
>bp_pack_value (, gimple_has_volatile_ops (stmt), 1);
> -  hist = gimple_histogram_value (cfun, stmt);
> +  hist = gimple_histogram_value (fn, stmt);
>bp_pack_value (, hist != NULL, 1);
>bp_pack_var_len_unsigned (, stmt->subcode);
>
> @@ -139,7 +139,7 @@ output_gimple_stmt (struct output_block
>  so that we do not have to deal with type mismatches on
>  merged symbols during IL read in.  The first operand
>  of GIMPLE_DEBUG must be a decl, not MEM_REF, though.  */
> - if (op && (i || !is_gimple_debug (stmt)))
> + if (!flag_wpa && op && (i || !is_gimple_debug (stmt)))
> {
>   basep = 
>   if (TREE_CODE (*basep) == ADDR_EXPR)
> @@ -147,7 +147,7 @@ output_gimple_stmt (struct output_block
>   while (handled_component_p (*basep))
> basep = _OPERAND (*basep, 0);
>   if (VAR_P (*basep)
> - && !auto_var_in_fn_p (*basep, current_function_decl)
> + && !auto_var_in_fn_p (*basep, fn->decl)
>   && !DECL_REGISTER (*basep))
> {
>   bool volatilep = TREE_THIS_VOLATILE (*basep);
> @@ -228,7 +228,7 @@ output_bb (struct output_block *ob, basi
>   print_gimple_stmt (streamer_dump_file, stmt, 0, TDF_SLIM);
> }
>
> - output_gimple_stmt (ob, stmt);
> + output_gimple_stmt (ob, fn, stmt);
>
>   /* Emit the EH region holding STMT.  */
>   region = lookup_stmt_eh_lp_fn (fn, stmt);
> Index: lto/lto.c
> ===
> --- lto/lto.c   (revision 276850)
> +++ lto/lto.c   (working copy)
> @@ -304,6 +304,13 @@ lto_wpa_write_files (void)
>
>timevar_push (TV_WHOPR_WPA_IO);
>
> +  cgraph_node *node;
> +  /* Do body modifications needed for streaming before we fork out
> + worker processes.  */
> +  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
> +if (gimple_has_body_p (node->decl))
> +  lto_prepare_function_for_streaming (node);
> +
>/* Generate a prefix for the LTRANS unit files.  */
>blen = strlen (ltrans_output_list);
>temp_filename = (char *) xmalloc (blen + sizeof ("2147483648.o"));
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c  (revision 276850)
> +++ lto-streamer-out.c  (working copy)
> @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
>  

Re: [PR47785] COLLECT_AS_OPTIONS

2019-10-11 Thread Richard Biener
On Fri, Oct 11, 2019 at 6:15 AM Kugan Vivekanandarajah
 wrote:
>
> Hi Richard,
> Thanks for the review.
>
> On Wed, 2 Oct 2019 at 20:41, Richard Biener  
> wrote:
> >
> > On Wed, Oct 2, 2019 at 10:39 AM Kugan Vivekanandarajah
> >  wrote:
> > >
> > > Hi,
> > >
> > > As mentioned in the PR, attached patch adds COLLECT_AS_OPTIONS for
> > > passing assembler options specified with -Wa, to the link-time driver.
> > >
> > > The proposed solution only works for uniform -Wa options across all
> > > TUs. As mentioned by Richard Biener, supporting non-uniform -Wa flags
> > > would require either adjusting partitioning according to flags or
> > > emitting multiple object files  from a single LTRANS CU. We could
> > > consider this as a follow up.
> > >
> > > Bootstrapped and regression tests on  arm-linux-gcc. Is this OK for trunk?
> >
> > While it works for your simple cases it is unlikely to work in practice 
> > since
> > your implementation needs the assembler options be present at the link
> > command line.  I agree that this might be the way for people to go when
> > they face the issue but then it needs to be documented somewhere
> > in the manual.
> >
> > That is, with COLLECT_AS_OPTION (why singular?  I'd expected
> > COLLECT_AS_OPTIONS) available to cc1 we could stream this string
> > to lto_options and re-materialize it at link time (and diagnose mismatches
> > even if we like).
> OK. I will try to implement this. So the idea is if we provide
> -Wa,options as part of the lto compile, this should be available
> during link time. Like in:
>
> arm-linux-gnueabihf-gcc -march=armv7-a -mthumb -O2 -flto
> -Wa,-mimplicit-it=always,-mthumb -c test.c
> arm-linux-gnueabihf-gcc  -flto  test.o
>
> I am not sure where should we stream this. Currently, cl_optimization
> has all the optimization flag provided for compiler and it is
> autogenerated and all the flags are integer values. Do you have any
> preference or example where this should be done.

In lto_write_options, I'd simply append the contents of COLLECT_AS_OPTIONS
(with -Wa, prepended to each of them), then recover them in lto-wrapper
for each TU and pass them down to the LTRANS compiles (if they agree
for all TUs, otherwise I'd warn and drop them).

Richard.

> Thanks,
> Kugan
>
>
>
> >
> > Richard.
> >
> > > Thanks,
> > > Kugan
> > >
> > >
> > > gcc/ChangeLog:
> > >
> > > 2019-10-02  kugan.vivekanandarajah  
> > >
> > > PR lto/78353
> > > * gcc.c (putenv_COLLECT_AS_OPTION): New to set COLLECT_AS_OPTION in env.
> > > (driver::main): Call putenv_COLLECT_AS_OPTION.
> > > * lto-wrapper.c (run_gcc): use COLLECT_AS_OPTION from env.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > 2019-10-02  kugan.vivekanandarajah  
> > >
> > > PR lto/78353
> > > * gcc.target/arm/pr78353-1.c: New test.
> > > * gcc.target/arm/pr78353-2.c: New test.


Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS

2019-10-11 Thread Richard Biener
On Thu, Oct 10, 2019 at 8:06 PM Richard Sandiford
 wrote:
>
> Wilco Dijkstra  writes:
> > ping
> >
> > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
> > bitfields by their declared type, which results in better codegeneration on 
> > practically
> > any target.
>
> The name is confusing, but the documentation looks accurate to me:
>
> Define this macro as a C expression which is nonzero if accessing less
> than a word of memory (i.e.@: a @code{char} or a @code{short}) is no
> faster than accessing a word of memory, i.e., if such access
> require more than one instruction or if there is no difference in cost
> between byte and (aligned) word loads.
>
> When this macro is not defined, the compiler will access a field by
> finding the smallest containing object; when it is defined, a fullword
> load will be used if alignment permits.  Unless bytes accesses are
> faster than word accesses, using word accesses is preferable since it
> may eliminate subsequent memory access if subsequent accesses occur to
> other fields in the same word of the structure, but to different bytes.
>
> > I'm thinking we should completely remove all trace of SLOW_BYTE_ACCESS
> > from GCC as it's confusing and useless.
>
> I disagree.  Some targets can optimise single-bit operations when the
> container is a byte, for example.

There's also less chance of store-to-load forwarding issues when _not_
forcing word accesses.  Because I think that we do not use the
same macro to change code generation for writes (where a larger access
requires a read-modify-write cycle).

But this also means that a decision with no information on context is
going to be flawed.  Still generally I'd do smaller reads if they are not
significantly more expensive on the target but larger writes (with the
same constraint) just for the STLF issue.

Unfortunately the macro docs don't say if it is applicable to reads or writes
or both...

> > OK for commit until we get rid of it?
> >
> > ChangeLog:
> > 2017-11-17  Wilco Dijkstra  
> >
> > gcc/
> > * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
> > --
> > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> > index 
> > 056110afb228fb919e837c04aa5e5552a4868ec3..d8f4d129a02fb89eb00d256aba8c4764d6026078
> >  100644
> > --- a/gcc/config/aarch64/aarch64.h
> > +++ b/gcc/config/aarch64/aarch64.h
> > @@ -769,14 +769,9 @@ typedef struct
> > if given data not on the nominal alignment.  */
> >  #define STRICT_ALIGNMENTTARGET_STRICT_ALIGN
> >
> > -/* Define this macro to be non-zero if accessing less than a word of
> > -   memory is no faster than accessing a word of memory, i.e., if such
> > -   accesses require more than one instruction or if there is no
> > -   difference in cost.
> > -   Although there's no difference in instruction count or cycles,
> > -   in AArch64 we don't want to expand to a sub-word to a 64-bit access
> > -   if we don't have to, for power-saving reasons.  */
> > -#define SLOW_BYTE_ACCESS   0
> > +/* Contrary to all documentation, this enables wide bitfield accesses,
> > +   which results in better code when accessing multiple bitfields.  */
> > +#define SLOW_BYTE_ACCESS   1
> >
> >  #define NO_FUNCTION_CSE 1
>
> I agree this makes sense from a performance point of view, and I think
> the existing comment is admitting that AArch64 has the properties that
> would normally cause us to set SLOW_BYTE_ACCESS to 1.  But the comment
> is claiming that there's a power-saving benefit to leaving it off.
>
> It seems like a weak argument though.  Bitfields are used when several
> values are packed into the same integer, so there's a high likelihood
> we'll need the whole integer anyway.  Avoiding the redundancies described
> in the documention should if anything help with power usage.
>
> Maybe the main concern was using a 64-bit access when a 32-bit one
> would do, since 32-bit bitfield containers are the most common.  But the:
>
>  && GET_MODE_ALIGNMENT (mode) <= align
>
> condition in get_best_mode should avoid that unless the 64-bit
> access is naturally aligned.  (See the big comment above for the
> pros and cons of this.)
>
> So I think we should change the macro value unless anyone can back up the
> power-saving claim.  Let's wait a week (more) to see if anyone objects.
>
> The comment change isn't OK though.  Please keep the first paragraph
> and just reword the second to say that's why we set the value to 1.
>
> Thanks,
> Richard


Re: Type representation in CTF and DWARF

2019-10-11 Thread Richard Biener
On Fri, Oct 11, 2019 at 1:06 AM Indu Bhagat  wrote:
>
>
>
> On 10/09/2019 12:49 AM, Jakub Jelinek wrote:
> > On Wed, Oct 09, 2019 at 09:41:09AM +0200, Richard Biener wrote:
> >> There's a mechanism to get type (and decl - I suppose CTF also
> >> contains debug info
> >> for function declarations not only its type?) info as part of early
> >> debug generation.
> >> The attached "hack" simply mangles dwarf2out to output this early info as 
> >> the
> >> only debug info (only verified on a small .c file).  We still have things 
> >> like
> >> file, line and column numbers for entities (not sure if CTF has those).
> >>
> >> It should be possible to "hide" the hack behind a -gdwarf-like-ctf or 
> >> similar.
> >> I guess -g0.5 isn't desirable and we've taken both -g0 and -g1 already...
> >> (and -g1 doesn't include types but just decls).
> > Yeah.  And if location info isn't in CTF, you can as well add an early
> > return in add_src_coords_attributes, like it has one for UNKNOWN_LOCATION
> > already.  Or if it is there, but just file/line and not column, you can use
> > -gno-column-info.  As has been mentioned earlier, you can use dwz utility
> > post-linking instead of -fdebug-types-section.
> >
> >   Jakub
>
> Thanks for your pointers.
>
> CTF does not encode location information. So, I used early exit in the
> add_src_coords_attributes to avoid generation of location info (file, line,
> column). To answer Richard's question, CTF does have type debug info
> for function declarations and the argument types. So I think with these
> changes, both CTF and DWARF generation will emit debug info for the same set 
> of
> types and decl.
>
> Compile with -g -gdwarf-like-ctf and use dwz -o   (using
> dwz compiled from the master branch) on the generated binaries:
>
> (coreutils-0.22)
>   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> ls   30616   |1136   |21098   | 26240 
>   | 0.62
> pwd  10734   |788|10433   | 13929 
>   | 0.83
> groups 10706 |811|10249   | 13378 
>   | 0.80
>
> (emacs-26.3)
>   .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf 
> (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> emacs-26.3.1 674657  |6402   |   273963   |   273910  
>   | 0.33
>
> I chose to account for 50% of .debug_str because at this point, it will be
> unfair to not account for them. Actually, one could even argue that upto 70%
> of the .debug_str are names of entities. CTF section sizes do include the CTF
> string tables.
>
> Across coreutils, I see a geomean of 0.73 (ratio of
> .ctf/(.debug_info + .debug_abbrev + 50% of .debug_str)). So, with the
> "-gdwarf-like-ctf code stubs" and dwz, DWARF continues to have a larger
> footprint than CTF (with 50% of .debug_str accounted for).

I'm not convinced this "improvement" in size is worth maintainig another
debug-info format much less since it lacks desirable features right now
and thus evaluation is tricky.

At least you can improve dwarf size considerably with a low amount of work.

I suspect another factor where dwarf is bigger compared to CTF is that dwarf
is recording typedef names as well as qualified type variants.  But maybe
CTF just has a more compact representation for the bits it actually implements.

Richard.

> Indu
>


Re: Remove splay tree form ipa-reference.c

2019-10-11 Thread Richard Biener
On Thu, Oct 10, 2019 at 9:25 PM Jan Hubicka  wrote:
>
> Hi,
> ipa-reference uses splay tree to map DECL_UIDs to trees.  On other
> places we use hash-maps which are more sutiable.

A simple hash-table/jhash-set does it as well if the UID is the same
as DECL_UID.
You can use the tree_decl_hash hash-traits for that.  decl lookup
can then be done like

tree_decl_minimal d;
d.uid = uid;
find_slot_with_hash (, uid)

or so.  There must be existing users that got converted from old-style
hashes to new templated ones but I can't find them right now :/

Richard.

> Bootstrapped/regtested x86_64-linux, comitted.
>
> Honza
> * ipa-reference.c: Do not include splay-tree.h
> (reference_vars_to_consider): Turn to hash map.
> (get_static_name, ipa_init, analyze_function, propagate,
> stream_out_bitmap, ipa_reference_write_optimization_summary,
> ipa_reference_write_optimization_summary): Update.
> Index: ipa-reference.c
> ===
> --- ipa-reference.c (revision 276849)
> +++ ipa-reference.c (working copy)
> @@ -46,7 +46,6 @@ along with GCC; see the file COPYING3.
>  #include "cgraph.h"
>  #include "data-streamer.h"
>  #include "calls.h"
> -#include "splay-tree.h"
>  #include "ipa-utils.h"
>  #include "ipa-reference.h"
>  #include "symbol-summary.h"
> @@ -92,9 +91,11 @@ struct ipa_reference_vars_info_d
>
>  typedef struct ipa_reference_vars_info_d *ipa_reference_vars_info_t;
>
> -/* This splay tree contains all of the static variables that are
> +/* This map contains all of the static variables that are
> being considered by the compilation level alias analysis.  */
> -static splay_tree reference_vars_to_consider;
> +typedef hash_map, tree>
> +reference_vars_to_consider_t;
> +static reference_vars_to_consider_t *reference_vars_to_consider;
>
>  /* Set of all interesting module statics.  A bit is set for every module
> static we are considering.  This is added to the local info when asm
> @@ -272,9 +273,7 @@ is_proper_for_analysis (tree t)
>  static const char *
>  get_static_name (int index)
>  {
> -  splay_tree_node stn =
> -splay_tree_lookup (reference_vars_to_consider, index);
> -  return fndecl_name ((tree)(stn->value));
> +  return fndecl_name (*reference_vars_to_consider->get (index));
>  }
>
>  /* Dump a set of static vars to FILE.  */
> @@ -416,7 +415,7 @@ ipa_init (void)
>ipa_init_p = true;
>
>if (dump_file)
> -reference_vars_to_consider = splay_tree_new (splay_tree_compare_ints, 0, 
> 0);
> +reference_vars_to_consider = new reference_vars_to_consider_t(251);
>
>bitmap_obstack_initialize (_info_obstack);
>bitmap_obstack_initialize (_summary_obstack);
> @@ -476,9 +475,8 @@ analyze_function (struct cgraph_node *fn
>   && bitmap_set_bit (all_module_statics, ipa_reference_var_uid (var)))
> {
>   if (dump_file)
> -   splay_tree_insert (reference_vars_to_consider,
> -  ipa_reference_var_uid (var),
> -  (splay_tree_value)var);
> +   reference_vars_to_consider->put (ipa_reference_var_uid (var),
> +   var);
> }
>switch (ref->use)
> {
> @@ -898,7 +896,7 @@ propagate (void)
>  }
>
>if (dump_file)
> -splay_tree_delete (reference_vars_to_consider);
> +delete reference_vars_to_consider;
>reference_vars_to_consider = NULL;
>return remove_p ? TODO_remove_functions : 0;
>  }
> @@ -968,8 +966,7 @@ stream_out_bitmap (struct lto_simple_out
>  return;
>EXECUTE_IF_AND_IN_BITMAP (bits, ltrans_statics, 0, index, bi)
>  {
> -  tree decl = (tree)splay_tree_lookup (reference_vars_to_consider,
> -  index)->value;
> +  tree decl = *reference_vars_to_consider->get (index);
>lto_output_var_decl_index (ob->decl_state, ob->main_stream, decl);
>  }
>  }
> @@ -987,7 +984,7 @@ ipa_reference_write_optimization_summary
>auto_bitmap ltrans_statics;
>int i;
>
> -  reference_vars_to_consider = splay_tree_new (splay_tree_compare_ints, 0, 
> 0);
> +  reference_vars_to_consider = new reference_vars_to_consider_t (251);
>
>/* See what variables we are interested in.  */
>for (i = 0; i < lto_symtab_encoder_size (encoder); i++)
> @@ -1001,9 +998,8 @@ ipa_reference_write_optimization_summary
> {
>   tree decl = vnode->decl;
>   bitmap_set_bit (ltrans_statics, ipa_reference_var_uid (decl));
> - splay_tree_insert (reference_vars_to_consider,
> -ipa_reference_var_uid (decl),
> -(splay_tree_value)decl);
> + reference_vars_to_consider->put
> +(ipa_reference_var_uid (decl), decl);
>   ltrans_statics_bitcount ++;
> }
>  }
> @@ -1045,7 +1041,7 @@ ipa_reference_write_optimization_summary
>   }
>}
>

Re: [SH][committed] Fix PR 88630

2019-10-11 Thread Oleg Endo
On Fri, 2019-10-11 at 00:36 +0900, Oleg Endo wrote:
> Hi,
> 
> When we did the refactoring of SH's FPSCR handling back in GCC 5, we
> missed one thing regarding ST-40, it seems.
> 
> The attached patch fixes the issue as confirmed on the real
> hardware. 
> Also tested on sh-sim with
> make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb,-
> m2a/-mb,-m4/-ml,-m4/-mb}"
> 
> 
> Committed to trunk, GCC 9, GCC 8 as r276809, r276825, r276837.

I forgot about GCC 7.  Committed there as well as r276877.

Cheers,
Oleg



Re: Correctly release ipa-reference summarries

2019-10-11 Thread Richard Biener
On Thu, Oct 10, 2019 at 9:21 PM Jan Hubicka  wrote:
>
> Hi,
> this patch fixes code removing summaries in ipa-reference.  As a memory
> leak it may make sense to backport this to release branches.

Please do so.

Richard.

> Honza
>
> * ipa-reference.c (propagate): Fix releasing of IPA summaries.
> Index: ipa-reference.c
> ===
> --- ipa-reference.c (revision 276707)
> +++ ipa-reference.c (working copy)
> @@ -891,15 +889,14 @@ propagate (void)
>
>bitmap_obstack_release (_info_obstack);
>
> -  if (ipa_ref_var_info_summaries == NULL)
> +  if (ipa_ref_var_info_summaries != NULL)
>  {
>delete ipa_ref_var_info_summaries;
>ipa_ref_var_info_summaries = NULL;
>  }
>
> -  ipa_ref_var_info_summaries = NULL;
>if (dump_file)
>  splay_tree_delete (reference_vars_to_consider);
>reference_vars_to_consider = NULL;
>return remove_p ? TODO_remove_functions : 0;
>  }


[C++ Patch] One more DECL_SOURCE_LOCATION

2019-10-11 Thread Paolo Carlini

Hi,

another straightforward DECL_SOURCE_LOCATION (TYPE_MAIN_DECL consistent 
with the one I used in build_anon_union_vars. Tested x86_64-linux.


Thanks, Paolo.

/
/cp
2019-10-11  Paolo Carlini  

* decl.c (check_tag_decl): Use DECL_SOURCE_LOCATION.

/testsuite
2019-10-11  Paolo Carlini  

* g++.dg/cpp0x/constexpr-union5.C: Test location(s) too.
* g++.dg/diagnostic/bitfld2.C: Likewise.
* g++.dg/ext/anon-struct1.C: Likewise.
* g++.dg/ext/anon-struct6.C: Likewise.
* g++.dg/ext/flexary19.C: Likewise.
* g++.dg/ext/flexary9.C: Likewise.
* g++.dg/template/error17.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 276845)
+++ cp/decl.c   (working copy)
@@ -4992,7 +4992,8 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
 
   if (TREE_CODE (declared_type) != UNION_TYPE
  && !in_system_header_at (input_location))
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ prohibits anonymous 
structs");
+   pedwarn (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (declared_type)),
+OPT_Wpedantic, "ISO C++ prohibits anonymous structs");
 }
 
   else
Index: testsuite/g++.dg/cpp0x/constexpr-union5.C
===
--- testsuite/g++.dg/cpp0x/constexpr-union5.C   (revision 276845)
+++ testsuite/g++.dg/cpp0x/constexpr-union5.C   (working copy)
@@ -23,16 +23,16 @@ SA((a.i == 42));
 
 struct B
 {
-  struct {
+  struct { // { dg-warning "10:ISO C\\+\\+ prohibits 
anonymous struct" }
 int h;
-struct {
+struct {   // { dg-warning "12:ISO C\\+\\+ prohibits 
anonymous struct" }
   union {
unsigned char i;
int j;
   };
   int k;
-}; // { dg-warning "anonymous struct" }
-  };   // { dg-warning "anonymous struct" }
+};
+  };
   int l;
 
   constexpr B(): h(1), i(2), k(3), l(4) {}
Index: testsuite/g++.dg/diagnostic/bitfld2.C
===
--- testsuite/g++.dg/diagnostic/bitfld2.C   (revision 276845)
+++ testsuite/g++.dg/diagnostic/bitfld2.C   (working copy)
@@ -6,4 +6,4 @@ template struct A
   struct {} : 2;   // { dg-error "expected ';' after struct" "expected" }
 };
 // { dg-error "ISO C.. forbids declaration" "declaration" { target *-*-* } 6 }
-// { dg-error "ISO C.. prohibits anonymous" "anonymous" { target *-*-* } 6 }
+// { dg-error "10:ISO C.. prohibits anonymous" "anonymous" { target *-*-* } 6 }
Index: testsuite/g++.dg/ext/anon-struct1.C
===
--- testsuite/g++.dg/ext/anon-struct1.C (revision 276845)
+++ testsuite/g++.dg/ext/anon-struct1.C (working copy)
@@ -19,7 +19,7 @@ char testD[sizeof(C::D) == sizeof(A) ? 1 : -1];
 
 /* GNU extension.  */
 struct E {
-  struct { char z; };  /* { dg-error "prohibits anonymous structs" } */
+  struct { char z; };  /* { dg-error "10:ISO C\\+\\+ prohibits 
anonymous structs" } */
   char e;
 };
 
@@ -45,6 +45,6 @@ char testH[sizeof(H) == 2 * sizeof(A) ? 1 : -1];
 
 /* Make sure __extension__ gets turned back off.  */
 struct I {
-  struct { char z; };  /* { dg-error "prohibits anonymous structs" } */
+  struct { char z; };  /* { dg-error "10:ISO C\\+\\+ prohibits 
anonymous structs" } */
   char i;
 };
Index: testsuite/g++.dg/ext/anon-struct6.C
===
--- testsuite/g++.dg/ext/anon-struct6.C (revision 276845)
+++ testsuite/g++.dg/ext/anon-struct6.C (working copy)
@@ -3,8 +3,8 @@
 struct A
 {
   struct
-  {
+  { // { dg-error "3:ISO C\\+\\+ prohibits anonymous structs" }
 struct { static int i; }; // { dg-error "prohibits anonymous 
structs|non-static data members|unnamed class" }
 void foo() { i; } // { dg-error "public non-static data" }
-  }; // { dg-error "prohibits anonymous structs" }
+  };
 };
Index: testsuite/g++.dg/ext/flexary19.C
===
--- testsuite/g++.dg/ext/flexary19.C(revision 276845)
+++ testsuite/g++.dg/ext/flexary19.C(working copy)
@@ -146,13 +146,13 @@ struct S16
 {
   int i;
 
-  struct {  // { dg-warning "invalid use" }
+  struct {  // { dg-warning "10:ISO C\\+\\+ prohibits anonymous 
struct|invalid use" }
 // A flexible array as a sole member of an anonymous struct is
 // rejected with an error in C mode but emits just a pedantic
 // warning in C++.  Other than excessive pedantry there is no
 // reason to reject it.
 int a[];
-  };// { dg-warning "anonymous struct" }
+  };
 };
 
 struct S17
@@ -177,9 +177,9 @@ struct S19
 {
   int i;
 
-  struct {  // { dg-warning "invalid use" }
+  struct {  // { dg-warning "10:ISO C\\+\\+ prohibits anonymous 

Re: [PATCH] Fix PR92046

2019-10-11 Thread Richard Biener
On Fri, 11 Oct 2019, Rainer Orth wrote:

> Hi Christophe,
> 
> > On Thu, 10 Oct 2019 at 16:01, Richard Biener  wrote:
> >
> >>
> >> The following fixes a few param adjustments that are made based on
> >> per-function adjustable flags by moving the adjustments to their
> >> users.  Semantics change in some minor ways but that's allowed
> >> for --params.
> >>
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> >>
> >> Hi,
> >
> > This generates several regressions.
> > On aarch64:
> > FAIL:  gcc.target/aarch64/vect_fp16_1.c scan-assembler-times
> > fadd\tv[0-9]+.8h 2
> >
> > on arm-linux-gnueabihf:
> > FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects
> >  scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1
> > loops" 1
> > FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects
> >  scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1
> > loops" 1
> >
> > on armeb-linux-gnueabihf, many (316) like:
> > FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1
> > loops" 1
> > FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized
> > 1 loops" 1
> >
> > still on armeb-linux-gnueabihf:
> > g++.dg/vect/pr33426-ivdep-2.cc  -std=c++14  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep-2.cc  -std=c++17  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep-2.cc  -std=c++2a  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep-2.cc  -std=c++98  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line )
> > g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line )
> > g++.dg/vect/pr33426-ivdep.cc  -std=c++14  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep.cc  -std=c++17  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep.cc  -std=c++2a  (test for warnings, line )
> > g++.dg/vect/pr33426-ivdep.cc  -std=c++98  (test for warnings, line )
> >
> > gfortran.dg/vect/no-vfa-pr32377.f90   -O   scan-tree-dump-times vect
> > "vectorized 2 loops" 1
> > gfortran.dg/vect/pr19049.f90   -O   scan-tree-dump-times vect
> > "vectorized 1 loops" 1
> > gfortran.dg/vect/pr32377.f90   -O   scan-tree-dump-times vect
> > "vectorized 2 loops" 1
> > gfortran.dg/vect/vect-2.f90   -O   scan-tree-dump-times vect
> > "vectorized 3 loops" 1
> > gfortran.dg/vect/vect-3.f90   -O   scan-tree-dump-times vect "Alignment
> > of access forced using versioning" 3
> > gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect "accesses
> > have the same alignment." 1
> > gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect
> > "vectorized 1 loops" 1
> > gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect "Alignment
> > of access forced using versioning." 2
> > gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect
> > "vectorized 1 loops" 1
> 
> that's PR tree-optimization/92066, also seen on sparc, powerpc64, and
> ia64.

Hmm, OK.  There's one obvious bug fixed below, other than that I have
to investigate in more detail.

Committed as obvious.

Richard.

2019-10-11  Richard Biener  

PR tree-optimization/92066
PR tree-optimization/92046
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Fix bogus cost model check.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 276858)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -2179,7 +2179,7 @@
   do_versioning
 = (optimize_loop_nest_for_speed_p (loop)
&& !loop->inner /* FORNOW */
-   && flag_vect_cost_model > VECT_COST_MODEL_CHEAP);
+   && flag_vect_cost_model != VECT_COST_MODEL_CHEAP);
 
   if (do_versioning)
 {


Re: [PATCH] Fix PR92046

2019-10-11 Thread Rainer Orth
Hi Christophe,

> On Thu, 10 Oct 2019 at 16:01, Richard Biener  wrote:
>
>>
>> The following fixes a few param adjustments that are made based on
>> per-function adjustable flags by moving the adjustments to their
>> users.  Semantics change in some minor ways but that's allowed
>> for --params.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>>
>> Hi,
>
> This generates several regressions.
> On aarch64:
> FAIL:  gcc.target/aarch64/vect_fp16_1.c scan-assembler-times
> fadd\tv[0-9]+.8h 2
>
> on arm-linux-gnueabihf:
> FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects
>  scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1
> loops" 1
> FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects
>  scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1
> loops" 1
>
> on armeb-linux-gnueabihf, many (316) like:
> FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1
> loops" 1
> FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized
> 1 loops" 1
>
> still on armeb-linux-gnueabihf:
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++14  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++17  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++2a  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++98  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line )
> g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line )
> g++.dg/vect/pr33426-ivdep.cc  -std=c++14  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep.cc  -std=c++17  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep.cc  -std=c++2a  (test for warnings, line )
> g++.dg/vect/pr33426-ivdep.cc  -std=c++98  (test for warnings, line )
>
> gfortran.dg/vect/no-vfa-pr32377.f90   -O   scan-tree-dump-times vect
> "vectorized 2 loops" 1
> gfortran.dg/vect/pr19049.f90   -O   scan-tree-dump-times vect
> "vectorized 1 loops" 1
> gfortran.dg/vect/pr32377.f90   -O   scan-tree-dump-times vect
> "vectorized 2 loops" 1
> gfortran.dg/vect/vect-2.f90   -O   scan-tree-dump-times vect
> "vectorized 3 loops" 1
> gfortran.dg/vect/vect-3.f90   -O   scan-tree-dump-times vect "Alignment
> of access forced using versioning" 3
> gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect "accesses
> have the same alignment." 1
> gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect
> "vectorized 1 loops" 1
> gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect "Alignment
> of access forced using versioning." 2
> gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect
> "vectorized 1 loops" 1

that's PR tree-optimization/92066, also seen on sparc, powerpc64, and
ia64.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[patch,Fortran] PR 92050 - fix ICE with -fcheck=all

2019-10-11 Thread Tobias Burnus
Checking produces extra code (unsurprisingly); this code needs to end up 
in the program…


Bootstrapped on x86-64_gnu-linux. OK for the trunk?

Tobias

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 965ab7786a1..65238ff623d 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -7031,8 +7031,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	gfc_allocate_lang_decl (result);
   GFC_DECL_SAVED_DESCRIPTOR (result) = parmse.expr;
   gfc_free_expr (class_expr);
-  gcc_assert (parmse.pre.head == NULL_TREE
-		  && parmse.post.head == NULL_TREE);
+  /* -fcheck= can add diagnostic code, which has to be placed before
+	 the call. */
+  if (parmse.pre.head != NULL)
+	  gfc_add_expr_to_block (>pre, parmse.pre.head);
+  gcc_assert (parmse.post.head == NULL_TREE);
 }
 
   /* Follow the function call with the argument post block.  */
diff --git a/gcc/testsuite/gfortran.dg/pr92050.f90 b/gcc/testsuite/gfortran.dg/pr92050.f90
new file mode 100644
index 000..64193878d8f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr92050.f90
@@ -0,0 +1,53 @@
+! { dg-do run }
+! { dg-options "-fcheck=all" }
+! { dg-shouldfail "above upper bound" }
+!
+! PR fortran/92050
+!
+!
+module buggy
+  implicit none (type, external)
+
+  type :: par
+  contains
+procedure, public :: fun => fun_par
+  end type par
+
+  type comp
+class(par), allocatable :: p
+  end type comp
+
+  type foo
+type(comp), allocatable :: m(:)
+  end type foo
+
+contains
+
+  function fun_par(this)
+class(par) :: this
+integer:: fun_par(1)
+fun_par = 42
+  end function fun_par
+
+  subroutine update_foo(this)
+class(foo) :: this
+write(*,*) this%m(1)%p%fun()
+  end subroutine update_foo
+
+  subroutine bad_update_foo(this)
+class(foo) :: this
+write(*,*) this%m(2)%p%fun()
+  end subroutine bad_update_foo
+end module buggy
+
+program main
+  use buggy
+  implicit none (type, external)
+  type(foo) :: x
+  allocate(x%m(1))
+  allocate(x%m(1)%p)
+  call update_foo(x)
+  call bad_update_foo(x)
+end program main
+
+! { dg-output "At line 39 of file .*pr92050.f90.*Fortran runtime error: Index '2' of dimension 1 of array 'this%m' above upper bound of 1" }


Re: [PATCH] Fix PR92046

2019-10-11 Thread Christophe Lyon
On Thu, 10 Oct 2019 at 16:01, Richard Biener  wrote:

>
> The following fixes a few param adjustments that are made based on
> per-function adjustable flags by moving the adjustments to their
> users.  Semantics change in some minor ways but that's allowed
> for --params.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>
> Hi,

This generates several regressions.
On aarch64:
FAIL:  gcc.target/aarch64/vect_fp16_1.c scan-assembler-times
fadd\tv[0-9]+.8h 2

on arm-linux-gnueabihf:
FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects
 scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/vect-align-1.c scan-tree-dump-times vect "vectorized 1
loops" 1
FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects
 scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/vect-align-2.c scan-tree-dump-times vect "vectorized 1
loops" 1

on armeb-linux-gnueabihf, many (316) like:
FAIL: gcc.dg/vect/O3-vect-pr34223.c scan-tree-dump-times vect "vectorized 1
loops" 1
FAIL: gcc.dg/vect/fast-math-pr35982.c scan-tree-dump-times vect "vectorized
1 loops" 1

still on armeb-linux-gnueabihf:
g++.dg/vect/pr33426-ivdep-2.cc  -std=c++14  (test for warnings, line )
g++.dg/vect/pr33426-ivdep-2.cc  -std=c++17  (test for warnings, line )
g++.dg/vect/pr33426-ivdep-2.cc  -std=c++2a  (test for warnings, line )
g++.dg/vect/pr33426-ivdep-2.cc  -std=c++98  (test for warnings, line )
g++.dg/vect/pr33426-ivdep-3.cc(test for warnings, line )
g++.dg/vect/pr33426-ivdep-4.cc(test for warnings, line )
g++.dg/vect/pr33426-ivdep.cc  -std=c++14  (test for warnings, line )
g++.dg/vect/pr33426-ivdep.cc  -std=c++17  (test for warnings, line )
g++.dg/vect/pr33426-ivdep.cc  -std=c++2a  (test for warnings, line )
g++.dg/vect/pr33426-ivdep.cc  -std=c++98  (test for warnings, line )

gfortran.dg/vect/no-vfa-pr32377.f90   -O   scan-tree-dump-times vect
"vectorized 2 loops" 1
gfortran.dg/vect/pr19049.f90   -O   scan-tree-dump-times vect
"vectorized 1 loops" 1
gfortran.dg/vect/pr32377.f90   -O   scan-tree-dump-times vect
"vectorized 2 loops" 1
gfortran.dg/vect/vect-2.f90   -O   scan-tree-dump-times vect
"vectorized 3 loops" 1
gfortran.dg/vect/vect-3.f90   -O   scan-tree-dump-times vect "Alignment
of access forced using versioning" 3
gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect "accesses
have the same alignment." 1
gfortran.dg/vect/vect-4.f90   -O   scan-tree-dump-times vect
"vectorized 1 loops" 1
gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect "Alignment
of access forced using versioning." 2
gfortran.dg/vect/vect-5.f90   -O   scan-tree-dump-times vect
"vectorized 1 loops" 1

Christophe


Richard.
>
> 2019-10-10  Richard Biener  
>
> PR middle-end/92046
> * opts.c (finish_options): Do not influence global --params
> from options that are adjustable per function.
> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
> Apply --param adjustment based on active cost-model.
> * tree-ssa-phiopt.c (cond_if_else_store_replacement): Disable
> further store-sinking when vectorization or if-conversion
> are not enabled.
>
> Index: gcc/opts.c
> ===
> --- gcc/opts.c  (revision 276795)+++ gcc/opts.c (working copy)
> +++ gcc/opts.c  (working copy)
> @@ -1123,24 +1123,6 @@ finish_options (struct gcc_options *opts
>&& !opts_set->x_flag_reorder_functions)
>  opts->x_flag_reorder_functions = 1;
>
> -  /* Tune vectorization related parametees according to cost model.  */
> -  if (opts->x_flag_vect_cost_model == VECT_COST_MODEL_CHEAP)
> -{
> -  maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS,
> -6, opts->x_param_values, opts_set->x_param_values);
> -  maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIGNMENT_CHECKS,
> -0, opts->x_param_values, opts_set->x_param_values);
> -  maybe_set_param_value (PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT,
> -0, opts->x_param_values, opts_set->x_param_values);
> -}
> -
> -  /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or
> if-conversion
> - is disabled.  */
> -  if ((!opts->x_flag_tree_loop_vectorize &&
> !opts->x_flag_tree_slp_vectorize)
> -   || !opts->x_flag_tree_loop_if_convert)
> -maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0,
> -   opts->x_param_values,
> opts_set->x_param_values);
> -
>/* The -gsplit-dwarf option requires -ggnu-pubnames.  */
>if (opts->x_dwarf_split_debug_info)
>  opts->x_debug_generate_pub_sections = 2;
> Index: gcc/tree-vect-data-refs.c
> ===
> --- gcc/tree-vect-data-refs.c   (revision 276795)
> +++ gcc/tree-vect-data-refs.c   (working copy)
> @@ -2075,6 +2075,8 @@ vect_enhance_data_refs_alignment (loop_v
>  {
>   

[Ada] Assorted gigi tweaks

2019-10-11 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2019-10-11  Eric Botcazou  

* gcc-interface/decl.c (elaborate_reference_1): Specifically deal with
pointer displacement.

* gcc-interface/decl.c (components_to_record): Use proper name.

* gcc-interface/trans.c (Sloc_to_locus): Use standard types.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 276873)
+++ gcc-interface/decl.c	(working copy)
@@ -6782,6 +6782,15 @@ elaborate_reference_1 (tree ref, void *d
 		   elaborate_reference_1 (TREE_OPERAND (ref, 0), data),
 		   TREE_OPERAND (ref, 1), NULL_TREE);
 
+  /* If this is the displacement of a pointer, elaborate the pointer and then
+ displace the result.  The actual purpose here is to drop the location on
+ the expression, which may be problematic if replicated on references.  */
+  if (TREE_CODE (ref) == POINTER_PLUS_EXPR
+  && TREE_CODE (TREE_OPERAND (ref, 1)) == INTEGER_CST)
+return build2 (POINTER_PLUS_EXPR, TREE_TYPE (ref),
+		   elaborate_reference_1 (TREE_OPERAND (ref, 0), data),
+		   TREE_OPERAND (ref, 1));
+
   sprintf (suffix, "EXP%d", ++er->n);
   return
 elaborate_expression_1 (ref, er->entity, suffix, er->definition, false);
@@ -7923,7 +7932,7 @@ components_to_record (Node_Id gnat_compo
&& !(Is_Packed (gnat_record_type)
 	? has_non_packed_fixed_size_field
 	: Optimize_Alignment_Space (gnat_record_type))
-   && !debug__debug_flag_dot_r);
+   && !Debug_Flag_Dot_R);
   const bool w_reorder
 = (Convention (gnat_record_type) == Convention_Ada
&& Warn_On_Questionable_Layout
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 276864)
+++ gcc-interface/trans.c	(working copy)
@@ -10848,8 +10848,8 @@ Sloc_to_locus (Source_Ptr Sloc, location
 }
 
   Source_File_Index file = Get_Source_File_Index (Sloc);
-  Logical_Line_Number line = Get_Logical_Line_Number (Sloc);
-  Column_Number column = (clear_column ? 0 : Get_Column_Number (Sloc));
+  Line_Number_Type line = Get_Logical_Line_Number (Sloc);
+  Column_Number_Type column = (clear_column ? 0 : Get_Column_Number (Sloc));
   line_map_ordinary *map = LINEMAPS_ORDINARY_MAP_AT (line_table, file - 1);
 
   /* We can have zero if pragma Source_Reference is in effect.  */


Re: Implement ggc_trim

2019-10-11 Thread Richard Biener
On October 11, 2019 9:03:53 AM GMT+02:00, Jan Hubicka  wrote:
>Hi,
>this patch adds ggc_trim that releases free pages used by GGC memory
>to system.  This is useful to reduce memory footprint of WPA streaming:
>WPA streaming ought to not use any more GGC memory (patches in testing
>for that) and trimming the memory makes it available to fork
>used
>by stream out machinery.
>
>I collected some stats for cc1 for both GGC and heap (using mallinfo).
>Memory footprints are as follows:
>
>After streaming in global stream: 123MB GGC;  25MB of heap.
>After streaming in callgraph: 228MB GGC;  45MB of heap.
>After streaming in summaries: 373MB GGC; 126MB of heap.
>After symbol merging   : 348MB GGC; 130MB of heap.
>After IPA-ICF  : 501MB GGC; 160MB of heap. (this is all ICF)
>After IPA-CP   : 528MB GGC; 163MB of heap.
>After IPA-SRA  : 532MB GGC; 163MB of heap.
>After Inline   : 644MB GGC; 173MB of heap
>   This is after collecting of 118MB of
>   garbage and returning 740k to system
>   by madvise_dontneed
>After ipa-reference: 644MB GGC; 370MB of heap
>   I checked this all goes into the
>   bitmaps; I have WIP patch for that
>After releasing summariess : 431MB GGC; 383MB of heap
>   Trim releases 43MB by unmap
>   and 321MB by madvise_dontneed
>
>At least i learnt new fact about ipa-reference consuming  200MB of
>memory which was not obvious from our detailed mem stats.
>
>I think the lowest hanging fruit after this patch is to add
>malloc_madvise which further reduces footpring and fix ipa-reference.
>Hopefully Martin will do a bit about ipa-icf.
>
>I will dig into what inliner does but it produces a lot of clones so I
>think it is mostly clone and summary duplication. Perhaps we can avoid
>copying some of summaries for inline clones.
>
>In TOP I see about 900MB instead of 1.4GB before WPA streaming starts
>with both ggc_trim and madvise.
>
>Note that I also tried to hack ggc_free to recognize free pages but at
>least in simple implementation it is a loss since it makes ggc_alloc
>more expensive (it needs to bring pages back and add into freelists)
>which hurts stream-in performance.
>
>I think sweeping once per WPA is no problem, it is definitly less than
>1% of WPA time.
>
>Bootstrapped/regtested x86_64-linux, OK?

Ok.

Richard. 

>   * ggc-page.c (release_pages): Output statistics when !quiet_flag.
>   (ggc_collect): Dump later to not interfere with release_page dump.
>   (ggc_trim): New function.
>   * ggc-none.c (ggc_trim): New.
>
>   * lto.c (lto_wpa_write_files): Call ggc_trim.
>Index: ggc-page.c
>===
>--- ggc-page.c (revision 276707)
>+++ ggc-page.c (working copy)
>@@ -529,7 +529,6 @@ static void clear_page_group_in_use (pag
> #endif
> static struct page_entry * alloc_page (unsigned);
> static void free_page (struct page_entry *);
>-static void release_pages (void);
> static void clear_marks (void);
> static void sweep_pages (void);
> static void ggc_recalculate_in_use_p (page_entry *);
>@@ -1016,6 +1015,8 @@ free_page (page_entry *entry)
> static void
> release_pages (void)
> {
>+  size_t n1 = 0;
>+  size_t n2 = 0;
> #ifdef USING_MADVISE
>   page_entry *p, *start_p;
>   char *start;
>@@ -1061,6 +1062,7 @@ release_pages (void)
>   else
> G.free_pages = p;
>   G.bytes_mapped -= mapped_len;
>+n1 += len;
> continue;
> }
>   prev = newprev;
>@@ -1092,6 +1094,7 @@ release_pages (void)
>/* Don't count those pages as mapped to not touch the garbage collector
>  unnecessarily. */
>   G.bytes_mapped -= len;
>+  n2 += len;
>   while (start_p != p)
> {
>   start_p->discarded = true;
>@@ -1124,6 +1127,7 @@ release_pages (void)
>   }
> 
>   munmap (start, len);
>+  n1 += len;
>   G.bytes_mapped -= len;
> }
> 
>@@ -1152,10 +1156,20 @@ release_pages (void)
>   *gp = g->next;
>   G.bytes_mapped -= g->alloc_size;
>   free (g->allocation);
>+  n1 += g->alloc_size;
>   }
> else
>   gp = >next;
> #endif
>+  if (!quiet_flag && (n1 || n2))
>+{
>+  fprintf (stderr, " {GC");
>+  if (n1)
>+  fprintf (stderr, " released %luk", (unsigned long)(n1 / 1024));
>+  if (n2)
>+  fprintf (stderr, " madv_dontneed %luk", (unsigned long)(n2 / 1024));
>+  fprintf (stderr, "}");
>+}
> }
> 
> /* This table provides a fast way to determine ceil(log_2(size)) for
>@@ -2178,19 +2192,22 @@ ggc_collect (void)
> return;
> 
>   timevar_push (TV_GC);
>-  if (!quiet_flag)
>-fprintf (stderr, " {GC %luk -> ", (unsigned long) G.allocated /
>1024);
>   if (GGC_DEBUG_LEVEL >= 2)
> fprintf (G.debug_file, 

[Ada] Accept size for record with oversized component clause

2019-10-11 Thread Eric Botcazou
This makes the compiler accept again an exact size clause for a record type 
that contains both a component subject to a representation clause that gives 
it a size greater than its nominal size and another component that is aliased.

Tested on x86_64-suse-linux, applied on the mainline.


2019-10-11  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_field): Adjust again the packing
for a field without strict alignment and with an oversized clause.


2019-10-11  Eric Botcazou  

* gnat.dg/specs/size_clause4.ads: New test.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 276866)
+++ gcc-interface/decl.c	(working copy)
@@ -7229,10 +7229,17 @@ gnat_to_gnu_field (Entity_Id gnat_field,
 	  && INTEGRAL_TYPE_P (TREE_TYPE (TYPE_FIELDS (gnu_field_type
 	gnu_field_type = TREE_TYPE (TYPE_FIELDS (gnu_field_type));
 
+  orig_field_type = gnu_field_type;
   gnu_field_type
 	= make_type_from_size (gnu_field_type, gnu_size,
 			   Has_Biased_Representation (gnat_field));
 
+  /* If the type has been extended, we may need to cap the alignment.  */
+  if (!needs_strict_alignment
+	  && gnu_field_type != orig_field_type
+	  && tree_int_cst_lt (TYPE_SIZE (orig_field_type), gnu_size))
+	packed = adjust_packed (gnu_field_type, gnu_record_type, packed);
+
   orig_field_type = gnu_field_type;
   gnu_field_type = maybe_pad_type (gnu_field_type, gnu_size, 0, gnat_field,
    false, false, definition, true);
-- { dg-do compile }

with System;

package Size_Clause4 is

  type Rec is record
A1 : System.Address;  
A2 : System.Address;  
I  : aliased Integer; 
  end record;
   
  for Rec use record
A1 at 0  range 0 .. 63; 
A2 at 8  range 0 .. 63;
I  at 16 range 0 .. 31;  
  end record;
  for Rec'Size use 160;

end Size_Clause4;


Re: [PATCH v4 6/7] S/390: Use signaling FP comparison instructions

2019-10-11 Thread Ilya Leoshkevich
> Am 07.10.2019 um 18:19 schrieb Andreas Krebbel :
> 
> On 01.10.19 15:27, Ilya Leoshkevich wrote:
>> dg-torture.exp=inf-compare-1.c is failing, because (qNaN > +Inf)
>> comparison is compiled to CDB instruction, which does not signal an
>> invalid operation exception. KDB should have been used instead.
>> 
>> This patch introduces a new CCmode and a new pattern in order to
>> generate signaling instructions in this and similar cases.
>> 
>> gcc/ChangeLog:
>> 
>> 2019-08-09  Ilya Leoshkevich  
>> 
>>  PR target/77918
>>  * config/s390/2827.md: Add new opcodes.
>>  * config/s390/2964.md: Likewise.
>>  * config/s390/3906.md: Likewise.
>>  * config/s390/8561.md: Likewise.
>>  * config/s390/s390-builtins.def (s390_vfchesb): Use
>>  the new vec_cmpgev4sf_quiet_nocc.
>>  (s390_vfchedb): Use the new vec_cmpgev2df_quiet_nocc.
>>  (s390_vfchsb): Use the new vec_cmpgtv4sf_quiet_nocc.
>>  (s390_vfchdb): Use the new vec_cmpgtv2df_quiet_nocc.
>>  (vec_cmplev4sf): Use the new vec_cmplev4sf_quiet_nocc.
>>  (vec_cmplev2df): Use the new vec_cmplev2df_quiet_nocc.
>>  (vec_cmpltv4sf): Use the new vec_cmpltv4sf_quiet_nocc.
>>  (vec_cmpltv2df): Use the new vec_cmpltv2df_quiet_nocc.
>>  * config/s390/s390-modes.def (CCSFPS): New mode.
>>  * config/s390/s390.c (s390_match_ccmode_set): Support CCSFPS.
>>  (s390_select_ccmode): Return CCSFPS for LT, LE, GT, GE and LTGT.
>>  (s390_branch_condition_mask): Reuse CCS for CCSFPS.
>>  (s390_expand_vec_compare): Use non-signaling patterns where
>>  necessary.
>>  (s390_reverse_condition): Support CCSFPS.
>>  * config/s390/s390.md (*cmp_ccsfps): New pattern.
>>  * config/s390/vector.md: (VFCMP_HW_OP): Remove.
>>  (asm_fcmp_op): Likewise.
>>  (*smaxv2df3_vx): Use pattern for quiet comparison.
>>  (*sminv2df3_vx): Likewise.
>>  (*vec_cmp_nocc): Remove.
>>  (*vec_cmpeq_quiet_nocc): New pattern.
>>  (vec_cmpgt_quiet_nocc): Likewise.
>>  (vec_cmplt_quiet_nocc): New expander.
>>  (vec_cmpge_quiet_nocc): New pattern.
>>  (vec_cmple_quiet_nocc): New expander.
>>  (*vec_cmpeq_signaling_nocc): New pattern.
>>  (*vec_cmpgt_signaling_nocc): Likewise.
>>  (*vec_cmpgt_signaling_finite_nocc): Likewise.
>>  (*vec_cmpge_signaling_nocc): Likewise.
>>  (*vec_cmpge_signaling_finite_nocc): Likewise.
>>  (vec_cmpungt): New expander.
>>  (vec_cmpunge): Likewise.
>>  (vec_cmpuneq): Use quiet patterns.
>>  (vec_cmpltgt): Allow only on z14+.
>>  (vec_cmpordered): Use quiet patterns.
>>  (vec_cmpunordered): Likewise.
>>  (VEC_CMP_EXPAND): Add ungt and unge.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2019-08-09  Ilya Leoshkevich  
>> 
>>  * gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust
>>  expectations.
> 
> ...
> 
>> diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
>> index 7b7c1141449..2d9cd9b5945 100644
>> --- a/gcc/config/s390/s390-modes.def
>> +++ b/gcc/config/s390/s390-modes.def
>> @@ -52,6 +52,8 @@ CCS:  EQ  LT   GT  UNORDERED  
>> (LTGFR, LTGR, LTR, ICM/Y,
>>ADB/R, AEB/R, SDB/R, 
>> SEB/R,
>>SRAG, SRA, SRDA)
>> CCSR: EQ  GT   LT  UNORDERED  (CGF/R, CH/Y)
>> +CCSFPS: EQGT   LT  UNORDERED  (KEB/R, KDB/R, KXBR, 
>> KDTR,
>> +   KXTR, WFK)
> 
> GT and LT need to be swapped.
> 
> Ok with that change. Thanks!

I've swapped GT and LT and committed this as

https://gcc.gnu.org/viewcvs/gcc?view=revision=276871



Re: [PATCH v4 7/7] S/390: Test signaling FP comparison instructions

2019-10-11 Thread Ilya Leoshkevich
> Am 07.10.2019 um 18:27 schrieb Andreas Krebbel :
> 
> On 01.10.19 15:27,  wrote:
>> gcc/testsuite/ChangeLog:
>> 
>> 2019-08-09  Ilya Leoshkevich  
>> 
>>  PR target/77918
>>  * gcc.target/s390/s390.exp: Enable Fortran tests.
>>  * gcc.target/s390/zvector/autovec-double-quiet-eq.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-ge.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-gt.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-le.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-lt.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-ordered.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-quiet-unordered.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c: New 
>> test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-eq.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ge-z13-finite.c: New 
>> test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ge-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ge.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-gt-z13-finite.c: New 
>> test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-gt-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-gt.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-le-z13-finite.c: New 
>> test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-le-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-le.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-lt-z13-finite.c: New 
>> test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-lt-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-lt.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13-finite.c: 
>> New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: New test.
>>  * gcc.target/s390/zvector/autovec-double-smax-z13.F90: New test.
>>  * gcc.target/s390/zvector/autovec-double-smax.F90: New test.
>>  * gcc.target/s390/zvector/autovec-double-smin-z13.F90: New test.
>>  * gcc.target/s390/zvector/autovec-double-smin.F90: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-eq.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-ge.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-gt.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-le.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-lt.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-ordered.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-quiet-unordered.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-eq.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-ge.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-gt.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-le.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-lt.c: New test.
>>  * gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: New test.
>>  * gcc.target/s390/zvector/autovec-fortran.h: New test.
>>  * gcc.target/s390/zvector/autovec-long-double-signaling-ge.c: New test.
>>  * gcc.target/s390/zvector/autovec-long-double-signaling-gt.c: New test.
>>  * gcc.target/s390/zvector/autovec-long-double-signaling-le.c: New test.
>>  * gcc.target/s390/zvector/autovec-long-double-signaling-lt.c: New test.
>>  * gcc.target/s390/zvector/autovec.h: New test.
> 
> Do these tests work on 32 bit? We need -mzarch to make the vector 
> instructions available there.
> 
> Ok, if clean on 32 and 64 bit.

I've added -mzarch, tweaked a few expectations, and committed this as

https://gcc.gnu.org/viewcvs/gcc?view=revision=276872


Avid ggc_alloc and push_cfun during LTO streaming

2019-10-11 Thread Jan Hubicka
Hi,
this patch prevents tree creation druing WPA stream out (to avoid
touching pages and triggering COW).  It fixes the following
 - gimple streamer produces MEM_REF wrappings for global decls.
   This is to preserve the type of access and is not necessary for
   WPA->LTRANS streaming when decls ar eno longer going to be merged.
 - we renumber stmt uids during streaming WPA summaries
 - loop optimizer is initialized in output_function.
After testing the patch I noticed that output_function does one extra
renumbering of stmts. This seems quite broken and I will fix it
incrementally.

Bootstrapped/regtested x86_64-linux, comitted.

* gimple-streamer-out.c (output_gimple_stmt): Add explicit function
parameter.
* lto-streamer-out.c: Include tree-dfa.h.
(output_cfg): Do not use cfun.
(lto_prepare_function_for_streaming): New.
(output_function): Do not push cfun; do not initialize loop optimizer.
* lto-streamer.h (lto_prepare_function_for_streaming): Declare.
* passes.c (ipa_write_summaries): Use it.
(ipa_write_optimization_summaries): Do not modify bodies.
* tree-dfa.c (renumber_gimple_stmt_uids): Add function parameter.
* tree.dfa.h (renumber_gimple_stmt_uids): Update prototype.
* tree-ssa-dse.c (pass_dse::execute): Update use of
renumber_gimple_stmt_uids.
* tree-ssa-math-opts.c (pass_optimize_widening_mul::execute): Likewise.

* lto.c (lto_wpa_write_files): Prepare all bodies for streaming.
Index: gimple-streamer-out.c
===
--- gimple-streamer-out.c   (revision 276850)
+++ gimple-streamer-out.c   (working copy)
@@ -57,7 +57,7 @@ output_phi (struct output_block *ob, gph
 /* Emit statement STMT on the main stream of output block OB.  */
 
 static void
-output_gimple_stmt (struct output_block *ob, gimple *stmt)
+output_gimple_stmt (struct output_block *ob, struct function *fn, gimple *stmt)
 {
   unsigned i;
   enum gimple_code code;
@@ -80,7 +80,7 @@ output_gimple_stmt (struct output_block
 as_a  (stmt)),
   1);
   bp_pack_value (, gimple_has_volatile_ops (stmt), 1);
-  hist = gimple_histogram_value (cfun, stmt);
+  hist = gimple_histogram_value (fn, stmt);
   bp_pack_value (, hist != NULL, 1);
   bp_pack_var_len_unsigned (, stmt->subcode);
 
@@ -139,7 +139,7 @@ output_gimple_stmt (struct output_block
 so that we do not have to deal with type mismatches on
 merged symbols during IL read in.  The first operand
 of GIMPLE_DEBUG must be a decl, not MEM_REF, though.  */
- if (op && (i || !is_gimple_debug (stmt)))
+ if (!flag_wpa && op && (i || !is_gimple_debug (stmt)))
{
  basep = 
  if (TREE_CODE (*basep) == ADDR_EXPR)
@@ -147,7 +147,7 @@ output_gimple_stmt (struct output_block
  while (handled_component_p (*basep))
basep = _OPERAND (*basep, 0);
  if (VAR_P (*basep)
- && !auto_var_in_fn_p (*basep, current_function_decl)
+ && !auto_var_in_fn_p (*basep, fn->decl)
  && !DECL_REGISTER (*basep))
{
  bool volatilep = TREE_THIS_VOLATILE (*basep);
@@ -228,7 +228,7 @@ output_bb (struct output_block *ob, basi
  print_gimple_stmt (streamer_dump_file, stmt, 0, TDF_SLIM);
}
 
- output_gimple_stmt (ob, stmt);
+ output_gimple_stmt (ob, fn, stmt);
 
  /* Emit the EH region holding STMT.  */
  region = lookup_stmt_eh_lp_fn (fn, stmt);
Index: lto/lto.c
===
--- lto/lto.c   (revision 276850)
+++ lto/lto.c   (working copy)
@@ -304,6 +304,13 @@ lto_wpa_write_files (void)
 
   timevar_push (TV_WHOPR_WPA_IO);
 
+  cgraph_node *node;
+  /* Do body modifications needed for streaming before we fork out
+ worker processes.  */
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+if (gimple_has_body_p (node->decl))
+  lto_prepare_function_for_streaming (node);
+
   /* Generate a prefix for the LTRANS unit files.  */
   blen = strlen (ltrans_output_list);
   temp_filename = (char *) xmalloc (blen + sizeof ("2147483648.o"));
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 276850)
+++ lto-streamer-out.c  (working copy)
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
 #include "debug.h"
 #include "omp-offload.h"
 #include "print-tree.h"
+#include "tree-dfa.h"
 
 
 static void lto_write_tree (struct output_block*, tree, bool);
@@ -1893,7 +1894,7 @@ output_cfg (struct output_block *ob, str
 
   streamer_write_hwi (ob, -1);
 
-  bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+  bb = ENTRY_BLOCK_PTR_FOR_FN (fn);
   while (bb->next_bb)
 {
   streamer_write_hwi (ob, bb->next_bb->index);
@@ -1902,9 +1903,6 @@ 

  1   2   >