[PATCH] Fix free_lang_data on asm stmts (PR lto/91572)

2019-08-30 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because on the inline asm LTO streaming streams
the constraint strings ("g" in this case), including their type, but the
fld type discovery doesn't see that type and so we end up streaming const
char type turned into its own main variant.

The strings for asm are in TREE_PURPOSE of the TREE_LIST args.
walk_tree doesn't walk TREE_PURPOSE though.  Tried to change that, but it
breaks way too much, tried to walk TREE_PURPOSE of TREE_LIST just for the
fld walking (find_decls_types_r), but that doesn't work either, most of the
TREE_PURPOSE we do not want to walk, usually it contains C++ default
arguments which fld clears.  So, this directed patch walks the TREE_PURPOSE
solely for the asm stmt arguments.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-08-31  Jakub Jelinek  

PR lto/91572
* tree.c (find_decls_types_in_node): Also walk TREE_PURPOSE of
GIMPLE_ASM TREE_LIST operands.

* g++.dg/lto/pr91572_0.C: New test.

--- gcc/tree.c.jj   2019-08-29 10:22:06.337702323 +0200
+++ gcc/tree.c  2019-08-29 11:07:16.120107950 +0200
@@ -6142,6 +6142,13 @@ find_decls_types_in_node (struct cgraph_
{
  tree arg = gimple_op (stmt, i);
  find_decls_types (arg, fld);
+ /* find_decls_types doesn't walk TREE_PURPOSE of TREE_LISTs,
+which we need for asm stmts.  */
+ if (arg
+ && TREE_CODE (arg) == TREE_LIST
+ && TREE_PURPOSE (arg)
+ && gimple_code (stmt) == GIMPLE_ASM)
+   find_decls_types (TREE_PURPOSE (arg), fld);
}
}
 }
--- gcc/testsuite/g++.dg/lto/pr91572_0.C.jj 2019-08-28 18:13:47.718349087 
+0200
+++ gcc/testsuite/g++.dg/lto/pr91572_0.C2019-08-28 18:13:41.695436342 
+0200
@@ -0,0 +1,12 @@
+// PR lto/91572
+// { dg-lto-do link }
+// { dg-lto-options { { -O -fPIC -flto } } }
+// { dg-require-effective-target shared }
+// { dg-require-effective-target fpic }
+// { dg-extra-ld-options "-shared" }
+
+void foo (char);
+namespace N {
+  class A { A (); };
+  A::A () { asm ("" : : "g" (0)); }
+}

Jakub


Go patch committed: Check for notinheap struct at each struct field

2019-08-30 Thread Ian Lance Taylor
This patch to the Go frontend checks for a notinheap struct at each
struct field.  When generating write barriers, we were only checking
for a notinheap struct at the outermost struct.  That mishandled the
case of setting a pointer to a notinheap struct as a field of another
struct that not notinheap.  This caused an invalid write barrier error
when building the 1.13 version of the runtime.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275239)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-289d94b9e6303ec74649d1f08d418300f2b4d0fd
+3b8a505824abb2a69f4c04c555a4ba29ab8b102b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/wb.cc
===
--- gcc/go/gofrontend/wb.cc (revision 274890)
+++ gcc/go/gofrontend/wb.cc (working copy)
@@ -733,6 +733,31 @@ Gogo::assign_needs_write_barrier(
  && !lhs->type()->points_to()->in_heap())
return false;
 
+  // For a struct assignment, we don't need a write barrier if all
+  // the field types can not be in the heap.
+  Struct_type* st = lhs->type()->struct_type();
+  if (st != NULL)
+   {
+ bool in_heap = false;
+ const Struct_field_list* fields = st->fields();
+ for (Struct_field_list::const_iterator p = fields->begin();
+  p != fields->end();
+  p++)
+   {
+ Type* ft = p->type();
+ if (!ft->has_pointer())
+   continue;
+ if (!ft->in_heap())
+   continue;
+ if (ft->points_to() != NULL && !ft->points_to()->in_heap())
+   continue;
+ in_heap = true;
+ break;
+   }
+ if (!in_heap)
+   return false;
+   }
+
   Field_reference_expression* fre = lhs->field_reference_expression();
   if (fre != NULL)
{
@@ -788,31 +813,6 @@ Gogo::assign_needs_write_barrier(
   && this->is_nonwb_pointer(ue->operand(), nonwb_pointers))
 return false;
 
-  // For a struct assignment, we don't need a write barrier if all the
-  // pointer types can not be in the heap.
-  Struct_type* st = lhs->type()->struct_type();
-  if (st != NULL)
-{
-  bool in_heap = false;
-  const Struct_field_list* fields = st->fields();
-  for (Struct_field_list::const_iterator p = fields->begin();
-  p != fields->end();
-  p++)
-   {
- Type* ft = p->type();
- if (!ft->has_pointer())
-   continue;
- if (!ft->in_heap())
-   continue;
- if (ft->points_to() != NULL && !ft->points_to()->in_heap())
-   continue;
- in_heap = true;
- break;
-   }
-  if (!in_heap)
-   return false;
-}
-
   // Write barrier needed in other cases.
   return true;
 }


Go patch committed: Support and use single argument go:linkname

2019-08-30 Thread Ian Lance Taylor
The gc compiler has started permitting go:linkname comments with a
single argument to mean that a function should be externally visible
outside the package.  This patch implements this in the Go frontend.

We also change the libgo runtime package to use it, rather than
repeating the name just to export a function.

We also remove a couple of unnecessary go:linkname comments on declarations.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275238)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-80403eb9e95c9642ebabb4d7c43deedaa763211f
+289d94b9e6303ec74649d1f08d418300f2b4d0fd
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 275227)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -2531,9 +2531,22 @@ Gogo::add_linkname(const std::string& go
   if (no == NULL)
 go_error_at(loc, "%s is not defined", go_name.c_str());
   else if (no->is_function())
-no->func_value()->set_asm_name(ext_name);
+{
+  if (ext_name.empty())
+   no->func_value()->set_is_exported_by_linkname();
+  else
+   no->func_value()->set_asm_name(ext_name);
+}
   else if (no->is_function_declaration())
-no->func_declaration_value()->set_asm_name(ext_name);
+{
+  if (ext_name.empty())
+   go_error_at(loc,
+   ("//% missing external name "
+"for declaration of %s"),
+   go_name.c_str());
+  else
+   no->func_declaration_value()->set_asm_name(ext_name);
+}
   else
 go_error_at(loc,
("%s is not a function; "
@@ -5465,7 +5478,8 @@ Function::Function(Function_type* type,
 calls_recover_(false), is_recover_thunk_(false), has_recover_thunk_(false),
 calls_defer_retaddr_(false), is_type_specific_function_(false),
 in_unique_section_(false), export_for_inlining_(false),
-is_inline_only_(false), is_referenced_by_inline_(false)
+is_inline_only_(false), is_referenced_by_inline_(false),
+is_exported_by_linkname_(false)
 {
 }
 
@@ -6220,6 +6234,11 @@ Function::get_or_make_decl(Gogo* gogo, N
   if (this->is_referenced_by_inline_)
flags |= Backend::function_is_visible;
 
+  // A go:linkname directive can be used to force a function to be
+  // visible.
+  if (this->is_exported_by_linkname_)
+   flags |= Backend::function_is_visible;
+
   // If a function calls the predeclared recover function, we
   // can't inline it, because recover behaves differently in a
   // function passed directly to defer.  If this is a recover
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 274890)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -547,7 +547,9 @@ class Gogo
   { this->file_block_names_[name] = location; }
 
   // Add a linkname, from the go:linkname compiler directive.  This
-  // changes the externally visible name of go_name to be ext_name.
+  // changes the externally visible name of GO_NAME to be EXT_NAME.
+  // If EXT_NAME is the empty string, GO_NAME is unchanged, but the
+  // symbol is made publicly visible.
   void
   add_linkname(const std::string& go_name, bool is_exported,
   const std::string& ext_name, Location location);
@@ -1359,6 +1361,11 @@ class Function
   set_asm_name(const std::string& asm_name)
   { this->asm_name_ = asm_name; }
 
+  // Mark this symbol as exported by a linkname directive.
+  void
+  set_is_exported_by_linkname()
+  { this->is_exported_by_linkname_ = true; }
+
   // Return the pragmas for this function.
   unsigned int
   pragmas() const
@@ -1706,6 +1713,9 @@ class Function
   // True if this function is referenced from an inlined body that
   // will be put into the export data.
   bool is_referenced_by_inline_ : 1;
+  // True if we should make this function visible to other packages
+  // because of a go:linkname directive.
+  bool is_exported_by_linkname_ : 1;
 };
 
 // A snapshot of the current binding state.
Index: gcc/go/gofrontend/lex.cc
===
--- gcc/go/gofrontend/lex.cc(revision 274614)
+++ gcc/go/gofrontend/lex.cc(working copy)
@@ -1996,7 +1996,7 @@ Lex::skip_cpp_comment()
 
  while (ps < pend && *ps != ' ' && *ps != '\t')
++ps;
- if (ps < pend)
+ if (ps <= pend)
go_name = std::string(pg, ps - pg);
  while (ps < pend && (*ps == ' ' || *ps == '\t'))
++ps;
@@ -2015,8 +2015,8 @@ Lex::skip_cpp_comment()
  ext_name.clear();
}
}
-   

Go patch committed: Don't report runtime escapes if we've seen errors

2019-08-30 Thread Ian Lance Taylor
If we get errors during a Go compilation, we skip the escape analysis
pass.  If we are compiling the runtime package, we report an error if
a bound method expression escapes.  The effect is that if we get an
error while compiling the runtime package, we would report confusing
and meaningless errors about bound method expressions escaping.

This patch stops doing that.

Bootstrapped and ran Go tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275237)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2444eb1e8c697531f8beb403679e4ab00b16dbf5
+80403eb9e95c9642ebabb4d7c43deedaa763211f
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 274998)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -7948,7 +7948,9 @@ Bound_method_expression::do_flatten(Gogo
   Node* n = Node::make_node(this);
   if ((n->encoding() & ESCAPE_MASK) == Node::ESCAPE_NONE)
 ret->heap_expression()->set_allocate_on_stack();
-  else if (gogo->compiling_runtime() && gogo->package_name() == "runtime")
+  else if (gogo->compiling_runtime()
+  && gogo->package_name() == "runtime"
+  && !saw_errors())
 go_error_at(loc, "%s escapes to heap, not allowed in runtime",
 n->ast_format(gogo).c_str());
 


Re: Go patch committed: Provide index information on bounds check failure

2019-08-30 Thread Ian Lance Taylor
On Thu, Aug 29, 2019 at 1:50 PM Andreas Schwab  wrote:
>
> On Aug 28 2019, Ian Lance Taylor  wrote:
>
> > This patch to the Go frontend and libgo changes the panic message
> > reported for an out of bounds index or slice operation to include the
> > invalid values.
>
> This breaks aarch64/-mabi=ilp32.
>
> aarch64-suse-linux/ilp32/libgo/archive/tar/check-testlog:
>
> /usr/aarch64-suse-linux/bin/ld: _gotest_.o: in function 
> `archive..z2ftar.Reader.next':
> /opt/gcc/gcc-20190829/Build/aarch64-suse-linux/ilp32/libgo/gotest1086/test/reader.go:72:
>  undefined reference to `runtime.goPanicExtendSliceAlen'

We should probably use a different GOARCH value for that build, such
as arm64p32.  But for now I'm committing this patch which will fix
both problems by just always building panic32.go.  Bootstrapped and
tested on x86_64-pc-linux-gnu.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275227)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-11fd9208f8545e882f945d3ed86fcc33abf1a61b
+2444eb1e8c697531f8beb403679e4ab00b16dbf5
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/panic32.go
===
--- libgo/go/runtime/panic32.go (revision 274998)
+++ libgo/go/runtime/panic32.go (working copy)
@@ -2,8 +2,6 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build 386 amd64p32 arm mips mipsle m68k nios2 sh shbe
-
 package runtime
 
 import _ "unsafe" // for go:linkname


Re: [PATCH, V3, #4 of 10], Add general prefixed/pcrel support

2019-08-30 Thread Alan Modra
On Fri, Aug 30, 2019 at 11:35:11AM -0500, Segher Boessenkool wrote:
> > - unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> > - return val + 0x8000 >= 0x1 - (TARGET_POWERPC64 ? 8 : 12);
> > + HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> > + HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
> 
> The 8 vs. 12 could use a comment (yes, I know it was there already).  Do you
> know what this is about, why it is 8 and 12?

"extra" here covers the increase in offset needed to access the memory
using multiple registers.  For example, when loading a TImode mem to
gprs you will load at offset+0 and offset+8 when powerpc64, and
offset+0, offset+4, offset+8, and offset+12 when powerpc32.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-30 Thread Alan Modra
On Fri, Aug 30, 2019 at 09:42:06AM +0200, Uros Bizjak wrote:
> On Fri, Aug 30, 2019 at 9:22 AM Richard Biener
>  wrote:
> >
> > On Thu, Aug 29, 2019 at 9:54 AM Uros Bizjak  wrote:
> > >
> > > On Wed, Aug 28, 2019 at 5:12 PM Uros Bizjak  wrote:
> > > >
> > > > Attached patch improves costing for STV shifts and corrects reject
> > > > condition for out of range shift count operands.
> > > >
> > > > 2019-08-28  Uroš Bizjak  
> > > >
> > > > * config/i386/i386-features.c
> > > > (general_scalar_chain::compute_convert_gain):
> > > > Correct cost for double-word shifts.
> > > > (general_scalar_to_vector_candidate_p): Reject count operands
> > > > greater or equal to mode bitsize.
> > > >
> > > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> > > >
> > > > Committed to mainline SVN.
> > >
> > > Ouch... I mixed up patches and actually committed the patch that
> > > removes maximum from cost of sse<->int moves.
> > >
> > > I can leave the patch for a day, so we can see the effects of the cost
> > > change, and if the patch creates problems, I'll revert it.
> >
> > Regresses gromacs and namd quite a bit on Haswell, also perl in SPEC 2000.
> > I guess we should try understand why rather than reverting immediately
> > (I'd leave it in at least another few days to get more testers pick up the
> > rev.).
> 
> The correct patch is actually [1] (I have updated the subject):
> 
> 2019-08-28  Uroš Bizjak  
> 
> * config/i386/i386.c (ix86_register_move_cost): Do not
> limit the cost of moves to/from XMM register to minimum 8.
> 
> There is no technical reason to limit the costs to minimum 8 anymore,
> and several targets (e.g skylake) also claim that moves between SSE
> and general regs are as cheap as moves between general regs. However,
> these values were never benchmarked properly due to the mentioned
> limitation and apparently cause some unwanted secondary effects
> (lowering the clock).
> 
> I agree with your proposal to leave the change in the tree some more
> time. At the end, we could raise the costs for all targets to 8 to
> effectively revert to the previous state.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2019-08/msg02009.html

Those SPEC regressions sound similar to what I saw when trying to give
proper costing to moves between general regs and vsx regs on Power9.
rs6000.c:rs6000_ira_change_pseudo_allocno_class is the hack I used to
work around bad register allocation decisions due to poor register
pressure calculation.

-- 
Alan Modra
Australia Development Lab, IBM


[Committed] PR fortran/91587 -- A syntax error should occur

2019-08-30 Thread Steve Kargl
This is obvious.

2019-08-30  Steven G. Kargl  

PR fortran/91587
* io.c (match_filepos): MATCH_ERROR should branch to a syntax error.

2019-08-30  Steven G. Kargl  

PR fortran/91587
* gfortran.dg/pr91587.f90: New test.

svn commit gcc/fortran/io.c gcc/fortran/ChangeLog gcc/testsuite/ChangeLog \
gcc/testsuite/gfortran.dg/pr91587.f90


Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c(revision 275048)
+++ gcc/fortran/io.c(working copy)
@@ -2845,7 +2845,7 @@ match_filepos (gfc_statement st, gfc_exec_op op)
 
   m = match_file_element (fp);
   if (m == MATCH_ERROR)
-goto done;
+goto syntax;
   if (m == MATCH_NO)
 {
   m = gfc_match_expr (&fp->unit);
Index: gcc/testsuite/gfortran.dg/pr91587.f90
===
--- gcc/testsuite/gfortran.dg/pr91587.f90   (nonexistent)
+++ gcc/testsuite/gfortran.dg/pr91587.f90   (working copy)
@@ -0,0 +1,12 @@
+! { dg-do compile }
+! PR fortran/91587
+! Code contributed by Gerhard Steinmetz
+program p
+   backspace(err=!)  ! { dg-error "Syntax error in" }
+   flush(err=!)  ! { dg-error "Syntax error in" }
+   rewind(err=!) ! { dg-error "Syntax error in" }
+end
+
+subroutine bar   ! An other matcher runs, and gives a different error.
+   endfile(err=!)! { dg-error "Expecting END" }
+end

-- 
Steve


Re: [PATCH 2/2] gcc/riscv: Add a mechanism to remove some calls to _riscv_save_0

2019-08-30 Thread Jim Wilson
On Sun, Aug 25, 2019 at 9:40 AM Jim Wilson  wrote:
> The problem with the linux toolchain support for -msave-restore is
> that we forgot to add a version script to export the symbols.  I made
> the obvious change to add a RISC-V specific libgcc version script, and
> now everything in the g++ and gfortran testsuites are linking, but the
> execution tests are all timing out.  So there is still something
> wrong.  I need to do some more work here.

Following up on this, I found a linker bug where it was emitting error
messages for non-pic code in shared libraries, but wasn't exiting with
an error, so bad shared libraries were being produced without killing
the build.  I wrote and committed a binutils patch to fix that.  The
underlying problem is that the save/restore functions were always
called as non-pic, even for a shared library.  But fixing that
produced shared libraries that failed again, which turned out because
the save/restore functions use the alternate link register t0/x5 which
is clobbered by plts, so we can't call them in shared libraries at
all.  I wrote and committed a gcc patch to disable -msave-restore when
-fpic is used, and emit a warning message if the user explicitly
turned on -msave-restore.  Since we can't use the save/restore
functions in shared libraries, we don't need to export them or support
pic calls to them, and I dropped those patches.  With these changes
I'm now able to do -msave-restore testing with a linux toolchain.
Testing with and without your patch for a riscv64-linux toolchain, I
see 11 extra gcc failures, 2 extra g++ failures, and 8 extra gfortran
failures.  I didn't look at the details.  I suspect that most of these
are failing for the same reason, and there is only a couple of minor
bugs that need to be fixed to make your patch work.

Jim


Re: [ PATCH ] C++20

2019-08-30 Thread JeanHeyd Meneide
Ahem -- we were supposed to use the 20 version of the constexpr macro,
not the base one.
I will note that, for some reason, the default constructor was already
constexpr, so we don't change that one!

On Fri, Aug 30, 2019 at 5:11 PM JeanHeyd Meneide
 wrote:
>
> On Fri, Aug 30, 2019 at 3:41 PM Jonathan Wakely  wrote:
> >
> > On 30/08/19 15:22 -0400, JeanHeyd Meneide wrote:
> > >This patch implements  as it currently exists in the C++20 Working 
> > >Draft.
> >
> > Nice!
> >
> >
> > >Notes:
> > >- __gnu_cxx::__normal_iterator is not fully constexpr, so its not used here
> >
> > I'd prefer to make __normal_iterator constexpr, and use it.
> > It needs to be constexpr anyway for string and vector.
> > ...
>
> Alright, thank you for the feedback. I fixed it up!
>
> Tested x86_64-pc-linux-gnu.
>
> 2019-08-30  JeanHeyd "ThePhD" Meneide  
>
> * include/std/span: Implement the entirety of span.
> * include/bits/stl_iterator.h: __gnu_cxx::__normal_iterator C> is now constexpr-qualified for C++11+.
>* include/bits/range_access.h: Add __adl_* versions of access 
> functions.
> * testsuite/23_containers/span/everything.cc: constexpr and
> non-constexpr tests.
> * include/Makefile.in: Add span to install.
> * include/Makefile.am: Likewise
diff --git a/.gitignore b/.gitignore
index d9d3967a12c..fd116c362ac 100644
--- a/.gitignore
+++ b/.gitignore
@@ -57,3 +57,7 @@ REVISION
 /mpc*
 /gmp*
 /isl*
+
+# ignore sprinkled in dev files that keep popping up
+.vscode/
+.vs/
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 3fe80f32cc4..b8b786d9260 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -68,6 +68,7 @@ std_headers = \
${std_srcdir}/scoped_allocator \
${std_srcdir}/set \
${std_srcdir}/shared_mutex \
+   ${std_srcdir}/span \
${std_srcdir}/sstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index b675d356cd4..cd1e9df5482 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -412,6 +412,7 @@ std_headers = \
${std_srcdir}/scoped_allocator \
${std_srcdir}/set \
${std_srcdir}/shared_mutex \
+   ${std_srcdir}/span \
${std_srcdir}/sstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index d1e74711433..3acaebadcf1 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -318,6 +318,72 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #endif // C++17
 
+#if __cplusplus > 201703L
+  // "why are these in namespace std:: and not __gnu_cxx:: ?"
+  // because if we don't put them here it's impossible to 
+  // have implicit ADL with "using std::begin/end/size/data;".
+  template 
+constexpr auto
+__adl_begin(_Container& __cont) noexcept(noexcept(begin(__cont)))
+{ return begin(__cont); }
+  
+  template 
+constexpr auto
+__adl_end(_Container& __cont) noexcept(noexcept(end(__cont)))
+{ return end(__cont); }
+  
+  template 
+constexpr auto
+__adl_cbegin(_Container& __cont) noexcept(noexcept(cbegin(__cont)))
+{ return cbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_cend(_Container& __cont) noexcept(noexcept(cend(__cont)))
+{ return cend(__cont); }
+
+  template 
+constexpr auto
+__adl_rbegin(_Container& __cont) noexcept(noexcept(rbegin(__cont)))
+{ return rbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_rend(_Container& __cont) noexcept(noexcept(rend(__cont)))
+{ return rend(__cont); }
+  
+  template 
+constexpr auto
+__adl_crbegin(_Container& __cont) noexcept(noexcept(crbegin(__cont)))
+{ return crbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_crend(_Container& __cont) noexcept(noexcept(crend(__cont)))
+{ return crend(__cont); }
+  
+  template 
+constexpr auto
+__adl_data(_Container& __cont) noexcept(noexcept(data(__cont)))
+{ return data(__cont); }
+  
+  template 
+constexpr auto
+__adl_cdata(_Container& __cont) noexcept(noexcept(cdata(__cont)))
+{ return cdata(__cont); }
+  
+  template 
+constexpr auto
+__adl_size(_Container& __cont) noexcept(noexcept(size(__cont)))
+{ return size(__cont); }
+  
+  template 
+constexpr auto
+__adl_empty(_Container& __cont) noexcept(noexcept(empty(__cont)))
+{ return empty(__cont); }
+  
+#endif // C++20
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 8ab0d72b0c2..f6a6060f2e3 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -799,55 +799,58 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef type

[PATCH] RISC-V: Disable -msave-restore for shared libraries.

2019-08-30 Thread Jim Wilson
This was noticed while trying to test -msave-restore support.  The
save/restore routines use the alternate return register t0/x5 which is
clobbered by the PLT header, so we can't use them in shared libraries.
This patch disables -msave-restore when -fpic (and -mplt), and emits a
warning if the user explicitly turned on -msave-restore.

Tested with a riscv64-linux cross compiler build and make check with the
option -msave-restore hardwired on.  There are thousands of failures without
the patch.  With the patch I get a reasonable number of failures.  Also
tested by hand to verify I get the warning when the user specifies the option.

Committed.

Jim

gcc/
* config/riscv/riscv.c (riscv_option_override): If -msave-restore
and -fpic and -mplt then disable -msave-restore and warn.
---
 gcc/config/riscv/riscv.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 9b16a1eb9c2..1e7528f1cb9 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -4636,6 +4636,16 @@ riscv_option_override (void)
 error ("%<-mriscv-attribute%> RISC-V ELF attribute requires GNU as 2.32"
   " [%<-mriscv-attribute%>]");
 #endif
+
+  /* The save-restore routines use t0 which is clobbered by the plt header,
+ so we can't use them when building shared libraries.  */
+  if (TARGET_SAVE_RESTORE && flag_pic && TARGET_PLT)
+{
+  target_flags &= ~MASK_SAVE_RESTORE;
+  if (target_flags_explicit & MASK_SAVE_RESTORE)
+   warning (0, "%<-msave-restore%> disabled; not supported with PLT "
+"based shared libraries");
+}
 }
 
 /* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
-- 
2.17.1



Go patch committed: Permit anonymous and empty fields in C header

2019-08-30 Thread Ian Lance Taylor
This patch to the Go frontend permits putting structs with anonymous
and empty fields in the C header file runtime.inc that is used to
build the C runtime code.  This is required for upcoming 1.13 support,
as the m struct has picked up an anonymous field.

Doing this lets the C header contain all the type descriptor structs,
so start using those in the C code.  This cuts the number of copies of
type descriptor definitions from 3 to 2.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 275010)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-db738935c77443840994e5a9f77e619e67a4c43a
+11fd9208f8545e882f945d3ed86fcc33abf1a61b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 274998)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -5238,11 +5238,11 @@ Gogo::write_c_header()
   // package they are mostly types defined by mkrsysinfo.sh based
   // on the C system header files.  We don't need to translate
   // types to C and back to Go.  But do accept the special cases
-  // _defer and _panic.
+  // _defer, _panic, and _type.
   std::string name = Gogo::unpack_hidden_name(no->name());
   if (name[0] == '_'
  && (name[1] < 'A' || name[1] > 'Z')
- && (name != "_defer" && name != "_panic"))
+ && (name != "_defer" && name != "_panic" && name != "_type"))
continue;
 
   if (no->is_type() && no->type_value()->struct_type() != NULL)
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 274935)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -6777,8 +6777,6 @@ Struct_type::can_write_to_c_header(
p != fields->end();
++p)
 {
-  if (p->is_anonymous())
-   return false;
   if (!this->can_write_type_to_c_header(p->type(), requires, declare))
return false;
   if (Gogo::message_name(p->field_name()) == "_")
@@ -6847,6 +6845,9 @@ Struct_type::can_write_type_to_c_header(
  }
if (t->struct_type() != NULL)
  {
+   // We will accept empty struct fields, but not print them.
+   if (t->struct_type()->total_field_count() == 0)
+ return true;
requires->push_back(no);
return t->struct_type()->can_write_to_c_header(requires, declare);
  }
@@ -6871,6 +6872,12 @@ Struct_type::write_to_c_header(std::ostr
p != fields->end();
++p)
 {
+  // Skip fields that are empty struct types.  The C code can't
+  // refer to them anyhow.
+  if (p->type()->struct_type() != NULL
+ && p->type()->struct_type()->total_field_count() == 0)
+   continue;
+
   os << '\t';
   this->write_field_to_c_header(os, p->field_name(), p->type());
   os << ';' << std::endl;
Index: libgo/go/reflect/makefunc_ffi_c.c
===
--- libgo/go/reflect/makefunc_ffi_c.c   (revision 274169)
+++ libgo/go/reflect/makefunc_ffi_c.c   (working copy)
@@ -3,7 +3,6 @@
 // license that can be found in the LICENSE file.
 
 #include "runtime.h"
-#include "go-type.h"
 
 #ifdef USE_LIBFFI
 
Index: libgo/mkruntimeinc.sh
===
--- libgo/mkruntimeinc.sh   (revision 274998)
+++ libgo/mkruntimeinc.sh   (working copy)
@@ -29,6 +29,12 @@ do
   sed -e '/struct '${TYPE}' {/,/^}/s/^.*$//' runtime.inc.tmp2 > 
runtime.inc.tmp3;
   mv runtime.inc.tmp3 runtime.inc.tmp2
 done
-sed -e 's/sigset/sigset_go/' runtime.inc.tmp2 > ${OUT}
+sed -e 's/sigset/sigset_go/' runtime.inc.tmp2 > runtime.inc.tmp3
+mv runtime.inc.tmp3 runtime.inc.tmp2
+
+# Make all the fields of type structs const.
+sed -e '/struct .*type {/,/^}/ s/  \(.*;\)/const \1/' < 
runtime.inc.tmp2 > runtime.inc.tmp3
+mv -f runtime.inc.tmp3 ${OUT}
+
 rm -f runtime.inc.tmp2 runtime.inc.tmp3
 exit 0
Index: libgo/runtime/go-construct-map.c
===
--- libgo/runtime/go-construct-map.c(revision 274169)
+++ libgo/runtime/go-construct-map.c(working copy)
@@ -9,20 +9,17 @@
 #include 
 
 #include "runtime.h"
-#include "go-type.h"
 
-extern void *makemap (const struct __go_map_type *, intgo hint,
- void *)
+extern void *makemap (const struct maptype *, intgo hint, void *)
   __asm__ (GOSYM_PREFIX "runtime.makemap");
 
-extern void *mapassign (const struct __go_map_type *, void *hmap,
-   const void *key)
+extern void *mapassign (const struct maptype *, void *hmap, const void *key)
   __asm__ (G

Re: [PATCH] or1k: Fix issue with set_got clobbering r9

2019-08-30 Thread Stafford Horne
On Fri, Aug 30, 2019 at 08:21:56AM -0700, Richard Henderson wrote:
> LGTM.

Thank you.
 
> On 8/30/19 2:31 AM, Stafford Horne wrote:
> > Hello, any comments on this?
> > 
> > If nothing I will commit in a few days.
> > 
> > On Thu, Aug 22, 2019 at 08:44:04PM +0900, Stafford Horne wrote:
> >> When compiling glibc we found that the GOT register was being allocated
> >> r9 when the instruction was still set_got_tmp.  That caused set_got to
> >> clobber r9.  We cannot simply say set_got_tmp clobbers r9 as this is the
> >> reason for having the temporary set_got_tmp.
> >>
> >> Fix by using a register class constraint that does not allow r9 during
> >> register allocation.
> >>
> >> gcc/ChangeLog:
> >>
> >>* config/or1k/constraints.md (t): New constraint.
> >>* config/or1k/or1k.h (GOT_REGS): New register class.
> >>* config/or1k/or1k.md (set_got_tmp, set_got): Use t contraint.
> >> ---
> >>  gcc/config/or1k/constraints.md | 4 
> >>  gcc/config/or1k/or1k.h | 3 +++
> >>  gcc/config/or1k/or1k.md| 4 ++--
> >>  3 files changed, 9 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/or1k/constraints.md 
> >> b/gcc/config/or1k/constraints.md
> >> index 8cac7eb5329..ba330c6b8c2 100644
> >> --- a/gcc/config/or1k/constraints.md
> >> +++ b/gcc/config/or1k/constraints.md
> >> @@ -25,6 +25,7 @@
> >>  ; We use:
> >>  ;  c - sibcall registers
> >>  ;  d - double pair base registers (excludes r0, r30 and r31 which 
> >> overflow)
> >> +;  t - got address registers (excludes r9 is clobbered by set_got)
> > 
> > I will changee this to (... r9 which is clobbered ...)
> > 
> >>  ;  I - constant signed 16-bit
> >>  ;  K - constant unsigned 16-bit
> >>  ;  M - constant signed 16-bit shifted left 16-bits (l.movhi)
> >> @@ -36,6 +37,9 @@
> >>  (define_register_constraint "d" "DOUBLE_REGS"
> >>"Registers which can be used for double reg pairs.")
> >>  
> >> +(define_register_constraint "t" "GOT_REGS"
> >> +  "Registers which can be used to store the Global Offset Table (GOT) 
> >> address.")
> >> +
> >>  ;; Immediates
> >>  (define_constraint "I"
> >>"A signed 16-bit immediate in the range -32768 to 32767."
> >> diff --git a/gcc/config/or1k/or1k.h b/gcc/config/or1k/or1k.h
> >> index 2b29e62fdd3..4c32607bac1 100644
> >> --- a/gcc/config/or1k/or1k.h
> >> +++ b/gcc/config/or1k/or1k.h
> >> @@ -190,6 +190,7 @@ enum reg_class
> >>NO_REGS,
> >>SIBCALL_REGS,
> >>DOUBLE_REGS,
> >> +  GOT_REGS,
> >>GENERAL_REGS,
> >>FLAG_REGS,
> >>ALL_REGS,
> >> @@ -202,6 +203,7 @@ enum reg_class
> >>"NO_REGS",  \
> >>"SIBCALL_REGS", \
> >>"DOUBLE_REGS",  \
> >> +  "GOT_REGS", \
> >>"GENERAL_REGS", \
> >>"FLAG_REGS",\
> >>"ALL_REGS" }
> >> @@ -215,6 +217,7 @@ enum reg_class
> >>  { { 0x, 0x }, \
> >>{ SIBCALL_REGS_MASK,   0 }, \
> >>{ 0x7f7e, 0x }, \
> >> +  { 0xfdff, 0x }, \
> >>{ 0x, 0x0003 }, \
> >>{ 0x, 0x0004 }, \
> >>{ 0x, 0x0007 }  \
> >> diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
> >> index cee11d078cc..36bcee336ab 100644
> >> --- a/gcc/config/or1k/or1k.md
> >> +++ b/gcc/config/or1k/or1k.md
> >> @@ -706,7 +706,7 @@
> >>  ;; set_got pattern below.  This works because the set_got_tmp insn is the
> >>  ;; first insn in the stream and that it isn't moved during RA.
> >>  (define_insn "set_got_tmp"
> >> -  [(set (match_operand:SI 0 "register_operand" "=r")
> >> +  [(set (match_operand:SI 0 "register_operand" "=t")
> >>(unspec_volatile:SI [(const_int 0)] UNSPECV_SET_GOT))]
> >>""
> >>  {
> >> @@ -715,7 +715,7 @@
> >>  
> >>  ;; The insn to initialize the GOT.
> >>  (define_insn "set_got"
> >> -  [(set (match_operand:SI 0 "register_operand" "=r")
> >> +  [(set (match_operand:SI 0 "register_operand" "=t")
> >>(unspec:SI [(const_int 0)] UNSPEC_SET_GOT))
> >> (clobber (reg:SI LR_REGNUM))]
> >>""
> >> -- 
> >> 2.21.0
> >>
> 


Re: [ PATCH ] C++20

2019-08-30 Thread JeanHeyd Meneide
On Fri, Aug 30, 2019 at 3:41 PM Jonathan Wakely  wrote:
>
> On 30/08/19 15:22 -0400, JeanHeyd Meneide wrote:
> >This patch implements  as it currently exists in the C++20 Working 
> >Draft.
>
> Nice!
>
>
> >Notes:
> >- __gnu_cxx::__normal_iterator is not fully constexpr, so its not used here
>
> I'd prefer to make __normal_iterator constexpr, and use it.
> It needs to be constexpr anyway for string and vector.
> ...

Alright, thank you for the feedback. I fixed it up!

Tested x86_64-pc-linux-gnu.

2019-08-30  JeanHeyd "ThePhD" Meneide  

* include/std/span: Implement the entirety of span.
* include/bits/stl_iterator.h: __gnu_cxx::__normal_iterator is now constexpr-qualified for C++11+.
   * include/bits/range_access.h: Add __adl_* versions of access functions.
* testsuite/23_containers/span/everything.cc: constexpr and
non-constexpr tests.
* include/Makefile.in: Add span to install.
* include/Makefile.am: Likewise
diff --git a/.gitignore b/.gitignore
index d9d3967a12c..fd116c362ac 100644
--- a/.gitignore
+++ b/.gitignore
@@ -57,3 +57,7 @@ REVISION
 /mpc*
 /gmp*
 /isl*
+
+# ignore sprinkled in dev files that keep popping up
+.vscode/
+.vs/
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 3fe80f32cc4..b8b786d9260 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -68,6 +68,7 @@ std_headers = \
${std_srcdir}/scoped_allocator \
${std_srcdir}/set \
${std_srcdir}/shared_mutex \
+   ${std_srcdir}/span \
${std_srcdir}/sstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index b675d356cd4..cd1e9df5482 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -412,6 +412,7 @@ std_headers = \
${std_srcdir}/scoped_allocator \
${std_srcdir}/set \
${std_srcdir}/shared_mutex \
+   ${std_srcdir}/span \
${std_srcdir}/sstream \
${std_srcdir}/stack \
${std_srcdir}/stdexcept \
diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index d1e74711433..3acaebadcf1 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -318,6 +318,72 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #endif // C++17
 
+#if __cplusplus > 201703L
+  // "why are these in namespace std:: and not __gnu_cxx:: ?"
+  // because if we don't put them here it's impossible to 
+  // have implicit ADL with "using std::begin/end/size/data;".
+  template 
+constexpr auto
+__adl_begin(_Container& __cont) noexcept(noexcept(begin(__cont)))
+{ return begin(__cont); }
+  
+  template 
+constexpr auto
+__adl_end(_Container& __cont) noexcept(noexcept(end(__cont)))
+{ return end(__cont); }
+  
+  template 
+constexpr auto
+__adl_cbegin(_Container& __cont) noexcept(noexcept(cbegin(__cont)))
+{ return cbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_cend(_Container& __cont) noexcept(noexcept(cend(__cont)))
+{ return cend(__cont); }
+
+  template 
+constexpr auto
+__adl_rbegin(_Container& __cont) noexcept(noexcept(rbegin(__cont)))
+{ return rbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_rend(_Container& __cont) noexcept(noexcept(rend(__cont)))
+{ return rend(__cont); }
+  
+  template 
+constexpr auto
+__adl_crbegin(_Container& __cont) noexcept(noexcept(crbegin(__cont)))
+{ return crbegin(__cont); }
+  
+  template 
+constexpr auto
+__adl_crend(_Container& __cont) noexcept(noexcept(crend(__cont)))
+{ return crend(__cont); }
+  
+  template 
+constexpr auto
+__adl_data(_Container& __cont) noexcept(noexcept(data(__cont)))
+{ return data(__cont); }
+  
+  template 
+constexpr auto
+__adl_cdata(_Container& __cont) noexcept(noexcept(cdata(__cont)))
+{ return cdata(__cont); }
+  
+  template 
+constexpr auto
+__adl_size(_Container& __cont) noexcept(noexcept(size(__cont)))
+{ return size(__cont); }
+  
+  template 
+constexpr auto
+__adl_empty(_Container& __cont) noexcept(noexcept(empty(__cont)))
+{ return empty(__cont); }
+  
+#endif // C++20
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 8ab0d72b0c2..4d432da0c6b 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -803,51 +803,51 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_current(_Iterator()) { }
 
   explicit
-  __normal_iterator(const _Iterator& __i) _GLIBCXX_NOEXCEPT
+  _GLIBCXX_CONSTEXPR __normal_iterator(const _Iterator& __i) 
_GLIBCXX_NOEXCEPT
   : _M_current(__i) { }
 
   // Allow iterator to const_iterator conversion
   template
-__normal_iterator(const 

Re: [PATCH] Optimize to_chars

2019-08-30 Thread Jonathan Wakely

On 30/08/19 22:46 +0300, Antony Polukhin wrote:

пт, 30 авг. 2019 г. в 19:01, Jonathan Wakely :
<...>

Have you tried comparing the improved code to libc++'s implementation?
I believe they use precomputed arrays of digits, but they use larger
arrays that allow 4 bytes to be written at once, which is considerably
faster (and those precomputed arrays libe in libc++.so not in the
header). Would we be better off keeping the precomputed arrays and
expanding them to do 4-byte writes?


This would not do good for bases 2, 8 and 16. For power of two bases
there is no costly `mod` or `div` instructions, only bit operations.
By increasing the digits table size the cache misses become more
likely.


OK, thanks. I'll try benchmarking your improved code against the
libc++ version next week.



Re: [PATCH] Optimize to_chars

2019-08-30 Thread Antony Polukhin
пт, 30 авг. 2019 г. в 19:01, Jonathan Wakely :
<...>
> Have you tried comparing the improved code to libc++'s implementation?
> I believe they use precomputed arrays of digits, but they use larger
> arrays that allow 4 bytes to be written at once, which is considerably
> faster (and those precomputed arrays libe in libc++.so not in the
> header). Would we be better off keeping the precomputed arrays and
> expanding them to do 4-byte writes?

This would not do good for bases 2, 8 and 16. For power of two bases
there is no costly `mod` or `div` instructions, only bit operations.
By increasing the digits table size the cache misses become more
likely.

For base 10... A few decades ago it was definitely a good idea to have
a big array of digits. Nowadays compilers know how to replace `__val /
10` and `__val % 10` with much cheaper multiplications and bit magic.
This is not as cheap as bit operations for power of two bases, but
some cache misses could nullify the advantage.

I need to experiment with base 10. There are ways to compress the
digits table, but it requires benchmarking. Because of that this patch
does not touch the __detail::__to_chars_10.

>
> Since we don't have a patch to do that, I think I'll commit yours. We
> can always go back to precomputed arrays later if somebody does that
> work.
>
> My only comments are on the changelog:
>
> >Changelog:
> >* include/std/charconv (__detail::__to_chars_8,
> >__detail::__to_chars_16): Replace array of precomputed digits
>
> When the list of changed functions is split across lines it should be
> like this:
>
> * include/std/charconv (__detail::__to_chars_8)
> (__detail::__to_chars_16): Replace array of precomputed digits
>
> i.e close the parentheses before the line break, and reopen on the
> next line.
>
> >with arithmetic operations to avoid CPU cache misses. Remove
> >zero termination from array of digits to allow symbol merge with
> >generic implementation of __detail::__to_chars. Replace final
> >offsets with constants. Use __detail::__to_chars_len_2 instead
> >of a generic __detail::__to_chars_len.
> >* include/std/charconv (__detail::__to_chars): Remove
>
> Don't repeat the asterisk and filename for later changes in the same
> file, i.e.
>
> (__detail::__to_chars): Remove zero termination from array of digits.
> (__detail::__to_chars_2): Leading digit is always '1'.
>
> There's no changelog entry for the changes to __to_chars_len_8 and
> __to_chars_len_16.
>
> Also, please don't include the ChangeLog diff in the patch, because it
> just makes it hard to apply the patch (the ChangeLog part will almost
> always fail to apply because somebody else will have modified the
> ChangeLog file since you created the patch, and so that hunk won't
> apply. The ChangeLog text should be sent as a separate (plain text)
> attachment, or just in the email body (as you did above).

Thanks! I'll take it into account next time.

-- 
Best regards,
Antony Polukhin


Re: [ PATCH ] C++20

2019-08-30 Thread Jonathan Wakely

On 30/08/19 15:22 -0400, JeanHeyd Meneide wrote:

This patch implements  as it currently exists in the C++20 Working Draft.


Nice!



Notes:
- __gnu_cxx::__normal_iterator is not fully constexpr, so its not used here


I'd prefer to make __normal_iterator constexpr, and use it.
It needs to be constexpr anyway for string and vector.


- P1394 might be slated to end up in C++20 per National Body Comments.
Therefore, an early implementation is left in an
implementation-defined _GLIBCXX_P1394 define.

2019-08-30  JeanHeyd "ThePhD" Meneide  

   * include/std/span: Implement the entirety of span.
   * include/bits/range_access.h: Add __adl_* versions of access functions.
   * testsuite/23_containers/span/everything.cc: constexpr and
non-constexpr tests.
   * include/Makefile.in: Add span to install.
   * include/Makefile.am: Likewise



+++ b/libstdc++-v3/include/std/span
@@ -0,0 +1,549 @@
+// Components for manipulating non-owning sequences of objects -*- C++ -*-
+
+// Copyright (C) 2019-2019 Free Software Foundation, Inc.


Just 2019 please, not 2019-2019.


+// WARNING: they forgot this feature test macro
+// get on someone's back about it in Belfast!!!


Please use FIXME: instead of WARNING: (for consistency with the rest
of the sources, so people grepping for FIXME: can find this).

The new feature test macro should be in  too.

There's no need to qualify ::std::true_type and ::std::size_t when
within namespace std already. There's no ADL for type names, and
normal unqualified lookup will find the right ones (and is easier to
read).


+  static_assert(
+_Count == ::std::dynamic_extent || _Extent == ::std::dynamic_extent || 
_Count <= _Extent,
+"bad span length");


There are a number of lines that are too long, they need to be broken
before 80 columns.

Our static_assert messages should be stated as the positive condition
that is being asserted. So the diagnostic reads like
"assertion failed: thing being asserted"

So "bad span length" makes it look like we asserted the length is bad,
but actually it was good. I prefer to write something saying "X must
be true", e.g. "count must be equal to dynamic_extent, or less than
the span's extent".

Do we need to check _Extent == ::std::dynamic_extent here, give nthat
if it's true then _Count <= _Extent will be true as well?


+  std::vector value{ 0 };


Your new testcase uses std::vector without including .




Re: [PATCH] Optimize to_chars

2019-08-30 Thread Jonathan Wakely

On 30/08/19 11:03 -0600, Martin Sebor wrote:

On 8/30/19 8:27 AM, Antony Polukhin wrote:

Bunch of micro optimizations for std::to_chars:
* For base == 8 replacing the lookup in __digits table with arithmetic
computations leads to a same CPU cycles for a loop (exchanges two
movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
saves 129 bytes of data and totally avoids a chance of cache misses on
__digits.
* For base == 16 replacing the lookup in __digits table with
arithmetic computations leads to a few additional instructions, but
totally avoids a chance of cache misses on __digits (- ~9 cache misses
for worst case) and saves 513 bytes of const data.
* Replacing __first[pos] and __first[pos - 1] with __first[1] and
__first[0] on final iterations saves ~2% of code size.
* Removing trailing '\0' from arrays of digits allows the linker to
merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
"0123456789abcdef" could share the same address). This improves data
locality and reduces binary sizes.
* Using __detail::__to_chars_len_2 instead of a generic
__detail::__to_chars_len makes the operation O(1) instead of O(N). It
also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
.

In sum: this significantly reduces the size of a binary (for about
4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
with latency (CPU cache misses) without changing the iterations count
and without adding costly instructions into the loops.


Would it make sense to move some of this code into GCC as
a built-in so that it could also be used by GCC to expand
some strtol and sprintf calls?


Makes sense, although we'd still need it in libstdc++ until Clang and
EDG implement the same built-in.





[ PATCH ] C++20

2019-08-30 Thread JeanHeyd Meneide
This patch implements  as it currently exists in the C++20 Working Draft.

Notes:
- __gnu_cxx::__normal_iterator is not fully constexpr, so its not used here
- P1394 might be slated to end up in C++20 per National Body Comments.
Therefore, an early implementation is left in an
implementation-defined _GLIBCXX_P1394 define.

2019-08-30  JeanHeyd "ThePhD" Meneide  

* include/std/span: Implement the entirety of span.
* include/bits/range_access.h: Add __adl_* versions of access functions.
* testsuite/23_containers/span/everything.cc: constexpr and
non-constexpr tests.
* include/Makefile.in: Add span to install.
* include/Makefile.am: Likewise


ThePhD.span.patch
Description: Binary data


Re: [PATCH, V3, #5 of 10], Make -mpcrel default on little endian Linux systems

2019-08-30 Thread Segher Boessenkool
On Mon, Aug 26, 2019 at 05:07:25PM -0400, Michael Meissner wrote:
> +/* By default enable support for pc-relative and numeric prefixed addressing 
> on
> +   the 'future' system, unless it is overriden at build time.  */
> +#ifndef TARGET_PREFIXED_ADDR_DEFAULT
> +#define TARGET_PREFIXED_ADDR_DEFAULT 1
> +#endif
> +
> +#if !defined (TARGET_PCREL_DEFAULT) && TARGET_PREFIXED_ADDR_DEFAULT
> +#define TARGET_PCREL_DEFAULT 1
> +#endif

Spelling ("overridden").

How can it be overridden at build time?

How can it be defined already, when linux64.h is included?  Don't put in
guards against things that cannot happen.


> +  if (TARGET_FUTURE)
> +{
> +  bool explicit_prefixed = ((rs6000_isa_flags_explicit
> +  & OPTION_MASK_PREFIXED_ADDR) != 0);
> +  bool explicit_pcrel = ((rs6000_isa_flags_explicit
> +   & OPTION_MASK_PCREL) != 0);
> +
> +  /* Prefixed addressing requires 64-bit registers.  */

Does it?  Don't disable things just because you do not want to think
about if and how to support them.  Be much more exact in the comment here
if you do have a reason to disable it here.

> +  if (!TARGET_POWERPC64)
> + {
> +   if (TARGET_PCREL && explicit_pcrel)
> + error ("%qs requires %qs", "-mpcrel", "-m64");

TARGET_POWERPC64 is -mpowerpc64.  -m64 is TARGET_64BIT.

> +  /* Enable defaults if desired.  */
> +  else
> + {
> +   if (!explicit_prefixed
> +   && (TARGET_PREFIXED_ADDR_DEFAULT
> +   || TARGET_PCREL
> +   || TARGET_PCREL_DEFAULT))
> + rs6000_isa_flags |= OPTION_MASK_PREFIXED_ADDR;
> +
> +   if (!explicit_pcrel && TARGET_PCREL_DEFAULT
> +   && TARGET_CMODEL == CMODEL_MEDIUM)
> + rs6000_isa_flags |= OPTION_MASK_PCREL;
> + }

Should these be the other way around?


Segher


Re: [PATCH] Optimize to_chars

2019-08-30 Thread Martin Sebor

On 8/30/19 8:27 AM, Antony Polukhin wrote:

Bunch of micro optimizations for std::to_chars:
* For base == 8 replacing the lookup in __digits table with arithmetic
computations leads to a same CPU cycles for a loop (exchanges two
movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
saves 129 bytes of data and totally avoids a chance of cache misses on
__digits.
* For base == 16 replacing the lookup in __digits table with
arithmetic computations leads to a few additional instructions, but
totally avoids a chance of cache misses on __digits (- ~9 cache misses
for worst case) and saves 513 bytes of const data.
* Replacing __first[pos] and __first[pos - 1] with __first[1] and
__first[0] on final iterations saves ~2% of code size.
* Removing trailing '\0' from arrays of digits allows the linker to
merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
"0123456789abcdef" could share the same address). This improves data
locality and reduces binary sizes.
* Using __detail::__to_chars_len_2 instead of a generic
__detail::__to_chars_len makes the operation O(1) instead of O(N). It
also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
.

In sum: this significantly reduces the size of a binary (for about
4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
with latency (CPU cache misses) without changing the iterations count
and without adding costly instructions into the loops.


Would it make sense to move some of this code into GCC as
a built-in so that it could also be used by GCC to expand
some strtol and sprintf calls?

Martin



Changelog:
 * include/std/charconv (__detail::__to_chars_8,
 __detail::__to_chars_16): Replace array of precomputed digits
 with arithmetic operations to avoid CPU cache misses. Remove
 zero termination from array of digits to allow symbol merge with
 generic implementation of __detail::__to_chars. Replace final
 offsets with constants. Use __detail::__to_chars_len_2 instead
 of a generic __detail::__to_chars_len.
 * include/std/charconv (__detail::__to_chars): Remove
 zero termination from array of digits.
 * include/std/charconv (__detail::__to_chars_2): Leading digit
 is always '1'.





Re: [PATCH 2/2] rs6000: Use unspec_volatile for darn (PR91481)

2019-08-30 Thread Segher Boessenkool
On Thu, Aug 22, 2019 at 07:20:48PM +, Segher Boessenkool wrote:
> Every call to darn should deliver a *new* random number; such calls
> shouldnot be CSEd together.  So they should be unspec_volatile, not
> plain unspec.
> 
> Tested on powerpc64-linux {-m32,-m64}.  (New testcase coming soon).
> Committing to trunk, and will backport to 9, 8, 7 as well.

Done now, all three patches (including the testcase).


Segher


Re: [PATCH, V3, #4 of 10], Add general prefixed/pcrel support

2019-08-30 Thread Segher Boessenkool
Hi!

(Please split off paddi to a separate patch?)

On Mon, Aug 26, 2019 at 04:43:37PM -0400, Michael Meissner wrote:
>   (prefixed_paddi_p): Fix thinkos in last patch.

Do that separately please.  Don't hide this in another patch like this.

Hrm, this is not in this patch at all?  Fix the changelog, then :-)

> --- gcc/config/rs6000/predicates.md   (revision 274870)
> +++ gcc/config/rs6000/predicates.md   (working copy)
> @@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
>  (define_predicate "add_operand"
>(if_then_else (match_code "const_int")
>  (match_test "satisfies_constraint_I (op)
> -  || satisfies_constraint_L (op)")
> +  || satisfies_constraint_L (op)
> +  || satisfies_constraint_eI (op)")
>  (match_operand 0 "gpc_reg_operand")))
>  
>  ;; Return 1 if the operand is either a non-special register, or 0, or -1.
> @@ -852,7 +853,8 @@ (define_predicate "adde_operand"
>  (define_predicate "non_add_cint_operand"
>(and (match_code "const_int")
> (match_test "!satisfies_constraint_I (op)
> - && !satisfies_constraint_L (op)")))
> + && !satisfies_constraint_L (op)
> + && !satisfies_constraint_eI (op)")))

(define_predicate "non_add_cint_operand"
  (and (match_code "const_int")
   (not (match_operand 0 "add_operand"

?  You can do that *now*, and it is pre-approved.  (This could use a better
name btw., I always have to look up what it means; a longer name is fine as
well of course, it is used only once or so).

> @@ -933,6 +935,13 @@ (define_predicate "lwa_operand"
>  return false;
>  
>addr = XEXP (inner, 0);
> +
> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> + the offset must be 0.  The prefixed PLWA does not have this
> + restriction.  */
> +  if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS))
> +return true;

Why does the decision whether something is a valid prefixed lwa_operand
need to know the non-prefixed lwa is a DS-form instruction?

And "local" is a head-scratcher for this condition, too.

> +;; Return 1 if op is a memory operand that is not prefixed.
> +(define_predicate "non_prefixed_mem_operand"
> +  (match_code "mem")
> +{
> +  if (!memory_operand (op, mode))
> +return false;
> +
> +  return !prefixed_local_addr_p (XEXP (op, 0), GET_MODE (op),
> +  TRAD_INSN_DEFAULT);
> +})

Use match_operand for the first condition please (and then match_test for
the second?)

This does make it seem like we need a prefixed_local_mem_p as well?  So
that we need neither that XEXP nor that GET_MODE.

> @@ -5735,6 +5735,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
>  && (value >> 31 == -1 || value >> 31 == 0))
>  return 1;
>  
> +  /* PADDI can support up to 34 bit signed integers.  */
> +  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
> +return 1;

Write this earlier, together with the 16BIT one?

> @@ -6905,6 +6909,7 @@ rs6000_adjust_vec_address (rtx scalar_re
>rtx element_offset;
>rtx new_addr;
>bool valid_addr_p;
> +  bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);

This is used 159 lines later.  Please refactor things.  That would make
a separate patch *before* this one.

> +  /* Optimize pc-relative addresses.  */
> +  else if (pcrel_p)
> +{
> +  if (CONST_INT_P (element_offset))
> + {
> +   rtx addr2 = addr;

This var needs a better name and/or comments.  Or maybe just factoring.

> @@ -7007,9 +7050,8 @@ rs6000_adjust_vec_address (rtx scalar_re
>  
>/* If we have a PLUS, we need to see whether the particular register class
>   allows for D-FORM or X-FORM addressing.  */
> -  if (GET_CODE (new_addr) == PLUS)
> +  if (GET_CODE (new_addr) == PLUS || pcrel_p)

That second condition needs a comment.

> @@ -7609,7 +7675,7 @@ mem_operand_ds_form (rtx op, machine_mod
> causes a wrap, so test only the low 16 bits.  */
>  offset = ((offset & 0x) ^ 0x8000) - 0x8000;
>  
> -  return offset + 0x8000 < 0x1u - extra;
> +  return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);

Please do all these things first too, as a separate patch.

> -  offset += 0x8000;
> -  return offset < 0x1 - extra;
> +  if (TARGET_PREFIXED_ADDR)
> +return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
> +  else
> +return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);

So this you could just do the 16BIT first, and then *this* patch will add
the 34BIT thing, in an easy-to-read patch.

> +{
> +  /* There is no prefixed version of the load/store with update.  */
> +  return !prefixed_local_addr_p (XEXP (x, 1), mode, TRAD_INSN_DEFAULT);
> +}

If you pass the actual MEM, the prefixed_local_mem_p function can return
false, itself.

> -   unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
> -   return val + 0x8000 >= 0x1 - (TARGET_POWERPC64 ? 8 : 12);
> +   HOST_WIDE_INT val = INTVAL (

Re: [PATCH] Fix thinko in early bail out in tree-switch-conversion.

2019-08-30 Thread Jeff Law
On 8/30/19 2:41 AM, Martin Liška wrote:
> Hi.
> 
> Thanks to Jakub, the patch addresses one memory leak in
> bit_test_cluster::find_bit_tests (allocation of output).
> And moreover, it implements proper guard when clustering
> is not successful. In such situation we want a quick bail out.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-08-29  Martin Liska  
> 
>   * tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
>   Bail out when we'll end up with the same number of clusters as
>   at the beginning.
>   (bit_test_cluster::find_bit_tests): Likewise for bit tests.
>   (jump_table_cluster::can_be_handled): Remove the guard
>   as it's already handled in ::is_enabled.  Allocate output
>   after early bail out.
> ---
>  gcc/tree-switch-conversion.c | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> 
OK
jeff


Re: [PATCH] PR libstdc++/89164 enforce constraints for uninitialized algos

2019-08-30 Thread Jonathan Wakely

On 30/08/19 14:54 +0100, Jonathan Wakely wrote:

The memmove optimizations for std::uninitialized_copy/fill/_n will
compile even if the type is not copy constructible, because std::copy
doesn't require copy construction to work. But the uninitialized
algorithms do require it.

This adds explicit static assertions to ensure we don't allow ill-formed
initializations.

PR libstdc++/89164
* include/bits/stl_algobase.h (__copy_move): Give descriptive names
to template parameters.
* include/bits/stl_uninitialized.h (uninitialized_copy)
(uninitialized_fill, uninitialized_fill_n): Add static assertions to
diagnose invalid uses.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
Adjust expected error.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/
89164.cc: New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/
89164.cc: New test.
* testsuite/23_containers/vector/cons/89164.cc: New test.
* testsuite/23_containers/vector/cons/89164_c++17.cc: New test.

This is the patch sent in
https://gcc.gnu.org/ml/libstdc++/2019-02/msg00034.html but with the
dg-error directives in the tests fixed.

Tested x86_64-linux, committed to trunk.


Oops, looks like what I tested didn't have the new test checked in.
This fixes it.


commit 01e0d7b3957c4bd8cdac86e0e4f16135dc61d51c
Author: Jonathan Wakely 
Date:   Fri Aug 30 17:18:17 2019 +0100

Fix errors in new test

* testsuite/23_containers/vector/cons/89164_c++17.cc: Fix errors.

diff --git a/libstdc++-v3/testsuite/23_containers/vector/cons/89164_c++17.cc b/libstdc++-v3/testsuite/23_containers/vector/cons/89164_c++17.cc
index 8b3afba4867..db7d8d5c850 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/cons/89164_c++17.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/cons/89164_c++17.cc
@@ -22,6 +22,12 @@
 
 // PR libstdc++/89164
 
+struct X
+{
+  X() = default;
+  X(const X&) = delete;
+};
+
 void test01()
 {
   X x[1];


Re: [PATCH] Optimize to_chars

2019-08-30 Thread Jonathan Wakely

On 30/08/19 17:01 +0100, Jonathan Wakely wrote:

On 30/08/19 17:27 +0300, Antony Polukhin wrote:

Bunch of micro optimizations for std::to_chars:
* For base == 8 replacing the lookup in __digits table with arithmetic
computations leads to a same CPU cycles for a loop (exchanges two
movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
saves 129 bytes of data and totally avoids a chance of cache misses on
__digits.
* For base == 16 replacing the lookup in __digits table with
arithmetic computations leads to a few additional instructions, but
totally avoids a chance of cache misses on __digits (- ~9 cache misses
for worst case) and saves 513 bytes of const data.
* Replacing __first[pos] and __first[pos - 1] with __first[1] and
__first[0] on final iterations saves ~2% of code size.
* Removing trailing '\0' from arrays of digits allows the linker to
merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
"0123456789abcdef" could share the same address). This improves data
locality and reduces binary sizes.
* Using __detail::__to_chars_len_2 instead of a generic
__detail::__to_chars_len makes the operation O(1) instead of O(N). It
also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
.

In sum: this significantly reduces the size of a binary (for about
4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
with latency (CPU cache misses) without changing the iterations count
and without adding costly instructions into the loops.


This is great, thanks.

Have you tried comparing the improved code to libc++'s implementation?
I believe they use precomputed arrays of digits, but they use larger
arrays that allow 4 bytes to be written at once, which is considerably
faster (and those precomputed arrays libe in libc++.so not in the
header). Would we be better off keeping the precomputed arrays and
expanding them to do 4-byte writes?

Since we don't have a patch to do that, I think I'll commit yours. We
can always go back to precomputed arrays later if somebody does that
work.

My only comments are on the changelog:


Changelog:
  * include/std/charconv (__detail::__to_chars_8,
  __detail::__to_chars_16): Replace array of precomputed digits


When the list of changed functions is split across lines it should be
like this:

  * include/std/charconv (__detail::__to_chars_8)
  (__detail::__to_chars_16): Replace array of precomputed digits

i.e close the parentheses before the line break, and reopen on the
next line.


  with arithmetic operations to avoid CPU cache misses. Remove
  zero termination from array of digits to allow symbol merge with
  generic implementation of __detail::__to_chars. Replace final
  offsets with constants. Use __detail::__to_chars_len_2 instead
  of a generic __detail::__to_chars_len.
  * include/std/charconv (__detail::__to_chars): Remove


Don't repeat the asterisk and filename for later changes in the same
file, i.e.

  (__detail::__to_chars): Remove zero termination from array of digits.
  (__detail::__to_chars_2): Leading digit is always '1'.

There's no changelog entry for the changes to __to_chars_len_8 and
__to_chars_len_16.


Oh, there's no separate __to_chars_len_16 function anyway, and you did
mention it as "Use __detail::__to_chars_len_2 instead ..." - sorry!

I think we might as well inline __to_chars_len_8 into __to_chars_8,
there's not much benefit to having a separate function. I'll do that
as a follow up patch.





Re: [PATCH] Optimize to_chars

2019-08-30 Thread Jonathan Wakely

On 30/08/19 17:27 +0300, Antony Polukhin wrote:

Bunch of micro optimizations for std::to_chars:
* For base == 8 replacing the lookup in __digits table with arithmetic
computations leads to a same CPU cycles for a loop (exchanges two
movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
saves 129 bytes of data and totally avoids a chance of cache misses on
__digits.
* For base == 16 replacing the lookup in __digits table with
arithmetic computations leads to a few additional instructions, but
totally avoids a chance of cache misses on __digits (- ~9 cache misses
for worst case) and saves 513 bytes of const data.
* Replacing __first[pos] and __first[pos - 1] with __first[1] and
__first[0] on final iterations saves ~2% of code size.
* Removing trailing '\0' from arrays of digits allows the linker to
merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
"0123456789abcdef" could share the same address). This improves data
locality and reduces binary sizes.
* Using __detail::__to_chars_len_2 instead of a generic
__detail::__to_chars_len makes the operation O(1) instead of O(N). It
also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
.

In sum: this significantly reduces the size of a binary (for about
4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
with latency (CPU cache misses) without changing the iterations count
and without adding costly instructions into the loops.


This is great, thanks.

Have you tried comparing the improved code to libc++'s implementation?
I believe they use precomputed arrays of digits, but they use larger
arrays that allow 4 bytes to be written at once, which is considerably
faster (and those precomputed arrays libe in libc++.so not in the
header). Would we be better off keeping the precomputed arrays and
expanding them to do 4-byte writes?

Since we don't have a patch to do that, I think I'll commit yours. We
can always go back to precomputed arrays later if somebody does that
work.

My only comments are on the changelog:


Changelog:
   * include/std/charconv (__detail::__to_chars_8,
   __detail::__to_chars_16): Replace array of precomputed digits


When the list of changed functions is split across lines it should be
like this:

   * include/std/charconv (__detail::__to_chars_8)
   (__detail::__to_chars_16): Replace array of precomputed digits

i.e close the parentheses before the line break, and reopen on the
next line.


   with arithmetic operations to avoid CPU cache misses. Remove
   zero termination from array of digits to allow symbol merge with
   generic implementation of __detail::__to_chars. Replace final
   offsets with constants. Use __detail::__to_chars_len_2 instead
   of a generic __detail::__to_chars_len.
   * include/std/charconv (__detail::__to_chars): Remove


Don't repeat the asterisk and filename for later changes in the same
file, i.e.

   (__detail::__to_chars): Remove zero termination from array of digits.
   (__detail::__to_chars_2): Leading digit is always '1'.

There's no changelog entry for the changes to __to_chars_len_8 and
__to_chars_len_16.

Also, please don't include the ChangeLog diff in the patch, because it
just makes it hard to apply the patch (the ChangeLog part will almost
always fail to apply because somebody else will have modified the
ChangeLog file since you created the patch, and so that hunk won't
apply. The ChangeLog text should be sent as a separate (plain text)
attachment, or just in the email body (as you did above).

I'll test this and commit it, thanks!




Re: [committed] Suppress warnings for mips/r10k-cache-barrier-9.c

2019-08-30 Thread Richard Sandiford
Jeff Law  writes:
> The improvements in our ability to issue diagnostics for out of bounds
> accesses to trailing arrays is causing r10k-cache-barrier-9.c to fail on
> MIPS targets.
>
> This is a bit of an odd test.  It purposefully does out of bounds
> accesses.  Some accesses cross sub-object boundaries, others leave the
> object entirely.  Apparently the MIPS backend is supposed to detect this
> stuff and issue some kind of cache barrier instruction.
>
> Additionally mips.exp overrides dg-options and makes it harder to pass
> additional options, like -Wno-.  You can use "-w", but not
> "-Wno-<...>".  We can't really use dg-warning markers because we don't
> get warnings at all optimization levels.
>
> One might argue the test is bogus, but I'm going to assume it serves
> some useful purpose.   I'm adding "-w" to suppress warnings.

Huh, yeah.  Certainly can't remember now why this seemed worth testing,
but I agree this patch is the best fix.

Richard

> Installing on the trunk,
>
> JEff
>
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 0c5ec5325f5..5044002c09c 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,7 @@
> +2019-08-30  Jeff Law  
> +
> + * gcc.target/mips/r10k-cache-barrier-9.c: Suppress warnings.
> +
>  2019-08-30  Martin Jambor  
>  
>   tree-optimization/91579
> diff --git a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c 
> b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
> index 2f83968aad6..2516b663ca1 100644
> --- a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
> +++ b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
> @@ -1,4 +1,4 @@
> -/* { dg-options "-mr10k-cache-barrier=store -G8" } */
> +/* { dg-options "-mr10k-cache-barrier=store -G8 -w" } */
>  
>  /* Test that out-of-range stores to components of static objects
> are protected by a cache barrier.  */


[Ada] Saturate the sizes for the target in -gnatR output

2019-08-30 Thread Eric Botcazou
This changes the saturation from the host to the target for the large sizes 
displayed in the the -gnatR output.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/decl.c (maybe_saturate_size): New function.
(gnat_to_gnu_entity): Invoke it on the Esize of types before sending
it for back-annotations.
* gcc-interface/trans.c: Fix typo.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 275198)
+++ gcc-interface/decl.c	(working copy)
@@ -232,6 +232,7 @@ static tree build_position_list (tree, bool, tree,
 static vec build_subst_list (Entity_Id, Entity_Id, bool);
 static vec build_variant_list (tree, vec,
 	 vec);
+static tree maybe_saturate_size (tree);
 static tree validate_size (Uint, tree, Entity_Id, enum tree_code, bool, bool);
 static void set_rm_size (Uint, tree, Entity_Id);
 static unsigned int validate_alignment (Uint, Entity_Id, unsigned int);
@@ -4327,9 +4328,10 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
 	{
 	  tree gnu_size = TYPE_SIZE (gnu_type);
 
-	  /* If the size is self-referential, annotate the maximum value.  */
+	  /* If the size is self-referential, annotate the maximum value
+	 after saturating it, if need be, to avoid a No_Uint value.  */
 	  if (CONTAINS_PLACEHOLDER_P (gnu_size))
-	gnu_size = max_size (gnu_size, true);
+	gnu_size = maybe_saturate_size (max_size (gnu_size, true));
 
 	  /* If we are just annotating types and the type is tagged, the tag
 	 and the parent components are not generated by the front-end so
@@ -4365,7 +4367,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
 		  gnu_size = size_binop (PLUS_EXPR, gnu_size, offset);
 		}
 
-	  gnu_size = round_up (gnu_size, align);
+	  gnu_size = maybe_saturate_size (round_up (gnu_size, align));
 	  Set_Esize (gnat_entity, annotate_value (gnu_size));
 
 	  /* Tagged types are Strict_Alignment so RM_Size = Esize.  */
@@ -8723,6 +8725,19 @@ build_variant_list (tree qual_union_type, vec::create_ggc (512);
 }
 
-/* Destroy data structures of the decl.c module.  */
+/* Destroy the data structures of the decl.c module.  */
 
 void
 destroy_gnat_decl (void)
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 275198)
+++ gcc-interface/trans.c	(working copy)
@@ -8790,7 +8790,7 @@ gnat_to_gnu (Node_Id gnat_node)
 
5. If this is a reference to an unconstrained array which is used as the
 	  prefix of an attribute reference that requires an lvalue, return the
-	  result unmodified because we want return the original bounds.
+	  result unmodified because we want to return the original bounds.
 
6. Finally, if the type of the result is already correct.  */
 


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Ilya Leoshkevich
> Am 30.08.2019 um 16:40 schrieb Ilya Leoshkevich :
> 
>> Am 30.08.2019 um 09:12 schrieb Richard Biener :
>> 
>> On Thu, Aug 29, 2019 at 5:39 PM Ilya Leoshkevich  wrote:
>>> 
 Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
 
 Bootstrap and regtest running on x86_64-redhat-linux and
 s390x-redhat-linux.
 
 This patch series adds signaling FP comparison support (both scalar and
 vector) to s390 backend.
>>> 
>>> I'm running into a problem on ppc64 with this patch, and it would be
>>> great if someone could help me figure out the best way to resolve it.
>>> 
>>> vector36.C test is failing because gimplifier produces the following
>>> 
>>> _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
>>> _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
>>> 
>>> from
>>> 
>>> VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
>>> { -1, -1, -1, -1 } ,
>>> { 0, 0, 0, 0 } >
>>> 
>>> Since the comparison tree code is now hidden behind a temporary, my code
>>> does not have anything to pass to the backend.  The reason for creating
>>> a temporary is that the comparison can trap, and so the following check
>>> in gimplify_expr fails:
>>> 
>>> if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
>>>   goto out;
>>> 
>>> gimple_test_f is is_gimple_condexpr, and it eventually calls
>>> operation_could_trap_p (GT).
>>> 
>>> My current solution is to simply state that backend does not support
>>> SSA_NAME in vector comparisons, however, I don't like it, since it may
>>> cause performance regressions due to having to fall back to scalar
>>> comparisons.
>>> 
>>> I was thinking about two other possible solutions:
>>> 
>>> 1. Change the gimplifier to allow trapping vector comparisons.  That's
>>>  a bit complicated, because tree_could_throw_p checks not only for
>>>  floating point traps, but also e.g. for array index out of bounds
>>>  traps.  So I would have to create a tree_could_throw_p version which
>>>  disregards specific kinds of traps.
>>> 
>>> 2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
>>>  its tree_code instead of SSA_NAME.  The potential problem I see with
>>>  this is that there appears to be no guarantee that _5 will be inlined
>>>  into _6 at a later point.  So if we say that we don't need to fall
>>>  back to scalar comparisons based on availability of vector >
>>>  instruction and inlining does not happen, then what's actually will
>>>  be required is vector selection (vsel on S/390), which might not be
>>>  available in general case.
>>> 
>>> What would be a better way to proceed here?
>> 
>> On GIMPLE there isn't a good reason to split out trapping comparisons
>> from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
>> where it is important because we'd have no way to represent EH info
>> when not done.  It might be a bit awkward to preserve EH across RTL
>> expansion though in case the [VEC_]COND_EXPR are not expanded
>> as a single pattern, but I'm not sure.
> 
> Ok, so I'm testing the following now - for the problematic test that
> helped:
> 
> diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
> index b0c9f9b671a..940aa394769 100644
> --- a/gcc/gimple-expr.c
> +++ b/gcc/gimple-expr.c
> @@ -602,17 +602,33 @@ is_gimple_lvalue (tree t)
> || TREE_CODE (t) == BIT_FIELD_REF);
> }
> 
> -/*  Return true if T is a GIMPLE condition.  */
> +/* Helper for is_gimple_condexpr and is_possibly_trapping_gimple_condexpr.  
> */
> 
> -bool
> -is_gimple_condexpr (tree t)
> +static bool
> +is_gimple_condexpr_1 (tree t, bool allow_traps)
> {
>   return (is_gimple_val (t) || (COMPARISON_CLASS_P (t)
> - && !tree_could_throw_p (t)
> + && (allow_traps || !tree_could_throw_p (t))
>   && is_gimple_val (TREE_OPERAND (t, 0))
>   && is_gimple_val (TREE_OPERAND (t, 1;
> }
> 
> +/*  Return true if T is a GIMPLE condition.  */
> +
> +bool
> +is_gimple_condexpr (tree t)
> +{
> +  return is_gimple_condexpr_1 (t, false);
> +}
> +
> +/* Like is_gimple_condexpr, but allow the T to trap.  */
> +
> +bool
> +is_possibly_trapping_gimple_condexpr (tree t)
> +{
> +  return is_gimple_condexpr_1 (t, true);
> +}
> +
> /* Return true if T is a gimple address.  */
> 
> bool
> diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
> index 1ad1432bd17..20546ca5b99 100644
> --- a/gcc/gimple-expr.h
> +++ b/gcc/gimple-expr.h
> @@ -41,6 +41,7 @@ extern void gimple_cond_get_ops_from_tree (tree, enum 
> tree_code *, tree *,
>  tree *);
> extern bool is_gimple_lvalue (tree);
> extern bool is_gimple_condexpr (tree);
> +extern bool is_possibly_trapping_gimple_condexpr (tree);
> extern bool is_gimple_address (const_tree);
> extern bool is_gimple_invariant_address (const_tree);
> extern bool is_gimple_ip_invariant_address (const_tree);
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> in

[Ada] Add warning for explicit by-reference mechanism

2019-08-30 Thread Eric Botcazou
This instructs gigi to issue a warning when an explicit by-reference mechanism 
specified by means of pragma Export_Function cannot be honored.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/ada-tree.h (DECL_FORCED_BY_REF_P): New macro.
* gcc-interface/decl.c (gnat_to_gnu_param): Set it on parameters
whose mechanism was forced to by-reference.
* gcc-interface/trans.c (Call_to_gnu): Do not issue a warning about a
misaligned actual parameter if it is based on a CONSTRUCTOR.  Remove
obsolete warning for users of Starlet.  Issue a warning if a temporary
is make around the call for a parameter with DECL_FORCED_BY_REF_P set.
(addressable_p): Return true for REAL_CST and ADDR_EXPR.

-- 
Eric BotcazouIndex: gcc-interface/ada-tree.h
===
--- gcc-interface/ada-tree.h	(revision 275062)
+++ gcc-interface/ada-tree.h	(working copy)
@@ -482,6 +482,9 @@ do {		   \
value of a function call or 'reference to a function call.  */
 #define DECL_RETURN_VALUE_P(NODE) DECL_LANG_FLAG_5 (VAR_DECL_CHECK (NODE))
 
+/* Nonzero in a PARM_DECL if its mechanism was forced to by-reference.  */
+#define DECL_FORCED_BY_REF_P(NODE) DECL_LANG_FLAG_5 (PARM_DECL_CHECK (NODE))
+
 /* In a FIELD_DECL corresponding to a discriminant, contains the
discriminant number.  */
 #define DECL_DISCRIMINANT_NUMBER(NODE) DECL_INITIAL (FIELD_DECL_CHECK (NODE))
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 275196)
+++ gcc-interface/decl.c	(working copy)
@@ -5208,6 +5208,7 @@ gnat_to_gnu_param (Entity_Id gnat_param, tree gnu_
   bool ro_param = in_param && !Address_Taken (gnat_param);
   bool by_return = false, by_component_ptr = false;
   bool by_ref = false;
+  bool forced_by_ref = false;
   bool restricted_aliasing_p = false;
   location_t saved_location = input_location;
   tree gnu_param;
@@ -5235,7 +5236,11 @@ gnat_to_gnu_param (Entity_Id gnat_param, tree gnu_
   /* Or else, see if a Mechanism was supplied that forced this parameter
  to be passed one way or another.  */
   else if (mech == Default || mech == By_Copy || mech == By_Reference)
-;
+forced_by_ref
+  = (mech == By_Reference
+	 && !foreign
+	 && !TYPE_IS_BY_REFERENCE_P (gnu_param_type)
+	 && !Is_Aliased (gnat_param));
 
   /* Positive mechanism means by copy for sufficiently small parameters.  */
   else if (mech > 0)
@@ -5368,6 +5373,7 @@ gnat_to_gnu_param (Entity_Id gnat_param, tree gnu_
   gnu_param = create_param_decl (gnu_param_name, gnu_param_type);
   TREE_READONLY (gnu_param) = ro_param || by_ref || by_component_ptr;
   DECL_BY_REF_P (gnu_param) = by_ref;
+  DECL_FORCED_BY_REF_P (gnu_param) = forced_by_ref;
   DECL_BY_COMPONENT_PTR_P (gnu_param) = by_component_ptr;
   DECL_POINTS_TO_READONLY_P (gnu_param)
 = (ro_param && (by_ref || by_component_ptr));
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 275197)
+++ gcc-interface/trans.c	(working copy)
@@ -5257,30 +5257,20 @@ Call_to_gnu (Node_Id gnat_node, tree *gnu_result_t
 
 	  /* Do not issue warnings for CONSTRUCTORs since this is not a copy
 	 but sort of an instantiation for them.  */
-	  if (TREE_CODE (gnu_name) == CONSTRUCTOR)
+	  if (TREE_CODE (remove_conversions (gnu_name, true)) == CONSTRUCTOR)
 	;
 
-	  /* If the type is passed by reference, a copy is not allowed.  */
-	  else if (TYPE_IS_BY_REFERENCE_P (gnu_formal_type))
+	  /* If the formal is passed by reference, a copy is not allowed.  */
+	  else if (TYPE_IS_BY_REFERENCE_P (gnu_formal_type)
+		   || Is_Aliased (gnat_formal))
 	post_error ("misaligned actual cannot be passed by reference",
 		gnat_actual);
 
-	  /* For users of Starlet we issue a warning because the interface
-	 apparently assumes that by-ref parameters outlive the procedure
-	 invocation.  The code still will not work as intended, but we
-	 cannot do much better since low-level parts of the back-end
-	 would allocate temporaries at will because of the misalignment
-	 if we did not do so here.  */
-	  else if (Is_Valued_Procedure (Entity (Name (gnat_node
-	{
-	  post_error
-		("?possible violation of implicit assumption", gnat_actual);
-	  post_error_ne
-		("?made by pragma Import_Valued_Procedure on &", gnat_actual,
-		 Entity (Name (gnat_node)));
-	  post_error_ne ("?because of misalignment of &", gnat_actual,
-			 gnat_formal);
-	}
+	  /* If the mechanism was forced to by-ref, a copy is not allowed but
+	 we issue only a warning because this case is not strict Ada.  */
+	  else if (DECL_FORCED_BY_REF_P (gnu_formal))
+	post_error ("misaligned actual cannot be passed by reference??",
+			gnat_actual);
 
 	  /* If the actual type of the object is already the

Re: [PATCH] or1k: Fix issue with set_got clobbering r9

2019-08-30 Thread Richard Henderson
LGTM.

On 8/30/19 2:31 AM, Stafford Horne wrote:
> Hello, any comments on this?
> 
> If nothing I will commit in a few days.
> 
> On Thu, Aug 22, 2019 at 08:44:04PM +0900, Stafford Horne wrote:
>> When compiling glibc we found that the GOT register was being allocated
>> r9 when the instruction was still set_got_tmp.  That caused set_got to
>> clobber r9.  We cannot simply say set_got_tmp clobbers r9 as this is the
>> reason for having the temporary set_got_tmp.
>>
>> Fix by using a register class constraint that does not allow r9 during
>> register allocation.
>>
>> gcc/ChangeLog:
>>
>>  * config/or1k/constraints.md (t): New constraint.
>>  * config/or1k/or1k.h (GOT_REGS): New register class.
>>  * config/or1k/or1k.md (set_got_tmp, set_got): Use t contraint.
>> ---
>>  gcc/config/or1k/constraints.md | 4 
>>  gcc/config/or1k/or1k.h | 3 +++
>>  gcc/config/or1k/or1k.md| 4 ++--
>>  3 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/or1k/constraints.md b/gcc/config/or1k/constraints.md
>> index 8cac7eb5329..ba330c6b8c2 100644
>> --- a/gcc/config/or1k/constraints.md
>> +++ b/gcc/config/or1k/constraints.md
>> @@ -25,6 +25,7 @@
>>  ; We use:
>>  ;  c - sibcall registers
>>  ;  d - double pair base registers (excludes r0, r30 and r31 which overflow)
>> +;  t - got address registers (excludes r9 is clobbered by set_got)
> 
> I will changee this to (... r9 which is clobbered ...)
> 
>>  ;  I - constant signed 16-bit
>>  ;  K - constant unsigned 16-bit
>>  ;  M - constant signed 16-bit shifted left 16-bits (l.movhi)
>> @@ -36,6 +37,9 @@
>>  (define_register_constraint "d" "DOUBLE_REGS"
>>"Registers which can be used for double reg pairs.")
>>  
>> +(define_register_constraint "t" "GOT_REGS"
>> +  "Registers which can be used to store the Global Offset Table (GOT) 
>> address.")
>> +
>>  ;; Immediates
>>  (define_constraint "I"
>>"A signed 16-bit immediate in the range -32768 to 32767."
>> diff --git a/gcc/config/or1k/or1k.h b/gcc/config/or1k/or1k.h
>> index 2b29e62fdd3..4c32607bac1 100644
>> --- a/gcc/config/or1k/or1k.h
>> +++ b/gcc/config/or1k/or1k.h
>> @@ -190,6 +190,7 @@ enum reg_class
>>NO_REGS,
>>SIBCALL_REGS,
>>DOUBLE_REGS,
>> +  GOT_REGS,
>>GENERAL_REGS,
>>FLAG_REGS,
>>ALL_REGS,
>> @@ -202,6 +203,7 @@ enum reg_class
>>"NO_REGS",\
>>"SIBCALL_REGS",   \
>>"DOUBLE_REGS",\
>> +  "GOT_REGS",   \
>>"GENERAL_REGS",   \
>>"FLAG_REGS",  \
>>"ALL_REGS" }
>> @@ -215,6 +217,7 @@ enum reg_class
>>  { { 0x, 0x },   \
>>{ SIBCALL_REGS_MASK,   0 },   \
>>{ 0x7f7e, 0x },   \
>> +  { 0xfdff, 0x },   \
>>{ 0x, 0x0003 },   \
>>{ 0x, 0x0004 },   \
>>{ 0x, 0x0007 }\
>> diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
>> index cee11d078cc..36bcee336ab 100644
>> --- a/gcc/config/or1k/or1k.md
>> +++ b/gcc/config/or1k/or1k.md
>> @@ -706,7 +706,7 @@
>>  ;; set_got pattern below.  This works because the set_got_tmp insn is the
>>  ;; first insn in the stream and that it isn't moved during RA.
>>  (define_insn "set_got_tmp"
>> -  [(set (match_operand:SI 0 "register_operand" "=r")
>> +  [(set (match_operand:SI 0 "register_operand" "=t")
>>  (unspec_volatile:SI [(const_int 0)] UNSPECV_SET_GOT))]
>>""
>>  {
>> @@ -715,7 +715,7 @@
>>  
>>  ;; The insn to initialize the GOT.
>>  (define_insn "set_got"
>> -  [(set (match_operand:SI 0 "register_operand" "=r")
>> +  [(set (match_operand:SI 0 "register_operand" "=t")
>>  (unspec:SI [(const_int 0)] UNSPEC_SET_GOT))
>> (clobber (reg:SI LR_REGNUM))]
>>""
>> -- 
>> 2.21.0
>>



[Ada] Minor tweak for coverage

2019-08-30 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/trans.c (gnat_to_gnu): Do not set the location on an
expression used for a tag.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 275191)
+++ gcc-interface/trans.c	(working copy)
@@ -8727,10 +8727,16 @@ gnat_to_gnu (Node_Id gnat_node)
 	set_gnu_expr_location_from_node (gnu_result, gnat_node);
 }
 
-  /* Set the location information on the result if it's not a simple name.
+  /* Set the location information on the result if it's not a simple name
+ or something that contains a simple name, for example a tag, because
+ we don"t want all the references to get the location of the first use.
  Note that we may have no result if we tried to build a CALL_EXPR node
  to a procedure with no side-effects and optimization is enabled.  */
-  else if (kind != N_Identifier && gnu_result && EXPR_P (gnu_result))
+  else if (kind != N_Identifier
+	   && !(kind == N_Selected_Component
+		&& Chars (Selector_Name (gnat_node)) == Name_uTag)
+	   && gnu_result
+	   && EXPR_P (gnu_result))
 set_gnu_expr_location_from_node (gnu_result, gnat_node);
 
   /* If we're supposed to return something of void_type, it means we have


[Ada] Allow tighter packing for component with variant part

2019-08-30 Thread Eric Botcazou
This lifts an old restriction pertaining to the layout of components of record 
types whose nominal subtype contains a variant part: they couldn't be packed.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/gigi.h (aggregate_type_contains_array_p): Declare.
* gcc-interface/decl.c (gnat_to_gnu_entity) : For an
extension, test Has_Record_Rep_Clause instead of Has_Specified_Layout.
(adjust_packed): Return 0 if the type of the field is an aggregate
type that contains (or is) a self-referential array.
(type_has_variable_size): Delete.
* gcc-interface/utils.c (inish_record_type): Constify a variable.
(aggregate_type_contains_array_p): Add parameter SELF_REFERENTIAL.
: Pass it in the recursive call.
: If it is true, return true only if the array type is
self-referential.
(create_field_decl): Streamline the setting of the alignment on the
field.  Pass false to aggregate_type_contains_array_p.


2019-08-30  Eric Botcazou  

* gnat.dg/pack24.adb: New test.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 275188)
+++ gcc-interface/decl.c	(working copy)
@@ -202,7 +202,6 @@ static void prepend_one_attribute_pragma (struct a
 static void prepend_attributes (struct attrib **, Entity_Id);
 static tree elaborate_expression (Node_Id, Entity_Id, const char *, bool, bool,
   bool);
-static bool type_has_variable_size (tree);
 static tree elaborate_expression_1 (tree, Entity_Id, const char *, bool, bool);
 static tree elaborate_expression_2 (tree, Entity_Id, const char *, bool, bool,
 unsigned int);
@@ -2953,10 +2952,13 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
 	  : 0;
 	const bool has_align = Known_Alignment (gnat_entity);
 	const bool has_discr = Has_Discriminants (gnat_entity);
-	const bool has_rep = Has_Specified_Layout (gnat_entity);
 	const bool is_extension
 	  = (Is_Tagged_Type (gnat_entity)
 	 && Nkind (record_definition) == N_Derived_Type_Definition);
+	const bool has_rep
+	  = is_extension
+	? Has_Record_Rep_Clause (gnat_entity)
+	: Has_Specified_Layout (gnat_entity);
 	const bool is_unchecked_union = Is_Unchecked_Union (gnat_entity);
 	bool all_rep = has_rep;
 
@@ -6865,11 +6867,13 @@ choices_to_gnu (tree gnu_operand, Node_Id gnat_cho
 static int
 adjust_packed (tree field_type, tree record_type, int packed)
 {
-  /* If the field contains an item of variable size, we cannot pack it
- because we cannot create temporaries of non-fixed size in case
- we need to take the address of the field.  See addressable_p and
- the notes on the addressability issues for further details.  */
-  if (type_has_variable_size (field_type))
+  /* If the field contains an array with self-referential size, we'd better
+ not pack it because this would misalign it and, therefore, cause large
+ temporaries to be created in case we need to take the address of the
+ field.  See addressable_p and the notes on the addressability issues
+ for further details.  */
+  if (AGGREGATE_TYPE_P (field_type)
+  && aggregate_type_contains_array_p (field_type, true))
 return 0;
 
   /* In the other cases, we can honor the packing.  */
@@ -7274,31 +7278,6 @@ components_need_strict_alignment (Node_Id componen
   return false;
 }
 
-/* Return true if TYPE is a type with variable size or a padding type with a
-   field of variable size or a record that has a field with such a type.  */
-
-static bool
-type_has_variable_size (tree type)
-{
-  tree field;
-
-  if (!TREE_CONSTANT (TYPE_SIZE (type)))
-return true;
-
-  if (TYPE_IS_PADDING_P (type)
-  && !TREE_CONSTANT (DECL_SIZE (TYPE_FIELDS (type
-return true;
-
-  if (!RECORD_OR_UNION_TYPE_P (type))
-return false;
-
-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
-if (type_has_variable_size (TREE_TYPE (field)))
-  return true;
-
-  return false;
-}
-
 /* Return true if FIELD is an artificial field.  */
 
 static bool
Index: gcc-interface/gigi.h
===
--- gcc-interface/gigi.h	(revision 275174)
+++ gcc-interface/gigi.h	(working copy)
@@ -835,6 +835,11 @@ extern tree get_base_type (tree type);
in bits.  If we don't know anything about the alignment, return 0.  */
 extern unsigned int known_alignment (tree exp);
 
+/* Return true if TYPE, an aggregate type, contains (or is) an array.
+   If SELF_REFERENTIAL is true, then an additional requirement on the
+   array is that it be self-referential.  */
+extern bool aggregate_type_contains_array_p (tree type, bool self_referential);
+
 /* Return true if VALUE is a multiple of FACTOR. FACTOR must be a power
of 2.  */
 extern bool value_factor_p (tree value, unsigned HOST_WIDE_INT factor);
Index: gcc-interface/utils.c

[committed] Suppress warnings for mips/r10k-cache-barrier-9.c

2019-08-30 Thread Jeff Law

The improvements in our ability to issue diagnostics for out of bounds
accesses to trailing arrays is causing r10k-cache-barrier-9.c to fail on
MIPS targets.

This is a bit of an odd test.  It purposefully does out of bounds
accesses.  Some accesses cross sub-object boundaries, others leave the
object entirely.  Apparently the MIPS backend is supposed to detect this
stuff and issue some kind of cache barrier instruction.

Additionally mips.exp overrides dg-options and makes it harder to pass
additional options, like -Wno-.  You can use "-w", but not
"-Wno-<...>".  We can't really use dg-warning markers because we don't
get warnings at all optimization levels.

One might argue the test is bogus, but I'm going to assume it serves
some useful purpose.   I'm adding "-w" to suppress warnings.

Installing on the trunk,

JEff
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 0c5ec5325f5..5044002c09c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2019-08-30  Jeff Law  
+
+   * gcc.target/mips/r10k-cache-barrier-9.c: Suppress warnings.
+
 2019-08-30  Martin Jambor  
 
tree-optimization/91579
diff --git a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c 
b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
index 2f83968aad6..2516b663ca1 100644
--- a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
+++ b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-9.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mr10k-cache-barrier=store -G8" } */
+/* { dg-options "-mr10k-cache-barrier=store -G8 -w" } */
 
 /* Test that out-of-range stores to components of static objects
are protected by a cache barrier.  */


Re: [PATCH] Sanitizing the middle-end interface to the back-end for strict alignment

2019-08-30 Thread Christophe Lyon
On Thu, 29 Aug 2019 at 23:26, Bernd Edlinger  wrote:
>
> On 8/29/19 11:08 AM, Christophe Lyon wrote:
> > On Thu, 29 Aug 2019 at 10:58, Kyrill Tkachov
> >  wrote:
> >>
> >> Hi Bernd,
> >>
> >> On 8/28/19 10:36 PM, Bernd Edlinger wrote:
> >>> On 8/28/19 2:07 PM, Christophe Lyon wrote:
>  Hi,
> 
>  This patch causes an ICE when building libgcc's unwind-arm.o
>  when configuring GCC:
>  --target  arm-none-linux-gnueabihf --with-mode thumb --with-cpu
>  cortex-a15 --with-fpu neon-vfpv4:
> 
>  The build works for the same target, but --with-mode arm --with-cpu
>  cortex a9 --with-fpu vfp
> 
>  In file included from
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/config/arm/unwind-arm.c:144:
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
>  In function 'get_eit_entry':
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:245:29:
>  warning: cast discards 'const' qualifier from pointer target type
>  [-Wcast-qual]
> 245 |   ucbp->pr_cache.ehtp = (_Unwind_EHT_Header 
>  *)&eitp->content;
> | ^
>  during RTL pass: expand
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
>  In function 'unwind_phase2_forced':
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:319:18:
>  internal compiler error: in gen_movdi, at config/arm/arm.md:5235
> 319 |   saved_vrs.core = entry_vrs->core;
> |   ~~~^
>  0x126530f gen_movdi(rtx_def*, rtx_def*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:5235
>  0x896d92 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.h:318
>  0x896d92 emit_move_insn_1(rtx_def*, rtx_def*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3694
>  0x897083 emit_move_insn(rtx_def*, rtx_def*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3790
>  0xfc25d6 gen_cpymem_ldrd_strd(rtx_def**)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:14582
>  0x126a1f1 gen_cpymemqi(rtx_def*, rtx_def*, rtx_def*, rtx_def*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:6688
>  0xb0bc08 maybe_expand_insn(insn_code, unsigned int, expand_operand*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/optabs.c:7440
>  0x89ba1e emit_block_move_via_cpymem
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1808
>  0x89ba1e emit_block_move_hints(rtx_def*, rtx_def*, rtx_def*,
>  block_op_methods, unsigned int, long, unsigned long, unsigned long,
>  unsigned long, bool, bool*)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1627
>  0x89c383 emit_block_move(rtx_def*, rtx_def*, rtx_def*, block_op_methods)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1667
>  0x89fb4e store_expr(tree_node*, rtx_def*, int, bool, bool)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5845
>  0x88c1f9 store_field
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:7149
>  0x8a0c22 expand_assignment(tree_node*, tree_node*, bool)
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5304
>  0x761964 expand_gimple_stmt_1
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3779
>  0x761964 expand_gimple_stmt
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3875
>  0x768583 expand_gimple_basic_block
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5915
>  0x76abc6 execute
>   
>  /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6538
> 
>  Christophe
> 
> >>> Okay, sorry for the breakage.
> >>>
> >>> What is happening in gen_cpymem_ldrd_strd is of course against the rules:
> >>>
> >>> It uses emit_move_insn on only 4-byte aligned DI-mode memory operands.
> >>>
> >>> I have a patch for this, which is able to fix the libgcc build on a 
> >>> cross, but have no
> >>> possibility to bootstrap the affected target.
> >>>
> >>> Could you please help?
> >>
> >> Well it's good that the sanitisation is catching the bugs!
> >>
>
> Yes, more than expected, though ;)
>
> >> Bootstrapping this patch I get another assert with the backtrace:
> >
>

Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2019-08-30 Thread Richard Sandiford
Christophe Lyon  writes:
> On Fri, 30 Aug 2019 at 11:00, Richard Sandiford
>  wrote:
>>
>> Christophe Lyon  writes:
>> > @@ -785,7 +785,7 @@ case ${target} in
>> >esac
>> >tmake_file="t-slibgcc"
>> >case $target in
>> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | 
>> > *-*-kopensolaris*-gnu)
>> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | 
>> > *-*-kopensolaris*-gnu  | *-*-uclinuxfdpiceabi)
>> >:;;
>> >  *-*-gnu*)
>> >native_system_header_dir=/include
>>
>> I don't think this is necessary, since this target will never match the
>> following *-*-gnu*) stanza anyway.
> OK (I thought it was clearer to add the fdpic config where we already
> have linux that would not match)

I think the idea is to match pure GNU systems only in the second stanza
(i.e. GNU/Hurd).  So we need the first stanza to exclude hybrid-GNU
systems like GNU/Linux, GNU/Solaris, GNU/FreeBSD, etc.

Since uclinuxfdpiceabi isn't a GNU-based system, I don't think it
needs to appear at all.

>> > diff --git a/libtool.m4 b/libtool.m4
>> > index 8966762..64e507a 100644
>> > --- a/libtool.m4
>> > +++ b/libtool.m4
>> > @@ -3734,7 +3739,7 @@ m4_if([$1], [CXX], [
>> >   ;;
>> >   esac
>> >   ;;
>> > -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
>> > +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
>> >   case $cc_basename in
>> > KCC*)
>> >   # KAI C++ Compiler
>>
>> Is this needed?  It seems to be in the !GCC branch of an if/else.
> I must admit I didn't test this case. I thought it was needed because
> this target does not match "linux*", in case someone tries to compile
> with another compiler...
>
>
>>
>> If it is needed, the default:
>>
>> _LT_TAGVAR(lt_prog_compiler_can_build_shared, $1)=no
>>
>> seems correct for non-FDPIC uclinux.
>>
> So, either use uclinuxfdpiceabi above, or do nothing and do not try to
> support other compilers?

Yeah.  I think the latter's better, since in this context we only
need libtool.m4 to support building with GCC.  The decision might
be different for upstream libtool, but do any commercial compilers
support Arm FDPIC yet?

>> > @@ -4032,7 +4037,7 @@ m4_if([$1], [CXX], [
>> >_LT_TAGVAR(lt_prog_compiler_static, $1)='-non_shared'
>> >;;
>> >
>> > -linux* | k*bsd*-gnu | kopensolaris*-gnu)
>> > +linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
>> >case $cc_basename in
>> ># old Intel for x86_64 which still supported -KPIC.
>> >ecc*)
>>
>> Same here.
>>
>> > @@ -5946,7 +5951,7 @@ if test "$_lt_caught_CXX_error" != yes; then
>> >  _LT_TAGVAR(inherit_rpath, $1)=yes
>> >  ;;
>> >
>> > -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
>> > +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinuxfdpiceabi)
>> >  case $cc_basename in
>> >KCC*)
>> >   # Kuck and Associates, Inc. (KAI) C++ Compiler
>>
>> Here too the code seems to be dealing specifically with non-GCC compilers.
>>
>> > @@ -6598,7 +6603,7 @@ interix[[3-9]]*)
>> >_LT_TAGVAR(postdeps,$1)=
>> >;;
>> >
>> > -linux*)
>> > +linux* | uclinux*)
>> >case `$CC -V 2>&1 | sed 5q` in
>> >*Sun\ C*)
>> >  # Sun C++ 5.9
>>
>> Here too.  (It only seems to do anything for Sun's C compiler.)
>>
>> The fewer hunks we have to maintain downstream the better :-)
>>
> Sure.
>
> I thought safer/cleaner to prepare the cases for non-GCC compilers, I
> guess it's better not to add that until proven useful?

Yeah, I think so.  I guess it depends on your POV.  To me, it seems
cleaner to add uclinux* and uclinuxfdpiceabi only where we know there's
a specific need, since that's also how we decide which of uclinux* and
uclinuxfdpiceabi to use.

Thanks,
Richard


[Ada] Fix minor inaccuracy in lvalue_required_p

2019-08-30 Thread Eric Botcazou
It can only result in a missed early optimization.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/trans.c (lvalue_required_p) : Adjust GNU_TYPE
in the recursive call.
: Likewise.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 275187)
+++ gcc-interface/trans.c	(working copy)
@@ -873,12 +873,14 @@ lvalue_required_p (Node_Id gnat_node, tree gnu_typ
   if (Prefix (gnat_parent) != gnat_node)
 	return 0;
 
-  return lvalue_required_p (gnat_parent, gnu_type, constant,
-address_of_constant);
+  return lvalue_required_p (gnat_parent,
+get_unpadded_type (Etype (gnat_parent)),
+constant, address_of_constant);
 
 case N_Selected_Component:
-  return lvalue_required_p (gnat_parent, gnu_type, constant,
-address_of_constant);
+  return lvalue_required_p (gnat_parent,
+get_unpadded_type (Etype (gnat_parent)),
+constant, address_of_constant);
 
 case N_Object_Renaming_Declaration:
   /* We need to preserve addresses through a renaming.  */


[Ada] Fix crash on unconstrained array with convention C

2019-08-30 Thread Eric Botcazou
This happens when the array has multiple dimensions.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/utils.c (build_template): Deal with parameters
passed by pointer to component of multi-dimensional arrays.

-- 
Eric BotcazouIndex: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 275062)
+++ gcc-interface/utils.c	(working copy)
@@ -3953,15 +3953,12 @@ build_template (tree template_type, tree array_typ
 	  && TYPE_HAS_ACTUAL_BOUNDS_P (array_type)))
 bound_list = TYPE_ACTUAL_BOUNDS (array_type);
 
-  /* First make the list for a CONSTRUCTOR for the template.  Go down the
- field list of the template instead of the type chain because this
- array might be an Ada array of arrays and we can't tell where the
- nested arrays stop being the underlying object.  */
-
-  for (field = TYPE_FIELDS (template_type); field;
-   (bound_list
-	? (bound_list = TREE_CHAIN (bound_list))
-	: (array_type = TREE_TYPE (array_type))),
+  /* First make the list for a CONSTRUCTOR for the template.  Go down
+ the field list of the template instead of the type chain because
+ this array might be an Ada array of array and we can't tell where
+ the nested array stop being the underlying object.  */
+  for (field = TYPE_FIELDS (template_type);
+   field;
field = DECL_CHAIN (DECL_CHAIN (field)))
 {
   tree bounds, min, max;
@@ -3968,12 +3965,18 @@ build_template (tree template_type, tree array_typ
 
   /* If we have a bound list, get the bounds from there.  Likewise
 	 for an ARRAY_TYPE.  Otherwise, if expr is a PARM_DECL with
-	 DECL_BY_COMPONENT_PTR_P, use the bounds of the field in the template.
-	 This will give us a maximum range.  */
+	 DECL_BY_COMPONENT_PTR_P, use the bounds of the field in the
+	 template, but this will only give us a maximum range.  */
   if (bound_list)
-	bounds = TREE_VALUE (bound_list);
+	{
+	  bounds = TREE_VALUE (bound_list);
+	  bound_list = TREE_CHAIN (bound_list);
+	}
   else if (TREE_CODE (array_type) == ARRAY_TYPE)
-	bounds = TYPE_INDEX_TYPE (TYPE_DOMAIN (array_type));
+	{
+	  bounds = TYPE_INDEX_TYPE (TYPE_DOMAIN (array_type));
+	  array_type = TREE_TYPE (array_type);
+	}
   else if (expr && TREE_CODE (expr) == PARM_DECL
 	   && DECL_BY_COMPONENT_PTR_P (expr))
 	bounds = TREE_TYPE (field);


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Ilya Leoshkevich
> Am 30.08.2019 um 09:12 schrieb Richard Biener :
> 
> On Thu, Aug 29, 2019 at 5:39 PM Ilya Leoshkevich  wrote:
>> 
>>> Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
>>> 
>>> Bootstrap and regtest running on x86_64-redhat-linux and
>>> s390x-redhat-linux.
>>> 
>>> This patch series adds signaling FP comparison support (both scalar and
>>> vector) to s390 backend.
>> 
>> I'm running into a problem on ppc64 with this patch, and it would be
>> great if someone could help me figure out the best way to resolve it.
>> 
>> vector36.C test is failing because gimplifier produces the following
>> 
>>  _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
>>  _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
>> 
>> from
>> 
>>  VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
>>  { -1, -1, -1, -1 } ,
>>  { 0, 0, 0, 0 } >
>> 
>> Since the comparison tree code is now hidden behind a temporary, my code
>> does not have anything to pass to the backend.  The reason for creating
>> a temporary is that the comparison can trap, and so the following check
>> in gimplify_expr fails:
>> 
>>  if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
>>goto out;
>> 
>> gimple_test_f is is_gimple_condexpr, and it eventually calls
>> operation_could_trap_p (GT).
>> 
>> My current solution is to simply state that backend does not support
>> SSA_NAME in vector comparisons, however, I don't like it, since it may
>> cause performance regressions due to having to fall back to scalar
>> comparisons.
>> 
>> I was thinking about two other possible solutions:
>> 
>> 1. Change the gimplifier to allow trapping vector comparisons.  That's
>>   a bit complicated, because tree_could_throw_p checks not only for
>>   floating point traps, but also e.g. for array index out of bounds
>>   traps.  So I would have to create a tree_could_throw_p version which
>>   disregards specific kinds of traps.
>> 
>> 2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
>>   its tree_code instead of SSA_NAME.  The potential problem I see with
>>   this is that there appears to be no guarantee that _5 will be inlined
>>   into _6 at a later point.  So if we say that we don't need to fall
>>   back to scalar comparisons based on availability of vector >
>>   instruction and inlining does not happen, then what's actually will
>>   be required is vector selection (vsel on S/390), which might not be
>>   available in general case.
>> 
>> What would be a better way to proceed here?
> 
> On GIMPLE there isn't a good reason to split out trapping comparisons
> from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
> where it is important because we'd have no way to represent EH info
> when not done.  It might be a bit awkward to preserve EH across RTL
> expansion though in case the [VEC_]COND_EXPR are not expanded
> as a single pattern, but I'm not sure.

Ok, so I'm testing the following now - for the problematic test that
helped:

diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index b0c9f9b671a..940aa394769 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -602,17 +602,33 @@ is_gimple_lvalue (tree t)
  || TREE_CODE (t) == BIT_FIELD_REF);
 }
 
-/*  Return true if T is a GIMPLE condition.  */
+/* Helper for is_gimple_condexpr and is_possibly_trapping_gimple_condexpr.  */
 
-bool
-is_gimple_condexpr (tree t)
+static bool
+is_gimple_condexpr_1 (tree t, bool allow_traps)
 {
   return (is_gimple_val (t) || (COMPARISON_CLASS_P (t)
-   && !tree_could_throw_p (t)
+   && (allow_traps || !tree_could_throw_p (t))
&& is_gimple_val (TREE_OPERAND (t, 0))
&& is_gimple_val (TREE_OPERAND (t, 1;
 }
 
+/*  Return true if T is a GIMPLE condition.  */
+
+bool
+is_gimple_condexpr (tree t)
+{
+  return is_gimple_condexpr_1 (t, false);
+}
+
+/* Like is_gimple_condexpr, but allow the T to trap.  */
+
+bool
+is_possibly_trapping_gimple_condexpr (tree t)
+{
+  return is_gimple_condexpr_1 (t, true);
+}
+
 /* Return true if T is a gimple address.  */
 
 bool
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index 1ad1432bd17..20546ca5b99 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -41,6 +41,7 @@ extern void gimple_cond_get_ops_from_tree (tree, enum 
tree_code *, tree *,
   tree *);
 extern bool is_gimple_lvalue (tree);
 extern bool is_gimple_condexpr (tree);
+extern bool is_possibly_trapping_gimple_condexpr (tree);
 extern bool is_gimple_address (const_tree);
 extern bool is_gimple_invariant_address (const_tree);
 extern bool is_gimple_ip_invariant_address (const_tree);
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index daa0b71c191..4e6256390c0 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -12973,6 +12973,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
   else if (gimple_test_f == is

[Ada] Fix infinite recursion on dynamic record type with -gnatR4

2019-08-30 Thread Eric Botcazou
An oversight when -gnatR4 was invented.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/decl.c (annotate_value) : Inline the call
also if List_Representation_Info is greater than 3.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 275174)
+++ gcc-interface/decl.c	(working copy)
@@ -8398,7 +8398,7 @@ annotate_value (tree gnu_size)
   /* In regular mode, inline back only if symbolic annotation is requested
 	 in order to avoid memory explosion on big discriminated record types.
 	 But not in ASIS mode, as symbolic annotation is required for DDA.  */
-  if (List_Representation_Info == 3 || type_annotate_only)
+  if (List_Representation_Info >= 3 || type_annotate_only)
 	{
 	  tree t = maybe_inline_call_in_expr (gnu_size);
 	  return t ? annotate_value (t) : No_Uint;


Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2019-08-30 Thread Christophe Lyon
On Fri, 30 Aug 2019 at 11:00, Richard Sandiford
 wrote:
>
> Christophe Lyon  writes:
> > @@ -785,7 +785,7 @@ case ${target} in
> >esac
> >tmake_file="t-slibgcc"
> >case $target in
> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu)
> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu 
> >  | *-*-uclinuxfdpiceabi)
> >:;;
> >  *-*-gnu*)
> >native_system_header_dir=/include
>
> I don't think this is necessary, since this target will never match the
> following *-*-gnu*) stanza anyway.
OK (I thought it was clearer to add the fdpic config where we already
have linux that would not match)

>
> > diff --git a/libtool.m4 b/libtool.m4
> > index 8966762..64e507a 100644
> > --- a/libtool.m4
> > +++ b/libtool.m4
> > @@ -3734,7 +3739,7 @@ m4_if([$1], [CXX], [
> >   ;;
> >   esac
> >   ;;
> > -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
> > +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
> >   case $cc_basename in
> > KCC*)
> >   # KAI C++ Compiler
>
> Is this needed?  It seems to be in the !GCC branch of an if/else.
I must admit I didn't test this case. I thought it was needed because
this target does not match "linux*", in case someone tries to compile
with another compiler...


>
> If it is needed, the default:
>
> _LT_TAGVAR(lt_prog_compiler_can_build_shared, $1)=no
>
> seems correct for non-FDPIC uclinux.
>
So, either use uclinuxfdpiceabi above, or do nothing and do not try to
support other compilers?

> > @@ -4032,7 +4037,7 @@ m4_if([$1], [CXX], [
> >_LT_TAGVAR(lt_prog_compiler_static, $1)='-non_shared'
> >;;
> >
> > -linux* | k*bsd*-gnu | kopensolaris*-gnu)
> > +linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
> >case $cc_basename in
> ># old Intel for x86_64 which still supported -KPIC.
> >ecc*)
>
> Same here.
>
> > @@ -5946,7 +5951,7 @@ if test "$_lt_caught_CXX_error" != yes; then
> >  _LT_TAGVAR(inherit_rpath, $1)=yes
> >  ;;
> >
> > -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
> > +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinuxfdpiceabi)
> >  case $cc_basename in
> >KCC*)
> >   # Kuck and Associates, Inc. (KAI) C++ Compiler
>
> Here too the code seems to be dealing specifically with non-GCC compilers.
>
> > @@ -6598,7 +6603,7 @@ interix[[3-9]]*)
> >_LT_TAGVAR(postdeps,$1)=
> >;;
> >
> > -linux*)
> > +linux* | uclinux*)
> >case `$CC -V 2>&1 | sed 5q` in
> >*Sun\ C*)
> >  # Sun C++ 5.9
>
> Here too.  (It only seems to do anything for Sun's C compiler.)
>
> The fewer hunks we have to maintain downstream the better :-)
>
Sure.

I thought safer/cleaner to prepare the cases for non-GCC compilers, I
guess it's better not to add that until proven useful?

Thanks,

Christophe


> Richard


Re: [ARM/FDPIC v5 06/21] [ARM] FDPIC: Add support for c++ exceptions

2019-08-30 Thread Jonathan Wakely

On 30/08/19 10:02 +0100, Kyrill Tkachov wrote:
As Richard mentioned in an earlier post the generic libgcc and 
libstdc++ changes will need approval from the relevant maintainers.


CC'ing the libstdc++ list and the libgcc maintainer.


The libstdc++ change is OK for trunk.



On 5/15/19 1:39 PM, Christophe Lyon wrote:

The main difference with existing support is that function addresses
are function descriptor addresses instead. This means that all code
dealing with function pointers now has to cope with function
descriptors instead.

For the same reason, Linux kernel helpers can no longer be called by
dereferencing their address, so we implement wrappers that directly
call the kernel helpers.

When restoring a function address, we also have to restore the FDPIC
register value (r9).

2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    gcc/
    * ginclude/unwind-arm-common.h (unwinder_cache): Add reserved5
    field.
    (FDPIC_REGNUM): New define.

    libgcc/
    * config/arm/linux-atomic.c (__kernel_cmpxchg): Add FDPIC support.
    (__kernel_dmb): Likewise.
    (__fdpic_cmpxchg): New function.
    (__fdpic_dmb): New function.
    * config/arm/unwind-arm.h (FDPIC_REGNUM): New define.
    (gnu_Unwind_Find_got): New function.
    (_Unwind_decode_typeinfo_ptr): Add FDPIC support.
    * unwind-arm-common.inc (UCB_PR_GOT): New.
    (funcdesc_t): New struct.
    (get_eit_entry): Add FDPIC support.
    (unwind_phase2): Likewise.
    (unwind_phase2_forced): Likewise.
    (__gnu_Unwind_RaiseException): Likewise.
    (__gnu_Unwind_Resume): Likewise.
    (__gnu_Unwind_Backtrace): Likewise.
    * unwind-pe.h (read_encoded_value_with_base): Likewise.

    libstdc++/
    * libsupc++/eh_personality.cc (get_ttype_entry): Add FDPIC
    support.

Change-Id: I64b81cfaf390a05f2fd121f44ba1912cb4b47cae

diff --git a/gcc/ginclude/unwind-arm-common.h 
b/gcc/ginclude/unwind-arm-common.h

index 6df783e..d4eb03e 100644
--- a/gcc/ginclude/unwind-arm-common.h
+++ b/gcc/ginclude/unwind-arm-common.h
@@ -91,7 +91,7 @@ extern "C" {
   _uw reserved2;  /* Personality routine address */
   _uw reserved3;  /* Saved callsite address */
   _uw reserved4;  /* Forced unwind stop arg */
- _uw reserved5;
+ _uw reserved5;  /* Personality routine GOT value in FDPIC 
mode.  */

 }
   unwinder_cache;
   /* Propagation barrier cache (valid after phase 1): */
diff --git a/libgcc/config/arm/linux-atomic.c 
b/libgcc/config/arm/linux-atomic.c

index 06a6d46..565f829 100644
--- a/libgcc/config/arm/linux-atomic.c
+++ b/libgcc/config/arm/linux-atomic.c
@@ -25,11 +25,62 @@ see the files COPYING3 and COPYING.RUNTIME 
respectively.  If not, see


 /* Kernel helper for compare-and-exchange.  */
 typedef int (__kernel_cmpxchg_t) (int oldval, int newval, int *ptr);
-#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)
+
+#define STR(X) #X
+#define XSTR(X) STR(X)
+
+#define KERNEL_CMPXCHG 0x0fc0
+
+#if __FDPIC__
+/* Non-FDPIC ABIs call __kernel_cmpxchg directly by dereferencing its
+   address, but under FDPIC we would generate a broken call
+   sequence. That's why we have to implement __kernel_cmpxchg and
+   __kernel_dmb here: this way, the FDPIC call sequence works.  */
+#define __kernel_cmpxchg __fdpic_cmpxchg
+#else
+#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) KERNEL_CMPXCHG)
+#endif

 /* Kernel helper for memory barrier.  */
 typedef void (__kernel_dmb_t) (void);
-#define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
+
+#define KERNEL_DMB 0x0fa0
+
+#if __FDPIC__
+#define __kernel_dmb __fdpic_dmb
+#else
+#define __kernel_dmb (*(__kernel_dmb_t *) KERNEL_DMB)
+#endif
+
+#if __FDPIC__
+static int __fdpic_cmpxchg (int oldval, int newval, int *ptr)
+{
+  int result;
+
+  asm volatile (
+   "ldr    ip, 1f\n\t"
+   "bx ip\n\t"
+   "1:\n\t"
+   ".word " XSTR(KERNEL_CMPXCHG) "\n\t"
+   : "=r" (result)
+   : "r" (oldval) , "r" (newval), "r" (ptr)
+   : "r3", "memory");
+  /* The result is actually returned by the kernel helper, we need
+ this to avoid a warning.  */
+  return result;
+}
+
+static void __fdpic_dmb (void)
+{
+  asm volatile (
+   "ldr    ip, 1f\n\t"
+   "bx ip\n\t"
+   "1:\n\t"
+   ".word " XSTR(KERNEL_DMB) "\n\t"
+   );
+}
+
+#endif

 /* Note: we implement byte, short and int versions of atomic 
operations using
    the above kernel helpers; see linux-atomic-64bit.c for "long 
long" (64-bit)
diff --git a/libgcc/config/arm/unwind-arm.h 
b/libgcc/config/arm/unwind-arm.h

index 43c5379..2bf320a 100644
--- a/libgcc/config/arm/unwind-arm.h
+++ b/libgcc/config/arm/unwind-arm.h
@@ -33,9 +33,33 @@
 /* Use IP as a scratch register within the personality routine.  */
 #define UNWIND_POINTER_REG 12

+#define FDPIC_REGNUM 9
+
+#define

Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2019-08-30 Thread Jonathan Wakely

On 29/08/19 16:54 +0200, Christophe Lyon wrote:

On 12/07/2019 08:49, Richard Sandiford wrote:

Christophe Lyon  writes:

The new arm-uclinuxfdpiceabi target behaves pretty much like
arm-linux-gnueabi. In order the enable the same set of features, we
have to update several configure scripts that generally match targets
like *-*-linux*: in most places, we add *-uclinux* where there is
already *-linux*, or uclinux* when there is already linux*.

In gcc/config.gcc and libgcc/config.host we use *-*-uclinuxfdpiceabi
because there is already a different behaviour for *-*uclinux* target.

In libtool.m4, we use uclinuxfdpiceabi in cases where ELF shared
libraries support is required, as uclinux does not guarantee that.

2019-XX-XX  Christophe Lyon  

config/
* futex.m4: Handle *-uclinux*.
* tls.m4 (GCC_CHECK_TLS): Likewise.

gcc/
* config.gcc: Handle *-*-uclinuxfdpiceabi.

libatomic/
* configure.tgt: Handle arm*-*-uclinux*.
* configure: Regenerate.

libgcc/
* config.host: Handle *-*-uclinuxfdpiceabi.

libitm/
* configure.tgt: Handle *-*-uclinux*.
* configure: Regenerate.

libstdc++-v3/
* acinclude.m4: Handle uclinux*.
* configure: Regenerate.
* configure.host: Handle uclinux*

* libtool.m4: Handle uclinux*.


Has the libtool.m4 patch been submitted to upstream libtool?
I think this is supposed to be handled by submitting there first
and then cherry-picking into gcc, so that the change isn't lost
by a future import.


I added a comment to libtool.m4 about this.


[...]

diff --git a/config/tls.m4 b/config/tls.m4
index 1a5fc59..a487aa4 100644
--- a/config/tls.m4
+++ b/config/tls.m4
@@ -76,7 +76,7 @@ AC_DEFUN([GCC_CHECK_TLS], [
  dnl Shared library options may depend on the host; this check
  dnl is only known to be needed for GNU/Linux.
  case $host in
-   *-*-linux*)
+   *-*-linux* | -*-uclinux*)
  LDFLAGS="-shared -Wl,--no-undefined $LDFLAGS"
  ;;
  esac


Is this right for all uclinux targets?

I don't think so, now restricted to -*-uclinuxfdpic*




diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 84258d8..cb0fdc5 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4


It'd probably be worth splitting out the libstdc++-v3 bits and
submitting them separately, cc:ing libstd...@gcc.gnu.org.  But...


I've now split the patch into two parts (both attached here)



@@ -1404,7 +1404,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 ac_has_nanosleep=yes
 ac_has_sched_yield=yes
 ;;
-  gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
+  gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
 AC_MSG_CHECKING([for at least GNU libc 2.17])
 AC_TRY_COMPILE(
   [#include ],


is this the right thing to do?  It seems odd to be testing the glibc
version for uclibc.

Do you want to support multiple possible settings of
ac_has_clock_monotonic and ac_has_clock_realtime?  Or could you just
hard-code the values, given particular baseline assumptions about the
version of uclibc etc.?  Hard-coding would then make


@@ -1526,7 +1526,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
   if test x"$ac_has_clock_monotonic" != x"yes"; then
 case ${target_os} in
-  linux*)
+  linux* | uclinux*)
AC_MSG_CHECKING([for clock_gettime syscall])
AC_TRY_COMPILE(
  [#include 


...this redundant.


Right, now fixed.


@@ -2415,7 +2415,7 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [
   # Default to "generic".
   if test $enable_clocale_flag = auto; then
 case ${target_os} in
-  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
+  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
enable_clocale_flag=gnu
;;
   darwin*)


This too seems to be choosing a glibc setting for a uclibc target.

Indeed.




@@ -2661,7 +2661,7 @@ AC_DEFUN([GLIBCXX_ENABLE_ALLOCATOR], [
   # Default to "new".
   if test $enable_libstdcxx_allocator_flag = auto; then
 case ${target_os} in
-  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
+  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
enable_libstdcxx_allocator_flag=new
;;
   *)


The full case is:

  # Probe for host-specific support if no specific model is specified.
  # Default to "new".
  if test $enable_libstdcxx_allocator_flag = auto; then
case ${target_os} in
  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
enable_libstdcxx_allocator_flag=new
;;
  *)
enable_libstdcxx_allocator_flag=new
;;
esac
  fi

which looks a bit redundant :-)


Right :-)

Thanks,

Christophe



Thanks,
Richard
.






From 81c84839b8f004b7b52317850f27f58e05bec6ad Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Fri, 4 May 2018 15:11:35 +
Subject: [ARM/FDPIC v6 02/24] [ARM] FDPIC: Han

[PATCH] Optimize to_chars

2019-08-30 Thread Antony Polukhin
Bunch of micro optimizations for std::to_chars:
* For base == 8 replacing the lookup in __digits table with arithmetic
computations leads to a same CPU cycles for a loop (exchanges two
movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
saves 129 bytes of data and totally avoids a chance of cache misses on
__digits.
* For base == 16 replacing the lookup in __digits table with
arithmetic computations leads to a few additional instructions, but
totally avoids a chance of cache misses on __digits (- ~9 cache misses
for worst case) and saves 513 bytes of const data.
* Replacing __first[pos] and __first[pos - 1] with __first[1] and
__first[0] on final iterations saves ~2% of code size.
* Removing trailing '\0' from arrays of digits allows the linker to
merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
"0123456789abcdef" could share the same address). This improves data
locality and reduces binary sizes.
* Using __detail::__to_chars_len_2 instead of a generic
__detail::__to_chars_len makes the operation O(1) instead of O(N). It
also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
.

In sum: this significantly reduces the size of a binary (for about
4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
with latency (CPU cache misses) without changing the iterations count
and without adding costly instructions into the loops.

Changelog:
* include/std/charconv (__detail::__to_chars_8,
__detail::__to_chars_16): Replace array of precomputed digits
with arithmetic operations to avoid CPU cache misses. Remove
zero termination from array of digits to allow symbol merge with
generic implementation of __detail::__to_chars. Replace final
offsets with constants. Use __detail::__to_chars_len_2 instead
of a generic __detail::__to_chars_len.
* include/std/charconv (__detail::__to_chars): Remove
zero termination from array of digits.
* include/std/charconv (__detail::__to_chars_2): Leading digit
is always '1'.

-- 
Best regards,
Antony Polukhin
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index eaa6f74..35706d0 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,17 @@
+2019-08-30  Antony Polukhin  
+
+   * include/std/charconv (__detail::__to_chars_8,
+   __detail::__to_chars_16): Replace array of precomputed digits
+   with arithmetic operations to avoid CPU cache misses. Remove
+   zero termination from array of digits to allow symbol merge with
+   generic implementation of __detail::__to_chars. Replace final
+   offsets with constants. Use __detail::__to_chars_len_2 instead
+   of a generic __detail::__to_chars_len.
+   * include/std/charconv (__detail::__to_chars): Remove
+   zero termination from array of digits.
+   * include/std/charconv (__detail::__to_chars_2): Leading digit
+   is always '1'.
+
 2019-08-29  Jonathan Wakely  
 
PR libstdc++/91067
diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 53aa63e..4e94c39 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -131,7 +131,7 @@ namespace __detail
: 1u;
}
   else
-   return __to_chars_len(__value, 8);
+   return (__to_chars_len_2(__value) + 2) / 3;
 }
 
   // Generic implementation for arbitrary bases.
@@ -155,8 +155,12 @@ namespace __detail
 
   unsigned __pos = __len - 1;
 
-  static constexpr char __digits[]
-   = "0123456789abcdefghijklmnopqrstuvwxyz";
+  static constexpr char __digits[] = {
+   '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
+   'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
+   'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
+   'u', 'v', 'w', 'x', 'y', 'z'
+  };
 
   while (__val >= __base)
{
@@ -181,7 +185,7 @@ namespace __detail
 
   to_chars_result __res;
 
-  const unsigned __len = __to_chars_len(__val, 0x10);
+  const unsigned __len = (__to_chars_len_2(__val) + 3) / 4;
 
   if (__builtin_expect((__last - __first) < __len, 0))
{
@@ -190,32 +194,30 @@ namespace __detail
  return __res;
}
 
-  static constexpr char __digits[513] =
-   "000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f"
-   "202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f"
-   "404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f"
-   "606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f"
-   "808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9f"
-   "a0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebf"
-   "c0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedf"
-   "e0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff";
+  static constexpr char __digits[] = {
+   '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
+   'a',

[Ada] Add assertion for Size attribute et al.

2019-08-30 Thread Eric Botcazou
The type to which, or the type of the object to which, it is applied must be 
frozen when the Size attribute, or a variant thereof, is processed.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/trans.c (Attribute_to_gnu) : Add assertion.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 275174)
+++ gcc-interface/trans.c	(working copy)
@@ -2351,6 +2351,9 @@ Attribute_to_gnu (Node_Id gnat_node, tree *gnu_res
 	gnu_type = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (gnu_type)));
 	}
 
+  /* The type must be frozen at this point.  */
+  gcc_assert (COMPLETE_TYPE_P (gnu_type));
+
   /* If we're looking for the size of a field, return the field size.  */
   if (TREE_CODE (gnu_prefix) == COMPONENT_REF)
 	gnu_result = DECL_SIZE (TREE_OPERAND (gnu_prefix, 1));


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Ilya Leoshkevich
> Am 30.08.2019 um 09:16 schrieb Richard Biener :
> 
> On Fri, Aug 30, 2019 at 9:12 AM Richard Biener
>  wrote:
>> 
>> On Thu, Aug 29, 2019 at 5:39 PM Ilya Leoshkevich  wrote:
>>> 
 Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
 
 Bootstrap and regtest running on x86_64-redhat-linux and
 s390x-redhat-linux.
 
 This patch series adds signaling FP comparison support (both scalar and
 vector) to s390 backend.
>>> 
>>> I'm running into a problem on ppc64 with this patch, and it would be
>>> great if someone could help me figure out the best way to resolve it.
>>> 
>>> vector36.C test is failing because gimplifier produces the following
>>> 
>>>  _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
>>>  _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
>>> 
>>> from
>>> 
>>>  VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
>>>  { -1, -1, -1, -1 } ,
>>>  { 0, 0, 0, 0 } >
>>> 
>>> Since the comparison tree code is now hidden behind a temporary, my code
>>> does not have anything to pass to the backend.  The reason for creating
>>> a temporary is that the comparison can trap, and so the following check
>>> in gimplify_expr fails:
>>> 
>>>  if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
>>>goto out;
>>> 
>>> gimple_test_f is is_gimple_condexpr, and it eventually calls
>>> operation_could_trap_p (GT).
>>> 
>>> My current solution is to simply state that backend does not support
>>> SSA_NAME in vector comparisons, however, I don't like it, since it may
>>> cause performance regressions due to having to fall back to scalar
>>> comparisons.
>>> 
>>> I was thinking about two other possible solutions:
>>> 
>>> 1. Change the gimplifier to allow trapping vector comparisons.  That's
>>>   a bit complicated, because tree_could_throw_p checks not only for
>>>   floating point traps, but also e.g. for array index out of bounds
>>>   traps.  So I would have to create a tree_could_throw_p version which
>>>   disregards specific kinds of traps.
>>> 
>>> 2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
>>>   its tree_code instead of SSA_NAME.  The potential problem I see with
>>>   this is that there appears to be no guarantee that _5 will be inlined
>>>   into _6 at a later point.  So if we say that we don't need to fall
>>>   back to scalar comparisons based on availability of vector >
>>>   instruction and inlining does not happen, then what's actually will
>>>   be required is vector selection (vsel on S/390), which might not be
>>>   available in general case.
>>> 
>>> What would be a better way to proceed here?
>> 
>> On GIMPLE there isn't a good reason to split out trapping comparisons
>> from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
>> where it is important because we'd have no way to represent EH info
>> when not done.  It might be a bit awkward to preserve EH across RTL
>> expansion though in case the [VEC_]COND_EXPR are not expanded
>> as a single pattern, but I'm not sure.
>> 
>> To go this route you'd have to split the is_gimple_condexpr check
>> I guess and eventually users turning [VEC_]COND_EXPR into conditional
>> code (do we have any?) have to be extra careful then.
> 
> Oh, btw - the fact that we have an expression embedded in [VEC_]COND_EXPR
> is something that bothers me for quite some time already and it makes
> things like VN awkward and GIMPLE fincky.  We've discussed alternatives
> to dead with the simplest being moving the comparison out to a separate
> stmt and others like having four operand [VEC_]COND_{EQ,NE,...}_EXPR
> codes or simply treating {EQ,NE,...}_EXPR as quarternary on GIMPLE
> with either optional 3rd and 4th operand (defaulting to 
> boolean_true/false_node)
> or always explicit ones (and thus dropping [VEC_]COND_EXPR).
> 
> What does LLVM do here?

For

void f(long long * restrict w, double * restrict x, double * restrict y, int n)
{
for (int i = 0; i < n; i++)
w[i] = x[i] == y[i] ? x[i] : y[i];
}

LLVM does

  %26 = fcmp oeq <2 x double> %21, %25
  %27 = extractelement <2 x i1> %26, i32 0
  %28 = select <2 x i1> %26, <2 x double> %21, <2 x double> %25

So they have separate operations for comparisons and ternary operator
(fcmp + select).


Backports to 7.5 (part 2)

2019-08-30 Thread Jakub Jelinek
Hi!

I've backported 97 commits from trunk to 7.5, bootstrapped/regtested them on
x86_64-linux and i686-linux and committed.
Here is the second half of them.

Jakub
2019-08-30  Jakub Jelinek  

Backported from mainline
2019-02-20  Jakub Jelinek  

PR middle-end/88074
PR middle-end/89415
* toplev.c (do_compile): Double the emin/emax exponents to workaround
buggy mpc_norm.

2019-02-19  Richard Biener  

PR middle-end/88074
* toplev.c (do_compile): Initialize mpfr's exponent range
based on available float modes.

2019-02-20  Jakub Jelinek  

PR middle-end/88074
PR middle-end/89415
* gcc.dg/pr88074-2.c: New test.

2019-02-19  Richard Biener  

PR middle-end/88074
* gcc.dg/pr88074.c: New testcase.

--- gcc/toplev.c(revision 270711)
+++ gcc/toplev.c(revision 270712)
@@ -1981,6 +1981,36 @@ do_compile ()
else
  int_n_enabled_p[i] = false;
 
+  /* Initialize mpfrs exponent range.  This is important to get
+ underflow/overflow in a reasonable timeframe.  */
+  machine_mode mode;
+  int min_exp = -1;
+  int max_exp = 1;
+  for (mode = GET_CLASS_NARROWEST_MODE (MODE_FLOAT);
+  mode != VOIDmode;
+  mode = GET_MODE_WIDER_MODE (mode))
+   if (SCALAR_FLOAT_MODE_P (mode))
+ {
+   const real_format *fmt = REAL_MODE_FORMAT (mode);
+   if (fmt)
+ {
+   /* fmt->emin - fmt->p + 1 should be enough but the
+  back-and-forth dance in real_to_decimal_for_mode we
+  do for checking fails due to rounding effects then.  */
+   if ((fmt->emin - fmt->p) < min_exp)
+ min_exp = fmt->emin - fmt->p;
+   if (fmt->emax > max_exp)
+ max_exp = fmt->emax;
+ }
+ }
+  /* E.g. mpc_norm assumes it can square a number without bothering with
+with range scaling, so until that is fixed, double the minimum
+and maximum exponents, plus add some buffer for arithmetics
+on the squared numbers.  */
+  if (mpfr_set_emin (2 * (min_exp - 1))
+ || mpfr_set_emax (2 * (max_exp + 1)))
+   sorry ("mpfr not configured to handle all float modes");
+
   /* Set up the back-end if requested.  */
   if (!no_backend)
backend_init ();
--- gcc/testsuite/gcc.dg/pr88074.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr88074.c  (revision 270712)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+#include 
+
+int main()
+{
+  _Complex double x;
+  __real x = 3.091e+8;
+  __imag x = -4.045e+8;
+  /* This used to spend huge amounts of compile-time inside mpc.  */
+  volatile _Complex double y = ctan (x);
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr88074-2.c(nonexistent)
+++ gcc/testsuite/gcc.dg/pr88074-2.c(revision 270712)
@@ -0,0 +1,17 @@
+/* PR middle-end/88074 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-add-options float128 } */
+/* { dg-require-effective-target float128 } */
+/* { dg-final { scan-tree-dump-not "link_error " "optimized" } } */
+
+extern void link_error (void);
+int
+main ()
+{
+  if (((__FLT128_MAX__ * 0.5 + __FLT128_MAX__ * 0.5i)
+   / (__FLT128_MAX__ * 0.25 + __FLT128_MAX__ * 0.25i))
+  != (_Complex _Float128) 2)
+link_error ();
+  return 0;
+}
2019-08-30  Jakub Jelinek  

Backported from mainline
2019-02-20  Jakub Jelinek  
David Malcolm  

PR middle-end/89091
* fold-const.c (decode_field_reference): Return NULL_TREE if
lang_hooks.types.type_for_size returns NULL.  Check it before
overwriting *exp_.  Use return NULL_TREE instead of return 0.

* gcc.dg/torture/pr89091.c: New test.

--- gcc/fold-const.c(revision 270712)
+++ gcc/fold-const.c(revision 270713)
@@ -4239,7 +4239,7 @@ decode_field_reference (location_t loc,
  There are problems with FP fields since the type_for_size call
  below can fail for, e.g., XFmode.  */
   if (! INTEGRAL_TYPE_P (TREE_TYPE (exp)))
-return 0;
+return NULL_TREE;
 
   /* We are interested in the bare arrangement of bits, so strip everything
  that doesn't affect the machine mode.  However, record the type of the
@@ -4255,7 +4255,7 @@ decode_field_reference (location_t loc,
   exp = TREE_OPERAND (exp, 0);
   STRIP_NOPS (exp); STRIP_NOPS (and_mask);
   if (TREE_CODE (and_mask) != INTEGER_CST)
-   return 0;
+   return NULL_TREE;
 }
 
   poly_int64 poly_bitsize, poly_bitpos;
@@ -4271,7 +4271,11 @@ decode_field_reference (location_t loc,
   || (! AGGREGATE_TYPE_P (TREE_TYPE (inner))
  && compare_tree_int (TYPE_SIZE (TREE_TYPE (inner)),
   *pbitpos + *pbitsize) < 0))
-return 0;
+return NULL_TREE;
+
+  unsigned_type = lang_hooks.types.type_for_siz

Backports to 7.5 (part 1)

2019-08-30 Thread Jakub Jelinek
Hi!

I've backported 97 commits from trunk to 7.5, bootstrapped/regtested them on
x86_64-linux and i686-linux and committed.
Here is the first half of them, second half will follow in another mail.

Jakub
2019-08-30  Jakub Jelinek  

Backported from mainline
2018-10-19  Jakub Jelinek  

PR middle-end/85488
PR middle-end/87649
* omp-low.c (check_omp_nesting_restrictions): Diagnose ordered without
depend closely nested inside of loop with ordered clause with
a parameter.

* c-c++-common/gomp/doacross-2.c: New test.
* c-c++-common/gomp/sink-3.c: Expect another error during error
recovery.

--- gcc/omp-low.c   (revision 265800)
+++ gcc/omp-low.c   (revision 265801)
@@ -2835,14 +2835,25 @@ check_omp_nesting_restrictions (gimple *
  case GIMPLE_OMP_FOR:
if (gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_TASKLOOP)
  goto ordered_in_taskloop;
-   if (omp_find_clause (gimple_omp_for_clauses (ctx->stmt),
-OMP_CLAUSE_ORDERED) == NULL)
+   tree o;
+   o = omp_find_clause (gimple_omp_for_clauses (ctx->stmt),
+OMP_CLAUSE_ORDERED);
+   if (o == NULL)
  {
error_at (gimple_location (stmt),
  "% region must be closely nested inside "
  "a loop region with an % clause");
return false;
  }
+   if (OMP_CLAUSE_ORDERED_EXPR (o) != NULL_TREE
+   && omp_find_clause (c, OMP_CLAUSE_DEPEND) == NULL_TREE)
+ {
+   error_at (gimple_location (stmt),
+ "% region without % clause may "
+ "not be closely nested inside a loop region with "
+ "an % clause with a parameter");
+   return false;
+ }
return true;
  case GIMPLE_OMP_TARGET:
if (gimple_omp_target_kind (ctx->stmt)
--- gcc/testsuite/c-c++-common/gomp/sink-3.c(revision 265800)
+++ gcc/testsuite/c-c++-common/gomp/sink-3.c(revision 265801)
@@ -14,7 +14,7 @@ foo ()
   for (i=0; i < 100; ++i)
 {
 #pragma omp ordered depend(sink:poo-1,paa+1) /* { dg-error 
"poo.*declared.*paa.*declared" } */
-bar(&i);
+bar(&i);/* { dg-error "may not be closely 
nested" "" { target *-*-* } .-1 } */
 #pragma omp ordered depend(source)
 }
 }
--- gcc/testsuite/c-c++-common/gomp/doacross-2.c(nonexistent)
+++ gcc/testsuite/c-c++-common/gomp/doacross-2.c(revision 265801)
@@ -0,0 +1,49 @@
+/* PR middle-end/87649 */
+
+void
+foo (void)
+{
+  int i;
+  #pragma omp for ordered(1)
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered  /* { dg-error "'ordered' region 
without 'depend' clause may not be closely nested inside a loop region with an 
'ordered' clause with a parameter" } */
+  ;
+}
+  #pragma omp for ordered(1)
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered threads  /* { dg-error "'ordered' region 
without 'depend' clause may not be closely nested inside a loop region with an 
'ordered' clause with a parameter" } */
+  ;
+}
+}
+
+void
+bar (void)
+{
+  int i;
+  #pragma omp for ordered
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered depend(source)   /* { dg-error "'ordered' 
construct with 'depend' clause must be closely nested inside a loop with 
'ordered' clause with a parameter" } */
+  #pragma omp ordered depend(sink: i - 1)  /* { dg-error "'ordered' 
construct with 'depend' clause must be closely nested inside a loop with 
'ordered' clause with a parameter" } */
+}
+  #pragma omp for
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered depend(source)   /* { dg-error "'ordered' 
construct with 'depend' clause must be closely nested inside a loop with 
'ordered' clause with a parameter" } */
+  #pragma omp ordered depend(sink: i - 1)  /* { dg-error "'ordered' 
construct with 'depend' clause must be closely nested inside a loop with 
'ordered' clause with a parameter" } */
+}
+  #pragma omp for
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered  /* { dg-error "'ordered' region 
must be closely nested inside a loop region with an 'ordered' clause" } */
+  ;
+}
+  #pragma omp for
+  for (i = 0; i < 64; i++)
+{
+  #pragma omp ordered threads  /* { dg-error "'ordered' region 
must be closely nested inside a loop region with an 'ordered' clause" } */
+  ;
+}
+}
2019-08-30  Jakub Jelinek  

Backported from mainline
2018-10-20  Jakub Jelinek  

PR middle-end/87647
* varasm.c (decode_addr_const): Handle COMPOUND_LITERAL_EXPR.

* gcc.c-torture/compile/pr87647.c: New test.

--- gcc/varasm.c(revision 265801)
+++ gcc/varasm.

Re: C++ PATCH for P1152R4: Deprecating some uses of volatile (PR c++/91361)

2019-08-30 Thread Marek Polacek
On Thu, Aug 29, 2019 at 03:14:36PM -0600, Martin Sebor wrote:
> On 8/28/19 5:56 PM, Marek Polacek wrote:
> > --- gcc/doc/invoke.texi
> > +++ gcc/doc/invoke.texi
> > @@ -3516,6 +3516,19 @@ result in a call to @code{terminate}.
> >   Disable the warning about the case when a conversion function converts an
> >   object to the same type, to a base class of that type, or to void; such
> >   a conversion function will never be called.
> > +
> > +@item -Wvolatile @r{(C++ and Objective-C++ only)}
> > +@opindex Wvolatile
> > +@opindex Wno-volatile
> > +Warn about deprecated uses of the volatile qualifier.  This includes 
> > postfix
> > +and prefix @code{++} and @code{--} expressions of volatile-qualified types,
> > +using simple assignments where the left operand is a volatile-qualified
> > +non-class type for their value, compound assignments where the left operand
> > +is a volatile-qualified non-class type, volatile-qualified function return
> > +type, volatile-qualified parameter type, and structured bindings of a
> > +volatile-qualified type.  This usage was deprecated in C++20.
> 
> Just a minor thing: Since the text uses volatile as a keyword
> (as opposed to an adjective) it should be @code{volatile},
> analogously how it's quoted in warnings.

Oh right, thanks.  Fixed with:

2019-08-30  Marek Polacek  

* doc/invoke.texi (-Wvolatile): Use @code for volatile.

diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index aa9886ee09f..44a8801e558 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -3520,13 +3520,14 @@ a conversion function will never be called.
 @item -Wvolatile @r{(C++ and Objective-C++ only)}
 @opindex Wvolatile
 @opindex Wno-volatile
-Warn about deprecated uses of the volatile qualifier.  This includes postfix
-and prefix @code{++} and @code{--} expressions of volatile-qualified types,
-using simple assignments where the left operand is a volatile-qualified
-non-class type for their value, compound assignments where the left operand
-is a volatile-qualified non-class type, volatile-qualified function return
-type, volatile-qualified parameter type, and structured bindings of a
-volatile-qualified type.  This usage was deprecated in C++20.
+Warn about deprecated uses of the @code{volatile} qualifier.  This includes
+postfix and prefix @code{++} and @code{--} expressions of
+@code{volatile}-qualified types, using simple assignments where the left
+operand is a @code{volatile}-qualified non-class type for their value,
+compound assignments where the left operand is a @code{volatile}-qualified
+non-class type, @code{volatile}-qualified function return type,
+@code{volatile}-qualified parameter type, and structured bindings of a
+@code{volatile}-qualified type.  This usage was deprecated in C++20.
 
 Enabled by default with @option{-std=c++2a}.
 @end table


[PATCH] PR libstdc++/89164 enforce constraints for uninitialized algos

2019-08-30 Thread Jonathan Wakely

The memmove optimizations for std::uninitialized_copy/fill/_n will
compile even if the type is not copy constructible, because std::copy
doesn't require copy construction to work. But the uninitialized
algorithms do require it.

This adds explicit static assertions to ensure we don't allow ill-formed
initializations.

PR libstdc++/89164
* include/bits/stl_algobase.h (__copy_move): Give descriptive names
to template parameters.
* include/bits/stl_uninitialized.h (uninitialized_copy)
(uninitialized_fill, uninitialized_fill_n): Add static assertions to
diagnose invalid uses.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
Adjust expected error.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/
89164.cc: New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/
89164.cc: New test.
* testsuite/23_containers/vector/cons/89164.cc: New test.
* testsuite/23_containers/vector/cons/89164_c++17.cc: New test.

This is the patch sent in
https://gcc.gnu.org/ml/libstdc++/2019-02/msg00034.html but with the
dg-error directives in the tests fixed.

Tested x86_64-linux, committed to trunk.


commit dde7756ade2ca2d9d886a4da66cdea8ddd493294
Author: Jonathan Wakely 
Date:   Fri Aug 30 14:46:42 2019 +0100

PR libstdc++/89164 enforce constraints for uninitialized algos

The memmove optimizations for std::uninitialized_copy/fill/_n will
compile even if the type is not copy constructible, because std::copy
doesn't require copy construction to work. But the uninitialized
algorithms do require it.

This adds explicit static assertions to ensure we don't allow ill-formed
initializations.

PR libstdc++/89164
* include/bits/stl_algobase.h (__copy_move): Give descriptive names
to template parameters.
* include/bits/stl_uninitialized.h (uninitialized_copy)
(uninitialized_fill, uninitialized_fill_n): Add static assertions to
diagnose invalid uses.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
Adjust expected error.
* 
testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/
89164.cc: New test.
* 
testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/
89164.cc: New test.
* testsuite/23_containers/vector/cons/89164.cc: New test.
* testsuite/23_containers/vector/cons/89164_c++17.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 36bb9ccb777..98d324827ed 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -359,7 +359,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // (2) If we're using random access iterators, then write the loop as
   // a for loop with an explicit count.
 
-  template
+  template
 struct __copy_move
 {
   template
diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h 
b/libstdc++-v3/include/bits/stl_uninitialized.h
index b29395cb7c0..cf42cbd9c76 100644
--- a/libstdc++-v3/include/bits/stl_uninitialized.h
+++ b/libstdc++-v3/include/bits/stl_uninitialized.h
@@ -130,9 +130,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus < 201103L
   const bool __assignable = true;
 #else
-  // trivial types can have deleted assignment
+  // Trivial types can have deleted copy constructor, but the std::copy
+  // optimization that uses memmove would happily "copy" them anyway.
+  static_assert(is_constructible<_ValueType2, decltype(*__first)>::value,
+ "result type must be constructible from value type of input range");
+
   typedef typename iterator_traits<_InputIterator>::reference _RefType1;
   typedef typename iterator_traits<_ForwardIterator>::reference _RefType2;
+  // Trivial types can have deleted assignment, so using std::copy
+  // would be ill-formed. Require assignability before using std::copy:
   const bool __assignable = is_assignable<_RefType2, _RefType1>::value;
 #endif
 
@@ -197,7 +203,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cplusplus < 201103L
   const bool __assignable = true;
 #else
-  // trivial types can have deleted assignment
+  // Trivial types can have deleted copy constructor, but the std::fill
+  // optimization that uses memmove would happily "copy" them anyway.
+  static_assert(is_constructible<_ValueType, c

[Ada] Remove range check generation code in gigi

2019-08-30 Thread Eric Botcazou
The generation of range checks is now entirely done by the front-end proper so 
the code to that effect in gigi is obsolete and can be removed, but the patch 
implements a watchdog mechanism to make sure no check is lost in the process.

Tested on x86_64-suse-linux, applied on the mainline.


2019-08-30  Eric Botcazou  

* gcc-interface/gigi.h (gigi_checking_assert): New macro.
* gcc-interface/decl.c (gnat_to_gnu_entity) :
Remove redundant test and adjust comments.  Minor tweaks.
* gcc-interface/trans.c (Call_to_gnu): Do not generate range checks,
instead assert that the Do_Range_Check flag is not set.  Adjust call
to convert_with_check.
(gnat_to_gnu): Likewise.
(assoc_to_constructor): Likewise.
(pos_to_constructor): Likewise.  Remove GNAT_COMPONENT_TYPE parameter.
(emit_range_check): Delete.
(convert_with_check): Remove RANGE_P parameter and adjust.  Do a single
overflow check for modular types.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 275062)
+++ gcc-interface/decl.c	(working copy)
@@ -447,13 +447,13 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
  If we are not defining it, it must be a type or an entity that is defined
  elsewhere or externally, otherwise we should have defined it already.  */
   gcc_assert (definition
-	  || type_annotate_only
 	  || is_type
 	  || kind == E_Discriminant
 	  || kind == E_Component
 	  || kind == E_Label
 	  || (kind == E_Constant && Present (Full_View (gnat_entity)))
-	  || Is_Public (gnat_entity));
+	  || Is_Public (gnat_entity)
+	  || type_annotate_only);
 
   /* Get the name of the entity and set up the line number and filename of
  the original definition for use in any decl we make.  Make sure we do
@@ -1758,34 +1758,29 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
 
 case E_Modular_Integer_Type:
   {
-	/* For modular types, make the unsigned type of the proper number
-	   of bits and then set up the modulus, if required.  */
-	tree gnu_modulus, gnu_high = NULL_TREE;
-
 	/* Packed Array Impl. Types are supposed to be subtypes only.  */
 	gcc_assert (!Is_Packed_Array_Impl_Type (gnat_entity));
 
+	/* For modular types, make the unsigned type of the proper number
+	   of bits and then set up the modulus, if required.  */
 	gnu_type = make_unsigned_type (esize);
 
-	/* Get the modulus in this type.  If it overflows, assume it is because
-	   it is equal to 2**Esize.  Note that there is no overflow checking
-	   done on unsigned type, so we detect the overflow by looking for
-	   a modulus of zero, which is otherwise invalid.  */
-	gnu_modulus = UI_To_gnu (Modulus (gnat_entity), gnu_type);
+	/* Get the modulus in this type.  If the modulus overflows, assume
+	   that this is because it was equal to 2**Esize.  Note that there
+	   is no overflow checking done on unsigned types, so we detect the
+	   overflow by looking for a modulus of zero, which is invalid.  */
+	tree gnu_modulus = UI_To_gnu (Modulus (gnat_entity), gnu_type);
 
+	/* If the modulus is not 2**Esize, then this also means that the upper
+	   bound of the type, i.e. modulus - 1, is not maximal, so we create an
+	   extra subtype to carry it and set the modulus on the base type.  */
 	if (!integer_zerop (gnu_modulus))
 	  {
+	TYPE_NAME (gnu_type) = create_concat_name (gnat_entity, "UMT");
 	TYPE_MODULAR_P (gnu_type) = 1;
 	SET_TYPE_MODULUS (gnu_type, gnu_modulus);
-	gnu_high = fold_build2 (MINUS_EXPR, gnu_type, gnu_modulus,
-build_int_cst (gnu_type, 1));
-	  }
-
-	/* If the upper bound is not maximal, make an extra subtype.  */
-	if (gnu_high
-	&& !tree_int_cst_equal (gnu_high, TYPE_MAX_VALUE (gnu_type)))
-	  {
-	TYPE_NAME (gnu_type) = create_concat_name (gnat_entity, "UMT");
+	tree gnu_high = fold_build2 (MINUS_EXPR, gnu_type, gnu_modulus,
+	 build_int_cst (gnu_type, 1));
 	gnu_type
 	  = create_extra_subtype (gnu_type, TYPE_MIN_VALUE (gnu_type),
   gnu_high);
@@ -2987,8 +2982,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gn
 		|| Present (Record_Extension_Part (record_definition)))
 	  record_definition = Record_Extension_Part (record_definition);
 
-	gcc_assert (type_annotate_only
-			|| Present (Parent_Subtype (gnat_entity)));
+	gcc_assert (Present (Parent_Subtype (gnat_entity))
+		|| type_annotate_only);
 	  }
 
 	/* Make a node for the record.  If we are not defining the record,
Index: gcc-interface/gigi.h
===
--- gcc-interface/gigi.h	(revision 275062)
+++ gcc-interface/gigi.h	(working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Header File   *
  * 

Re: [PATCH] Fix unused malloc return value warning

2019-08-30 Thread Jonathan Wakely

On 29/08/19 21:54 +0200, François Dumont wrote:

Hi

    I am having this warning:

/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/util/testsuite_performance.h:170: 
attention: ignoring return value of « void* malloc(size_t) » declared 
with attribute « warn_unused_result » [-Wunused-result]

  170 |   malloc(0); // Needed for some implementations.

    Ok to fix it with attached patch ?


OK for trunk.

    It seems trivial but I wonder if I shouldn't keep the malloc 
returned pointer and free it properly ?


It's not causing any problems (it's only the testsuite) so let's not
worry about it.

    Or maybe just remove the malloc cause there is not clear comment 
explaining why it's needed and I haven't found much in SVN audit 
trail.


The comment says it's needed, so let's assume that's true.


    * testsuite_files/util/testsuite_performance.h
    (resource_counter::start): Ignore unused malloc(0) result.

François




diff --git a/libstdc++-v3/testsuite/util/testsuite_performance.h 
b/libstdc++-v3/testsuite/util/testsuite_performance.h
index 556c78159be..8abc77cf31a 100644
--- a/libstdc++-v3/testsuite/util/testsuite_performance.h
+++ b/libstdc++-v3/testsuite/util/testsuite_performance.h
@@ -167,7 +167,7 @@ namespace __gnu_test
{
  if (getrusage(who, &rusage_begin) != 0 )
memset(&rusage_begin, 0, sizeof(rusage_begin));
-  malloc(0); // Needed for some implementations.
+  void* p __attribute__((unused)) = malloc(0); // Needed for some 
implementations.
  allocation_begin = mallinfo();
}






[preprocessor] Popping "" file names

2019-08-30 Thread Nathan Sidwell
On the modules branch, I needed a bunch of tests checking the 
interaction of #include nesting and the declarations therein.  Doing 
this with the current testsuite infrastructure is quite awkward.


Using
  #  "name" [12]
directives works just fine, /except/ in the unnesting case.  There we 
need "name" to match the name of the main file.  But we don't know that, 
because it's /some/directory/src/gcc//mytest.C


Hence this change to the directive.  when unnesting, if the given 
filename is "", we simply use the filename we already expect to be 
popped to.


I've wanted this occasionally in the past, but my current requirements 
pushed me to implement this.


1) if the popped-to filename is actually "", the right thing happens 
anyway.  So existing sources using this form of line control continue to 
be accepted.


2) the IS_MAIN_FILE test about mismatches isn't needed -- if we're in 
the main file, we'll have a NULL 'from' linemap pointer.


3) We do not document this form of the line control.  Only the 
standard-mandated '#line' form, and that doesn't permit the trailing 
push/pop/system-header flags.


I intend to commit this next week, to allow for comments.

nathan
--
Nathan Sidwell
2019-08-30  Nathan Sidwell  

	New # semantics for popping to "" name.
	libcpp/
	* directives.c (do_linemarker): Popping to "" name means get the
	name from the include stack..

Index: libcpp/directives.c
===
--- libcpp/directives.c	(revision 275032)
+++ libcpp/directives.c	(working copy)
@@ -1085,7 +1085,15 @@ do_linemarker (cpp_reader *pfile)
   const line_map_ordinary *from
 	= linemap_included_from_linemap (line_table, map);
-  if (MAIN_FILE_P (map)
-	  || (from
-	  && filename_cmp (ORDINARY_MAP_FILE_NAME (from), new_file) != 0))
+
+  if (!from)
+	/* Not nested.  */;
+  else if (!new_file[0])
+	/* Leaving to "" means fill in the popped-to name.  */
+	new_file = ORDINARY_MAP_FILE_NAME (from);
+  else if (filename_cmp (ORDINARY_MAP_FILE_NAME (from), new_file) != 0)
+	/* It's the wrong name, Grommit!  */
+	from = NULL;
+
+  if (!from)
 	{
 	  cpp_warning (pfile, CPP_W_NONE,
@@ -1095,4 +1103,5 @@ do_linemarker (cpp_reader *pfile)
 	}
 }
+
   /* Compensate for the increment in linemap_add that occurs in
  _cpp_do_file_change.  We're currently at the start of the line
Index: gcc/testsuite/c-c++-common/cpp/line-1.c
===
--- gcc/testsuite/c-c++-common/cpp/line-1.c	(revision 0)
+++ gcc/testsuite/c-c++-common/cpp/line-1.c	(working copy)
@@ -0,0 +1,20 @@
+/* { dg-do preprocess } */
+/* { dg-additional-options -Wno-pedantic } */
+
+main-1 __FILE__
+
+# 7 "inner.h" 1
+inner-1 __FILE__
+# 9 "inside.h" 1
+inside-1 __FILE__
+# 11 "" 2
+inner-2 __FILE__
+#13 "" 2
+main-2 __FILE__
+
+
+/* { dg-final { scan-file line-1.i "main-1 \"\[^\n]*line-1.c\"\n" } } */
+/* { dg-final { scan-file line-1.i "main-2 \"\[^\n]*line-1.c\"\n" } } */
+/* { dg-final { scan-file line-1.i "inner-1 \"inner.h\"\n" } } */
+/* { dg-final { scan-file line-1.i "inner-2 \"inner.h\"\n" } } */
+/* { dg-final { scan-file line-1.i "inside-1 \"inside.h\"\n" } } */


Re: [PATCH] Add .pd extension to c_exts.

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 2:31 PM Alexander Monakov  wrote:
>
>
>
> On Fri, 30 Aug 2019, Richard Biener wrote:
>
> > On Fri, Aug 30, 2019 at 12:58 PM Martin Liška  wrote:
> > >
> > > Hi.
> > >
> > > I would like to add .pd to c_exts so that one
> > > can have correctly set tab width, etc.
> >
> > But then it will auto-indent with too much spaces, no?
>
> I think it's fine, the script does
>
>   setlocal cindent
>   setlocal cinoptions=>4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0
>
> so it's a bit smarter than indenting everything by 2 spaces :)
>
> So +1 from me for the patch, but note that now we have a contrib/vim-gcc-dev
> "plugin" directory, it might be more natural to add gcc-match indent rules
> there.

Yeah, that would be better than claiming .pd is C ...

Richard.

>
> Alexander


Re: [PATCH] Add .pd extension to c_exts.

2019-08-30 Thread Alexander Monakov


On Fri, 30 Aug 2019, Richard Biener wrote:

> On Fri, Aug 30, 2019 at 12:58 PM Martin Liška  wrote:
> >
> > Hi.
> >
> > I would like to add .pd to c_exts so that one
> > can have correctly set tab width, etc.
> 
> But then it will auto-indent with too much spaces, no?

I think it's fine, the script does

  setlocal cindent
  setlocal cinoptions=>4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0

so it's a bit smarter than indenting everything by 2 spaces :)

So +1 from me for the patch, but note that now we have a contrib/vim-gcc-dev
"plugin" directory, it might be more natural to add gcc-match indent rules
there.

Alexander

Re: [PATCH] Add .pd extension to c_exts.

2019-08-30 Thread Martin Liška
On 8/30/19 2:09 PM, Richard Biener wrote:
> But then it will auto-indent with too much spaces, no?

It's doesn't understand the lisp-like notation
(foo
 (bar
  (baz

so no auto-indentation is happening.

Martin


Re: [PATCH] Add .pd extension to c_exts.

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 12:58 PM Martin Liška  wrote:
>
> Hi.
>
> I would like to add .pd to c_exts so that one
> can have correctly set tab width, etc.

But then it will auto-indent with too much spaces, no?

Richard.

> Ready for trunk?
> Thanks,
> Martin
>
> contrib/ChangeLog:
>
> 2019-08-30  Martin Liska  
>
> * vimrc: Add .pd extension.
> ---
>  contrib/vimrc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
>


Re: [PATCH][AArch64] Vectorize MULH(R)S patterns with SVE2 instructions

2019-08-30 Thread Richard Sandiford
Thanks for doing this.  The patch looks good, so this review is mostly a
list of very minor formatting comments, sorry.

Yuliang Wang  writes:
> 2019-08-22  Yuliang Wang  
>

Please add a line here pointing at the PR:

PR tree-optimization/89386

The commit hooks pick this up automatically and link the commit to the
bugzilla ticket.  (The PR was filed for SSSE3, but the target-independent
bits are still needed there.)

> * config/aarch64/aarch64-sve2.md: support for SVE2 instructions 
> [S/U]MULL[T/B] + [R]SHRN[T/B] and MULHRS pattern variants

Unfortunately the changelog format is pretty strict here.  Lines have
to be 80 chars or shorter, indented by tabs, and each pattern, function,
variable or type needs to be listed individually regardless of how
useful that seems.  So I think this should be something like:

* config/aarch64/aarch64-sve2.md (mull)
(shrnb, shrnt, mulhs3): New patterns.

(See below for why the "*" patterns aren't listed.)

> * config/aarch64/iterators.md: iterators and attributes for above

Here too the iterators need to be listed:

* config/aarch64/iterators.md (UNSPEC_SMULLB, UNSPEC_SMULLT)
(UNSPEC_UMULLB, UNSPEC_UMULLT, UNSPEC_SHRNB, UNSPEC_SHRNT)
(UNSPEC_RSHRNB, UNSPEC_RSHRNT, UNSPEC_SMULHS, UNSPEC_SMULHRS)
UNSPEC_UMULHS, UNSPEC_UMULHRS): New unspecs.
(MULLBT, SHRNB, SHRNT, MULHRS): New int iterators.
(su, r): Handle the unspecs above.
(bt): New int attribute.

> * internal-fn.def: internal functions for MULH[R]S patterns

* internal-fn.def (IFN_MULHS, IFN_MULHRS): New internal functions.

> * optabs.def: optabs definitions for above and sign variants

* optabs.def (smulhs_optab, smulhrs_optab, umulhs_optab)
(umulhrs_optab): New optabs.

> * tree-vect-patterns.c (vect_recog_multhi_pattern): pattern 
> recognition function for MULHRS

* tree-vect-patterns.c (vect_recog_multhi_pattern): New function.
(vect_vect_recog_func_ptrs): Add it.

> * gcc.target/aarch64/sve2/mulhrs_1.c: new test for all variants

Just:

* gcc.target/aarch64/sve2/mulhrs_1.c: New test.

(Sorry that this is so finicky.  I'm just the messenger. :-))

> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 
> 2334e5a7b7dc524bbd1f4d0a48ba5cd991970118..51783604ad8f83eb1d070c133009ed41a2a0252d
>  100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -63,3 +63,89 @@
> movprfx\t%0, %2\;h\t%0., %1/m, %0., 
> %3."
>[(set_attr "movprfx" "*,yes")]
>  )
> +
> +;; Multiply long top / bottom

Very minor, but: GCC comments traditionally end with "." even if they're
not full sentences.

> +(define_insn "*mull"
> +  [(set (match_operand: 0 "register_operand" "=w")
> +(unspec: [(match_operand:SVE_BHSI 1 "register_operand" "w")
> +   (match_operand:SVE_BHSI 2 "register_operand" "w")]
> +  MULLBT))]
> +  "TARGET_SVE2"
> +  "mull\t%0., %1., %2."
> +)
> +
> +(define_expand "mull"
> +  [(set (match_operand: 0 "register_operand")
> +(unspec: [(match_operand:SVE_BHSI 1 "register_operand")
> +   (match_operand:SVE_BHSI 2 "register_operand")]
> +  MULLBT))]
> +  "TARGET_SVE2"
> +  ""
> +)

It's more usual to put the define_expand first.  But in this case we
don't need separate define_expands, we can just make the define_insn the
named pattern itself:

;; Multiply long top / bottom.
(define_insn "mull"
  [(set (match_operand: 0 "register_operand" "=w")
(unspec: [(match_operand:SVE_BHSI 1 "register_operand" "w")
 (match_operand:SVE_BHSI 2 "register_operand" "w")]
MULLBT))]
  "TARGET_SVE2"
  "mull\t%0., %1., %2."
)

> +
> +;; (Rounding) Right shift narrow bottom
> +(define_insn "*shrnb"
> +  [(set (match_operand:SVE_BHSI 0 "register_operand" "=w")
> +   (unspec:SVE_BHSI [(match_operand: 1 "register_operand" "w")
> +   (match_operand 2 
> "aarch64_simd_shift_imm_offset_" "i")]

This is more a personal thing, but I think it's better to leave out
the "i" (just remove it rather than replace it with "") when the operand
doesn't in fact accept all the things that "i" allows.

> +  SHRNB))]
> +  "TARGET_SVE2"
> +  "shrnb\t%0., %1., #%2"
> +)
> +
> +(define_expand "shrnb"
> +  [(set (match_operand:SVE_BHSI 0 "register_operand")
> +  (unspec:SVE_BHSI [(match_operand: 1 "register_operand")
> +   (match_operand 2 
> "aarch64_simd_shift_imm_offset_")]
> +  SHRNB))]
> +  "TARGET_SVE2"
> +  ""
> +)
> +
> +;; (Rounding) Right shift narrow top
> +(define_insn "*shrnt"
> +  [(set (match_operand:SVE_BHSI 0 "register_operand" "=w")
> +   (unspec:SVE_BHSI [(match_operand:SVE_BHSI 1 "register_operand" "%0")

"%" indicates that operand 1 is commutative with operand 2, which isn't
t

[C++ PATCH] vtable decl marking

2019-08-30 Thread Nathan Sidwell
I noticed that DECL_VTABLE_OR_VTT_P and DECL_VIRTUAL_P have the same 
semantics for VAR_DECLs.  We set them both at the same time, and the C++ 
FE tests the former, but (I presume) the optimizers test the latter.


It doesn't seem to matter that we set DECL_VIRTUAL_P for the VTT.

The attached patch survives bootstraping, so I'll install it next week, 
if there's no objections.


nathan

--
Nathan Sidwell
2019-08-30  Nathan Sidwell  

	* cp-tree.h (DECL_VTABLE_OR_VTT_P): Forward to DECL_VIRTUAL_P.

Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h	(revision 275032)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -463,5 +463,4 @@ extern GTY(()) tree cp_global_trees[CPTI
   LOOKUP_FOUND_P (in RECORD_TYPE, UNION_TYPE, NAMESPACE_DECL)
5: IDENTIFIER_VIRTUAL_P (in IDENTIFIER_NODE)
-  DECL_VTABLE_OR_VTT_P (in VAR_DECL)
   FUNCTION_RVALUE_QUALIFIED (in FUNCTION_TYPE, METHOD_TYPE)
   CALL_EXPR_REVERSE_ARGS (in CALL_EXPR, AGGR_INIT_EXPR)
@@ -3255,6 +3254,8 @@ struct GTY(()) lang_decl {
 #define DECL_TINFO_P(NODE) TREE_LANG_FLAG_4 (VAR_DECL_CHECK (NODE))
 
-/* 1 iff VAR_DECL node NODE is virtual table or VTT.  */
-#define DECL_VTABLE_OR_VTT_P(NODE) TREE_LANG_FLAG_5 (VAR_DECL_CHECK (NODE))
+/* 1 iff VAR_DECL node NODE is virtual table or VTT.  We forward to
+   DECL_VIRTUAL_P from the common code, as that has the semantics we
+   need.  But we want a more descriptive name.  */
+#define DECL_VTABLE_OR_VTT_P(NODE) DECL_VIRTUAL_P (VAR_DECL_CHECK (NODE))
 
 /* 1 iff FUNCTION_TYPE or METHOD_TYPE has a ref-qualifier (either & or &&). */


[PATCH] Add .pd extension to c_exts.

2019-08-30 Thread Martin Liška
Hi.

I would like to add .pd to c_exts so that one
can have correctly set tab width, etc.

Ready for trunk?
Thanks,
Martin

contrib/ChangeLog:

2019-08-30  Martin Liska  

* vimrc: Add .pd extension.
---
 contrib/vimrc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/contrib/vimrc b/contrib/vimrc
index 7c0c5878c63..a9ec46bf52f 100644
--- a/contrib/vimrc
+++ b/contrib/vimrc
@@ -32,7 +32,7 @@ function! SetStyle()
 return
   endif
   let l:ext = fnamemodify(l:fname, ":e")
-  let l:c_exts = ['c', 'h', 'cpp', 'cc', 'C', 'H', 'def', 'java']
+  let l:c_exts = ['c', 'h', 'cpp', 'cc', 'C', 'H', 'def', 'java', 'pd']
   if index(l:c_exts, l:ext) != -1
 setlocal cindent
 setlocal tabstop=8



Re: [SVE] PR86753

2019-08-30 Thread Richard Biener
On Wed, Aug 28, 2019 at 11:02 AM Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > On Tue, 27 Aug 2019 at 21:14, Richard Sandiford
> >  wrote:
> >>
> >> Richard should have the final say, but some comments...
> >>
> >> Prathamesh Kulkarni  writes:
> >> > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> >> > index 1e2dfe5d22d..862206b3256 100644
> >> > --- a/gcc/tree-vect-stmts.c
> >> > +++ b/gcc/tree-vect-stmts.c
> >> > @@ -1989,17 +1989,31 @@ check_load_store_masking (loop_vec_info 
> >> > loop_vinfo, tree vectype,
> >> >
> >> >  static tree
> >> >  prepare_load_store_mask (tree mask_type, tree loop_mask, tree vec_mask,
> >> > -  gimple_stmt_iterator *gsi)
> >> > +  gimple_stmt_iterator *gsi, tree mask,
> >> > +  cond_vmask_map_type *cond_to_vec_mask)
> >>
> >> "scalar_mask" might be a better name.  But maybe we should key off the
> >> vector mask after all, now that we're relying on the code having no
> >> redundancies.
> >>
> >> Passing the vinfo would be better than passing the cond_vmask_map_type
> >> directly.
> >>
> >> >  {
> >> >gcc_assert (useless_type_conversion_p (mask_type, TREE_TYPE 
> >> > (vec_mask)));
> >> >if (!loop_mask)
> >> >  return vec_mask;
> >> >
> >> >gcc_assert (TREE_TYPE (loop_mask) == mask_type);
> >> > +
> >> > +  tree *slot = 0;
> >> > +  if (cond_to_vec_mask)
> >>
> >> The pointer should never be null in this context.
> > Disabling check for NULL results in segfault with cond_arith_4.c because we
> > reach prepare_load_store_mask via vect_schedule_slp, called from
> > here in vect_transform_loop:
> >  /* Schedule the SLP instances first, then handle loop vectorization
> >  below.  */
> >   if (!loop_vinfo->slp_instances.is_empty ())
> > {
> >   DUMP_VECT_SCOPE ("scheduling SLP instances");
> >   vect_schedule_slp (loop_vinfo);
> > }
> >
> > which is before bb processing loop.
>
> We want this optimisation to be applied to SLP too though.  Especially
> since non-SLP will be going away at some point.
>
> But as Richard says, the problem with SLP is that the statements aren't
> traversed in block order, so I guess we can't do the on-the-fly
> redundancy elimination there...

And the current patch AFAICS can generate wrong SSA for this reason.

> Maybe an alternative would be to record during the analysis phase which
> scalar conditions need which loop masks.  Statements that need a loop
> mask currently do:
>
>   vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype);
>
> If we also pass the scalar condition, we can maintain a hash_set of
>  pairs, representing the conditions that have
> loop masks applied at some point in the vectorised code.  The COND_EXPR
> code can use that set to decide whether to apply the loop mask or not.

Yeah, that sounds better.

Note that I don't like the extra "helpers" in fold-const.c/h, they do not look
useful in general so put them into vectorizer private code.  The decomposing
also doesn't look too nice, instead prepare_load_store_mask could get
such decomposed representation - possibly quite natural with the suggestion
from Richard above.

Richard.

> Trying to avoid duplicate ANDs with the loop mask would then become a
> separate follow-on change.  Not sure whether it's worth it on its own.
>
> Thanks,
> Richard


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 12:06 PM Segher Boessenkool
 wrote:
>
> On Fri, Aug 30, 2019 at 09:12:11AM +0200, Richard Biener wrote:
> > On GIMPLE there isn't a good reason to split out trapping comparisons
> > from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
> > where it is important because we'd have no way to represent EH info
> > when not done.  It might be a bit awkward to preserve EH across RTL
> > expansion though in case the [VEC_]COND_EXPR are not expanded
> > as a single pattern, but I'm not sure.
>
> The *language* specifies which comparisons trap on unordered and which
> do not.  (Hopefully it is similar for all or this is going to be a huge
> mess :-) )  So Gimple needs to at least keep track of this.  There also
> are various optimisations that can be done -- two signaling comparisons
> with the same arguments can be folded to just one, for example -- so it
> seems to me you want to represent this in Gimple, never mind if you do
> EH for them or just a magical trap or whatever.

The issue with GIMPLE_CONDs is that we can't have EH edges out
of blocks ending in them so to represent the compare-and-jump
single GIMPLE insn raising exceptions we need to split the actually
trapping compare away.

For [VEC_]COND_EXPR we'd need to say the whole expression
can throw/trap if the embedded comparison can.  On GIMPLE this
is enough precision but we of course have to handle it that way
then.  Currently operation_could_trap* do not consider [VEC_]COND_EXPR
trapping nor does tree_could_trap_p.  stmt_could_throw_1_p looks
fine by means of falling back to checking individual operands and
then running into the embedded comparison.  But we do have
operation_could_trap* callers which would need to be adjusted
on the GIMPLE side.

Richard.

>
> Segher


Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

2019-08-30 Thread Martin Liška
@Richi: PING^1

On 7/16/19 8:35 AM, Li Jia He wrote:
> 
> 
> On 2019/7/2 4:51 PM, Richard Biener wrote:
>> On Tue, 2 Jul 2019, Richard Biener wrote:
>>
>>> On Tue, 2 Jul 2019, Li Jia He wrote:
>>>


 On 2019/7/1 3:30 PM, Richard Biener wrote:
> On Fri, 28 Jun 2019, Andrew Pinski wrote:
>
>> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He  wrote:
>>>
>>>
>>>
>>> On 2019/6/27 11:48 PM, Jeff Law wrote:
 On 6/27/19 12:11 AM, Li Jia He wrote:
> Hi,
>
> According to the optimizable case described by Qi Feng on
> issue 88784, we can combine the cases into the following:
>
> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> 2. x >  y  &&  x == XXX_MIN  -->   false
> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
>
> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> 5. x <  y  &&  x == XXX_MAX  -->   false
> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
>
> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> 8. x <= y  ||  x != XXX_MIN  -->   true
> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
>
> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> 11. x >= y  ||  x != XXX_MAX  -->   true
> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
>
> Note: XXX_MIN represents the minimum value of type x.
>  XXX_MAX represents the maximum value of type x.
>
> Here we don't need to care about whether the operation is
> signed or unsigned.  For example, in the below equation:
>
> 'x >  y  &&  x != XXX_MIN  -->   x > y'
>
> If the x type is signed int and XXX_MIN is INT_MIN, we can
> optimize it to 'x > y'.  However, if the type of x is unsigned
> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
>
> The regression testing for the patch was done on GCC mainline on
>
>    powerpc64le-unknown-linux-gnu (Power 9 LE)
>
> with no regressions.  Is it OK for trunk ?
>
> Thanks,
> Lijia He
>
> gcc/ChangeLog
>
> 2019-06-27  Li Jia He  
>    Qi Feng  
>
>    PR middle-end/88784
>    * gimple-fold.c (and_comparisons_contain_equal_operands): New
> function.
>    (and_comparisons_1): Use
> and_comparisons_contain_equal_operands.
>    (or_comparisons_contain_equal_operands): New function.
>    (or_comparisons_1): Use or_comparisons_contain_equal_operands.
 Would this be better done via match.pd?  ISTM this transformation
 would
 be well suited for that framework.
>>>
>>> Hi, Jeff
>>>
>>> I did this because of the following test case:
>>> `
>>> _Bool comp(unsigned x, unsigned y)
>>> {
>>>  return x > y && x != 0;
>>> }
>>> `
>>> The gimple file dumped on the power platform is:
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>  _Bool D.2837;
>>>  int iftmp.0;
>>>
>>>  if (x > y) goto ; else goto ;
>>>  :
>>>  if (x != 0) goto ; else goto ;
>>>  :
>>>  iftmp.0 = 1;
>>>  goto ;
>>>  :
>>>  iftmp.0 = 0;
>>>  :
>>>  D.2837 = (_Bool) iftmp.0;
>>>  return D.2837;
>>> }
>>> `
>>> However, the gimple file dumped on x86 is
>>> `
>>> comp (unsigned int x, unsigned int y)
>>> {
>>>  _Bool D.2837;
>>>
>>>  _1 = x > y;
>>>  _2 = x != 0;
>>>  _3 = _1 & _2;
>>>  _4 = (int) _3;
>>>  D.2837 = (_Bool) _4;
>>>  return D.2837;
>>> }
>>> `
>>>
>>> The reason for the inconsistency between these two behaviors is param
>>> logical-op-non-short-circuit.  If we add the pattern to the match.pd
>>> file, we can only optimize the situation in which the statement is in
>>> the same basic block (logical-op-non-short-circuit=1, x86).  But for
>>> a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
>>> can't handle this situation.
>>>
>>> Another reason is that I found out maybe_fold_and_comparisons and
>>> maybe_fold_or_comparisons are not only called by ifcombine pass but
>>> also by reassoc pass. Using this method can basically unify param
>>> logical-op-non-short-circuit=0 or 1.
>>
>>
>> As mentioned before ifcombine pass should be using gimple-match
>> instead of fold_build.  Try converting ifcombine over to gimple-match
>> infrastructure and add these to match.pd.
>
> Yes, I mentioned that in the PR.  The issue is that at the moment
> to combine x > y with x <= y you'd have to build GENERIC trees
> for both or temporary GIMPLE assign with a SSA def (and then feed
> that into the GENERIC o

[PATCH 3/3][MSP430] Use default_elf_select_section to select sections for data where possible

2019-08-30 Thread Jozef Lawrynowicz
With the "noinit" attribute now handled generically, direct assignment of
data with the "noinit" attribute to the ".noinit" attribute can be removed from
the msp430 back end, and default_elf_select_section can be used instead.

default_elf_select_section can also be used for SECCAT_RODATA_MERGE_* sections,
to enable the linker to merge constant data.
>From 2d2bc7f11b7d5bfc918351a5963b041f69aac673 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Wed, 28 Aug 2019 17:54:22 +0100
Subject: [PATCH 3/3] MSP430: Use default_elf_select_section to determine
 sections for data where possible

gcc/ChangeLog:

2019-08-29  Jozef Lawrynowicz  

	* config/msp430/msp430.c (msp430_init_sections): Remove handling of the
	noinit section.
	(msp430_select_section): Handle decls with the "noinit" attribute with
	default_elf_select_section.
	Handle SECCAT_RODATA_MERGE_* section types with
	default_elf_select_section.
	Add comments about handling of unsupported section types.
	(msp430_section_type_flags): Remove handling of the noinit section.

---
 gcc/config/msp430/msp430.c | 81 +++---
 1 file changed, 50 insertions(+), 31 deletions(-)

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 1b2e9683a94..d409ea15df2 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1785,7 +1785,6 @@ gen_prefix (tree decl)
   return NULL;
 }
 
-static section * noinit_section;
 static section * persist_section;
 
 #undef  TARGET_ASM_INIT_SECTIONS
@@ -1794,8 +1793,6 @@ static section * persist_section;
 static void
 msp430_init_sections (void)
 {
-  noinit_section = get_unnamed_section (0, output_section_asm_op,
-	".section .noinit,\"aw\"");
   persist_section = get_unnamed_section (0, output_section_asm_op,
 	 ".section .persistent,\"aw\"");
 }
@@ -1806,6 +1803,10 @@ msp430_init_sections (void)
 static section *
 msp430_select_section (tree decl, int reloc, unsigned HOST_WIDE_INT align)
 {
+  const char * prefix;
+  const char * sec_name;
+  const char * base_sec_name;
+
   gcc_assert (decl != NULL_TREE);
 
   if (TREE_CODE (decl) == STRING_CST
@@ -1821,51 +1822,71 @@ msp430_select_section (tree decl, int reloc, unsigned HOST_WIDE_INT align)
   && is_interrupt_func (decl))
 return get_section (".lowtext", SECTION_CODE | SECTION_WRITE , decl);
 
-  const char * prefix = gen_prefix (decl);
-  if (prefix == NULL)
-{
-  if (TREE_CODE (decl) == FUNCTION_DECL)
-	return text_section;
-  /* FIXME: ATTR_NOINIT is handled generically in
-	 default_elf_select_section.  */
-  else if (has_attr (ATTR_NOINIT, decl))
-	return noinit_section;
-  else if (has_attr (ATTR_PERSIST, decl))
-	return persist_section;
-  else
-	return default_select_section (decl, reloc, align);
-}
+  if (has_attr (ATTR_PERSIST, decl))
+return persist_section;
+
+  /* ATTR_NOINIT is handled generically.  */
+  if (has_attr (ATTR_NOINIT, decl))
+return default_elf_select_section (decl, reloc, align);
+
+  prefix = gen_prefix (decl);
 
-  const char * sec;
   switch (categorize_decl_for_section (decl, reloc))
 {
-case SECCAT_TEXT:   sec = ".text";   break;
-case SECCAT_DATA:   sec = ".data";   break;
-case SECCAT_BSS:sec = ".bss";break;
-case SECCAT_RODATA: sec = ".rodata"; break;
+case SECCAT_TEXT:
+  if (!prefix)
+	return text_section;
+  base_sec_name = ".text";
+  break;
+case SECCAT_DATA:
+  if (!prefix)
+	return data_section;
+  base_sec_name = ".data";
+  break;
+case SECCAT_BSS:
+  if (!prefix)
+	return bss_section;
+  base_sec_name = ".bss";
+  break;
+case SECCAT_RODATA:
+  if (!prefix)
+	return readonly_data_section;
+  base_sec_name = ".rodata";
+  break;
 
+/* Enable merging of constant data by the GNU linker using
+   default_elf_select_section and therefore enabling creation of
+   sections with the SHF_MERGE flag.  */
 case SECCAT_RODATA_MERGE_STR:
 case SECCAT_RODATA_MERGE_STR_INIT:
 case SECCAT_RODATA_MERGE_CONST:
+  return default_elf_select_section (decl, reloc, align);
+
+/* The sections listed below are are not supported for MSP430.
+   They should not be generated, but in case they are, we use
+   default_select_section so they get placed in sections
+   the msp430 assembler and linker understand.  */
+/* "small data" sections are not supported.  */
 case SECCAT_SRODATA:
-case SECCAT_DATA_REL:
-case SECCAT_DATA_REL_LOCAL:
-case SECCAT_DATA_REL_RO:
-case SECCAT_DATA_REL_RO_LOCAL:
 case SECCAT_SDATA:
 case SECCAT_SBSS:
+/* Thread-local storage (TLS) is not supported.  */
 case SECCAT_TDATA:
 case SECCAT_TBSS:
+/* Sections used by a dynamic linker are not supported.  */
+case SECCAT_DATA_REL:
+case SECCAT_DATA_REL_LOCAL:
+case SECCAT_DATA_REL_RO:
+case SECCAT_DATA_REL_RO_LOCAL:
   return default_select_section (decl, reloc, align);
 

[PATCH 2/3][MSP430] Setup exclusion tables for function and data attributes

2019-08-30 Thread Jozef Lawrynowicz
The attached patch removes hard-coded warnings from msp430 attribute handlers,
and replaces them with exclusion rules specified in the attribute_spec table.

Where msp430 attributes conflict with generic attributes, hard-coded warnings
are still necessary.
>From c6def571a47df5394ca9f5c5c4918f0ffcf97375 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Wed, 28 Aug 2019 17:44:16 +0100
Subject: [PATCH 2/3] MSP430: Setup exclusion tables for function and data
 attributes

gcc/ChangeLog:

2019-08-29  Jozef Lawrynowicz  

	* config/msp430/msp430.c (msp430_attr): Remove warnings about
	conflicting msp430-specific attributes.
	(msp430_section_attr): Likewise.
	Add warnings about conflicts with generic "noinit" and "section"
	attributes.
	Fix grammar in -mlarge error message.
	(msp430_data_attr): Rename to msp430_persist_attr.
	Add warnings about conflicts with generic "noinit" and "section"
	attributes.
	Add warning for when variable is not initialized.
	Chain conditionals which prevent the attribute being added.
	(ATTR_EXCL): New helper.
	(attr_reent_exclusions): New exclusion table.
	(attr_naked_exclusions): Likewise.
	(attr_crit_exclusions): Likewise.
	(attr_lower_exclusions): Likewise.
	(attr_upper_exclusions): Likewise.
	(attr_either_exclusions): Likewise.
	(attr_persist_exclusions): Likewise.
	(msp430_attribute_table): Update with exclusion rules.
	(msp430_output_aligned_decl_common): Don't output common symbol if decl
	has a section.

gcc/testsuite/ChangeLog:

2019-08-29  Jozef Lawrynowicz  

	* gcc.target/msp430/data-attributes-2.c: New test.
	* gcc.target/msp430/function-attributes-4.c: Update dg-warning
	strings.
	* gcc.target/msp430/region-attribute-misuse.c: Likewise.

---
 gcc/config/msp430/msp430.c| 199 +++---
 .../gcc.target/msp430/data-attributes-2.c |  51 +
 .../gcc.target/msp430/function-attributes-4.c |  27 +--
 .../msp430/region-attribute-misuse.c  |   6 +-
 4 files changed, 189 insertions(+), 94 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/data-attributes-2.c

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index d54a3749437..1b2e9683a94 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1345,35 +1345,19 @@ msp430_attr (tree * node,
 	}
   if (is_critical_func (* node))
 	{
+	  /* We always ignore the critical attribute when interrupt and
+	 critical are used together.  */
 	  warning (OPT_Wattributes,
 		   "critical attribute has no effect on interrupt functions");
 	  DECL_ATTRIBUTES (*node) = remove_attribute (ATTR_CRIT,
 		  DECL_ATTRIBUTES (* node));
 	}
 }
-  else if (TREE_NAME_EQ (name, ATTR_REENT))
-{
-  if (is_naked_func (* node))
-	message = "naked functions cannot be reentrant";
-  else if (is_critical_func (* node))
-	message = "critical functions cannot be reentrant";
-}
   else if (TREE_NAME_EQ (name, ATTR_CRIT))
 {
-  if (is_naked_func (* node))
-	message = "naked functions cannot be critical";
-  else if (is_reentrant_func (* node))
-	message = "reentrant functions cannot be critical";
-  else if (is_interrupt_func ( *node))
+  if (is_interrupt_func ( *node))
 	message = "critical attribute has no effect on interrupt functions";
 }
-  else if (TREE_NAME_EQ (name, ATTR_NAKED))
-{
-  if (is_critical_func (* node))
-	message = "critical functions cannot be naked";
-  else if (is_reentrant_func (* node))
-	message = "reentrant functions cannot be naked";
-}
 
   if (message)
 {
@@ -1396,32 +1380,15 @@ msp430_section_attr (tree * node,
 
   const char * message = NULL;
 
-  if (TREE_NAME_EQ (name, ATTR_UPPER))
-{
-  if (has_attr (ATTR_LOWER, * node))
-	message = "already marked with 'lower' attribute";
-  else if (has_attr (ATTR_EITHER, * node))
-	message = "already marked with 'either' attribute";
-  else if (! msp430x)
-	message = "upper attribute needs a 430X cpu";
-}
-  else if (TREE_NAME_EQ (name, ATTR_LOWER))
-{
-  if (has_attr (ATTR_UPPER, * node))
-	message = "already marked with 'upper' attribute";
-  else if (has_attr (ATTR_EITHER, * node))
-	message = "already marked with 'either' attribute";
-}
-  else
-{
-  gcc_assert (TREE_NAME_EQ (name, ATTR_EITHER));
-
-  if (has_attr (ATTR_LOWER, * node))
-	message = "already marked with 'lower' attribute";
-  else if (has_attr (ATTR_UPPER, * node))
-	message = "already marked with 'upper' attribute";
-}
-
+  /* The "noinit" and "section" attributes are handled generically, so we
+ cannot set up additional target-specific attribute exclusions using the
+ existing mechanism.  */
+  if (has_attr (ATTR_NOINIT, * node))
+message = G_("ignoring attribute %qE because it conflicts with "
+		 "attribute %");
+  else if (has_attr ("section", * node))
+message = G_("ignoring attribute %qE because it conflicts with "
+		 "attribute %");
   /* It does not make sense to use up

[PATCH 1/3] Implement TARGET_HANDLE_GENERIC_ATTRIBUTE

2019-08-30 Thread Jozef Lawrynowicz
The attached patch adds a new target hook "TARGET_HANDLE_GENERIC_ATTRIBUTE"
which enables a back end to perform additional processing of an attribute that
is normally handled by a front end.

So far only the "section" and "noinit" attribute make use of this hook, as the
msp430 back end requires additional attribute conflict checking to be performed
on these generic attributes.
>From e693da709114df378e2ea8b1d3729b105c99a495 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Wed, 28 Aug 2019 14:09:20 +0100
Subject: [PATCH 1/3] Implement TARGET_HANDLE_GENERIC_ATTRIBUTE

gcc/ChangeLog:

2019-08-29  Jozef Lawrynowicz  

	* config/msp430/msp430.c (TARGET_HANDLE_GENERIC_ATTRIBUTE): Define.
	(msp430_handle_generic_attribute): New function.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in: Add TARGET_HANDLE_GENERIC_ATTRIBUTE.
	* hooks.c (hook_tree_treeptr_tree_tree_int_boolptr_null): New.
	* hooks.h (hook_tree_treeptr_tree_tree_int_boolptr_null): New.
	* target.def: Define new hook TARGET_HANDLE_GENERIC_ATTRIBUTE.

gcc/c-family/ChangeLog:

2019-08-29  Jozef Lawrynowicz  

	* c-attribs.c (handle_section_attribute): Call the
	handle_generic_attribute target hook after performing target
	independent processing.
	(handle_noinit_attribute): Likewise.
---
 gcc/c-family/c-attribs.c   | 39 +++--
 gcc/config/msp430/msp430.c | 40 ++
 gcc/doc/tm.texi|  8 
 gcc/doc/tm.texi.in |  2 ++
 gcc/hooks.c|  6 ++
 gcc/hooks.h|  1 +
 gcc/target.def | 11 +++
 7 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 820c83fa3b9..e572c6502bb 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -1875,6 +1875,7 @@ handle_section_attribute (tree *node, tree ARG_UNUSED (name), tree args,
 			  int ARG_UNUSED (flags), bool *no_add_attrs)
 {
   tree decl = *node;
+  tree res = NULL_TREE;
 
   if (!targetm_common.have_named_sections)
 {
@@ -1922,12 +1923,20 @@ handle_section_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   goto fail;
 }
 
-  set_decl_section_name (decl, TREE_STRING_POINTER (TREE_VALUE (args)));
-  return NULL_TREE;
+  res = targetm.handle_generic_attribute (node, name, args, flags,
+	  no_add_attrs);
+
+  /* If the back end confirms the attribute can be added then continue onto
+ final processing.  */
+  if (!(* no_add_attrs))
+{
+  set_decl_section_name (decl, TREE_STRING_POINTER (TREE_VALUE (args)));
+  return res;
+}
 
 fail:
   *no_add_attrs = true;
-  return NULL_TREE;
+  return res;
 }
 
 /* If in c++-11, check if the c++-11 alignment constraint with respect
@@ -2249,6 +2258,7 @@ handle_noinit_attribute (tree * node,
 		  bool *no_add_attrs)
 {
   const char *message = NULL;
+  tree res = NULL_TREE;
 
   gcc_assert (DECL_P (*node));
   gcc_assert (args == NULL);
@@ -2271,14 +2281,23 @@ handle_noinit_attribute (tree * node,
   *no_add_attrs = true;
 }
   else
-/* If this var is thought to be common, then change this.  Common
-   variables are assigned to sections before the backend has a
-   chance to process them.  Do this only if the attribute is
-   valid.  */
-if (DECL_COMMON (*node))
-  DECL_COMMON (*node) = 0;
+{
+  res = targetm.handle_generic_attribute (node, name, args, flags,
+	  no_add_attrs);
+  /* If the back end confirms the attribute can be added then continue onto
+	 final processing.  */
+  if (!(* no_add_attrs))
+	{
+	  /* If this var is thought to be common, then change this.  Common
+	 variables are assigned to sections before the backend has a
+	 chance to process them.  Do this only if the attribute is
+	 valid.  */
+	  if (DECL_COMMON (* node))
+	DECL_COMMON (* node) = 0;
+	}
+}
 
-  return NULL_TREE;
+  return res;
 }
 
 
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index c5cf7044aef..d54a3749437 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1518,6 +1518,46 @@ const struct attribute_spec msp430_attribute_table[] =
 { NULL,		0, 0, false, false, false, false, NULL,  NULL }
   };
 
+#undef TARGET_HANDLE_GENERIC_ATTRIBUTE
+#define TARGET_HANDLE_GENERIC_ATTRIBUTE msp430_handle_generic_attribute
+
+tree
+msp430_handle_generic_attribute (tree * node,
+ tree   name,
+ tree   args ATTRIBUTE_UNUSED,
+ intflags ATTRIBUTE_UNUSED,
+ bool * no_add_attrs)
+
+{
+  const char * message = NULL;
+
+  if (!(TREE_NAME_EQ (name, ATTR_NOINIT) || TREE_NAME_EQ (name, "section")))
+return NULL_TREE;
+
+  /* The front end has set up an exclusion between the "noinit" and "section"
+ attributes.  */
+  if (has_attr (ATTR_LOWER, * node))
+message = G_("ignoring attribute %qE because it conflicts with "
+		 "attribute %");
+  else if (has_attr (ATTR_UPPER, * node))
+message = G_("ign

[PATCH 0/3] MSP430: Improve attribute handling

2019-08-30 Thread Jozef Lawrynowicz
The following series of patches improves the handling of msp430-specific
attributes by making use of generic mechanisms for performing common
tasks (i.e. handling attribute conflicts, putting data objects in sections).

The patches also transition the msp430 back end to fully use the generic
handling of the "noinit" attribute.

Successfully bootstrapped and regtested on x86_64-pc-linux-gnu.
Successfully regtested for msp430-elf.

As a further sanity test I built GCC for arm-eabi and ran
execute.exp=noinit-attribute.c to confirm the noinit attribute still
works as expected for ARM.

Ok for trunk?

Jozef Lawrynowicz (3):
  Implement TARGET_HANDLE_GENERIC_ATTRIBUTE
  MSP430: Setup exclusion tables for function and data attributes
  MSP430: Use default_elf_select_section to determine sections for data
where possible

 gcc/c-family/c-attribs.c  |  39 ++-
 gcc/config/msp430/msp430.c| 320 --
 gcc/doc/tm.texi   |   8 +
 gcc/doc/tm.texi.in|   2 +
 gcc/hooks.c   |   6 +
 gcc/hooks.h   |   1 +
 gcc/target.def|  11 +
 .../gcc.target/msp430/data-attributes-2.c |  51 +++
 .../gcc.target/msp430/function-attributes-4.c |  27 +-
 .../msp430/region-attribute-misuse.c  |   6 +-
 10 files changed, 336 insertions(+), 135 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/data-attributes-2.c


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Segher Boessenkool
On Fri, Aug 30, 2019 at 09:12:11AM +0200, Richard Biener wrote:
> On GIMPLE there isn't a good reason to split out trapping comparisons
> from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
> where it is important because we'd have no way to represent EH info
> when not done.  It might be a bit awkward to preserve EH across RTL
> expansion though in case the [VEC_]COND_EXPR are not expanded
> as a single pattern, but I'm not sure.

The *language* specifies which comparisons trap on unordered and which
do not.  (Hopefully it is similar for all or this is going to be a huge
mess :-) )  So Gimple needs to at least keep track of this.  There also
are various optimisations that can be done -- two signaling comparisons
with the same arguments can be folded to just one, for example -- so it
seems to me you want to represent this in Gimple, never mind if you do
EH for them or just a magical trap or whatever.


Segher


Re: [PATCH V6 05/11] bpf: new GCC port

2019-08-30 Thread Jose E. Marchesi


> This patch adds a port for the Linux kernel eBPF architecture to GCC.
>
> ChangeLog:
>
>   * configure.ac: Support for bpf-*-* targets.
>   * configure: Regenerate.
>
> contrib/ChangeLog:
>
>   * config-list.mk (LIST): Disable go in bpf-*-* targets.
>
> gcc/ChangeLog:
>
>   * config.gcc: Support for bpf-*-* targets.
>   * common/config/bpf/bpf-common.c: New file.
>   * config/bpf/t-bpf: Likewise.
>   * config/bpf/predicates.md: Likewise.
>   * config/bpf/constraints.md: Likewise.
>   * config/bpf/bpf.opt: Likewise.
>   * config/bpf/bpf.md: Likewise.
>   * config/bpf/bpf.h: Likewise.
>   * config/bpf/bpf.c: Likewise.
>   * config/bpf/bpf-protos.h: Likewise.
>   * config/bpf/bpf-opts.h: Likewise.
>   * config/bpf/bpf-helpers.h: Likewise.
>   * config/bpf/bpf-helpers.def: Likewise.

Looks good to me, thanks.

Thanks to you for the throughful, super useful review.

Jeff also had detailed comments, so please wait for an ack from him
too.

Sure.


Re: [PATCH] or1k: Fix issue with set_got clobbering r9

2019-08-30 Thread Stafford Horne
Hello, any comments on this?

If nothing I will commit in a few days.

On Thu, Aug 22, 2019 at 08:44:04PM +0900, Stafford Horne wrote:
> When compiling glibc we found that the GOT register was being allocated
> r9 when the instruction was still set_got_tmp.  That caused set_got to
> clobber r9.  We cannot simply say set_got_tmp clobbers r9 as this is the
> reason for having the temporary set_got_tmp.
> 
> Fix by using a register class constraint that does not allow r9 during
> register allocation.
> 
> gcc/ChangeLog:
> 
>   * config/or1k/constraints.md (t): New constraint.
>   * config/or1k/or1k.h (GOT_REGS): New register class.
>   * config/or1k/or1k.md (set_got_tmp, set_got): Use t contraint.
> ---
>  gcc/config/or1k/constraints.md | 4 
>  gcc/config/or1k/or1k.h | 3 +++
>  gcc/config/or1k/or1k.md| 4 ++--
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/or1k/constraints.md b/gcc/config/or1k/constraints.md
> index 8cac7eb5329..ba330c6b8c2 100644
> --- a/gcc/config/or1k/constraints.md
> +++ b/gcc/config/or1k/constraints.md
> @@ -25,6 +25,7 @@
>  ; We use:
>  ;  c - sibcall registers
>  ;  d - double pair base registers (excludes r0, r30 and r31 which overflow)
> +;  t - got address registers (excludes r9 is clobbered by set_got)

I will changee this to (... r9 which is clobbered ...)

>  ;  I - constant signed 16-bit
>  ;  K - constant unsigned 16-bit
>  ;  M - constant signed 16-bit shifted left 16-bits (l.movhi)
> @@ -36,6 +37,9 @@
>  (define_register_constraint "d" "DOUBLE_REGS"
>"Registers which can be used for double reg pairs.")
>  
> +(define_register_constraint "t" "GOT_REGS"
> +  "Registers which can be used to store the Global Offset Table (GOT) 
> address.")
> +
>  ;; Immediates
>  (define_constraint "I"
>"A signed 16-bit immediate in the range -32768 to 32767."
> diff --git a/gcc/config/or1k/or1k.h b/gcc/config/or1k/or1k.h
> index 2b29e62fdd3..4c32607bac1 100644
> --- a/gcc/config/or1k/or1k.h
> +++ b/gcc/config/or1k/or1k.h
> @@ -190,6 +190,7 @@ enum reg_class
>NO_REGS,
>SIBCALL_REGS,
>DOUBLE_REGS,
> +  GOT_REGS,
>GENERAL_REGS,
>FLAG_REGS,
>ALL_REGS,
> @@ -202,6 +203,7 @@ enum reg_class
>"NO_REGS", \
>"SIBCALL_REGS",\
>"DOUBLE_REGS", \
> +  "GOT_REGS",\
>"GENERAL_REGS",\
>"FLAG_REGS",   \
>"ALL_REGS" }
> @@ -215,6 +217,7 @@ enum reg_class
>  { { 0x, 0x },\
>{ SIBCALL_REGS_MASK,   0 },\
>{ 0x7f7e, 0x },\
> +  { 0xfdff, 0x },\
>{ 0x, 0x0003 },\
>{ 0x, 0x0004 },\
>{ 0x, 0x0007 } \
> diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
> index cee11d078cc..36bcee336ab 100644
> --- a/gcc/config/or1k/or1k.md
> +++ b/gcc/config/or1k/or1k.md
> @@ -706,7 +706,7 @@
>  ;; set_got pattern below.  This works because the set_got_tmp insn is the
>  ;; first insn in the stream and that it isn't moved during RA.
>  (define_insn "set_got_tmp"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> +  [(set (match_operand:SI 0 "register_operand" "=t")
>   (unspec_volatile:SI [(const_int 0)] UNSPECV_SET_GOT))]
>""
>  {
> @@ -715,7 +715,7 @@
>  
>  ;; The insn to initialize the GOT.
>  (define_insn "set_got"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> +  [(set (match_operand:SI 0 "register_operand" "=t")
>   (unspec:SI [(const_int 0)] UNSPEC_SET_GOT))
> (clobber (reg:SI LR_REGNUM))]
>""
> -- 
> 2.21.0
> 


Re: [PATCH] Sanitizing the middle-end interface to the back-end for strict alignment

2019-08-30 Thread Kyrill Tkachov

Hi Bernd,

On 8/29/19 10:26 PM, Bernd Edlinger wrote:

On 8/29/19 11:08 AM, Christophe Lyon wrote:

On Thu, 29 Aug 2019 at 10:58, Kyrill Tkachov
 wrote:

Hi Bernd,

On 8/28/19 10:36 PM, Bernd Edlinger wrote:

On 8/28/19 2:07 PM, Christophe Lyon wrote:

Hi,

This patch causes an ICE when building libgcc's unwind-arm.o
when configuring GCC:
--target  arm-none-linux-gnueabihf --with-mode thumb --with-cpu
cortex-a15 --with-fpu neon-vfpv4:

The build works for the same target, but --with-mode arm --with-cpu
cortex a9 --with-fpu vfp

In file included from
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/config/arm/unwind-arm.c:144:
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
In function 'get_eit_entry':
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:245:29:
warning: cast discards 'const' qualifier from pointer target type
[-Wcast-qual]
245 |   ucbp->pr_cache.ehtp = (_Unwind_EHT_Header *)&eitp->content;
| ^
during RTL pass: expand
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
In function 'unwind_phase2_forced':
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:319:18:
internal compiler error: in gen_movdi, at config/arm/arm.md:5235
319 |   saved_vrs.core = entry_vrs->core;
|   ~~~^
0x126530f gen_movdi(rtx_def*, rtx_def*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:5235
0x896d92 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.h:318
0x896d92 emit_move_insn_1(rtx_def*, rtx_def*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3694
0x897083 emit_move_insn(rtx_def*, rtx_def*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3790
0xfc25d6 gen_cpymem_ldrd_strd(rtx_def**)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:14582
0x126a1f1 gen_cpymemqi(rtx_def*, rtx_def*, rtx_def*, rtx_def*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:6688
0xb0bc08 maybe_expand_insn(insn_code, unsigned int, expand_operand*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/optabs.c:7440
0x89ba1e emit_block_move_via_cpymem
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1808
0x89ba1e emit_block_move_hints(rtx_def*, rtx_def*, rtx_def*,
block_op_methods, unsigned int, long, unsigned long, unsigned long,
unsigned long, bool, bool*)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1627
0x89c383 emit_block_move(rtx_def*, rtx_def*, rtx_def*, block_op_methods)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1667
0x89fb4e store_expr(tree_node*, rtx_def*, int, bool, bool)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5845
0x88c1f9 store_field
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:7149
0x8a0c22 expand_assignment(tree_node*, tree_node*, bool)
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5304
0x761964 expand_gimple_stmt_1
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3779
0x761964 expand_gimple_stmt
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3875
0x768583 expand_gimple_basic_block
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5915
0x76abc6 execute
  
/tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6538

Christophe


Okay, sorry for the breakage.

What is happening in gen_cpymem_ldrd_strd is of course against the rules:

It uses emit_move_insn on only 4-byte aligned DI-mode memory operands.

I have a patch for this, which is able to fix the libgcc build on a cross, but 
have no
possibility to bootstrap the affected target.

Could you please help?

Well it's good that the sanitisation is catching the bugs!


Yes, more than expected, though ;)


Bootstrapping this patch I get another assert with the backtrace:

Thanks for the additional testing, Kyrill!

FWIW, my original report was with a failure to just build GCC for
cortex-a15. I later got the reports of testing cross-toolchains, and
saw other problems on cortex-a9 for instance.
But I guess, you have noticed them with your bootstrap?
on arm-linux-gnueabi
gcc.target/arm/aapcs/align4.c (internal compiler error)
gcc.target/arm/aapcs/align_rec4.c (internal compiler error)


This appears to be yet unknown middle-end bug (not fixed by current patch)

$ arm-linux-gnueabihf-gcc align4.c
during RTL pass: expand
In file included from align4.c:22:
align4.c: In function 'testfunc':
abitest.h:73:42: internal compiler error: in gen

Re: [ARM/FDPIC v5 01/21] [ARM] FDPIC: Add -mfdpic option support

2019-08-30 Thread Richard Sandiford
Christophe Lyon  writes:
> On 16/07/2019 12:11, Richard Sandiford wrote:
>> [This isn't really something that should be reviewed under global
>> reviewership, but if it's either that or nothing, I'll do it anyway...]
>> 
>> Christophe Lyon  writes:
>>> 2019-XX-XX  Christophe Lyon  
>>> Mickaël Guêné  
>>>
>>> gcc/
>>> * config/arm/arm.opt: Add -mfdpic option.
>>> * doc/invoke.texi: Add documentation for -mfdpic.
>>>
>>> Change-Id: I0eabd1d11c9406fd4a43c4333689ebebbfcc4fe8
>>>
>>> diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
>>> index 9067d49..2ed3bd5 100644
>>> --- a/gcc/config/arm/arm.opt
>>> +++ b/gcc/config/arm/arm.opt
>>> @@ -306,3 +306,7 @@ Cost to assume for a branch insn.
>>>   mgeneral-regs-only
>>>   Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Save
>>>   Generate code which uses the core registers only (r0-r14).
>>> +
>>> +mfdpic
>>> +Target Report Mask(FDPIC)
>>> +Enable Function Descriptor PIC mode.
>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>> index 29585cf..805d7cc 100644
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi
>>> @@ -703,7 +703,8 @@ Objective-C and Objective-C++ Dialects}.
>>>   -mrestrict-it @gol
>>>   -mverbose-cost-dump @gol
>>>   -mpure-code @gol
>>> --mcmse}
>>> +-mcmse @gol
>>> +-mfdpic}
>>>   
>>>   @emph{AVR Options}
>>>   @gccoptlist{-mmcu=@var{mcu}  -mabsdata  -maccumulate-args @gol
>>> @@ -17912,6 +17913,23 @@ MOVT instruction.
>>>   Generate secure code as per the "ARMv8-M Security Extensions: 
>>> Requirements on
>>>   Development Tools Engineering Specification", which can be found on
>>>   
>>> @url{http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf}.
>>> +
>>> +@item -mfdpic
>>> +@itemx -mno-fdpic
>>> +@opindex mfdpic
>>> +@opindex mno-fdpic
>>> +Select the FDPIC ABI, which uses function descriptors to represent
>> 
>> Maybe "64-bit function descriptors"?  Just a suggestion, might not be useful.
>> 
>> OK with that change, thanks.
>
> OK, here is a new version, where I added a few words to explain that -static
> is not supported.
>
> Thanks,
> Christophe
>
>> 
>> Richard
>> 
>>> +pointers to functions.  When the compiler is configured for
>>> +@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
>>> +and implies @option{-fPIE} if none of the PIC/PIE-related options is
>>> +provided.  On other targets, it only enables the FDPIC-specific code
>>> +generation features, and the user should explicitly provide the
>>> +PIC/PIE-related options as needed.
>>> +
>>> +The opposite @option{-mno-fdpic} option is useful (and required) to
>>> +build the Linux kernel using the same (@code{arm-*-uclinuxfdpiceabi})
>>> +toolchain as the one used to build the userland programs.
>>> +
>>>   @end table
>>>   
>>>   @node AVR Options
>> .
>> 
>
> From c936684e2b77ff5716bd8b67c617dcad088c72e0 Mon Sep 17 00:00:00 2001
> From: Christophe Lyon 
> Date: Thu, 8 Feb 2018 10:44:32 +0100
> Subject: [ARM/FDPIC v6 01/24] [ARM] FDPIC: Add -mfdpic option support
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> 2019-XX-XX  Christophe Lyon  
>   Micka«l Guªn©  
>
>   gcc/
>   * config/arm/arm.opt: Add -mfdpic option.
>   * doc/invoke.texi: Add documentation for -mfdpic.
>
> Change-Id: I05b98d6ae87c2b3fc04dd7fba415c730accdf33e
>
> diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
> index 9067d49..2ed3bd5 100644
> --- a/gcc/config/arm/arm.opt
> +++ b/gcc/config/arm/arm.opt
> @@ -306,3 +306,7 @@ Cost to assume for a branch insn.
>  mgeneral-regs-only
>  Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Save
>  Generate code which uses the core registers only (r0-r14).
> +
> +mfdpic
> +Target Report Mask(FDPIC)
> +Enable Function Descriptor PIC mode.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 29585cf..b77fa06 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -703,7 +703,8 @@ Objective-C and Objective-C++ Dialects}.
>  -mrestrict-it @gol
>  -mverbose-cost-dump @gol
>  -mpure-code @gol
> --mcmse}
> +-mcmse @gol
> +-mfdpic}
>  
>  @emph{AVR Options}
>  @gccoptlist{-mmcu=@var{mcu}  -mabsdata  -maccumulate-args @gol
> @@ -17912,6 +17913,27 @@ MOVT instruction.
>  Generate secure code as per the "ARMv8-M Security Extensions: Requirements on
>  Development Tools Engineering Specification", which can be found on
>  
> @url{http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf}.
> +
> +@item -mfdpic
> +@itemx -mno-fdpic
> +@opindex mfdpic
> +@opindex mno-fdpic
> +Select the FDPIC ABI, which uses 64-bit function descriptors to
> +represent pointers to functions.  When the compiler is configured for
> +@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
> +and implies @option{-fPIE} if none of the PIC/PIE-related options is
> +provided.  On other targets, it o

Re: [ARM/FDPIC v5 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2019-08-30 Thread Richard Sandiford
Christophe Lyon  writes:
> 2019-XX-XX  Christophe Lyon  
>   Micka«l Guªn© 
>
>   gcc/
>   * config.gcc: Handle arm*-*-uclinuxfdpiceabi.
>   * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): New.
>   (SUBTARGET_EXTRA_ASM_SPEC): Use TARGET_FDPIC_ASM_SPEC.
>   * config/arm/linux-eabi.h (FDPIC_CC1_SPEC): New.
>   (CC1_SPEC): Use FDPIC_CC1_SPEC.
>   (MUSL_DYNAMIC_LINKER): Add -fdpic suffix when needed.
>   * config/arm/uclinuxfdpiceabi.h: New file.
>
>   libsanitizer/
>   * configure.tgt (arm*-*-*fdpiceabi): Sanitizers are
>   unsupported in this configuration.
>
> Change-Id: I74ac1fbb2e809e864d2b0acce66b173e76bcf92b
>
> diff --git a/gcc/config/arm/uclinuxfdpiceabi.h 
> b/gcc/config/arm/uclinuxfdpiceabi.h
> new file mode 100644
> index 000..2d0c04b
> --- /dev/null
> +++ b/gcc/config/arm/uclinuxfdpiceabi.h
> @@ -0,0 +1,54 @@
> +/* Configuration file for ARM GNU/Linux FDPIC EABI targets.
> +   Copyright (C) 2018 Free Software Foundation, Inc.

Copyright year should be/include 2019.

OK with that change if no Arm maintainer objects before the rest
of the patch series is approved.

> 2019-XX-XX  Christophe Lyon  
>
>   gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_static): Disable
>   for ARM FDPIC target.

OK, thanks.

Richard


Re: [ARM/FDPIC v5 06/21] [ARM] FDPIC: Add support for c++ exceptions

2019-08-30 Thread Kyrill Tkachov
As Richard mentioned in an earlier post the generic libgcc and libstdc++ 
changes will need approval from the relevant maintainers.


CC'ing the libstdc++ list and the libgcc maintainer.

On 5/15/19 1:39 PM, Christophe Lyon wrote:

The main difference with existing support is that function addresses
are function descriptor addresses instead. This means that all code
dealing with function pointers now has to cope with function
descriptors instead.

For the same reason, Linux kernel helpers can no longer be called by
dereferencing their address, so we implement wrappers that directly
call the kernel helpers.

When restoring a function address, we also have to restore the FDPIC
register value (r9).

2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    gcc/
    * ginclude/unwind-arm-common.h (unwinder_cache): Add reserved5
    field.
    (FDPIC_REGNUM): New define.

    libgcc/
    * config/arm/linux-atomic.c (__kernel_cmpxchg): Add FDPIC support.
    (__kernel_dmb): Likewise.
    (__fdpic_cmpxchg): New function.
    (__fdpic_dmb): New function.
    * config/arm/unwind-arm.h (FDPIC_REGNUM): New define.
    (gnu_Unwind_Find_got): New function.
    (_Unwind_decode_typeinfo_ptr): Add FDPIC support.
    * unwind-arm-common.inc (UCB_PR_GOT): New.
    (funcdesc_t): New struct.
    (get_eit_entry): Add FDPIC support.
    (unwind_phase2): Likewise.
    (unwind_phase2_forced): Likewise.
    (__gnu_Unwind_RaiseException): Likewise.
    (__gnu_Unwind_Resume): Likewise.
    (__gnu_Unwind_Backtrace): Likewise.
    * unwind-pe.h (read_encoded_value_with_base): Likewise.

    libstdc++/
    * libsupc++/eh_personality.cc (get_ttype_entry): Add FDPIC
    support.

Change-Id: I64b81cfaf390a05f2fd121f44ba1912cb4b47cae

diff --git a/gcc/ginclude/unwind-arm-common.h 
b/gcc/ginclude/unwind-arm-common.h

index 6df783e..d4eb03e 100644
--- a/gcc/ginclude/unwind-arm-common.h
+++ b/gcc/ginclude/unwind-arm-common.h
@@ -91,7 +91,7 @@ extern "C" {
   _uw reserved2;  /* Personality routine address */
   _uw reserved3;  /* Saved callsite address */
   _uw reserved4;  /* Forced unwind stop arg */
- _uw reserved5;
+ _uw reserved5;  /* Personality routine GOT value in FDPIC 
mode.  */

 }
   unwinder_cache;
   /* Propagation barrier cache (valid after phase 1): */
diff --git a/libgcc/config/arm/linux-atomic.c 
b/libgcc/config/arm/linux-atomic.c

index 06a6d46..565f829 100644
--- a/libgcc/config/arm/linux-atomic.c
+++ b/libgcc/config/arm/linux-atomic.c
@@ -25,11 +25,62 @@ see the files COPYING3 and COPYING.RUNTIME 
respectively.  If not, see


 /* Kernel helper for compare-and-exchange.  */
 typedef int (__kernel_cmpxchg_t) (int oldval, int newval, int *ptr);
-#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)
+
+#define STR(X) #X
+#define XSTR(X) STR(X)
+
+#define KERNEL_CMPXCHG 0x0fc0
+
+#if __FDPIC__
+/* Non-FDPIC ABIs call __kernel_cmpxchg directly by dereferencing its
+   address, but under FDPIC we would generate a broken call
+   sequence. That's why we have to implement __kernel_cmpxchg and
+   __kernel_dmb here: this way, the FDPIC call sequence works.  */
+#define __kernel_cmpxchg __fdpic_cmpxchg
+#else
+#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) KERNEL_CMPXCHG)
+#endif

 /* Kernel helper for memory barrier.  */
 typedef void (__kernel_dmb_t) (void);
-#define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
+
+#define KERNEL_DMB 0x0fa0
+
+#if __FDPIC__
+#define __kernel_dmb __fdpic_dmb
+#else
+#define __kernel_dmb (*(__kernel_dmb_t *) KERNEL_DMB)
+#endif
+
+#if __FDPIC__
+static int __fdpic_cmpxchg (int oldval, int newval, int *ptr)
+{
+  int result;
+
+  asm volatile (
+   "ldr    ip, 1f\n\t"
+   "bx ip\n\t"
+   "1:\n\t"
+   ".word " XSTR(KERNEL_CMPXCHG) "\n\t"
+   : "=r" (result)
+   : "r" (oldval) , "r" (newval), "r" (ptr)
+   : "r3", "memory");
+  /* The result is actually returned by the kernel helper, we need
+ this to avoid a warning.  */
+  return result;
+}
+
+static void __fdpic_dmb (void)
+{
+  asm volatile (
+   "ldr    ip, 1f\n\t"
+   "bx ip\n\t"
+   "1:\n\t"
+   ".word " XSTR(KERNEL_DMB) "\n\t"
+   );
+}
+
+#endif

 /* Note: we implement byte, short and int versions of atomic 
operations using
    the above kernel helpers; see linux-atomic-64bit.c for "long long" 
(64-bit)
diff --git a/libgcc/config/arm/unwind-arm.h 
b/libgcc/config/arm/unwind-arm.h

index 43c5379..2bf320a 100644
--- a/libgcc/config/arm/unwind-arm.h
+++ b/libgcc/config/arm/unwind-arm.h
@@ -33,9 +33,33 @@
 /* Use IP as a scratch register within the personality routine.  */
 #define UNWIND_POINTER_REG 12

+#define FDPIC_REGNUM 9
+
+#define STR(x) #x
+#define XSTR(x) STR(x)
+
 #ifdef __cplusplus
 extern "C" {
 #endif
+_Unwind_Pt

Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2019-08-30 Thread Richard Sandiford
Christophe Lyon  writes:
> @@ -785,7 +785,7 @@ case ${target} in
>esac
>tmake_file="t-slibgcc"
>case $target in
> -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu)
> +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu  
> | *-*-uclinuxfdpiceabi)
>:;;
>  *-*-gnu*)
>native_system_header_dir=/include

I don't think this is necessary, since this target will never match the
following *-*-gnu*) stanza anyway.

> diff --git a/libtool.m4 b/libtool.m4
> index 8966762..64e507a 100644
> --- a/libtool.m4
> +++ b/libtool.m4
> @@ -3734,7 +3739,7 @@ m4_if([$1], [CXX], [
>   ;;
>   esac
>   ;;
> -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
> +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
>   case $cc_basename in
> KCC*)
>   # KAI C++ Compiler

Is this needed?  It seems to be in the !GCC branch of an if/else.

If it is needed, the default:

_LT_TAGVAR(lt_prog_compiler_can_build_shared, $1)=no

seems correct for non-FDPIC uclinux.

> @@ -4032,7 +4037,7 @@ m4_if([$1], [CXX], [
>_LT_TAGVAR(lt_prog_compiler_static, $1)='-non_shared'
>;;
>  
> -linux* | k*bsd*-gnu | kopensolaris*-gnu)
> +linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinux*)
>case $cc_basename in
># old Intel for x86_64 which still supported -KPIC.
>ecc*)

Same here.

> @@ -5946,7 +5951,7 @@ if test "$_lt_caught_CXX_error" != yes; then
>  _LT_TAGVAR(inherit_rpath, $1)=yes
>  ;;
>  
> -  linux* | k*bsd*-gnu | kopensolaris*-gnu)
> +  linux* | k*bsd*-gnu | kopensolaris*-gnu | uclinuxfdpiceabi)
>  case $cc_basename in
>KCC*)
>   # Kuck and Associates, Inc. (KAI) C++ Compiler

Here too the code seems to be dealing specifically with non-GCC compilers.

> @@ -6598,7 +6603,7 @@ interix[[3-9]]*)
>_LT_TAGVAR(postdeps,$1)=
>;;
>  
> -linux*)
> +linux* | uclinux*)
>case `$CC -V 2>&1 | sed 5q` in
>*Sun\ C*)
>  # Sun C++ 5.9

Here too.  (It only seems to do anything for Sun's C compiler.)

The fewer hunks we have to maintain downstream the better :-)

Richard


Re: [PATCH 0/3] mklog improvements

2019-08-30 Thread Martin Liška
PING^1

On 8/13/19 9:49 AM, Martin Liska wrote:
> Hi.
> 
> I'm sending format independent changes to mklog that should
> improve the script. It addresses couple of improvement
> mentioned here:
> https://gcc.gnu.org/ml/gcc-patches/2019-08/msg00031.html
> 
> Martin
> 
> Martin Liska (3):
>   Use argparse.ArgumentParser for mklog.
>   mklog: parse PR references from new test files
>   mklog: Do not print changed functions in testsuite
> 
>  contrib/mklog | 98 ---
>  1 file changed, 47 insertions(+), 51 deletions(-)
> 



Re: [PATCH] Come up with json::integer_number and use it in GCOV.

2019-08-30 Thread Martin Liška
PING^2

On 8/26/19 2:34 PM, Martin Liška wrote:
> PING^1
> 
> On 8/13/19 1:51 PM, Martin Liška wrote:
>> On 8/2/19 2:40 PM, David Malcolm wrote:
>>> Something that occurred to me reading the updated patch: maybe it would
>>> make things easier to have utility member functions of json::object to
>>> implicitly make the child, e.g.:
>>>
>>> void
>>> json::object::set (const char *key, long v)
>>> {
>>>set (key, new json::integer_number (v));
>>> }
>>>
>>> so that all those calls can be just:
>>>
>>>   obj->set ("line", exploc.line);
>>>   obj->set ("column", exploc.column);
>>>
>>> etc (assuming overloading is unambiguous).
>>>
>>> But that's probably orthogonal to this patch.
>>
>> Looks good to me. It's a candidate for a follow up patch.
>>
>>>
>>>
 And I changed all occurrences of float_number with integer_number
 as you suggested.
>>> Thanks.
>>>
 I'm currently testing the updated patch.
 Martin
>>> The updated patch looks good to me, but technically I'm not a reviewer
>>> for these files.
>>
>> Sure, I hope @Jakub or @Richi can approve me that?
>> Thanks,
>> Martin
>>
>>>
>>> Dave
>>
> 



Re: [PATCH] Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)

2019-08-30 Thread Feng Xue OS
> I was merely suggesting a better comment describing the function you are
> introducing.
Oh. I know.  Good wording. 

> Thinking about it a bit more, I think you simply do not want to ever
> push the extra VIEW_CONVERT_EXPR to the vector and in
> evaluate_conditions_for_known_args always do a fold_convert to the
> desired type (similarly like we do today).
Yes. It is, but at cost of creating an extra memory slot.

Feng

Re: [PATCH] Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)

2019-08-30 Thread Martin Jambor
Hi,

On Fri, Aug 30 2019, Feng Xue OS wrote:
>> (It's a bad idea to make ChangeLog entries part of the patch, it won't
>> apply to anyone, not even to you nowadays. )
> Got it. Will not include this kind of info in later patches.
>
>> I understand describing these things is difficult, but flatten is
>> strange way to describe what the function does.  What about somthing
>> like the following?
>> 
>> Analyze EXPR if it represents a series of simple operations performed on
>> a function parameter and return true if so.  FBI, STMT, INDEX_P, SIZE_P
>> and AGGPOS have the same meaning like in
>> unmodified_parm_or_parm_agg_item.  Operations on the parameter are
>> recorded to PARAM_OPS_P if it is not NULL.
> Operations should be recorded in some place, and this is why PARAM_OPS_P
> is used. Not quite understand this point.

I was merely suggesting a better comment describing the function you are
introducing.

>
>>> +   /* Find use of parameter, add a convert operation to describe
>>> +  result type, which may not be same as parameter type.  */
>>> +   eval_op.val_is_rhs = false;
>>> +   eval_op.val = NULL_TREE;
>>> +   eval_op.code = VIEW_CONVERT_EXPR;
>>> +   eval_op.type = TREE_TYPE (expr);
>>> +
>>> +   vec_safe_insert (*param_ops_p, 0, eval_op);
>
>> If we get here in the first iteration of the loop, could we not insert
>> anything into the vector and handle such cases in
>> evaluate_conditions_for_known_args like we do today (well, with
>> fold_convert might be better)?  It could save quite some memory and it
>> is important to try keep the memory footprint down in IPA summaries.
> Here is a little trick to make code of folding in 
> evaluate_conditions_for_known_args ()
> be simple. It does consume some memory for most cases. Will consider other way
> and remove this.

Thinking about it a bit more, I think you simply do not want to ever
push the extra VIEW_CONVERT_EXPR to the vector and in
evaluate_conditions_for_known_args always do a fold_convert to the
desired type (similarly like we do today).

Thanks,

Martin

>
>> Also, I think you want a parameter to limit the maximum length of
>> param_ops_p, at some point someone will come with some crazy
>> machine-generated code that will create huge vectors.
> Yes. Exactly.
>
> Thanks,
>
> Martin


[PATCH] Fix thinko in early bail out in tree-switch-conversion.

2019-08-30 Thread Martin Liška
Hi.

Thanks to Jakub, the patch addresses one memory leak in
bit_test_cluster::find_bit_tests (allocation of output).
And moreover, it implements proper guard when clustering
is not successful. In such situation we want a quick bail out.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-08-29  Martin Liska  

* tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
Bail out when we'll end up with the same number of clusters as
at the beginning.
(bit_test_cluster::find_bit_tests): Likewise for bit tests.
(jump_table_cluster::can_be_handled): Remove the guard
as it's already handled in ::is_enabled.  Allocate output
after early bail out.
---
 gcc/tree-switch-conversion.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)


diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 776db77f53a..988e923d7e8 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1213,14 +1213,14 @@ jump_table_cluster::find_jump_tables (vec &clusters)
 }
 
   /* No result.  */
-  if (min[l].m_count == INT_MAX)
+  if (min[l].m_count == l)
 return clusters.copy ();
 
   vec output;
   output.create (4);
 
   /* Find and build the clusters.  */
-  for (int end = l;;)
+  for (unsigned int end = l;;)
 {
   int start = min[end].m_start;
 
@@ -1257,11 +1257,9 @@ jump_table_cluster::can_be_handled (const vec &clusters,
  make a sequence of conditional branches instead of a dispatch.
 
  The definition of "much bigger" depends on whether we are
- optimizing for size or for speed.  */
-  if (!flag_jump_tables)
-return false;
+ optimizing for size or for speed.
 
-  /* For algorithm correctness, jump table for a single case must return
+ For algorithm correctness, jump table for a single case must return
  true.  We bail out in is_beneficial if it's called just for
  a single case.  */
   if (start == end)
@@ -1311,9 +1309,6 @@ jump_table_cluster::is_beneficial (const vec &,
 vec
 bit_test_cluster::find_bit_tests (vec &clusters)
 {
-  vec output;
-  output.create (4);
-
   unsigned l = clusters.length ();
   auto_vec min;
   min.reserve (l + 1);
@@ -1336,9 +1331,12 @@ bit_test_cluster::find_bit_tests (vec &clusters)
 }
 
   /* No result.  */
-  if (min[l].m_count == INT_MAX)
+  if (min[l].m_count == l)
 return clusters.copy ();
 
+  vec output;
+  output.create (4);
+
   /* Find and build the clusters.  */
   for (unsigned end = l;;)
 {



[PATCH] Consider also negative edges in cycle detection.

2019-08-30 Thread Martin Liška
Hi.

The patch is enhancement of r271117 where I started detecting zero
cycles. We also need consider negative edges.

Patch survives gcov.exp and I'll install it after proper testing
next week.

Thanks,
Martin

gcc/ChangeLog:

2019-08-30  Martin Liska  

* gcov.c (path_contains_zero_cycle_arc): Rename to ...
(path_contains_zero_or_negative_cycle_arc): ... this and handle
also negative edges.
(circuit): Handle also negative edges as they can happen
in some situations.
---
 gcc/gcov.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


diff --git a/gcc/gcov.c b/gcc/gcov.c
index c65b7153765..f4e65ee46da 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -730,10 +730,10 @@ unblock (const block_info *u, block_vector_t &blocked,
 /* Return true when PATH contains a zero cycle arc count.  */
 
 static bool
-path_contains_zero_cycle_arc (arc_vector_t &path)
+path_contains_zero_or_negative_cycle_arc (arc_vector_t &path)
 {
   for (unsigned i = 0; i < path.size (); i++)
-if (path[i]->cs_count == 0)
+if (path[i]->cs_count <= 0)
   return true;
   return false;
 }
@@ -759,7 +759,7 @@ circuit (block_info *v, arc_vector_t &path, block_info *start,
 {
   block_info *w = arc->dst;
   if (w < start
-	  || arc->cs_count == 0
+	  || arc->cs_count <= 0
 	  || !linfo.has_block (w))
 	continue;
 
@@ -770,7 +770,7 @@ circuit (block_info *v, arc_vector_t &path, block_info *start,
 	  handle_cycle (path, count);
 	  loop_found = true;
 	}
-  else if (!path_contains_zero_cycle_arc (path)
+  else if (!path_contains_zero_or_negative_cycle_arc (path)
 	   &&  find (blocked.begin (), blocked.end (), w) == blocked.end ())
 	loop_found |= circuit (w, path, start, blocked, block_lists, linfo,
 			   count);
@@ -785,7 +785,7 @@ circuit (block_info *v, arc_vector_t &path, block_info *start,
   {
 	block_info *w = arc->dst;
 	if (w < start
-	|| arc->cs_count == 0
+	|| arc->cs_count <= 0
 	|| !linfo.has_block (w))
 	  continue;
 



Re: [ARM/FDPIC v5 09/21] [ARM] FDPIC: Add support for taking address of nested function

2019-08-30 Thread Kyrill Tkachov



On 8/29/19 4:36 PM, Christophe Lyon wrote:

On 31/07/2019 16:44, Christophe Lyon wrote:

On Tue, 16 Jul 2019 at 14:42, Kyrill Tkachov
 wrote:



On 7/16/19 12:18 PM, Kyrill Tkachov wrote:

Hi Christophe

On 5/15/19 1:39 PM, Christophe Lyon wrote:

In FDPIC mode, the trampoline generated to support pointers to nested
functions looks like:

    .word trampoline address
    .word trampoline GOT address
    ldr    r12, [pc, #8]
    ldr    r9, [pc, #8]
    ldr   pc, [pc, #8]
    .word static chain value
    .word GOT address
    .word function's address

because in FDPIC function pointers are actually pointers to function
descriptors, we have to actually generate a function descriptor for
the trampoline.

2019-XX-XX  Christophe Lyon 
 Mickaël Guêné 

 gcc/
 * config/arm/arm.c (arm_asm_trampoline_template): Add FDPIC
 support.
 (arm_trampoline_init): Likewise.
 (arm_trampoline_init): Likewise.
 * config/arm/arm.h (TRAMPOLINE_SIZE): Likewise.

Change-Id: Idc4d5f629ae4f8d79bdf9623517481d524a0c144

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 40e3f3b..99d13bf 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3976,13 +3976,50 @@ arm_warn_func_return (tree decl)
 .word static chain value
 .word function's address
 XXX FIXME: When the trampoline returns, r8 will be 
clobbered.  */

+/* In FDPIC mode, the trampoline looks like:
+  .word trampoline address
+  .word trampoline GOT address
+  ldr    r12, [pc, #8] ; #4 for Thumb2
+  ldr    r9,  [pc, #8] ; #4 for Thumb2
+  ldr   pc,  [pc, #8] ; #4 for Thumb2
+  .word static chain value
+  .word GOT address
+  .word function's address
+*/



I think this comment is not right for Thumb2.

These load instructionshave 32-bit encodings, even in Thumb2 (they use
high registers).


Andre and Wilco pointed out to me offline that the offset should be #4
for Arm mode.

The Arm ARM at E1.2.3 says:

PC, the program counter

* When executing an A32 instruction, PC reads as the address of the
current instruction plus 8.

* When executing a T32 instruction, PC reads as the address of the
current instruction plus 4.



Yes, it looks like the code is right, and the comment is wrong:
- offset 8 for thumb2 mode
- offset 4 for arm mode


Here is the updated version

Ok with a fixed ChangeLog (it currently mentions arm_trampoline_init 
twice but doesn't mention arm_trampoline_adjust_address)


Thanks,

Kyrill



Thanks,

Christophe


Thanks,

Kyrill




Also, please merge this comment with the one above (no separate /**/)



  static void
  arm_asm_trampoline_template (FILE *f)
  {
    fprintf (f, "\t.syntax unified\n");

-  if (TARGET_ARM)
+  if (TARGET_FDPIC)
+    {
+  /* The first two words are a function descriptor pointing 
to the

+    trampoline code just below.  */
+  if (TARGET_ARM)
+   fprintf (f, "\t.arm\n");
+  else if (TARGET_THUMB2)
+   fprintf (f, "\t.thumb\n");
+  else
+   /* Only ARM and Thumb-2 are supported.  */
+   gcc_unreachable ();
+
+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  /* Trampoline code which sets the static chain register but 
also

+    PIC register before jumping into real code. */
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  STATIC_CHAIN_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PIC_OFFSET_TABLE_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PC_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);



As above, I think the offset should be 8 for both Arm and Thumb2.

Thanks,

Kyrill



+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+    }
+  else if (TARGET_ARM)
  {
    fprintf (f, "\t.arm\n");
    asm_fprintf (f, "\tldr\t%r, [%r, #0]\n", STATIC_CHAIN_REGNUM,
PC_REGNUM);
@@ -4023,12 +4060,40 @@ arm_trampoline_init (rtx m_tramp, tree 
fndecl,

rtx chain_value)
    emit_block_move (m_tramp, assemble_trampoline_template (),
 GEN_INT (TRAMPOLINE_SIZE), BLOCK_OP_NORMAL);

-  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 8 : 12);
-  emit_move_insn (mem, chain_value);
+  if (TARGET_FDPIC)
+    {
+  rtx funcdesc = XEXP (DECL_RTL (fndecl), 0);
+  rtx fnaddr = gen_rtx_MEM (Pmode, funcdesc);
+  rtx gotaddr = gen_rtx_MEM (Pmode, plus_constant (Pmode,
funcdesc, 4));
+  /* The function start address is at offset 8, but in Thumb 
mode

+    we want bit 0 set to 1 to indicate Thumb-ness, hence 9
+    below.  */
+  rtx trampoline_code_start
+   = plus_constant (Pmode, XEXP (m_tr

Re: [PATCH] Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)

2019-08-30 Thread Feng Xue OS
> (It's a bad idea to make ChangeLog entries part of the patch, it won't
> apply to anyone, not even to you nowadays. )
Got it. Will not include this kind of info in later patches.

> I understand describing these things is difficult, but flatten is
> strange way to describe what the function does.  What about somthing
> like the following?
> 
> Analyze EXPR if it represents a series of simple operations performed on
> a function parameter and return true if so.  FBI, STMT, INDEX_P, SIZE_P
> and AGGPOS have the same meaning like in
> unmodified_parm_or_parm_agg_item.  Operations on the parameter are
> recorded to PARAM_OPS_P if it is not NULL.
Operations should be recorded in some place, and this is why PARAM_OPS_P
is used. Not quite understand this point.

>> +   /* Find use of parameter, add a convert operation to describe
>> +  result type, which may not be same as parameter type.  */
>> +   eval_op.val_is_rhs = false;
>> +   eval_op.val = NULL_TREE;
>> +   eval_op.code = VIEW_CONVERT_EXPR;
>> +   eval_op.type = TREE_TYPE (expr);
>> +
>> +   vec_safe_insert (*param_ops_p, 0, eval_op);

> If we get here in the first iteration of the loop, could we not insert
> anything into the vector and handle such cases in
> evaluate_conditions_for_known_args like we do today (well, with
> fold_convert might be better)?  It could save quite some memory and it
> is important to try keep the memory footprint down in IPA summaries.
Here is a little trick to make code of folding in 
evaluate_conditions_for_known_args ()
be simple. It does consume some memory for most cases. Will consider other way
and remove this.

> Also, I think you want a parameter to limit the maximum length of
> param_ops_p, at some point someone will come with some crazy
> machine-generated code that will create huge vectors.
Yes. Exactly.

Thanks,

Martin

Re: [ARM/FDPIC v5 05/21] [ARM] FDPIC: Fix __do_global_dtors_aux and frame_dummy generation

2019-08-30 Thread Richard Sandiford
Christophe Lyon  writes:
> On 12/07/2019 08:06, Richard Sandiford wrote:
>> Christophe Lyon  writes:
>>> In FDPIC, we need to make sure __do_global_dtors_aux and frame_dummy
>>> are referenced by their address, not by pointers to the function
>>> descriptors.
>>>
>>> 2019-XX-XX  Christophe Lyon  
>>> Mickaël Guêné 
>>>
>>> * libgcc/crtstuff.c: Add support for FDPIC.
>>>
>>> Change-Id: I0bc4b1232fbf3c69068fb23a1b9cafc895d141b1
>>>
>>> diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
>>> index 4927a9f..159b461 100644
>>> --- a/libgcc/crtstuff.c
>>> +++ b/libgcc/crtstuff.c
>>> @@ -429,9 +429,18 @@ __do_global_dtors_aux (void)
>>>   #ifdef FINI_SECTION_ASM_OP
>>>   CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux)
>>>   #elif defined (FINI_ARRAY_SECTION_ASM_OP)
>>> +#if defined(__FDPIC__)
>>> +__asm__(
>>> +"   .section .fini_array\n"
>>> +"   .align 2\n"
>>> +"   .word __do_global_dtors_aux\n"
>>> +);
>>> +asm (TEXT_SECTION_ASM_OP);
>>> +#else /* defined(__FDPIC__) */
>>>   static func_ptr __do_global_dtors_aux_fini_array_entry[]
>>> __attribute__ ((__used__, section(".fini_array"), 
>>> aligned(sizeof(func_ptr
>>> = { __do_global_dtors_aux };
>>> +#endif /* defined(__FDPIC__) */
>>>   #else /* !FINI_SECTION_ASM_OP && !FINI_ARRAY_SECTION_ASM_OP */
>>>   static void __attribute__((used))
>>>   __do_global_dtors_aux_1 (void)
>> 
>> It'd be good to avoid hard-coding the pointer size.  Would it work to do:
>> 
>> __asm__("\t.equ\.t__do_global_dtors_aux_alias, __do_global_dtors_aux\n");
>> extern char __do_global_dtors_aux_alias;
>> static void *__do_global_dtors_aux_fini_array_entry[]
>> __attribute__ ((__used__, section(".fini_array"), aligned(sizeof(void 
>> *
>> = { &__do_global_dtors_aux_alias };
>> 
>> ?  Similarly for the init_array.
>> 
> OK, done.
>
>> AFAICT this and 02/21 are the only patches that aren't Arm-specific,
>> is that right?
>> 
>> Thanks,
>> Richard
>> .
>> 
>
> From ea0eee1ddeddef92277ae68eac4af28994c2902c Mon Sep 17 00:00:00 2001
> From: Christophe Lyon 
> Date: Thu, 8 Feb 2018 11:12:52 +0100
> Subject: [ARM/FDPIC v6 05/24] [ARM] FDPIC: Fix __do_global_dtors_aux and
>  frame_dummy generation
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> In FDPIC, we need to make sure __do_global_dtors_aux and frame_dummy
> are referenced by their address, not by pointers to the function
> descriptors.
>
> 2019-XX-XX  Christophe Lyon  
>   Micka«l Guªn© 
>
>   libgcc/
>   * libgcc/crtstuff.c: Add support for FDPIC.

OK, thanks.

Richard

>
> Change-Id: I0bc4b1232fbf3c69068fb23a1b9cafc895d141b1
>
> diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
> index 4927a9f..6659039 100644
> --- a/libgcc/crtstuff.c
> +++ b/libgcc/crtstuff.c
> @@ -429,9 +429,17 @@ __do_global_dtors_aux (void)
>  #ifdef FINI_SECTION_ASM_OP
>  CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux)
>  #elif defined (FINI_ARRAY_SECTION_ASM_OP)
> +#if defined(__FDPIC__)
> +__asm__("\t.equ\t__do_global_dtors_aux_alias, __do_global_dtors_aux\n");
> +extern char __do_global_dtors_aux_alias;
> +static void *__do_global_dtors_aux_fini_array_entry[]
> +__attribute__ ((__used__, section(".fini_array"), aligned(sizeof(void *
> + = { &__do_global_dtors_aux_alias };
> +#else /* defined(__FDPIC__) */
>  static func_ptr __do_global_dtors_aux_fini_array_entry[]
>__attribute__ ((__used__, section(".fini_array"), 
> aligned(sizeof(func_ptr
>= { __do_global_dtors_aux };
> +#endif /* defined(__FDPIC__) */
>  #else /* !FINI_SECTION_ASM_OP && !FINI_ARRAY_SECTION_ASM_OP */
>  static void __attribute__((used))
>  __do_global_dtors_aux_1 (void)
> @@ -473,9 +481,17 @@ frame_dummy (void)
>  #ifdef __LIBGCC_INIT_SECTION_ASM_OP__
>  CRT_CALL_STATIC_FUNCTION (__LIBGCC_INIT_SECTION_ASM_OP__, frame_dummy)
>  #else /* defined(__LIBGCC_INIT_SECTION_ASM_OP__) */
> +#if defined(__FDPIC__)
> +__asm__("\t.equ\t__frame_dummy_alias, frame_dummy\n");
> +extern char __frame_dummy_alias;
> +static void *__frame_dummy_init_array_entry[]
> +__attribute__ ((__used__, section(".init_array"), aligned(sizeof(void *
> + = { &__frame_dummy_alias };
> +#else /* defined(__FDPIC__) */
>  static func_ptr __frame_dummy_init_array_entry[]
>__attribute__ ((__used__, section(".init_array"), 
> aligned(sizeof(func_ptr
>= { frame_dummy };
> +#endif /* defined(__FDPIC__) */
>  #endif /* !defined(__LIBGCC_INIT_SECTION_ASM_OP__) */
>  #endif /* USE_EH_FRAME_REGISTRY || USE_TM_CLONE_REGISTRY */


Re: [PATCH V6 05/11] bpf: new GCC port

2019-08-30 Thread Richard Sandiford
jema...@gnu.org (Jose E. Marchesi) writes:
> This patch adds a port for the Linux kernel eBPF architecture to GCC.
>
> ChangeLog:
>
>   * configure.ac: Support for bpf-*-* targets.
>   * configure: Regenerate.
>
> contrib/ChangeLog:
>
>   * config-list.mk (LIST): Disable go in bpf-*-* targets.
>
> gcc/ChangeLog:
>
>   * config.gcc: Support for bpf-*-* targets.
>   * common/config/bpf/bpf-common.c: New file.
>   * config/bpf/t-bpf: Likewise.
>   * config/bpf/predicates.md: Likewise.
>   * config/bpf/constraints.md: Likewise.
>   * config/bpf/bpf.opt: Likewise.
>   * config/bpf/bpf.md: Likewise.
>   * config/bpf/bpf.h: Likewise.
>   * config/bpf/bpf.c: Likewise.
>   * config/bpf/bpf-protos.h: Likewise.
>   * config/bpf/bpf-opts.h: Likewise.
>   * config/bpf/bpf-helpers.h: Likewise.
>   * config/bpf/bpf-helpers.def: Likewise.

Looks good to me, thanks.  Jeff also had detailed comments, so please
wait for an ack from him too.

Richard


Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)

2019-08-30 Thread Feng Xue OS
That's good. Thanks for your comments.

Feng


From: Martin Jambor 
Sent: Thursday, August 29, 2019 11:00 PM
To: Feng Xue OS; gcc-patches@gcc.gnu.org; Jan Hubicka
Subject: Re: [PATCH] Setup predicate for switch default case in IPA (PR 
ipa/91089)

Hi,

On Fri, Jul 12 2019, Feng Xue OS wrote:
> IPA does not construct executability predicate for default case of switch 
> statement.
> So execution cost of default case is not properly evaluated in IPA-cp, this 
> might
> prevent function clone for function containing switch statement, if certain 
> non-default
> case is proved to be executed after constant propagation.
>
> This patch is composed to deduce predicate for default case, if it turns out 
> to be a
> relative simple one, for example, we can try to merge case range, and use
> comparison upon range bounds, and also range analysis information to simplify
> predicate.
>

I have read through the patch and it looks OK to me but I cannot approve
it, you have to ping Honza for that.  Since you decided to use the value
range info, it would be nice if you could also add a testcase where it
plays a role.  Also, please don't post changelog entries as a part of
the patch, it basically guarantees it will not apply for anybody, not
even for you when you update your trunk.

Thanks for working on this,

Martin


> Feng
>
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3d92250b520..4de2f568990 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-07-12  Feng Xue  
> +
> + PR ipa/91089
> + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate
> + for switch default case using range analysis information.
> + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New.
> +
>  2019-07-11  Sunil K Pandey  
>


Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-30 Thread Uros Bizjak
On Fri, Aug 30, 2019 at 9:22 AM Richard Biener
 wrote:
>
> On Thu, Aug 29, 2019 at 9:54 AM Uros Bizjak  wrote:
> >
> > On Wed, Aug 28, 2019 at 5:12 PM Uros Bizjak  wrote:
> > >
> > > Attached patch improves costing for STV shifts and corrects reject
> > > condition for out of range shift count operands.
> > >
> > > 2019-08-28  Uroš Bizjak  
> > >
> > > * config/i386/i386-features.c
> > > (general_scalar_chain::compute_convert_gain):
> > > Correct cost for double-word shifts.
> > > (general_scalar_to_vector_candidate_p): Reject count operands
> > > greater or equal to mode bitsize.
> > >
> > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> > >
> > > Committed to mainline SVN.
> >
> > Ouch... I mixed up patches and actually committed the patch that
> > removes maximum from cost of sse<->int moves.
> >
> > I can leave the patch for a day, so we can see the effects of the cost
> > change, and if the patch creates problems, I'll revert it.
>
> Regresses gromacs and namd quite a bit on Haswell, also perl in SPEC 2000.
> I guess we should try understand why rather than reverting immediately
> (I'd leave it in at least another few days to get more testers pick up the
> rev.).

The correct patch is actually [1] (I have updated the subject):

2019-08-28  Uroš Bizjak  

* config/i386/i386.c (ix86_register_move_cost): Do not
limit the cost of moves to/from XMM register to minimum 8.

There is no technical reason to limit the costs to minimum 8 anymore,
and several targets (e.g skylake) also claim that moves between SSE
and general regs are as cheap as moves between general regs. However,
these values were never benchmarked properly due to the mentioned
limitation and apparently cause some unwanted secondary effects
(lowering the clock).

I agree with your proposal to leave the change in the tree some more
time. At the end, we could raise the costs for all targets to 8 to
effectively revert to the previous state.

[1] https://gcc.gnu.org/ml/gcc-patches/2019-08/msg02009.html

Uros.


Re: [PR 91579] Avoid creating redundant PHI nodes in tail-call pass

2019-08-30 Thread Richard Biener
On Thu, Aug 29, 2019 at 3:56 PM Martin Jambor  wrote:
>
> Hi,
>
> On Thu, Aug 29 2019, Richard Biener wrote:
> > On Thu, Aug 29, 2019 at 11:04 AM Martin Jambor  wrote:
> >>
> >> Hi,
> >>
> >> when turning a tail-recursive call into a loop, the tail-call pass
> >> creates a phi node for each gimple_reg function parameter that has any
> >> use at all, even when the value passed to the original call is the same
> >> as the received one, when it is the parameter's default definition.
> >> This results in a redundant phi node in which one argument is the
> >> default definition and all others are the LHS of the same phi node.  See
> >> the Bugzilla entry for an example.  These phi nodes in turn confuses
> >> ipa-prop.c which cannot skip them and may not create a pass-through jump
> >> function when it should.
> >>
> >> Fixed by the following patch which just adds a bitmap to remember where
> >> there are non-default-defs passed to a tail-recursive call and then
> >> creates phi nodes only for such parameters.  It has passed bootstrap and
> >> testing on x86_64-linux.
> >>
> >> OK for trunk?
> >
> > OK.  Eventually arg_needs_copy_p can be elided completely
> > and pre-computed into the bitmap so we can just check
> > the positional?  And rename the bitmap to arg_need_copy itself.
>
> Like this?  Bootstrapped and tested on x86_64-linux.

Yes.  This is OK.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> 2019-08-29  Martin Jambor  
>
> tree-optimization/91579
> * tree-tailcall.c (tailr_arg_needs_copy): New variable.
> (find_tail_calls): Allocate tailr_arg_needs_copy and set its bits as
> appropriate.
> (arg_needs_copy_p): Removed.
> (eliminate_tail_call): Test tailr_arg_needs_copy instead of calling
> arg_needs_copy_p.
> (tree_optimize_tail_calls_1): Likewise.  Free tailr_arg_needs_copy.
>
> testsuite/
> * gcc.dg/tree-ssa/pr91579.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr91579.c | 22 
>  gcc/tree-tailcall.c | 48 +
>  2 files changed, 47 insertions(+), 23 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr91579.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr91579.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr91579.c
> new file mode 100644
> index 000..ee752be1a85
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr91579.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-tailr1" } */
> +
> +typedef long unsigned int size_t;
> +typedef int (*compare_t)(const void *, const void *);
> +
> +int partition (void *base, size_t nmemb, size_t size, compare_t cmp);
> +
> +void
> +my_qsort (void *base, size_t nmemb, size_t size, compare_t cmp)
> +{
> +  int pt;
> +  if (nmemb > 1)
> +{
> +  pt = partition (base, nmemb, size, cmp);
> +  my_qsort (base, pt + 1, size, cmp);
> +  my_qsort ((void*)((char*) base + (pt + 1) * size),
> +   nmemb - pt - 1, size, cmp);
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-not "cmp\[^\r\n\]*PHI" "tailr1" } } */
> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
> index a4b563efd73..4824a5e650f 100644
> --- a/gcc/tree-tailcall.c
> +++ b/gcc/tree-tailcall.c
> @@ -126,6 +126,11 @@ struct tailcall
> accumulator.  */
>  static tree m_acc, a_acc;
>
> +/* Bitmap with a bit for each function parameter which is set to true if we
> +   have to copy the parameter for conversion of tail-recursive calls.  */
> +
> +static bitmap tailr_arg_needs_copy;
> +
>  static bool optimize_tail_call (struct tailcall *, bool);
>  static void eliminate_tail_call (struct tailcall *);
>
> @@ -727,6 +732,18 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>   gimple_stmt_iterator mgsi = gsi_for_stmt (stmt);
>   gsi_move_before (&mgsi, &gsi);
> }
> +  if (!tailr_arg_needs_copy)
> +   tailr_arg_needs_copy = BITMAP_ALLOC (NULL);
> +  for (param = DECL_ARGUMENTS (current_function_decl), idx = 0;
> +  param;
> +  param = DECL_CHAIN (param), idx++)
> +   {
> + tree ddef, arg = gimple_call_arg (call, idx);
> + if (is_gimple_reg (param)
> + && (ddef = ssa_default_def (cfun, param))
> + && (arg != ddef))
> +   bitmap_set_bit (tailr_arg_needs_copy, idx);
> +   }
>  }
>
>nw = XNEW (struct tailcall);
> @@ -905,25 +922,6 @@ decrease_profile (basic_block bb, profile_count count)
>  }
>  }
>
> -/* Returns true if argument PARAM of the tail recursive call needs to be 
> copied
> -   when the call is eliminated.  */
> -
> -static bool
> -arg_needs_copy_p (tree param)
> -{
> -  tree def;
> -
> -  if (!is_gimple_reg (param))
> -return false;
> -
> -  /* Parameters that are only defined but never used need not be copied.  */
> -  def = ssa_default_def (cfun, param);
> -  if (!def)
> -return false;
> -
> -  return true;
> -}
> -
>  /* Eliminates tail call described by

Re: [PATCH] use fallback location for warning (PR 91599)

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 3:58 AM Martin Sebor  wrote:
>
> warning_at() calls with the %G directive rely on the gimple statement
> for both their location and the inlining context.  When the statement
> is not associated with a location, the warning doesn't point at any
> line even if the location argument passed to the call was valid.
> The attached patch changes the percent_G_percent handler to fall back
> on the provided location in that case, and the recently added warning
> for char assignments to pass to the function a fallback location if
> the statement doesn't have one.
>
> Tested on x86_64-linux.

OK unless David has any further comments.

Richard.

>
> Martin


Re: [PATCH] correct MEM_REF bounds checking of arrays (PR 91584)

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 12:36 AM Martin Sebor  wrote:
>
> The -Warray-bounds enhancement I added to GCC 9 causes false
> positives in languages like Fortran whose first array element
> is at a non-zero index.  The attached patch has the function
> responsible for the warning normalize the array bounds to
> always start at zero to avoid these false positives.
>
> Tested on x86_64-linux.

OK.

Thanks,
Richard.

> Martin


Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-30 Thread Richard Biener
On Thu, Aug 29, 2019 at 7:36 PM Alexander Monakov  wrote:
>
> On Thu, 29 Aug 2019, Maxim Kuvyrkov wrote:
>
> > >> r1 = [rb + 0]
> > >> 
> > >> r2 = [rb + 8]
> > >> 
> > >> r3 = [rb + 16]
> > >> 
> > >>
> > >> which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This
> > >> schedule happens because r2= load gets lower priority than the
> > >> "irrelevant"  due to the above patch.
> > >>
> > >> If we think about it, the fact that "r1 = [rb + 0]" can be scheduled
> > >> means that true dependencies of all similar base+offset loads are
> > >> resolved.  Therefore, for autoprefetcher-friendly schedule we should
> > >> prioritize memory reads before "irrelevant" instructions.
> > >
> > > But isn't there also max number of load issues in a fetch window to 
> > > consider?
> > > So interleaving arithmetic with loads might be profitable.
> >
> > It appears that cores with autoprefetcher hardware prefer loads and stores
> > bundled together, not interspersed with other instructions to occupy the 
> > rest
> > of CPU units.
>
> Let me point out that the motivating example has a bigger effect in play:
>
> (1) r1 = [rb + 0]
> (2) 
> (3) r2 = [rb + 8]
> (4) 
> (5) r3 = [rb + 16]
> (6) 
>
> here Cortex-A53, being an in-order core, cannot issue the load at (3) until
> after the load at (1) has completed, because the use at (2) depends on it.
> The good schedule allows the three loads to issue in a pipelined fashion.

OK, so with dispatch/issue issues I was thinking of scheduling independent
work like AGU ops inbetween the loads.  It might be that some in-order
cores like to see two adjacent loads to fire auto-prefetching but any such
heuristic should probably be very sub-architecture specific.

> So essentially the main issue is not a hardware peculiarity, but rather the
> bad schedule being totally wrong (it could only make sense if loads had 
> 1-cycle
> latency, which they do not).
>
> I think this highlights how implementing this autoprefetch heuristic via the
> dfa_lookahead_guard interface looks questionable in the first place, but the
> patch itself makes sense to me.
>
> Alexander


Re: [PATCH, i386]: Improve STV conversion of shifts

2019-08-30 Thread Richard Biener
On Thu, Aug 29, 2019 at 9:54 AM Uros Bizjak  wrote:
>
> On Wed, Aug 28, 2019 at 5:12 PM Uros Bizjak  wrote:
> >
> > Attached patch improves costing for STV shifts and corrects reject
> > condition for out of range shift count operands.
> >
> > 2019-08-28  Uroš Bizjak  
> >
> > * config/i386/i386-features.c
> > (general_scalar_chain::compute_convert_gain):
> > Correct cost for double-word shifts.
> > (general_scalar_to_vector_candidate_p): Reject count operands
> > greater or equal to mode bitsize.
> >
> > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >
> > Committed to mainline SVN.
>
> Ouch... I mixed up patches and actually committed the patch that
> removes maximum from cost of sse<->int moves.
>
> I can leave the patch for a day, so we can see the effects of the cost
> change, and if the patch creates problems, I'll revert it.

Regresses gromacs and namd quite a bit on Haswell, also perl in SPEC 2000.
I guess we should try understand why rather than reverting immediately
(I'd leave it in at least another few days to get more testers pick up the
rev.).

Richard.

> Sorry for the mixup,
> Uros.


Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-30 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 2:18 PM Uros Bizjak  wrote:
>
> On Fri, Aug 30, 2019 at 2:08 AM Hongtao Liu  wrote:
> >
> > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak  wrote:
> > >
> > > 2019-08-28  Uroš Bizjak  
> > >
> > > * config/i386/i386.c (ix86_register_move_cost): Do not
> > > limit the cost of moves to/from XMM register to minimum 8.
> > >
> > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> > >
> > > Actually committed as r274994 with the wrong ChangeLog.
> > >
> > > Uros.
> >
> > There is 11% regression in 548.exchange_r of SPEC2017.
> >
> > Reason for the regression:
> > For 548.exchange_r, a lot of movements between gpr and xmm are
> > generated as expected,
> > and it reduced  clocksticks by 3%.
>
> This is OK, and expected from the patch.
>
> > But  however maybe too many xmm registers are used,
> > a frequency reduction issue is triggered(average frequency reduced by 13%).
> > So totally it takes more time.
>
> This is a secondary effect that is currently not modelled by the compiler.
>
> However, I expected that SSE <-> int moves in x86-tune-cost.h will
> have to be retuned. Up to now, both directions were limited to minimum
> 8, so any value lower than 8 was ignored. However, minimum was set to
> work-around certain limitation in reload, which is not needed anymore.
>
> You can simply set the values of SSE <-> int moves to 8 (which is an
> arbitrary value!) to restore the previous behaviour, but I think that
> a more precise cost value should be determined, probably a different
> one for each direction. But until register pressure effects are
> modelled, any artificially higher value will represent a workaround
> and not the true reg-reg move cost.
>
> Uros.

Yes,  we're still working on skylake_cost model.

-- 
BR,
Hongtao


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Richard Biener
On Fri, Aug 30, 2019 at 9:12 AM Richard Biener
 wrote:
>
> On Thu, Aug 29, 2019 at 5:39 PM Ilya Leoshkevich  wrote:
> >
> > > Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
> > >
> > > Bootstrap and regtest running on x86_64-redhat-linux and
> > > s390x-redhat-linux.
> > >
> > > This patch series adds signaling FP comparison support (both scalar and
> > > vector) to s390 backend.
> >
> > I'm running into a problem on ppc64 with this patch, and it would be
> > great if someone could help me figure out the best way to resolve it.
> >
> > vector36.C test is failing because gimplifier produces the following
> >
> >   _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
> >   _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
> >
> > from
> >
> >   VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
> >   { -1, -1, -1, -1 } ,
> >   { 0, 0, 0, 0 } >
> >
> > Since the comparison tree code is now hidden behind a temporary, my code
> > does not have anything to pass to the backend.  The reason for creating
> > a temporary is that the comparison can trap, and so the following check
> > in gimplify_expr fails:
> >
> >   if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
> > goto out;
> >
> > gimple_test_f is is_gimple_condexpr, and it eventually calls
> > operation_could_trap_p (GT).
> >
> > My current solution is to simply state that backend does not support
> > SSA_NAME in vector comparisons, however, I don't like it, since it may
> > cause performance regressions due to having to fall back to scalar
> > comparisons.
> >
> > I was thinking about two other possible solutions:
> >
> > 1. Change the gimplifier to allow trapping vector comparisons.  That's
> >a bit complicated, because tree_could_throw_p checks not only for
> >floating point traps, but also e.g. for array index out of bounds
> >traps.  So I would have to create a tree_could_throw_p version which
> >disregards specific kinds of traps.
> >
> > 2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
> >its tree_code instead of SSA_NAME.  The potential problem I see with
> >this is that there appears to be no guarantee that _5 will be inlined
> >into _6 at a later point.  So if we say that we don't need to fall
> >back to scalar comparisons based on availability of vector >
> >instruction and inlining does not happen, then what's actually will
> >be required is vector selection (vsel on S/390), which might not be
> >available in general case.
> >
> > What would be a better way to proceed here?
>
> On GIMPLE there isn't a good reason to split out trapping comparisons
> from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
> where it is important because we'd have no way to represent EH info
> when not done.  It might be a bit awkward to preserve EH across RTL
> expansion though in case the [VEC_]COND_EXPR are not expanded
> as a single pattern, but I'm not sure.
>
> To go this route you'd have to split the is_gimple_condexpr check
> I guess and eventually users turning [VEC_]COND_EXPR into conditional
> code (do we have any?) have to be extra careful then.

Oh, btw - the fact that we have an expression embedded in [VEC_]COND_EXPR
is something that bothers me for quite some time already and it makes
things like VN awkward and GIMPLE fincky.  We've discussed alternatives
to dead with the simplest being moving the comparison out to a separate
stmt and others like having four operand [VEC_]COND_{EQ,NE,...}_EXPR
codes or simply treating {EQ,NE,...}_EXPR as quarternary on GIMPLE
with either optional 3rd and 4th operand (defaulting to boolean_true/false_node)
or always explicit ones (and thus dropping [VEC_]COND_EXPR).

What does LLVM do here?

Richard.

> Richard.
>
> >


Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-30 Thread Richard Biener
On Thu, Aug 29, 2019 at 5:39 PM Ilya Leoshkevich  wrote:
>
> > Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
> >
> > Bootstrap and regtest running on x86_64-redhat-linux and
> > s390x-redhat-linux.
> >
> > This patch series adds signaling FP comparison support (both scalar and
> > vector) to s390 backend.
>
> I'm running into a problem on ppc64 with this patch, and it would be
> great if someone could help me figure out the best way to resolve it.
>
> vector36.C test is failing because gimplifier produces the following
>
>   _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
>   _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;
>
> from
>
>   VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
>   { -1, -1, -1, -1 } ,
>   { 0, 0, 0, 0 } >
>
> Since the comparison tree code is now hidden behind a temporary, my code
> does not have anything to pass to the backend.  The reason for creating
> a temporary is that the comparison can trap, and so the following check
> in gimplify_expr fails:
>
>   if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
> goto out;
>
> gimple_test_f is is_gimple_condexpr, and it eventually calls
> operation_could_trap_p (GT).
>
> My current solution is to simply state that backend does not support
> SSA_NAME in vector comparisons, however, I don't like it, since it may
> cause performance regressions due to having to fall back to scalar
> comparisons.
>
> I was thinking about two other possible solutions:
>
> 1. Change the gimplifier to allow trapping vector comparisons.  That's
>a bit complicated, because tree_could_throw_p checks not only for
>floating point traps, but also e.g. for array index out of bounds
>traps.  So I would have to create a tree_could_throw_p version which
>disregards specific kinds of traps.
>
> 2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
>its tree_code instead of SSA_NAME.  The potential problem I see with
>this is that there appears to be no guarantee that _5 will be inlined
>into _6 at a later point.  So if we say that we don't need to fall
>back to scalar comparisons based on availability of vector >
>instruction and inlining does not happen, then what's actually will
>be required is vector selection (vsel on S/390), which might not be
>available in general case.
>
> What would be a better way to proceed here?

On GIMPLE there isn't a good reason to split out trapping comparisons
from [VEC_]COND_EXPR - the gimplifier does this for GIMPLE_CONDs
where it is important because we'd have no way to represent EH info
when not done.  It might be a bit awkward to preserve EH across RTL
expansion though in case the [VEC_]COND_EXPR are not expanded
as a single pattern, but I'm not sure.

To go this route you'd have to split the is_gimple_condexpr check
I guess and eventually users turning [VEC_]COND_EXPR into conditional
code (do we have any?) have to be extra careful then.

Richard.

>