date:20190701

[PATCH, committed] PR 91048 Open ~/.mklog in string mode

2019-07-01 Thread Janne Blomqvist

Hi,

another leftover from the py3 conversion. Committed r272921 as obvious.

2019-07-02  Janne Blomqvist  

PR other/91048
* mklog (read_user_info): Open ~/.mklog in string mode.

Index: mklog
===
--- mklog(revision 272920)
+++ mklog(working copy)
@@ -100,7 +100,7 @@ EMAIL = ...
   mklog_conf = os.path.expanduser('~/.mklog')
   if os.path.exists(mklog_conf):
 attrs = {}
-f = open(mklog_conf, 'rb')
+f = open(mklog_conf)
 for s in f:
   if cache.match(r'^\s*([a-zA-Z0-9_]+)\s*=\s*(.*?)\s*$', s):
 attrs[cache.group(1)] = cache.group(2)


-- 
Janne Blomqvist

RE: Use ODR for canonical types construction in LTO

2019-07-01 Thread JiangNing OS

Hi,

FYI. This patch works for my application LTO build on aarch64.

Thanks,
-Jiangning

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org 
> On Behalf Of Jan Hubicka
> Sent: Monday, July 1, 2019 8:22 PM
> To: Christophe Lyon 
> Cc: Eric Botcazou ; gcc Patches  patc...@gcc.gnu.org>; Richard Biener ; d...@dcepelik.cz;
> Martin Liška 
> Subject: Re: Use ODR for canonical types construction in LTO
> 
> Hi,
> this patch fixes the ICE. Problem is that va_arg type is pre-streamed and thus
> at stream-in time we have main variant constructed by LTO FE which
> is !CXX_ODR_P while vairants are ones comming from C++ FE which are ODR.
> It is safe to drop the flags here since we only care about main variants
> anyway.
> 
> I am testing the patch on aarch64 now.
> 
> Honza
> 
> Index: lto/lto-common.c
> 
> ===
> --- lto/lto-common.c  (revision 272846)
> +++ lto/lto-common.c  (working copy)
> @@ -568,8 +568,17 @@ lto_register_canonical_types_for_odr_typ
> 
>/* Register all remaining types.  */
>FOR_EACH_VEC_ELT (*types_to_register, i, t)
> -if (!TYPE_CANONICAL (t))
> -  gimple_register_canonical_type (t);
> +{
> +  /* For pre-streamed types like va-arg it is possible that main variant
> +  is !CXX_ODR_P while the variant (which is streamed) is.
> +  Copy CXX_ODR_P to make type verifier happy.  This is safe because
> +  in canonical type calculation we only consider main variants.
> +  However we can not change this flag before streaming is finished
> +  to not affect tree merging.  */
> +  TYPE_CXX_ODR_P (t) = TYPE_CXX_ODR_P (TYPE_MAIN_VARIANT (t));
> +  if (!TYPE_CANONICAL (t))
> +gimple_register_canonical_type (t);
> +}
>  }
> 
>

Re: [PATCH] integrate sprintf pass into strlen (PR 83431)

2019-07-01 Thread Martin Sebor


Attached is a more complete solution that fully resolves the bug
report by avoiding a warning in cases like:

  char a[32], b[8];

  void f (void)
  {
if (strlen (a) < sizeof b - 2)
  snprintf (b, sizeof b, "b=%s", a); // no -Wformat-truncation
  }

It does that by having get_range_strlen_dynamic use the EVRP
information for non-constant strlen calls: EVRP has recorded
that the result is less sizeof b - 2 and so the function
returns this limited range of lengths to snprintf which can
then avoid warning.  It also improves the checking and can
find latent bugs it missed before (like the off-by-one in
print-rtl.c).

Besides improving the accuracy of the -Wformat-overflow and
truncation warnings this can also result in better code.
So far this only benefits snprintf but there may be other
opportunities to string functions as well (e.g., strcmp or
memcmp).

Jeff, I looked into your question/suggestion for me last week
when we spoke, to introduce some sort of a recursion limit for
get_range_strlen_dynamic.  It's easily doable but before we go
down that path I did some testing to see how bad it can get and
to compare it with get_range_strlen.  Here are the results for
a few packages.  The dept is the maximum recursion depth, success
and fail are the numbers of successful and failed calls to
the function:

  binutils-gdb:
  depth   success fail
get_range_strlen:   319  830221621
get_range_strlen_dynamic:41  1503  161
  gcc:
get_range_strlen:46  721111365
get_range_strlen_dynamic:23 10272   12
  glibc:
get_range_strlen:76  284011422
get_range_strlen_dynamic:51  1186   46
  elfutils:
get_range_strlen:33  1198 2516
get_range_strlen_dynamic:31   685   36
  kernel:
get_range_strlen:25  529911698
get_range_strlen_dynamic:31  9911  122

Except the first case of get_range_strlen (I haven't looked into
it yet), it doesn't seem too bad, and with just one exception it's
better than get_range_strlen.  Let me know if you think it's worth
adding a parameter (I assume we'd use it for both functions) and
what to set it to.

On 6/11/19 5:26 PM, Martin Sebor wrote:

The sprintf and strlen passes both work with strings but
run independently of one another and don't share state.  As
a result, lengths of strings dynamically created by functions
that are available to the strlen pass are not available to
sprintf.  Conversely, lengths of strings formatted by
the sprintf functions are not made available to the strlen
pass.  The result is less efficient code, poor diagnostics,
and ultimately less than optimal user experience.

The attached patch is the first step toward rectifying this
design problem.  It integrates the two passes into one and
exposes the string length data managed by the strlen pass to
the sprintf "module."  (It does not expose any sprintf data
to the strlen pass yet.)

The sprintf pass invocations in passes.def have been replaced
with those of strlen.  The first "early" invocation is only
effective for the sprintf module to enable warnings without
optimization.  The second invocation is "late" and enables
both warnings and the sprintf and strlen optimizations unless
explicitly disabled via -fno-optimize-strlen.

Since the strlen optimization is the second invocation of
the pass tests that scan the strlen dump have been adjusted
to look for the "strlen2" dump file.

The changes in the patch are mostly mechanical.  The one new
"feature" worth calling out is the get_range_strlen_dynamic
function.  It's analogous to get_range_strlen in gimple-fold.c
except that it makes use of the strlen "dynamically" obtained
string length info.  In cases when the info is not available
the former calls the latter.

The other new functions in tree-ssa-strlen.c are
check_and_optimize_call and handle_integral_assign: I added
them only to keep the check_and_optimize_stmt function from
getting excessively long and hard to follow.  Otherwise,
the code in the functions is unchanged.

There are a number of further enhancements to consider as
the next steps:
  *  enhance the strlen module to determine string length range
     information from integer variable to which it was previously
     assigned (necessary to fully resolve pr83431 and pr90625),
  *  make the sprintf/snprintf string length results directly
     available to the strlen module,
  *  enhance the sprintf module to optimize snprintf(0, 0, fmt)
     calls with fmt strings consisting solely of plain characters
     and %s directives to series of strlen calls on the arguments,
  *  and more.

Martin


PR tree-optimization/83431 - -Wformat-truncation may incorrectly report truncation

gcc/ChangeLog:

	PR c++/83431
	* gimple-ssa-sprintf.c (pass_data_sprintf_length): Remove object.
	(sprintf_dom_walker): Remove class.
	(get_int_range): Make argument const.

Go patch committed: refactor Exports to encapsulate type refs map

2019-07-01 Thread Ian Lance Taylor

This Go frontend patch by Than McIntosh refactors the Export class to
encapsulate the type refs map.  This convert the Export::type_refs map
from a static object to a field contained (indirectly, via an impl
class) in Export itself, for better encapsulation and to be able to
reclaim its memory when exporting is done.  No change in compiler
functionality.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 272666)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-d3d0f3c5bbe9d272178d55bdb907b07c188800e1
+1e042a49d6f2e95d371301aa7b911522dc5877f4
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/export.cc
===
--- gcc/go/gofrontend/export.cc (revision 272608)
+++ gcc/go/gofrontend/export.cc (working copy)
@@ -41,14 +41,6 @@ const char Export::v2_magic[Export::magi
 
 const int Export::checksum_len;
 
-// Constructor.
-
-Export::Export(Stream* stream)
-  : stream_(stream), type_index_(1), packages_()
-{
-  go_assert(Export::checksum_len == Go_sha1_helper::checksum_len);
-}
-
 // Type hash table operations, treating aliases as distinct.
 
 class Type_hash_alias_identical
@@ -80,14 +72,31 @@ class Type_alias_identical
   }
 };
 
-// Mapping from Type objects to a constant index.  This would be nicer
-// as a field in Export, but then export.h would have to #include
-// types.h.
-
+// Mapping from Type objects to a constant index.
 typedef Unordered_map_hash(const Type*, int, Type_hash_alias_identical,
-  Type_alias_identical) Type_refs;
+   Type_alias_identical) Type_refs;
+
+// Implementation object for class Export.  Hidden implementation avoids
+// having to #include types.h in export.h, or use a static map.
+
+struct Export_impl {
+  Type_refs type_refs;
+};
 
-static Type_refs type_refs;
+// Constructor.
+
+Export::Export(Stream* stream)
+: stream_(stream), type_index_(1), packages_(), impl_(new Export_impl)
+{
+  go_assert(Export::checksum_len == Go_sha1_helper::checksum_len);
+}
+
+// Destructor.
+
+Export::~Export()
+{
+  delete this->impl_;
+}
 
 // A traversal class to collect functions and global variables
 // referenced by inlined functions.
@@ -635,7 +644,7 @@ Export::set_type_index(Type* type)
   type = type->forwarded();
 
   std::pair ins =
-type_refs.insert(std::make_pair(type, 0));
+this->impl_->type_refs.insert(std::make_pair(type, 0));
   if (!ins.second)
 {
   // We've already seen this type.
@@ -1011,8 +1020,8 @@ Export::write_types(int unexported_type_
 {
   // Map from type index to type.
   std::vector types(static_cast(this->type_index_));
-  for (Type_refs::const_iterator p = type_refs.begin();
-   p != type_refs.end();
+  for (Type_refs::const_iterator p = this->impl_->type_refs.begin();
+   p != this->impl_->type_refs.end();
++p)
 {
   if (p->second >= 0)
@@ -1152,8 +1161,8 @@ int
 Export::type_index(const Type* type)
 {
   type = type->forwarded();
-  Type_refs::const_iterator p = type_refs.find(type);
-  go_assert(p != type_refs.end());
+  Type_refs::const_iterator p = this->impl_->type_refs.find(type);
+  go_assert(p != this->impl_->type_refs.end());
   int index = p->second;
   go_assert(index != 0);
   return index;
@@ -1231,7 +1240,7 @@ Export::register_builtin_type(Gogo* gogo
   Named_object* named_object = gogo->lookup_global(name);
   go_assert(named_object != NULL && named_object->is_type());
   std::pair ins =
-type_refs.insert(std::make_pair(named_object->type_value(), code));
+this->impl_->type_refs.insert(std::make_pair(named_object->type_value(), 
code));
   go_assert(ins.second);
 
   // We also insert the underlying type.  We can see the underlying
@@ -1239,7 +1248,7 @@ Export::register_builtin_type(Gogo* gogo
   // fails--we expect duplications here, and it doesn't matter when
   // they occur.
   Type* real_type = named_object->type_value()->real_type();
-  type_refs.insert(std::make_pair(real_type, code));
+  this->impl_->type_refs.insert(std::make_pair(real_type, code));
 }
 
 // Class Export::Stream.
Index: gcc/go/gofrontend/export.h
===
--- gcc/go/gofrontend/export.h  (revision 272608)
+++ gcc/go/gofrontend/export.h  (working copy)
@@ -22,6 +22,7 @@ class Import_init_set;
 class Backend;
 class Temporary_statement;
 class Unnamed_label;
+struct Export_impl;
 
 // Codes used for the builtin types.  These are all negative to make
 // them easily distinct from the codes assigned by Export::write_type.
@@ -121,6 +122,7 @@ class Export : public String_dump
   };
 
   Export(Stream*);
+  ~Export();
 
   // Size of export data magic string (which includes version numb

Re: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-07-01 Thread Gaius Mulley

Segher Boessenkool  writes:

> Hi Gaius,
>
> On Fri, Jun 14, 2019 at 02:09:48PM +0100, Gaius Mulley wrote:
>> here is version two of the patches which introduce Modula-2 into the
>> GCC trunk.  The patches include:
>> 
>>   (*)  a patch to allow all front ends to register a lang spec function.
>>(included are patches for all front ends to provide an empty
>> callback function).
>>   (*)  patch diffs to allow the Modula-2 front end driver to be
>>built using GCC Makefile and friends.
>> 
>> The compressed tarball includes:
>> 
>>   (*)  gcc/m2  (compiler driver and lang-spec stuff for Modula-2).
>>Including the need for registering lang spec functions.
>>   (*)  gcc/testsuite/gm2  (a Modula-2 dejagnu test to ensure that
>>the gm2 driver is built and can understands --version).
>
> I built on pwoerpc64-linux, with the patch and the tarball.
>
> I first need this patch, because srcdir is an absolute path for me:
>
> ===
> diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
> index e2d5098..a423a9e 100644
> --- a/gcc/m2/Make-lang.in
> +++ b/gcc/m2/Make-lang.in
> @@ -71,13 +71,13 @@ m2/gm2config.h:
>  export AR ; \
>  RANLIB=`echo $(RANLIB_FOR_TARGET) | sed -e "s/^ //"` ; \
>  export RANLIB ; \
> -$(SHELL) -c '../$(srcdir)/m2/configure --srcdir=../$(srcdir)/m2 
> --t
> +$(SHELL) -c '$(srcdir)/m2/configure --srcdir=$(srcdir)/m2 
> --target=
>  else \
> -$(SHELL) -c '../$(srcdir)/m2/configure --srcdir=../$(srcdir)/m2 
> --t
> +$(SHELL) -c '$(srcdir)/m2/configure --srcdir=$(srcdir)/m2 
> --target=
>  fi
>  
>  m2/gm2version.c: m2/gm2version.h
> - cd m2 ; bash ../$(srcdir)/m2/tools-src/makeversion -p ../$(srcdir)
> + cd m2 ; bash $(srcdir)/m2/tools-src/makeversion -p $(srcdir)
>  
>  # Build hooks.
>  
> ===
>
> (This patch might not be correct, but it works for me to get things to
> build, at least).
>
> But then I still get build failures: it tries to run xgcc when it hasn't
> been built yet.  ("it" == "something", I didn't keep logs, sorry).
>
> I let it run overnight with -j1, and it finished.  The testsuite is
> running now :-)
>
>
> Segher

Hi Segher,

many thanks for the patch and highlighting the relative vs absolute
srcdir build.  I've fixed the makeversion with a change to the script
(directory option added).  I'll work on the configure line and explore a
solution.  Yes -j1 is definitely an overnight task!


regards,
Gaius

Re: [PATCH, Ada, Darwin, PPC] PPC Darwin has stack check probes.

2019-07-01 Thread Eric Botcazou

> 2019-07-01  Iain Sandoe  
> 
>   * libgnat/system-darwin-ppc.ads: Set Stack_Check_Probes True for
>   PPC Darwin.

OK, thanks.

-- 
Eric Botcazou

[PATCH] @signbit2_dm

2019-07-01 Thread Segher Boessenkool

One more I forgot to send before.


Segher


2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (signbit2_dm): Make this a
parameterized name.
(signbit2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9ab9ceb..9e81df9 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4857,13 +4857,7 @@ (define_expand "signbit2"
   rtx tmp = gen_reg_rtx (DImode);
   rtx dest_di = gen_lowpart (DImode, dest);
 
-  if (mode == KFmode)
-   emit_insn (gen_signbitkf2_dm (tmp, src));
-  else if (mode == TFmode)
-   emit_insn (gen_signbittf2_dm (tmp, src));
-  else
-   gcc_unreachable ();
-
+  emit_insn (gen_signbit2_dm (mode, tmp, src));
   emit_insn (gen_lshrdi3 (dest_di, tmp, GEN_INT (63)));
   DONE;
 }
@@ -4893,7 +4887,7 @@ (define_expand "signbit2"
 ;; After register allocation, if the _Float128 had originally been in GPRs, the
 ;; split allows the post reload phases to eliminate the move, and do the shift
 ;; directly with the register that contains the signbit.
-(define_insn_and_split "signbit2_dm"
+(define_insn_and_split "@signbit2_dm"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r")
(unspec:DI [(match_operand:SIGNBIT 1 "gpc_reg_operand" "wa,r")]
   UNSPEC_SIGNBIT))]
-- 
1.8.3.1

Re: [PATCH][ARM] Add support for "noinit" attribute

2019-07-01 Thread Sandra Loosemore


On 6/13/19 9:13 AM, Christophe Lyon wrote:

@@ -7131,6 +7132,18 @@ given via attribute argument.
 
 @end table
 
+@node ARM Variable Attributes

+@subsection ARM Variable Attributes
+
+@table @code
+@item noinit
+@cindex @code{noinit} variable attribute, ARM
+Any data with the @code{noinit} attribute will not be initialised by
+the C runtime startup code, or the program loader.  Not initialising
+data in this way can reduce program startup times.
+
+@end table
+
 @node AVR Variable Attributes
 @subsection AVR Variable Attributes
 


Minor nit:  please use American spelling:  initialized, initializing.

-Sandra

Re: [PATCH] Prepare for prefixed instructions on PowerPC

2019-07-01 Thread Segher Boessenkool

Hi Mike,

Sorry I missed this patch :-(

On Thu, Jun 27, 2019 at 04:18:00PM -0400, Michael Meissner wrote:
> As we discussed off-line earlier, I changed all of the "4" lengths to be "*",
> even for instruction alternatives that would not be subject to being changed 
> to
> be a prefixed instruction (i.e. the sign extend instruction "extsw"'s length 
> is
> set to "*", even though it would not be a prefixed instruction).

"*" means "use the default for this attribute", which often is nicer to
see than "4".  For example, "8" stands out more in a sea of "*"s.

Usually "*" is the insns defined as "normal" alternatives, and "8" or
"12" etc. are split.

> @@ -7231,7 +7231,7 @@ (define_insn "*movcc_internal1"
>(const_string "mtjmpr")
>(const_string "load")
>(const_string "store")])
> -   (set_attr "length" "4,4,12,4,4,8,4,4,4,4,4,4")])
> +   (set_attr "length" "*,*,12,*,*,8,*,*,*,*,*,*")])

In this case, the "12" and "8" are actually defined as one insn in the
template, with some "\;".  Luckily there aren't many of those.

> @@ -7385,8 +7385,8 @@ (define_insn "*mov_softfloat"
>   *,  *, *, *")
>  
> (set_attr "length"
> - "4,  4, 4, 4, 4, 4,
> - 4,  4, 8, 4")])
> + "*,  *, *, *, *, *,
> + *,  *, 8, *")])

[ That last line should start with a tab as well. ]

The entry before the 8 is split as well.  Maybe that should be "4", to
stand out?  I don't know what works better; your choice.

> @@ -7696,8 +7696,8 @@ (define_insn "*mov_softfloat64"
>   *,   *,  *")
>  
> (set_attr "length"
> -"4,   4,  4,  4,  4,  8,
> - 12,  16, 4")])
> +"*,   *,  *,  *,  *,  8,
> + 12,  16, *")])

Same for the last entry here.

> @@ -8760,10 +8760,10 @@ (define_insn "*movdi_internal32"
>vecsimple")
> (set_attr "size" "64")
> (set_attr "length"
> - "8, 8, 8, 4, 4, 4,
> -  16,4, 4, 4, 4, 4,
> -  4, 4, 4, 4, 4, 8,
> -  4")
> + "8, 8, 8, *, *, *,
> +  16,*, *, *, *, *,
> +  *, *, *, *, *, 8,
> +  *")

And the last here.

> @@ -8853,11 +8853,11 @@ (define_insn "*movdi_internal64"
>  mftgpr,mffgpr")
> (set_attr "size" "64")
> (set_attr "length"
> -   "4, 4, 4, 4, 4,  20,
> -4, 4, 4, 4, 4,  4,
> -4, 4, 4, 4, 4,  4,
> -4, 8, 4, 4, 4,  4,
> -4, 4")
> +   "*, *, *, *, *,  20,
> +*, *, *, *, *,  *,
> +*, *, *, *, *,  *,
> +*, 8, *, *, *,  *,
> +*, *")

And two of the entries here.

> @@ -1150,9 +1150,9 @@ (define_insn "vsx_mov_64bit"
>  store, load,  store, *, vecsimple, 
> vecsimple,
>  vecsimple, *, *, vecstore,  vecload")
> (set_attr "length"
> -   "4, 4, 4, 8, 4, 8,
> -8, 8, 8, 8, 4, 4,
> -4, 20,8, 4, 4")
> +   "*, *, *, 8, *, 8,
> +8, 8, 8, 8, *, *,
> +*, 20,8, *, *")

No idea which ones are split here :-)  None of the * and all other would
be nice, but who knows :-)


Okay for trunk, maybe with some "4" if you agree that has value.  Thanks!
And again, sorry I missed this patch.


Segher

[PATCH, Ada, Darwin, PPC] PPC Darwin has stack check probes.

2019-07-01 Thread Iain Sandoe

On PPC, Darwin uses the same code as other parts of the port, I suspect
that the False was a historical relic.

(perhaps I could have applied this with my Darwin maintainer’s hat, or as 
obvious,
 but wasn’t sure for the Ada sub-tree)

OK for trunk?
Thanks
Iain

2019-07-01  Iain Sandoe  

* libgnat/system-darwin-ppc.ads: Set Stack_Check_Probes True for
PPC Darwin.

diff --git a/gcc/ada/libgnat/system-darwin-ppc.ads 
b/gcc/ada/libgnat/system-darwin-ppc.ads
index d314b66..9adc2de 100644
--- a/gcc/ada/libgnat/system-darwin-ppc.ads
+++ b/gcc/ada/libgnat/system-darwin-ppc.ads
@@ -158,7 +158,7 @@ private
Preallocated_Stacks   : constant Boolean := False;
Signed_Zeros  : constant Boolean := True;
Stack_Check_Default   : constant Boolean := False;
-   Stack_Check_Probes: constant Boolean := False;
+   Stack_Check_Probes: constant Boolean := True;
Stack_Check_Limits: constant Boolean := False;
Support_Aggregates: constant Boolean := True;
Support_Atomic_Primitives : constant Boolean := Word_Size = 64;

[PATCH, i386]: Also use and split insns with SSE operands for 32bit MMX targets

2019-07-01 Thread Uros Bizjak

Attached patch uses and splits a couple of instructions with SSE
operands for 32bit MMX targets.

2019-07-01  Uroš Bizjak  

* config/i386/i386.md ("isa" attribute): Add sse_noavx.
("enabled" attribute): Handle sse_noavx isa attribute.
* config/i386/mmx.md (*vec_dupv2sf): Add "isa" attribute.
Use TARGET_SSE && SSE_REGNO_P in split condition.
(*vec_dupv2sf): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c362a72411a2..c401deb5eb00 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -794,7 +794,7 @@
 
 ;; Used to control the "enabled" attribute on a per-instruction basis.
 (define_attr "isa" "base,x64,x64_sse2,x64_sse4,x64_sse4_noavx,x64_avx,nox64,
-   sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx,
+   sse_noavx,sse2,sse2_noavx,sse3,sse4,sse4_noavx,avx,noavx,
avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f,
avx512bw,noavx512bw,avx512dq,noavx512dq,
avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
@@ -820,6 +820,8 @@
   (symbol_ref "TARGET_64BIT && TARGET_AVX512BW")
 (eq_attr "isa" "nox64") (symbol_ref "!TARGET_64BIT")
 (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2")
+(eq_attr "isa" "sse_noavx")
+  (symbol_ref "TARGET_SSE && !TARGET_AVX")
 (eq_attr "isa" "sse2_noavx")
   (symbol_ref "TARGET_SSE2 && !TARGET_AVX")
 (eq_attr "isa" "sse3") (symbol_ref "TARGET_SSE3")
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 70413349390c..9ff86ba0eecc 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -590,12 +590,16 @@
punpckldq\t%0, %0
#
#"
-  "TARGET_MMX_WITH_SSE && reload_completed"
+  "TARGET_SSE && reload_completed
+   && SSE_REGNO_P (REGNO (operands[0]))"
   [(set (match_dup 0)
(vec_duplicate:V4SF (match_dup 1)))]
-  "operands[0] = lowpart_subreg (V4SFmode, operands[0],
-GET_MODE (operands[0]));"
-  [(set_attr "mmx_isa" "native,sse_noavx,avx")
+{
+  operands[0] = lowpart_subreg (V4SFmode, operands[0],
+   GET_MODE (operands[0]));
+}
+  [(set_attr "isa" "*,sse_noavx,avx")
+   (set_attr "mmx_isa" "native,*,*")
(set_attr "type" "mmxcvt,ssemov,ssemov")
(set_attr "mode" "DI,TI,TI")])
 
@@ -1560,12 +1564,16 @@
#
#
#"
-  "TARGET_MMX_WITH_SSE && reload_completed"
+  "TARGET_SSE && reload_completed
+   && SSE_REGNO_P (REGNO (operands[0]))"
   [(set (match_dup 0)
(vec_duplicate:V4SI (match_dup 1)))]
-  "operands[0] = lowpart_subreg (V4SImode, operands[0],
-GET_MODE (operands[0]));"
-  [(set_attr "mmx_isa" "native,sse_noavx,avx,avx")
+{
+  operands[0] = lowpart_subreg (V4SImode, operands[0],
+   GET_MODE (operands[0]));
+}
+  [(set_attr "isa" "*,sse_noavx,avx,avx")
+   (set_attr "mmx_isa" "native,*,*,*")
(set_attr "type" "mmxcvt,ssemov,ssemov,ssemov")
(set_attr "mode" "DI,TI,TI,TI")])

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-07-01 Thread Martin Sebor


On 7/1/19 10:33 AM, Richard Biener wrote:

On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor 
wrote:>

[Adding gcc-patches]

Richard, do you have any further comments or is the revised patch 
good to commit?


No further comments from my side - it's good to commit.


After running a full bootstrap with the patch with the static_assert
I found that the I didn't fully understand the KeyId type/compare_type
requirements.  Like value_type, this type too can be a non-POD type.
It just needs a suitable Traits (AKA Descriptor) class.

I've updated the comments to reflect that and removed
the static_assert and checked in the original version of the change
with better comments in r272893.

Sorry about that hiccup.

Martin



Richard.


Martin

On 6/25/19 2:30 PM, Martin Sebor wrote:

On 6/25/19 3:53 AM, Jonathan Wakely wrote:

On 24/06/19 19:42 +0200, Richard Biener wrote:

On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor
 wrote:


On 6/24/19 6:11 AM, Richard Biener wrote:

On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor


wrote:


On 6/21/19 6:06 AM, Richard Biener wrote:

On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor


wrote:


Bug 90923 shows that even though GCC hash-table
based containers like hash_map can be instantiated
on types with user-defined ctors and dtors they
invoke the dtors of such types without invoking the
corresponding ctors.

It was thanks to this bug that I spent a day
debugging

"interesting"

miscompilations during GCC bootstrap (in fairness,
it was that and bug 90904 about auto_vec copy
assignment/construction also being hosed even for
POD types).

The attached patch corrects the hash_map and
hash_set templates to invoke the ctors of the
elements they insert and makes them (hopefully)
safe to use with non-trivial user-defined types.


Hum.  I am worried about the difference of assignment
vs.

construction

in ::put()

+  bool ins = hash_entry::is_empty (*e); +
if (ins) +   { + e->m_key = k; +
new ((void *) &e->m_value) Value (v); +   } +
else +   e->m_value = v;

why not invoke the dtor on the old value and then the
ctor again?


It wouldn't work for self-assignment:

Value &y = m.get_or_insert (key); m.put (key, y);

The usual rule of thumb for writes into containers is
to use construction when creating a new element and
assignment when replacing the value of an existing
element.

Which reminds me that the hash containers, despite
being copy- constructible (at least for POD types, they
aren't for user- defined types), also aren't safe for
assignment even for PODs. I opened bug 90959 for this.
Until the assignment is fixed I made it inaccessibe in
the patch (I have fixed the copy ctor to DTRT for
non-PODs).


How is an empty hash_entry constructed?


In hash_table::find_slot_with_hash simply by finding an
empty slot and returning a pointer to it.  The memory
for the slot is marked "empty" by calling the
Traits::mark_empty() function.

The full type of hash_map is actually

hash_mapsimple_hashmap_traits, 
Value>


and simple_hashmap_traits delegates it to
default_hash_traits whose mark_empty() just clears the
void*, leaving the Value part uninitialized.  That
makes sense because we don't want to call ctors for
empty entries.  I think the questions one might ask if
one were to extend the design are: a) what class should
invoke the ctor/assignment and b) should it do it 
directly or via the traits?



::remove() doesn't seem to invoke the dtor either,
instead it relies on the traits::remove function?


Yes.  There is no Traits::construct or assign or copy.
We could add them but I'm not sure I see to what end
(there could be use cases, I just don't know enough
about these classes to think of any).

Attached is an updated patch with the additional minor
fixes mentioned above.

Martin

PS I think we would get much better results by
adopting the properly designed and tested standard
library containers than by spending time trying to
improve the design of these legacy classes.  For simple
uses that don't need to integrate with the GC machinery
the standard containers should be fine (plus, it'd
provide us with greater motivation to improve them and
the code GCC emits for their uses).  Unfortunately, to
be able to use the hash-based containers we would need
to upgrade to C++ 11.  Isn't it time yet?


I don't think so.  I'm also not sure if C++11 on its own
is desirable or if it should be C++14 or later at that
point.  SLES 12 has GCC 4.8 as host compiler (but also
GCC 7+ optionally), SLES 15 has GCC 7. SLES 11 already
struggles currently (GCC 4.3) but I'd no longer consider
that important enough.

Note any such change to libstdc++ containers should be
complete and come with both code-size and compile-time
and memory-usage measurements (both of GCC and other apps
of course).


Can I go ahead and commit the patch?


I think we need to document the requirements on Value classes
better.

@@ -177,7 +185,10 @@ public: INSERT); bool ins =
Traits::is_empty (*e); if (ins) -   e->m_key = k; +
{

[PATCH 10/12] @abs2_hw

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (abs2_hw): Make this a parameterized
name.
(abs2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 27fdc4f..974f0b1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8117,12 +8117,7 @@ (define_expand "abs2"
 {
   if (TARGET_FLOAT128_HW)
{
- if (mode == TFmode)
-   emit_insn (gen_abstf2_hw (operands[0], operands[1]));
- else if (mode == KFmode)
-   emit_insn (gen_abskf2_hw (operands[0], operands[1]));
- else
-   FAIL;
+ emit_insn (gen_abs2_hw (mode, operands[0], operands[1]));
  DONE;
}
   else if (TARGET_FLOAT128_TYPE)
@@ -13908,7 +13903,7 @@ (define_insn "@neg2_hw"
(set_attr "size" "128")])
 
 
-(define_insn "abs2_hw"
+(define_insn "@abs2_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
(abs:IEEE128
 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
-- 
1.8.3.1

[PATCH 11/12] @ieee_128bit_vsx_neg2

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (ieee_128bit_vsx_neg2): Make this a
parameterized name.
(neg2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 974f0b1..86acaae 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8070,14 +8070,8 @@ (define_expand "neg2"
   if (TARGET_FLOAT128_HW)
emit_insn (gen_neg2_hw (mode, operands[0], operands[1]));
   else if (TARGET_FLOAT128_TYPE)
-   {
- if (mode == TFmode)
-   emit_insn (gen_ieee_128bit_vsx_negtf2 (operands[0], operands[1]));
- else if (mode == KFmode)
-   emit_insn (gen_ieee_128bit_vsx_negkf2 (operands[0], operands[1]));
- else
-   gcc_unreachable ();
-   }
+   emit_insn (gen_ieee_128bit_vsx_neg2 (mode,
+operands[0], operands[1]));
   else
{
  rtx libfunc = optab_libfunc (neg_optab, mode);
@@ -8189,7 +8183,7 @@ (define_expand "ieee_128bit_negative_zero"
 ;; twiddle the sign bit.  Later GCSE passes can then combine multiple uses of
 ;; neg/abs to create the constant just once.
 
-(define_insn_and_split "ieee_128bit_vsx_neg2"
+(define_insn_and_split "@ieee_128bit_vsx_neg2"
   [(set (match_operand:IEEE128 0 "register_operand" "=wa")
(neg:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa")))
(clobber (match_scratch:V16QI 2 "=v"))]
-- 
1.8.3.1

[PATCH 12/12] @ieee_128bit_vsx_abs2

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (ieee_128bit_vsx_abs2): Make this a
parameterized name.
(abs2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 86acaae..9e81df9 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8116,12 +8116,8 @@ (define_expand "abs2"
}
   else if (TARGET_FLOAT128_TYPE)
{
- if (mode == TFmode)
-   emit_insn (gen_ieee_128bit_vsx_abstf2 (operands[0], operands[1]));
- else if (mode == KFmode)
-   emit_insn (gen_ieee_128bit_vsx_abskf2 (operands[0], operands[1]));
- else
-   FAIL;
+ emit_insn (gen_ieee_128bit_vsx_abs2 (mode,
+  operands[0], operands[1]));
  DONE;
}
   else
@@ -8212,7 +8208,7 @@ (define_insn "*ieee_128bit_vsx_neg2_internal"
   [(set_attr "type" "veclogical")])
 
 ;; IEEE 128-bit absolute value
-(define_insn_and_split "ieee_128bit_vsx_abs2"
+(define_insn_and_split "@ieee_128bit_vsx_abs2"
   [(set (match_operand:IEEE128 0 "register_operand" "=wa")
(abs:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa")))
(clobber (match_scratch:V16QI 2 "=v"))]
-- 
1.8.3.1

[PATCH 09/12] @neg2_hw

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (neg2_hw): Make this a parameterized
name.
(neg2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5b3e458..27fdc4f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8068,14 +8068,7 @@ (define_expand "neg2"
   if (FLOAT128_IEEE_P (mode))
 {
   if (TARGET_FLOAT128_HW)
-   {
- if (mode == TFmode)
-   emit_insn (gen_negtf2_hw (operands[0], operands[1]));
- else if (mode == KFmode)
-   emit_insn (gen_negkf2_hw (operands[0], operands[1]));
- else
-   gcc_unreachable ();
-   }
+   emit_insn (gen_neg2_hw (mode, operands[0], operands[1]));
   else if (TARGET_FLOAT128_TYPE)
{
  if (mode == TFmode)
@@ -13905,7 +13898,7 @@ (define_insn "copysign3_soft"
   [(set_attr "type" "veccomplex")
(set_attr "length" "8")])
 
-(define_insn "neg2_hw"
+(define_insn "@neg2_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
(neg:IEEE128
 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
-- 
1.8.3.1

[PATCH 05/12] @ctr

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (ctr): Make this a parameterized name.
(doloop_end): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d665316..381f140 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12566,22 +12566,14 @@ (define_expand "doloop_end"
(use (match_operand 1))]; label
   ""
 {
-  if (TARGET_64BIT)
-{
-  if (GET_MODE (operands[0]) != DImode)
-   FAIL;
-  emit_jump_insn (gen_ctrdi (operands[0], operands[1]));
-}
-  else
-{
-  if (GET_MODE (operands[0]) != SImode)
-   FAIL;
-  emit_jump_insn (gen_ctrsi (operands[0], operands[1]));
-}
+  if (GET_MODE (operands[0]) != Pmode)
+FAIL;
+
+  emit_jump_insn (gen_ctr (Pmode, operands[0], operands[1]));
   DONE;
 })
 
-(define_expand "ctr"
+(define_expand "@ctr"
   [(parallel [(set (pc)
   (if_then_else (ne (match_operand:P 0 "register_operand")
 (const_int 1))
-- 
1.8.3.1

[PATCH 08/12] @extenddf2

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (extenddf2): Make this a parameterized
name.
(floatsi2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3235eb2..5b3e458 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7775,7 +7775,7 @@ (define_insn_and_split "*mov_softfloat"
(const_string "8")
(const_string "16"))])])
 
-(define_expand "extenddf2"
+(define_expand "@extenddf2"
   [(set (match_operand:FLOAT128 0 "gpc_reg_operand")
(float_extend:FLOAT128 (match_operand:DF 1 "gpc_reg_operand")))]
   "TARGET_HARD_FLOAT && TARGET_LONG_DOUBLE_128"
@@ -7922,12 +7922,7 @@ (define_expand "floatsi2"
 {
   rtx tmp = gen_reg_rtx (DFmode);
   expand_float (tmp, op1, false);
-  if (mode == TFmode)
-   emit_insn (gen_extenddftf2 (op0, tmp));
-  else if (mode == IFmode)
-   emit_insn (gen_extenddfif2 (op0, tmp));
-  else
-   gcc_unreachable ();
+  emit_insn (gen_extenddf2 (mode, op0, tmp));
   DONE;
 }
 })
-- 
1.8.3.1

[PATCH 06/12] @eh_set_lr_

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (eh_set_lr_): Make this a parameterized
name.
(eh_return): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 381f140..881efe1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -13184,15 +13184,12 @@ (define_expand "eh_return"
   [(use (match_operand 0 "general_operand"))]
   ""
 {
-  if (TARGET_32BIT)
-emit_insn (gen_eh_set_lr_si (operands[0]));
-  else
-emit_insn (gen_eh_set_lr_di (operands[0]));
+  emit_insn (gen_eh_set_lr (Pmode, operands[0]));
   DONE;
 })
 
 ; We can't expand this before we know where the link register is stored.
-(define_insn_and_split "eh_set_lr_"
+(define_insn_and_split "@eh_set_lr_"
   [(unspec_volatile [(match_operand:P 0 "register_operand" "r")] UNSPECV_EH_RR)
(clobber (match_scratch:P 1 "=&b"))]
   ""
-- 
1.8.3.1

[PATCH 07/12] @extenddf2_{fprs,vsx}

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (extenddf2_fprs): Make this a
parameterized name.
(extenddf2_vsx): Make this a parameterized name.
(extenddf2): Use those names.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 881efe1..3235eb2 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7783,31 +7783,20 @@ (define_expand "extenddf2"
   if (FLOAT128_IEEE_P (mode))
 rs6000_expand_float128_convert (operands[0], operands[1], false);
   else if (TARGET_VSX)
-{
-  if (mode == TFmode)
-   emit_insn (gen_extenddftf2_vsx (operands[0], operands[1]));
-  else if (mode == IFmode)
-   emit_insn (gen_extenddfif2_vsx (operands[0], operands[1]));
-  else
-   gcc_unreachable ();
-}
-   else
+emit_insn (gen_extenddf2_vsx (mode, operands[0], operands[1]));
+  else
 {
   rtx zero = gen_reg_rtx (DFmode);
   rs6000_emit_move (zero, CONST0_RTX (DFmode), DFmode);
 
-  if (mode == TFmode)
-   emit_insn (gen_extenddftf2_fprs (operands[0], operands[1], zero));
-  else if (mode == IFmode)
-   emit_insn (gen_extenddfif2_fprs (operands[0], operands[1], zero));
-  else
-   gcc_unreachable ();
+  emit_insn (gen_extenddf2_fprs (mode,
+operands[0], operands[1], zero));
 }
   DONE;
 })
 
 ;; Allow memory operands for the source to be created by the combiner.
-(define_insn_and_split "extenddf2_fprs"
+(define_insn_and_split "@extenddf2_fprs"
   [(set (match_operand:IBM128 0 "gpc_reg_operand" "=d,d,&d")
(float_extend:IBM128
 (match_operand:DF 1 "nonimmediate_operand" "d,m,d")))
@@ -7826,7 +7815,7 @@ (define_insn_and_split "extenddf2_fprs"
   operands[4] = simplify_gen_subreg (DFmode, operands[0], mode, lo_word);
 })
 
-(define_insn_and_split "extenddf2_vsx"
+(define_insn_and_split "@extenddf2_vsx"
   [(set (match_operand:IBM128 0 "gpc_reg_operand" "=d,d")
(float_extend:IBM128
 (match_operand:DF 1 "nonimmediate_operand" "wa,m")))]
-- 
1.8.3.1

[PATCH 04/12] @indirect_jump_nospec

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (indirect_jump_nospec): Make this a
parameterized name.
(indirect_jump): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ca8b0c0..d665316 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12410,10 +12410,7 @@ (define_expand "indirect_jump"
 {
   if (!rs6000_speculate_indirect_jumps) {
 rtx ccreg = gen_reg_rtx (CCmode);
-if (Pmode == DImode)
-  emit_jump_insn (gen_indirect_jumpdi_nospec (operands[0], ccreg));
-else
-  emit_jump_insn (gen_indirect_jumpsi_nospec (operands[0], ccreg));
+emit_jump_insn (gen_indirect_jump_nospec (Pmode, operands[0], ccreg));
 DONE;
   }
 })
@@ -12425,7 +12422,7 @@ (define_insn "*indirect_jump"
   "b%T0"
   [(set_attr "type" "jmpreg")])
 
-(define_insn "indirect_jump_nospec"
+(define_insn "@indirect_jump_nospec"
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
   "!rs6000_speculate_indirect_jumps"
-- 
1.8.3.1

[PATCH 03/12] @abs2_internal

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (abs2_internal): Make this a
parameterized name.
(abs2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 48ead5e..ca8b0c0 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8163,17 +8163,12 @@ (define_expand "abs2"
 }
 
   label = gen_label_rtx ();
-  if (mode == TFmode)
-emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
-  else if (mode == IFmode)
-emit_insn (gen_absif2_internal (operands[0], operands[1], label));
-  else
-FAIL;
+  emit_insn (gen_abs2_internal (mode, operands[0], operands[1], label));
   emit_label (label);
   DONE;
 })
 
-(define_expand "abs2_internal"
+(define_expand "@abs2_internal"
   [(set (match_operand:IBM128 0 "gpc_reg_operand")
(match_operand:IBM128 1 "gpc_reg_operand"))
(set (match_dup 3) (match_dup 5))
-- 
1.8.3.1

[PATCH 02/12] @fix_truncsi2_fprs

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (fix_truncsi2_fprs): Make this a
parameterized name.
(fix_truncsi2): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 63823c4..48ead5e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7969,17 +7969,13 @@ (define_expand "fix_truncsi2"
 {
   if (FLOAT128_IEEE_P (mode))
rs6000_expand_float128_convert (op0, op1, false);
-  else if (mode == TFmode)
-   emit_insn (gen_fix_trunctfsi2_fprs (op0, op1));
-  else if (mode == IFmode)
-   emit_insn (gen_fix_truncifsi2_fprs (op0, op1));
   else
-   gcc_unreachable ();
+   emit_insn (gen_fix_truncsi2_fprs (mode, op0, op1));
   DONE;
 }
 })
 
-(define_expand "fix_truncsi2_fprs"
+(define_expand "@fix_truncsi2_fprs"
   [(parallel [(set (match_operand:SI 0 "gpc_reg_operand")
   (fix:SI (match_operand:IBM128 1 "gpc_reg_operand")))
  (clobber (match_dup 2))
-- 
1.8.3.1

[PATCH 01/12] @neg2

2019-07-01 Thread Segher Boessenkool

2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.md (neg2): Make this a parameterized name.
(allocate_stack): Use that name.  Simplify.

---
 gcc/config/rs6000/rs6000.md | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d0d272a..63823c4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2249,7 +2249,7 @@ (define_insn "subf3_carry_in_xx"
   [(set_attr "type" "add")])
 
 
-(define_insn "neg2"
+(define_insn "@neg2"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
(neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))]
   ""
@@ -9810,10 +9810,7 @@ (define_expand "allocate_stack"
 {
   operands[1] = force_reg (Pmode, operands[1]);
   neg_op0 = gen_reg_rtx (Pmode);
-  if (TARGET_32BIT)
-   emit_insn (gen_negsi2 (neg_op0, operands[1]));
-  else
-   emit_insn (gen_negdi2 (neg_op0, operands[1]));
+  emit_insn (gen_neg2 (Pmode, neg_op0, operands[1]));
 }
   else
 neg_op0 = GEN_INT (-INTVAL (operands[1]));
-- 
1.8.3.1

[PATCH 00/12] rs6000: Use parameterised names

2019-07-01 Thread Segher Boessenkool

This series makes the rs6000 backend use parameterised names.  This
means adding an "@" to the start of pattern names, removing the mode
from the gen_* names where you call them, and adding an extra mode
parameter to those calls (as the first argument).

The abs and neg patterns used to call FAIL for some unexpected modes,
they no longer do.  It should have maybe used gcc_unreachable there
instead.  These patches also remove such gcc_unreachable.

Tested all on powerpc64-linux {-m32,-m64} and on p9 powerpc64le-linux.
Committing to trunk.


Segher


Segher Boessenkool (12):
  @neg2
  @fix_truncsi2_fprs
  @abs2_internal
  @indirect_jump_nospec
  @ctr
  @eh_set_lr_
  @extenddf2_{fprs,vsx}
  @extenddf2
  @neg2_hw
  @abs2_hw
  @ieee_128bit_vsx_neg2
  @ieee_128bit_vsx_abs2

 gcc/config/rs6000/rs6000.md | 130 +++-
 1 file changed, 33 insertions(+), 97 deletions(-)

-- 
1.8.3.1

Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format

2019-07-01 Thread Giuliano Belinassi

Hi, Liška

I incorporated all your suggestions to the patch, but before sending the
v2 I want to discuss some errors reported by the style script.

>=== ERROR type #1: blocks of 8 spaces should be replaced with tabs (16
>error(s)) ===
>gcc/lto/lto-dump.c:263:22:"  -list [options]   Dump the
>symbol list.\n"
>gcc/lto/lto-dump.c:264:18:"-demangle   Dump the
>demangled output.\n"
>gcc/lto/lto-dump.c:265:22:"-defined-only   Dump only the
>defined symbols.\n"
>gcc/lto/lto-dump.c:266:21:"-print-valueDump initial
>values of the variables.\n"
>gcc/lto/lto-dump.c:267:19:"-name-sort  Sort the
>symbols alphabetically.\n"
>gcc/lto/lto-dump.c:268:19:"-size-sort  Sort the
>symbols according to size.\n"
>gcc/lto/lto-dump.c:269:22:"-reverse-sort   Dump the
>symbols in reverse order.\n"
>gcc/lto/lto-dump.c:270:15:"  -symbol=  Dump the
>details of specific symbol.\n"
>gcc/lto/lto-dump.c:271:15:"  -objects  Dump the
>details of LTO objects.\n"
>gcc/lto/lto-dump.c:272:17:"  -callgraphDump the
>callgraph in graphviz format.\n"
>gcc/lto/lto-dump.c:273:18:"  -type-stats   Dump
>statistics of tree types.\n"
>gcc/lto/lto-dump.c:274:18:"  -tree-stats   Dump
>statistics of trees.\n"
>gcc/lto/lto-dump.c:275:20:"  -gimple-stats Dump
>statistics of gimple statements.\n"
>gcc/lto/lto-dump.c:276:18:"  -dump-body=   Dump the
>specific gimple body.\n"
>gcc/lto/lto-dump.c:277:19:"  -dump-level=  Deciding the
>optimization level of body.\n"
>gcc/lto/lto-dump.c:278:12:"  -help Display the
>dump tool help.\n";
>
Is this correct? I think that replacing these spaces with tabs will do
bad things in some terminals.

>=== ERROR type #2: there should be exactly one space between function
>name and parenthesis (1 error(s)) ===
>gcc/lto/lang.opt:131:11:LTODump Var(flag_dump_callgraph)

I just followed the pattern in that file. Unless everything there is
wrong, the style in the patch is correct.

>
>=== ERROR type #3: there should be no space before a left square bracket
>(2 error(s)) ===
>gcc/lto/lto-dump.c:261:21:"Usage: lto-dump [OPTION]... SUB_COMMAND
>[OPTION]...\n\n"
>gcc/lto/lto-dump.c:263:13:"  -list [options]   Dump the
>symbol list.\n"
>
>=== ERROR type #4: trailing operator (1 error(s)) ===
>gcc/lto/lto-dump.c:260:18:  const char *msg =

I think doing it in this way is cleaner than the previous version, where
there was a printf for each line. Therefore this change is less noisy
because I just write the string with keeping in mind with what will be
displayed in the terminal, and then print it.  Of course this is my point
of view and it is completely subjective.

Any other style error suggested by the script were fixed.

On 07/01, Martin Liška wrote:
> On 7/1/19 2:56 PM, Giuliano Belinassi wrote:
> > Hi,
> > 
> > On 07/01, Martin Jambor wrote:
> >> On Sat, Jun 29 2019, Giuliano Belinassi wrote:
> >>> This patch makes lto-dump dump the symbol table callgraph in graphviz
> >>> format.
> >> -ENOPATCH
> > Sorry,
> > I forgot the most important. Here it is.
> 
> Hello.
> 
> I like the intention! Please try to use:
> $ git diff HEAD > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py 
> /tmp/patch
> 
> which will show you some coding style issues.
> 
> > 
> >> Martin
> >>
> >>> I've not added any test to this because I couldn't find a way to call
> >>> lto-dump from the testsuite. Also, any feedback with regard to how can
> >>> I improve this is welcome.
> >>>
> >>> gcc/ChangeLog
> >>> 2019-06-29  Giuliano Belinassi  
> >>>
> >>> * cgraph.c (dump_graphviz): New function
> >>> * cgraph.h (dump_graphviz): New function
> >>> * symtab.c (dump_graphviz): New function
> >>> * varpool.c (dump_graphviz): New function
> >>>
> >>> gcc/lto/ChangeLog
> >>> 2019-06-29  Giuliano Belinassi  
> >>>
> >>> * lang.opt (flag_dump_callgraph): New flag
> >>> * lto-dump.c (dump_symtab_graphviz): New function
> >>> * lto-dump.c (dump_tool_help): New option
> >>> * lto-dump.c (lto_main): New option
> > Giuliano
> > 
> > ===
> > 
> > diff --git gcc/cgraph.c gcc/cgraph.c
> > index 28019aba434..55f4ee0bdaa 100644
> > 
> > --- gcc/cgraph.c
> > 
> > +++ gcc/cgraph.c
> > 
> > @@ -2204,6 +2204,22 @@
> > 
> >  cgraph_node::dump (FILE *f)
> > 
> > }
> > }
> > +/* Dump call graph node to file F in graphviz format. */
> > +
> > +void
> > +cgraph_node::dump_graphviz (FILE *f)
> > +{
> > + cgraph_edge *edge;
> > +
> > + for (edge = callees; edge; edge = edge->next_callee)
> > + {
> > + cgraph_node *callee = edge->callee;
> > +
> > + fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ());
> > + }
> > +}
> > +
> > +
> > /* Dump ca

[committed] Fix longjmp expander on hppa

2019-07-01 Thread John David Anglin

With the recent update to setjmp, we now need to restore the hard frame pointer
directly from the saved frame pointer.  We don't need to adjust for the offset 
between
the virtual_stack_vars_rtx and the hard frame pointer.

Tested on hppa-unknwon-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.

Committed to trunk.

Dave
-- 
John David Anglin  dave.ang...@bell.net

2019-07-01  Wilco Dijkstra  
John David Anglin  setjmp

PR target/90963
* config/pa/pa.md (builtin_longjmp): Restore hard_frame_pointer_rtx
using saved frame pointer.

Index: config/pa/pa.md
===
--- config/pa/pa.md (revision 272852)
+++ config/pa/pa.md (working copy)
@@ -8700,9 +8700,7 @@
  restoring the gp.  */
   emit_move_insn (pv, lab);

-  /* Restore the stack and frame pointers.  The virtual_stack_vars_rtx
- is saved instead of the hard_frame_pointer_rtx in the save area.
- We need to adjust for the offset between these two values.  */
+  /* Restore the stack and frame pointers.  */
   fp = copy_to_reg (fp);
   emit_stack_restore (SAVE_NONLOCAL, stack);

@@ -8710,7 +8708,7 @@
   emit_insn (gen_blockage ());
   emit_clobber (hard_frame_pointer_rtx);
   emit_clobber (frame_pointer_rtx);
-  emit_move_insn (hard_frame_pointer_rtx, plus_constant (Pmode, fp, -8));
+  emit_move_insn (hard_frame_pointer_rtx, fp);

   emit_use (hard_frame_pointer_rtx);
   emit_use (stack_pointer_rtx);

Re: [PING][AArch64] Use scvtf fbits option where appropriate

2019-07-01 Thread James Greenhalgh

On Wed, Jun 26, 2019 at 10:35:00AM +0100, Joel Hutton wrote:
> Ping, plus minor rework (mostly non-functional changes)
> 
> gcc/ChangeLog:
> 
> 2019-06-12  Joel Hutton  
> 
>  * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New 
> prototype
>  * config/aarch64/aarch64.c (aarch64_fpconst_pow2_recip): New function
>  * config/aarch64/aarch64.md 
> (*aarch64_cvtf2_mult): New pattern

Cool; I learned a new instruction!

>  (*aarch64_cvtf2_mult): New pattern
>  * config/aarch64/constraints.md (Dt): New constraint
>  * config/aarch64/predicates.md (aarch64_fpconst_pow2_recip): New 
> predicate
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-12  Joel Hutton  
> 
>  * gcc.target/aarch64/fmul_scvtf_1.c: New test.

This testcase will fail on ILP32 targets where unsigned long will still
live in a 'w' register.

Thanks,
James

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-07-01 Thread Richard Biener

On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor  wrote:>
> [Adding gcc-patches]
>
> Richard, do you have any further comments or is the revised patch
> good to commit?

No further comments from my side - it's good to commit.

Richard.

> Martin
>
> On 6/25/19 2:30 PM, Martin Sebor wrote:
> > On 6/25/19 3:53 AM, Jonathan Wakely wrote:
> >> On 24/06/19 19:42 +0200, Richard Biener wrote:
> >>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor  wrote:
> 
>  On 6/24/19 6:11 AM, Richard Biener wrote:
>  > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor 
>  wrote:
>  >>
>  >> On 6/21/19 6:06 AM, Richard Biener wrote:
>  >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor 
>  wrote:
>  
>   Bug 90923 shows that even though GCC hash-table based containers
>   like hash_map can be instantiated on types with user-defined ctors
>   and dtors they invoke the dtors of such types without invoking
>   the corresponding ctors.
>  
>   It was thanks to this bug that I spent a day debugging
>  "interesting"
>   miscompilations during GCC bootstrap (in fairness, it was that and
>   bug 90904 about auto_vec copy assignment/construction also being
>   hosed even for POD types).
>  
>   The attached patch corrects the hash_map and hash_set templates
>   to invoke the ctors of the elements they insert and makes them
>   (hopefully) safe to use with non-trivial user-defined types.
>  >>>
>  >>> Hum.  I am worried about the difference of assignment vs.
>  construction
>  >>> in ::put()
>  >>>
>  >>> +  bool ins = hash_entry::is_empty (*e);
>  >>> +  if (ins)
>  >>> +   {
>  >>> + e->m_key = k;
>  >>> + new ((void *) &e->m_value) Value (v);
>  >>> +   }
>  >>> +  else
>  >>> +   e->m_value = v;
>  >>>
>  >>> why not invoke the dtor on the old value and then the ctor again?
>  >>
>  >> It wouldn't work for self-assignment:
>  >>
>  >> Value &y = m.get_or_insert (key);
>  >> m.put (key, y);
>  >>
>  >> The usual rule of thumb for writes into containers is to use
>  >> construction when creating a new element and assignment when
>  >> replacing the value of an existing element.
>  >>
>  >> Which reminds me that the hash containers, despite being copy-
>  >> constructible (at least for POD types, they aren't for user-
>  >> defined types), also aren't safe for assignment even for PODs.
>  >> I opened bug 90959 for this.  Until the assignment is fixed
>  >> I made it inaccessibe in the patch (I have fixed the copy
>  >> ctor to DTRT for non-PODs).
>  >>
>  >>> How is an empty hash_entry constructed?
>  >>
>  >> In hash_table::find_slot_with_hash simply by finding an empty
>  >> slot and returning a pointer to it.  The memory for the slot
>  >> is marked "empty" by calling the Traits::mark_empty() function.
>  >>
>  >> The full type of hash_map is actually
>  >>
>  >> hash_map  >>  Value,
>  >>  simple_hashmap_traits,
>  >>Value>
>  >>
>  >> and simple_hashmap_traits delegates it to default_hash_traits
>  >> whose mark_empty() just clears the void*, leaving the Value
>  >> part uninitialized.  That makes sense because we don't want
>  >> to call ctors for empty entries.  I think the questions one
>  >> might ask if one were to extend the design are: a) what class
>  >> should invoke the ctor/assignment and b) should it do it
>  >> directly or via the traits?
>  >>
>  >>> ::remove() doesn't
>  >>> seem to invoke the dtor either, instead it relies on the
>  >>> traits::remove function?
>  >>
>  >> Yes.  There is no Traits::construct or assign or copy.  We
>  >> could add them but I'm not sure I see to what end (there could
>  >> be use cases, I just don't know enough about these classes to
>  >> think of any).
>  >>
>  >> Attached is an updated patch with the additional minor fixes
>  >> mentioned above.
>  >>
>  >> Martin
>  >>
>  >> PS I think we would get much better results by adopting
>  >> the properly designed and tested standard library containers
>  >> than by spending time trying to improve the design of these
>  >> legacy classes.  For simple uses that don't need to integrate
>  >> with the GC machinery the standard containers should be fine
>  >> (plus, it'd provide us with greater motivation to improve
>  >> them and the code GCC emits for their uses).  Unfortunately,
>  >> to be able to use the hash-based containers we would need to
>  >> upgrade to C++ 11.  Isn't it time yet?
>  >
>  > I don't think so.  I'm also not sure if C++11 on its own is desirable
>  > or if it should

Re: [PATCH][AArch64] Remove constraint strings from define_expand constructs in the back end

2019-07-01 Thread James Greenhalgh

On Mon, Jun 24, 2019 at 04:33:40PM +0100, Dennis Zhang wrote:
> Hi,
> 
> A number of AArch64 define_expand patterns have specified constraints 
> for their operands. But the constraint strings are ignored at expand 
> time and are therefore redundant/useless. We now avoid specifying 
> constraints in new define_expands, but we should clean up the existing 
> define_expand definitions.
> 
> For example, the constraint "=w" is removed in the following case:
> (define_expand "sqrt2"
>[(set (match_operand:GPF_F16 0 "register_operand" "=w")
> The "" marks with an empty constraint in define_expand are removed as well.
> 
> The patch is tested with the build configuration of 
> --target=aarch64-none-linux-gnu, and it passes gcc/testsuite.

This is OK for trunk.

Thanks,
James

> gcc/ChangeLog:
> 
> 2019-06-21  Dennis Zhang  
> 
>   * config/aarch64/aarch64-simd.md: Remove redundant constraints
>   from define_expand.
>   * config/aarch64/aarch64-sve.md: Likewise.
>   * config/aarch64/aarch64.md: Likewise.
>   * config/aarch64/atomics.md: Likewise.

Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch

2019-07-01 Thread Richard Biener

On Mon, Jul 1, 2019 at 4:05 PM Joern Wolfgang Rennecke
 wrote:
>
> [Apologies if this is a duplicate, I'm unsure if my previous mail was
> delivered]
> On 01/07/19 12:38, Richard Biener wrote:
> > On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke
> >  wrote:
> >> The heuristic introduced for PR71016 prevents recognizing a max / min
> >> combo like it is used for
> >> saturation when followed by a conversion.
> >> The attached patch refines the heuristic to allow this case. Regression
> >> tested on x86_64-pc-linux-gnu .
> > Few style nits:
> >
> >...
> >
> > also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt)
> > otherwise you'd also allow MIN/MAX unrelated to the conversion detected.
>
> Like the attached patch?

Yes - minor nit:

+ if (!operand_equal_p
+(lhs, gimple_assign_rhs1 (arg0_def_stmt), 0))
+   return NULL;

you can use

   if (lhs != gimple_assign_rhs1 (arg0_def_stmt))
 return NULL;

since SSA names are shared tree nodes.

OK with that change.

> >
> > On x86 I see the MIN_EXPR is already detected by GENERIC folding,
> > I wonder if that is required or if we can handle the case without in one
> > phiopt pass invocation as well.
> >
> tree_ssa_phiopt_worker is supposed to handle this by handling nested
> COND_EXPRs
> from the inside out:
>
> /* Search every basic block for COND_EXPR we may be able to optimize.
>
>   We walk the blocks in order that guarantees that a block with
>   a single predecessor is processed before the predecessor.
>   This ensures that we collapse inner ifs before visiting the
>   outer ones, and also that we do not try to visit a removed
>   block.  */
>bb_order = single_pred_before_succ_order ();
>
> However, I have no idea how to construct a testcase for this with the
> gimple folding in place.
>
> #define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x))
>
> void
> foo (unsigned char *p, int i, int m)
> {
>*p = (m > SAT (i) ? m : SAT (i));
> }
>
> or the equivalent for MIN_EXPR, *.c.004.original already has only one
> conditional left.

Hmm.  The testcase in your patch should be equal to

void
foo (unsigned char *p, int i)
{
  // tem1 = MAX ((unsinged char)MIN (i, 255), 0)
  unsigned char tem1;
  if (i >= 0)
{
  // tem2 = MIN (i, 255)
  int tem2;
  if (i < 255)
tem2 = i;
  else
tem2 = 255;
  // tem1 = (unsigned char) tem2;
  tem1 = tem2;
}
  else
tem1 = 0;
  *p = tem1;
}

but for this I don't see any MIN/MAX recognition :/  Anyhow - probably
a stupid mistake on my side ;)

Richard.

>

[SPARC] Fix PR middle-end/64242

2019-07-01 Thread Eric Botcazou

SPARC defines a nonlocal_goto pattern to which the same adjustment needs to be 
applied as in the middle-end.

Tested on SPARC/Solaris, applied on mainline and 9 branch.


2019-07-01  Eric Botcazou  

PR middle-end/64242
* config/sparc/sparc.md (nonlocal_goto): Restore frame pointer last.
Add frame clobber and schedule blockage.

-- 
Eric BotcazouIndex: config/sparc/sparc.md
===
--- config/sparc/sparc.md	(revision 272633)
+++ config/sparc/sparc.md	(working copy)
@@ -7381,7 +7381,7 @@ (define_expand "nonlocal_goto"
   ""
 {
   rtx i7 = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
-  rtx r_label = copy_to_reg (operands[1]);
+  rtx r_label = operands[1];
   rtx r_sp = adjust_address (operands[2], Pmode, 0);
   rtx r_fp = operands[3];
   rtx r_i7 = adjust_address (operands[2], Pmode, GET_MODE_SIZE (Pmode));
@@ -7394,9 +7394,18 @@ (define_expand "nonlocal_goto"
   emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
   emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
 
-  /* Restore frame pointer for containing function.  */
-  emit_move_insn (hard_frame_pointer_rtx, r_fp);
+  r_label = copy_to_reg (r_label);
+
+  /* Restore the frame pointer and stack pointer.  We must use a
+ temporary since the setjmp buffer may be a local.  */
+  r_fp = copy_to_reg (r_fp);
   emit_stack_restore (SAVE_NONLOCAL, r_sp);
+  r_i7 = copy_to_reg (r_i7);
+
+  /* Ensure the frame pointer move is not optimized.  */
+  emit_insn (gen_blockage ());
+  emit_clobber (hard_frame_pointer_rtx);
+  emit_move_insn (hard_frame_pointer_rtx, r_fp);
   emit_move_insn (i7, r_i7);
 
   /* USE of hard_frame_pointer_rtx added for consistency;
@@ -7405,8 +7414,7 @@ (define_expand "nonlocal_goto"
   emit_use (stack_pointer_rtx);
   emit_use (i7);
 
-  emit_jump_insn (gen_indirect_jump (r_label));
-  emit_barrier ();
+  emit_indirect_jump (r_label);
   DONE;
 })

Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5

2019-07-01 Thread James Greenhalgh

On Mon, Jul 01, 2019 at 04:13:40PM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> Some scheduling descriptions, like the Cortex-A57 one, are reused for 
> multiple -mcpu options.
> Sometimes those other -mcpu cores support more architecture features 
> than the Armv8-A Cortex-A57.
> For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as 
> the Dot Product instructions.
> These Dot Product instructions have the neon_dot and neon_dot_q 
> scheduling type, but that type is not
> handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to 
> care about these instructions.
> 
> But if we just ignore the neon_dot(_q) type at scheduling we get really 
> terrible codegen when compiling
> for -mcpu=cortex-a76, for example, because the scheduler just pools all 
> the UDOT instructions at the end
> of the basic block, since it doesn't assume anything about their behaviour.
> 
> This patch ameliorates the situation somewhat by telling the Cortex-A57 
> scheduling model to treat any
> insn that doesn't get assigned a cortex_a57_neon_type but is actually a 
> is_neon_type instruction as
> a simple neon_arith_basic instruction. This allows us to treat 
> post-Armv8-A SIMD instructions more sanely
> without having to model each of them explicitly in cortex-a57.md.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf and 
> aarch64-none-linux-gnu.
> 
> Ok for trunk from an aarch64 perspective?

OK.

Thansk,
James

> 
> Thanks,
> Kyrill
>

Re: [PATCH][ARM] Add support for "noinit" attribute

2019-07-01 Thread Kyrill Tkachov


Hi Christophe,

On 6/13/19 4:13 PM, Christophe Lyon wrote:

Hi,

Similar to what already exists for TI msp430 or in TI compilers for
arm, this patch adds support for "noinit" attribute for arm. It's very
similar to the corresponding code in GCC for msp430.

It is useful for embedded targets where the user wants to keep the
value of some data when the program is restarted: such variables are
not zero-initialized.It is mostly a helper/shortcut to placing
variables in a dedicated section.

It's probably desirable to add the following chunk to the GNU linker:
diff --git a/ld/emulparams/armelf.sh b/ld/emulparams/armelf.sh
index 272a8bc..9555cec 100644
--- a/ld/emulparams/armelf.sh
+++ b/ld/emulparams/armelf.sh
@@ -10,7 +10,19 @@ OTHER_TEXT_SECTIONS='*(.glue_7t) *(.glue_7)
*(.vfp11_veneer) *(.v4_bx)'
 OTHER_BSS_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__bss_start__ =
.${CREATE_SHLIB+)};"
 OTHER_BSS_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}_bss_end__ =
.${CREATE_SHLIB+)}; ${CREATE_SHLIB+PROVIDE (}__bss_end__ =
.${CREATE_SHLIB+)};"
 OTHER_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__end__ = 
.${CREATE_SHLIB+)};"
-OTHER_SECTIONS='.note.gnu.arm.ident 0 : { KEEP 
(*(.note.gnu.arm.ident)) }'

+OTHER_SECTIONS='
+.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }
+  /* This section contains data that is not initialised during load
+ *or* application reset.  */
+   .noinit (NOLOAD) :
+   {
+ . = ALIGN(2);
+ PROVIDE (__noinit_start = .);
+ *(.noinit)
+ . = ALIGN(2);
+ PROVIDE (__noinit_end = .);
+   }
+'

so that the noinit section has the "NOLOAD" flag.

I'll submit that part separately to the binutils project if OK.

OK?



This is mostly ok by me, with a few code comments inline.

I wonder whether this is something we could implement for all targets in 
the midend, but this would require linker script support for the target 
to be effective...


Thanks,

Kyrill


Thanks,

Christophe


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e3e71ea..332c41b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -151,6 +151,7 @@ static tree arm_handle_notshared_attribute (tree *, tree, 
tree, int, bool *);
 #endif
 static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
 static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *);
+static tree arm_data_attr (tree *, tree, tree, int, bool *);
 static void arm_output_function_epilogue (FILE *);
 static void arm_output_function_prologue (FILE *);
 static int arm_comp_type_attributes (const_tree, const_tree);
@@ -375,7 +376,8 @@ static const struct attribute_spec arm_attribute_table[] =
 arm_handle_cmse_nonsecure_entry, NULL },
   { "cmse_nonsecure_call", 0, 0, true, false, false, true,
 arm_handle_cmse_nonsecure_call, NULL },
-  { NULL, 0, 0, false, false, false, false, NULL, NULL }
+  { "noinit", 0, 0, true, false, false, false, arm_data_attr, NULL },
+  { NULL, 0, 0, false, false, false, false, NULL, NULL },
 };
 
 /* Initialize the GCC target structure.  */

@@ -808,6 +810,10 @@ static const struct attribute_spec arm_attribute_table[] =
 
 #undef TARGET_CONSTANT_ALIGNMENT

 #define TARGET_CONSTANT_ALIGNMENT arm_constant_alignment
+
+#undef  TARGET_ASM_SELECT_SECTION
+#define TARGET_ASM_SELECT_SECTION arm_select_section
+
 
 /* Obstack for minipool constant handling.  */

 static struct obstack minipool_obstack;
@@ -7150,6 +7156,47 @@ arm_handle_cmse_nonsecure_call (tree *node, tree name,
   return NULL_TREE;
 }
 
+/* Called when the noinit attribute is used. Check whether the

+   attribute is allowed here and add the attribute to the variable
+   decl tree or otherwise issue a diagnostic. This function checks
+   NODE is of the expected type and issues diagnostics otherwise using
+   NAME.  If it is not of the expected type *NO_ADD_ATTRS will be set
+   to true.  */
+
+static tree
+arm_data_attr (tree * node,
+ tree   name,
+ tree   args ATTRIBUTE_UNUSED,
+ intflags ATTRIBUTE_UNUSED,
+ bool * no_add_attrs ATTRIBUTE_UNUSED)
+{

no_add_attrs is set in this function, so should not be ATTRIBUTE_UNUSED?
Arguably args is also checked against NULL, so it's technically not unused 
either.

+  const char * message = NULL;
+
+  gcc_assert (DECL_P (* node));

The house style doesn't have a space after '*'. Same elsewhere in the patch.

+  gcc_assert (args == NULL);
+
+  if (TREE_CODE (* node) != VAR_DECL)
+message = G_("%qE attribute only applies to variables");
+
+  /* Check that it's possible for the variable to have a section.  */
+  if ((TREE_STATIC (* node) || DECL_EXTERNAL (* node) || in_lto_p)
+  && DECL_SECTION_NAME (* node))
+message = G_("%qE attribute cannot be applied to variables with specific 
sections");
+
+  /* If this var is thought to be common, then change this.  Common variables
+ are assigned to sections before the backend has a chance to process them. 
 */
+  if (DECL_COMMON (* node))
+DECL_CO

Re: [PATCH 2/2] gdbhooks.py: extend vec support in pretty printers

2019-07-01 Thread David Malcolm

On Mon, 2019-07-01 at 13:07 +0300, Vladislav Ivanishin wrote:
> This change is threefold:
> - enable pretty printing of vec<>, not just vec<>*
> - generalize 'vec<(\S+), (\S+), (\S+)>' regex, which is limiting
> - extend to work for vl_ptr layout (only vl_embed was supported)
> 
> The motivating example for all three is a vector of vectors in
> tree-ssa-uninit.c:
> 
> typedef vec pred_chain;
> typedef vec pred_chain_union;
> 
> (This predictably expands to
>   vec, va_heap, vl_ptr>
> in gdb, if we use strip_typedefs() on it -- otherwise it's just
> pred_chain_union.  The first patch of the series takes care of that.)
> 
> This example features "direct" vecs themselves rather than pointers
> to
> vecs, which is not a rare case; in fact, a quick grep showed that the
> number of declarations of direct vecs is about the same as that of
> pointers to vec.
> 
> The GDB's Python API seems to do dereferencing automatically (i.e. it
> detects whether to use '->', or '.' for field access), allowing to
> use
> the same code to access fields from a struct directly as well as
> through
> a pointer.
> 
> This makes it possible to reuse VecPrinter (meant for vec<> *) for
> plain
> vec<>, but perhaps it would be beneficial to handle all pointers in a
> generic way e.g. print the address and then print the dereferenced
> object, letting the appropriate pretty printer to pick it up and do
> its
> thing, if one is provided for the type.
> 
> The output I have for direct vecs now:
> 
> (gdb) p *bb->succs
> $94 = @0x776569b0 = { 5)>,  0x776548a0 (2 -> 3)>}
> 
> Compare that to
> 
> (gdb) p bb->succs
> $70 = 0x776569b0 = { 5)>,  0x776548a0 (2 -> 3)>}
> 
> Hope the choice of the '@' sign to represent the address of an object
> taken by the pretty printer is not confusing?  (GDB itself uses this
> notation for references, like so: (edge_def *&) @0x77656c38.)

Thanks for the patch.

I like the notation idea.

[...]

> diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
> index b036704b86a..a7e665cadb9 100644
> --- a/gcc/gdbhooks.py
> +++ b/gcc/gdbhooks.py
> @@ -128,12 +128,18 @@ and here's a length 1 vec:
>(gdb) p bb->succs
>$19 = 0x70428bb8 = { EXIT)>}
>  
> -You cannot yet use array notation [] to access the elements within the
> -vector: attempting to do so instead gives you the vec itself (for vec[0]),
> -or a (probably) invalid cast to vec<> for the memory after the vec (for
> -vec[1] onwards).
> +Direct instances of vec<> are also supported (their addresses are
> +prefixed with the '@' sign).
>  
> -Instead (for now) you must access m_vecdata:
> +In case you have a vec<> pointer, to use array notation [] to access the
> +elements within the vector you must first dereference the pointer, just
> +as you do in C:
> +  (gdb) p (*bb->succs)[0]
> +  $50 = (edge_def *&) @0x77656c38:  10)>
> +  (gdb) p bb->succs[0][0]
> +  $51 = (edge_def *&) @0x77656c38:  10)>

That's a nice improvement.

If the user erroneously attempts, say:

  (gdb) p bb->succs[2]

are they still accessing whatever's in memory after the vector pointer?
If so, I think the comment needs to retain some kind of warning about
this, as it's a nasty "gotcha".

> +Another option is to access m_vecdata:
>(gdb) p bb->preds->m_vecdata[0]
>$20 =  5)>
>(gdb) p bb->preds->m_vecdata[1]

[...]

> @@ -441,6 +447,10 @@ class VecPrinter:
>  #-ex "up" -ex "p bb->preds"
>  def __init__(self, gdbval):
>  self.gdbval = gdbval
> +if self.gdbval.type.code == gdb.TYPE_CODE_PTR:
> +self.ptr = intptr(self.gdbval)
> +else:
> +self.ptr = None

I think this could use a comment.

If we have a NULL "vec *", then presumably self.ptr will be 0...

>  def display_hint (self):
>  return 'array'
> @@ -448,14 +458,25 @@ class VecPrinter:
>  def to_string (self):
>  # A trivial implementation; prettyprinting the contents is done
>  # by gdb calling the "children" method below.
> -return '0x%x' % intptr(self.gdbval)
> +if self.ptr:
> +return '0x%x' % self.ptr
> +else:
> +return '@0x%x' % intptr(self.gdbval.address)

...and if we have a NULL vec *, with self.ptr as 0, it will take the
"else" path here, which I think is meant for the non-pointer case.

Should the condition be "if self.ptr is not None"?

>  def children (self):
> -if intptr(self.gdbval) == 0:
> +if self.ptr == 0:
>  return
> -m_vecpfx = self.gdbval['m_vecpfx']
> +m_vec = self.gdbval
> +# There are 2 kinds of vec layouts: vl_embed, and vl_ptr.  The latter
> +# is implemented with the former stored in the m_vec field.  Sadly,
> +# template_argument(2) won't work: `gdb.error: No type named 
> vl_embed`.
> +try:
> +m_vec = m_vec['m_vec']
> +except:
> +pass

I don't like the use of naked "except"; I always worry it might
s

Re: Incremental LTO linking part 7: documentation

2019-07-01 Thread Sandra Loosemore


On 5/26/19 11:35 AM, Gerald Pfeifer wrote:

On Tue, 8 May 2018, Jan Hubicka wrote:

this patch adds documentation of -flinker-output.

* doc/invoke.texi (-flinker-output): Document


I found a follow-up patch to this in a local tree that had been
sitting there for a while, had a another look, and now committed
it.

Sandra, if you could have another pass that would be really good;
I'm sure you'll be able to further improve.


I've checked in the attached patch, which I've also had sitting around 
for a few weeks now.  I kept getting sidetracked when working on this... 
 first I noticed that we were not consistently spelling "link-time 
optimization" like that, then I noticed that terms like "WPA" were not 
defined in the user documentation, there are some LTO-related options 
that are listed in the option summary in the user manual but only 
documented in the internals manual, etc.  I've finally decided I should 
just commit this piece as-is to get it out of my tree; it's an 
incremental improvement, anyway.


-Sandra
2019-07-01  Sandra Loosemore  

	gcc/
	* doc/invoke.texi (Link Options): Further editorial changes to
	-flinker-output docs.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 272886)
+++ gcc/doc/invoke.texi	(working copy)
@@ -6007,8 +6007,9 @@ Warn about types with virtual methods wh
 if the type were declared with the C++11 @code{final} specifier, 
 or, if possible,
 declared in an anonymous namespace. This allows GCC to more aggressively
-devirtualize the polymorphic calls. This warning is more effective with link
-time optimization, where the information about the class hierarchy graph is
+devirtualize the polymorphic calls. This warning is more effective with 
+link-time optimization, 
+where the information about the class hierarchy graph is
 more complete.
 
 @item -Wsuggest-final-methods
@@ -13190,49 +13191,49 @@ Options}.
 
 @item -flinker-output=@var{type}
 @opindex flinker-output
-This option controls code generation of the link time optimizer.  By
+This option controls code generation of the link-time optimizer.  By
 default the linker output is automatically determined by the linker
 plugin.  For debugging the compiler and if incremental linking with a 
 non-LTO object file is desired, it may be useful to control the type
 manually.
 
-If @var{type} is @samp{exec} code generation produces a static
+If @var{type} is @samp{exec}, code generation produces a static
 binary. In this case @option{-fpic} and @option{-fpie} are both
 disabled.
 
-If @var{type} is @samp{dyn} code generation produces a shared
+If @var{type} is @samp{dyn}, code generation produces a shared
 library.  In this case @option{-fpic} or @option{-fPIC} is preserved,
 but not enabled automatically.  This allows to build shared libraries
-without position independent code on architectures where this is
+without position-independent code on architectures where this is
 possible, i.e.@: on x86.
 
-If @var{type} is @samp{pie} code generation produces an @option{-fpie}
+If @var{type} is @samp{pie}, code generation produces an @option{-fpie}
 executable. This results in similar optimizations as @samp{exec}
 except that @option{-fpie} is not disabled if specified at compilation
 time.
 
-If @var{type} is @samp{rel} the compiler assumes that incremental linking is
+If @var{type} is @samp{rel}, the compiler assumes that incremental linking is
 done.  The sections containing intermediate code for link-time optimization are
 merged, pre-optimized, and output to the resulting object file. In addition, if
-@option{-ffat-lto-objects} is specified the binary code is produced for future
-non-LTO linking. The object file produced by incremental linking will be smaller
+@option{-ffat-lto-objects} is specified, binary code is produced for future
+non-LTO linking. The object file produced by incremental linking is smaller
 than a static library produced from the same object files.  At link time the
-result of incremental linking will also load faster to compiler than a static
+result of incremental linking also loads faster than a static
 library assuming that the majority of objects in the library are used.
 
 Finally @samp{nolto-rel} configures the compiler for incremental linking where
-code generation is forced, a final binary is produced and the intermediate
+code generation is forced, a final binary is produced, and the intermediate
 code for later link-time optimization is stripped. When multiple object files
-are linked together the resulting code will be optimized better than with
-link-time optimizations disabled (for example, cross-module inlining will
-happen), most of benefits of whole program optimizations are however lost. 
+are linked together the resulting code is better optimized than with
+link-time optimizations disabled (for example, cross-module inlining 
+happens), but most of benefits of whole program optimizations are lost.

Re: [PATCH 1/2] gdbhooks.py: use strip_typedefs to simplify matching type names

2019-07-01 Thread David Malcolm

On Mon, 2019-07-01 at 12:50 +0300, Vladislav Ivanishin wrote:
> Hi!
> 
> GDB's Python API provides strip_typedefs method that can be
> instrumental
> for writing DRY code.  Using it at least partially obviates the need
> for
> the add_printer_for_types method we have in gdbhooks.py (it takes a
> list
> of typenames to match and is usually used to deal with typedefs).
> 
> I think, the code can be simplified further (there's still
> ['tree_node *', 'const tree_node *'], which is a little awkward) if
> we
> deal with pointer types in a uniform fashion (I'll touch on this in
> the
> description of the second patch).  But that can be improved
> separately.
> 
> The gimple_statement_cond, etc. part has been dysfunctional for a
> while
> (namely since gimple-classes-v2-option-3 branch was merged).  I
> updated
> it to use the new classes' names.  That seems to work (though it
> doesn't
> output much info anyway: pretty
> vs. 
>(gphi *) 0x778c0200
> in the raw version).
> 
> I changed the name passed to pp.add_printer_for_types() for
> scalar_mode
> and friends -- so they all share the same name now -- but I don't
> believe the name is used in any context where it would matter.

IIRC there's a gdb command to tell you what the registered pretty-
printers are; I think the name is only ever used in that context (or
maybe for fine-grained control of pretty-printing) - and I don't know
of anyone who uses that.  I don't think changing the name matters.

> This is just a clean up of gdbhooks.py.  OK to commit?

The only area I wasn't sure about were the removal hunks relating to
machine modes: is that all covered now by the looking-through-typedefs?

Otherwise, looks good to me.  I assume you've tested the patch by
debugging with it.

Thanks
Dave

Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A57

2019-07-01 Thread Kyrill Tkachov


Something somewhere cut off the subject line: it should say Cortex-A57.

Sorry about that.

Kyrill

On 7/1/19 4:13 PM, Kyrill Tkachov wrote:

Hi all,

Some scheduling descriptions, like the Cortex-A57 one, are reused for
multiple -mcpu options.
Sometimes those other -mcpu cores support more architecture features
than the Armv8-A Cortex-A57.
For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as
the Dot Product instructions.
These Dot Product instructions have the neon_dot and neon_dot_q
scheduling type, but that type is not
handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to
care about these instructions.

But if we just ignore the neon_dot(_q) type at scheduling we get really
terrible codegen when compiling
for -mcpu=cortex-a76, for example, because the scheduler just pools all
the UDOT instructions at the end
of the basic block, since it doesn't assume anything about their 
behaviour.


This patch ameliorates the situation somewhat by telling the Cortex-A57
scheduling model to treat any
insn that doesn't get assigned a cortex_a57_neon_type but is actually a
is_neon_type instruction as
a simple neon_arith_basic instruction. This allows us to treat
post-Armv8-A SIMD instructions more sanely
without having to model each of them explicitly in cortex-a57.md.

Bootstrapped and tested on arm-none-linux-gnueabihf and
aarch64-none-linux-gnu.

Ok for trunk from an aarch64 perspective?

Thanks,
Kyrill

2019-01-07  Kyrylo Tkachov  

 * config/arm/cortex-a57.md (cortex_a57_neon_type): Use 
neon_arith_basic

 for is_neon_type instructions that have not already been categorized.

[PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5

2019-07-01 Thread Kyrill Tkachov


Hi all,

Some scheduling descriptions, like the Cortex-A57 one, are reused for 
multiple -mcpu options.
Sometimes those other -mcpu cores support more architecture features 
than the Armv8-A Cortex-A57.
For example, the Cortex-A75 and Cortex-A76 support Armv8.2-A as well as 
the Dot Product instructions.
These Dot Product instructions have the neon_dot and neon_dot_q 
scheduling type, but that type is not
handled in cortex-a57.md, since the Cortex-A57 itself doesn't need to 
care about these instructions.


But if we just ignore the neon_dot(_q) type at scheduling we get really 
terrible codegen when compiling
for -mcpu=cortex-a76, for example, because the scheduler just pools all 
the UDOT instructions at the end

of the basic block, since it doesn't assume anything about their behaviour.

This patch ameliorates the situation somewhat by telling the Cortex-A57 
scheduling model to treat any
insn that doesn't get assigned a cortex_a57_neon_type but is actually a 
is_neon_type instruction as
a simple neon_arith_basic instruction. This allows us to treat 
post-Armv8-A SIMD instructions more sanely

without having to model each of them explicitly in cortex-a57.md.

Bootstrapped and tested on arm-none-linux-gnueabihf and 
aarch64-none-linux-gnu.


Ok for trunk from an aarch64 perspective?

Thanks,
Kyrill

2019-01-07  Kyrylo Tkachov  

    * config/arm/cortex-a57.md (cortex_a57_neon_type): Use neon_arith_basic
    for is_neon_type instructions that have not already been categorized.

diff --git a/gcc/config/arm/cortex-a57.md b/gcc/config/arm/cortex-a57.md
index 6ba4f5711cf4b1127ff6cc2e59f1fa9dbd25a9b1..d1ea2aa1edbec1981c68d2e54c5ae6dfe0fec944 100644
--- a/gcc/config/arm/cortex-a57.md
+++ b/gcc/config/arm/cortex-a57.md
@@ -236,7 +236,12 @@ (define_attr "cortex_a57_neon_type"
 			   neon_store1_4reg, neon_store1_4reg_q,\
 			   neon_store1_one_lane, neon_store1_one_lane_q,\
 			   neon_store2_one_lane, neon_store2_one_lane_q")
-	(const_string "neon_store_complex")]
+	(const_string "neon_store_complex")
+;; If it doesn't match any of the above that we want to treat specially but is
+;; still a NEON type, treat it as a basic NEON type.  This is better than
+;; dropping it on the floor and making no assumptions about it whatsoever.
+	  (eq_attr "is_neon_type" "yes")
+	(const_string "neon_arith_basic")]
 	  (const_string "unknown")))
 
 ;; The Cortex-A57 core is modelled as a triple issue pipeline that has

Re: [Committed] S/390: Fix vector shift count operand

2019-07-01 Thread Andreas Krebbel

On 01.07.19 17:01, Marek Polacek wrote:
> On Mon, Jul 01, 2019 at 04:58:09PM +0200, Andreas Krebbel wrote:
>> We currently use subst definitions to handle the different variants of shift
>> count operands. Unfortunately, in the vector shift pattern the shift count
>> operand is used directly. Without it being adjusted for the 'subst' variants 
>> the
>> displacement value is omitted resulting in a wrong shift count being applied.
>>
>> This patch needs to be applied to older branches as well.
>>
>> Bootstrapped and regression tested on s390x (IBM z14).
>> Committed to mainline
>>
>> gcc/ChangeLog:
>>
>> 2019-07-01  Andreas Krebbel  
>>
>>  * config/s390/vector.md:
> 
> This CL entry is blank.

I've fixed this in the changelog file already.

Andreas

Re: [Committed] S/390: Fix vector shift count operand

2019-07-01 Thread Marek Polacek

On Mon, Jul 01, 2019 at 04:58:09PM +0200, Andreas Krebbel wrote:
> We currently use subst definitions to handle the different variants of shift
> count operands. Unfortunately, in the vector shift pattern the shift count
> operand is used directly. Without it being adjusted for the 'subst' variants 
> the
> displacement value is omitted resulting in a wrong shift count being applied.
> 
> This patch needs to be applied to older branches as well.
> 
> Bootstrapped and regression tested on s390x (IBM z14).
> Committed to mainline
> 
> gcc/ChangeLog:
> 
> 2019-07-01  Andreas Krebbel  
> 
>   * config/s390/vector.md:

This CL entry is blank.

Marek

[Committed] S/390: Fix vector shift count operand

2019-07-01 Thread Andreas Krebbel

We currently use subst definitions to handle the different variants of shift
count operands. Unfortunately, in the vector shift pattern the shift count
operand is used directly. Without it being adjusted for the 'subst' variants the
displacement value is omitted resulting in a wrong shift count being applied.

This patch needs to be applied to older branches as well.

Bootstrapped and regression tested on s390x (IBM z14).
Committed to mainline

gcc/ChangeLog:

2019-07-01  Andreas Krebbel  

* config/s390/vector.md:

gcc/testsuite/ChangeLog:

2019-07-01  Andreas Krebbel  

* gcc.target/s390/vector/vec-shift-2.c: New test.
---
 gcc/config/s390/vector.md  |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c | 24 ++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index a2c1012..140ef47 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -981,7 +981,7 @@
(VEC_SHIFTS:VI (match_operand:VI 1 "register_operand"   "v")
   (match_operand:SI 2 "nonmemory_operand" "an")))]
   "TARGET_VX"
-  "\t%v0,%v1,%Y2"
+  "\t%v0,%v1,"
   [(set_attr "op_type" "VRS")])
 
 ; Shift each element by corresponding vector element
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c 
b/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c
new file mode 100644
index 000..c7a1d93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-shift-2.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
+
+/* { dg-final { scan-assembler-times "veslf" 1 } } */
+
+typedef __attribute__((vector_size(16))) signed int v4si;
+
+v4si __attribute__((noinline,noclone))
+shift_left_by_scalar (v4si in, int shift_count)
+{
+  return in << (3 + shift_count);
+}
+
+int
+main ()
+{
+  v4si a = { 1, 2, 3, 4 };
+  v4si result = shift_left_by_scalar (a, 1);
+
+  if (result[1] != 32)
+__builtin_abort ();
+
+  return 0;
+}
-- 
2.7.4

[PATCH] rs6000: Improve indexed addressing

2019-07-01 Thread Segher Boessenkool

The function rs6000_force_indexed_or_indirect_mem makes a memory
operand suitable for indexed (or indirect) addressing.  If the memory
address isn't yet valid, it loads the whole thing into a register to
make it valid.  That isn't optimal.  This changes it to load an
address that is the sum of two things into two registers instead.
This results in lower latency code, and if inside loops, a constant
term can be moved outside the loop.

Tested on powerpc64-linux {-m32,-m64} and on powerpc64le-linux (a Power9).
Committing.


Segher


2019-07-01  Segher Boessenkool  

* config/rs6000/rs6000.c (rs6000_force_indexed_or_indirect_mem):
Load both operands of a PLUS into registers separately.

---
 gcc/config/rs6000/rs6000.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5e80673..f59f3a9 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32100,7 +32100,16 @@ rs6000_force_indexed_or_indirect_mem (rtx x)
  addr = reg;
}
 
-  x = replace_equiv_address (x, force_reg (Pmode, addr));
+  if (GET_CODE (addr) == PLUS)
+   {
+ rtx op0 = XEXP (addr, 0);
+ rtx op1 = XEXP (addr, 1);
+ op0 = force_reg (Pmode, op0);
+ op1 = force_reg (Pmode, op1);
+ x = replace_equiv_address (x, gen_rtx_PLUS (Pmode, op0, op1));
+   }
+  else
+   x = replace_equiv_address (x, force_reg (Pmode, addr));
 }
 
   return x;
-- 
1.8.3.1

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-07-01 Thread Martin Sebor


[Adding gcc-patches]

Richard, do you have any further comments or is the revised patch
good to commit?

Martin

On 6/25/19 2:30 PM, Martin Sebor wrote:

On 6/25/19 3:53 AM, Jonathan Wakely wrote:

On 24/06/19 19:42 +0200, Richard Biener wrote:

On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor  wrote:


On 6/24/19 6:11 AM, Richard Biener wrote:
> On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor  
wrote:

>>
>> On 6/21/19 6:06 AM, Richard Biener wrote:
>>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor  
wrote:


 Bug 90923 shows that even though GCC hash-table based containers
 like hash_map can be instantiated on types with user-defined ctors
 and dtors they invoke the dtors of such types without invoking
 the corresponding ctors.

 It was thanks to this bug that I spent a day debugging 
"interesting"

 miscompilations during GCC bootstrap (in fairness, it was that and
 bug 90904 about auto_vec copy assignment/construction also being
 hosed even for POD types).

 The attached patch corrects the hash_map and hash_set templates
 to invoke the ctors of the elements they insert and makes them
 (hopefully) safe to use with non-trivial user-defined types.
>>>
>>> Hum.  I am worried about the difference of assignment vs. 
construction

>>> in ::put()
>>>
>>> +  bool ins = hash_entry::is_empty (*e);
>>> +  if (ins)
>>> +   {
>>> + e->m_key = k;
>>> + new ((void *) &e->m_value) Value (v);
>>> +   }
>>> +  else
>>> +   e->m_value = v;
>>>
>>> why not invoke the dtor on the old value and then the ctor again?
>>
>> It wouldn't work for self-assignment:
>>
>> Value &y = m.get_or_insert (key);
>> m.put (key, y);
>>
>> The usual rule of thumb for writes into containers is to use
>> construction when creating a new element and assignment when
>> replacing the value of an existing element.
>>
>> Which reminds me that the hash containers, despite being copy-
>> constructible (at least for POD types, they aren't for user-
>> defined types), also aren't safe for assignment even for PODs.
>> I opened bug 90959 for this.  Until the assignment is fixed
>> I made it inaccessibe in the patch (I have fixed the copy
>> ctor to DTRT for non-PODs).
>>
>>> How is an empty hash_entry constructed?
>>
>> In hash_table::find_slot_with_hash simply by finding an empty
>> slot and returning a pointer to it.  The memory for the slot
>> is marked "empty" by calling the Traits::mark_empty() function.
>>
>> The full type of hash_map is actually
>>
>> hash_map>  Value,
>>  simple_hashmap_traits,
>>    Value>
>>
>> and simple_hashmap_traits delegates it to default_hash_traits
>> whose mark_empty() just clears the void*, leaving the Value
>> part uninitialized.  That makes sense because we don't want
>> to call ctors for empty entries.  I think the questions one
>> might ask if one were to extend the design are: a) what class
>> should invoke the ctor/assignment and b) should it do it
>> directly or via the traits?
>>
>>> ::remove() doesn't
>>> seem to invoke the dtor either, instead it relies on the
>>> traits::remove function?
>>
>> Yes.  There is no Traits::construct or assign or copy.  We
>> could add them but I'm not sure I see to what end (there could
>> be use cases, I just don't know enough about these classes to
>> think of any).
>>
>> Attached is an updated patch with the additional minor fixes
>> mentioned above.
>>
>> Martin
>>
>> PS I think we would get much better results by adopting
>> the properly designed and tested standard library containers
>> than by spending time trying to improve the design of these
>> legacy classes.  For simple uses that don't need to integrate
>> with the GC machinery the standard containers should be fine
>> (plus, it'd provide us with greater motivation to improve
>> them and the code GCC emits for their uses).  Unfortunately,
>> to be able to use the hash-based containers we would need to
>> upgrade to C++ 11.  Isn't it time yet?
>
> I don't think so.  I'm also not sure if C++11 on its own is desirable
> or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
> as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
> SLES 11 already struggles currently (GCC 4.3) but I'd no longer
> consider that important enough.
>
> Note any such change to libstdc++ containers should be complete
> and come with both code-size and compile-time and memory-usage
> measurements (both of GCC and other apps of course).

Can I go ahead and commit the patch?


I think we need to document the requirements on Value classes better.

@@ -177,7 +185,10 @@ public:
  INSERT);
  bool ins = Traits::is_empty (*e);
  if (ins)
-   e->m_key = k;
+   {
+ e->m_key = k;
+ new ((void *)&e->m_value) Value ();
+   }

this now requires a default constructor and I always forget about
diff

Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch

2019-07-01 Thread Joern Wolfgang Rennecke

[Apologies if this is a duplicate, I'm unsure if my previous mail was 
delivered]

On 01/07/19 12:38, Richard Biener wrote:

On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke
 wrote:

The heuristic introduced for PR71016 prevents recognizing a max / min
combo like it is used for
saturation when followed by a conversion.
The attached patch refines the heuristic to allow this case. Regression
tested on x86_64-pc-linux-gnu .

Few style nits:

   ...

also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt)
otherwise you'd also allow MIN/MAX unrelated to the conversion detected.


Like the attached patch?


On x86 I see the MIN_EXPR is already detected by GENERIC folding,
I wonder if that is required or if we can handle the case without in one
phiopt pass invocation as well.

tree_ssa_phiopt_worker is supposed to handle this by handling nested 
COND_EXPRs

from the inside out:

   /* Search every basic block for COND_EXPR we may be able to optimize.

 We walk the blocks in order that guarantees that a block with
 a single predecessor is processed before the predecessor.
 This ensures that we collapse inner ifs before visiting the
 outer ones, and also that we do not try to visit a removed
 block.  */
  bb_order = single_pred_before_succ_order ();

However, I have no idea how to construct a testcase for this with the 
gimple folding in place.


#define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x))

void
foo (unsigned char *p, int i, int m)
{
  *p = (m > SAT (i) ? m : SAT (i));
}

or the equivalent for MIN_EXPR, *.c.004.original already has only one 
conditional left.



2019-07-01  Joern Rennecke  

PR middle-end/66726
* tree-ssa-phiopt.c (factor_out_conditional_conversion):
Tune heuristic from PR71016 to allow MIN / MAX.
* testsuite/gcc.dg/tree-ssa/pr66726-4.c: New testcase.

Index: tree-ssa-phiopt.c
===
--- tree-ssa-phiopt.c   (revision 272846)
+++ tree-ssa-phiopt.c   (working copy)
@@ -504,7 +504,25 @@ factor_out_conditional_conversion (edge
  gsi = gsi_for_stmt (arg0_def_stmt);
  gsi_prev_nondebug (&gsi);
  if (!gsi_end_p (gsi))
-   return NULL;
+   {
+ if (gassign *assign
+   = dyn_cast  (gsi_stmt (gsi)))
+   {
+ tree lhs = gimple_assign_lhs (assign);
+ enum tree_code ass_code
+   = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+   return NULL;
+ if (!operand_equal_p
+(lhs, gimple_assign_rhs1 (arg0_def_stmt), 0))
+   return NULL;
+ gsi_prev_nondebug (&gsi);
+ if (!gsi_end_p (gsi))
+   return NULL;
+   }
+ else
+   return NULL;
+   }
  gsi = gsi_for_stmt (arg0_def_stmt);
  gsi_next_nondebug (&gsi);
  if (!gsi_end_p (gsi))
Index: testsuite/gcc.dg/tree-ssa/pr66726-4.c
===
--- testsuite/gcc.dg/tree-ssa/pr66726-4.c   (nonexistent)
+++ testsuite/gcc.dg/tree-ssa/pr66726-4.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1-details" } */
+
+#define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x))
+
+void
+foo (unsigned char *p, int i)
+{
+  *p = SAT (i);
+}
+
+/* { dg-final { scan-tree-dump-times "COND_EXPR .*and PHI .*converted to 
straightline code" 1 "phiopt1" } } */

[Ada] Implement GNAT.Graphs

2019-07-01 Thread Pierre-Marie de Rodat

This patch introduces new unit GNAT.Graphs which currently provides a
directed graph abstraction.


-- Source --


--  operations.adb

with Ada.Text_IO; use Ada.Text_IO;
with GNAT;use GNAT;
with GNAT.Graphs; use GNAT.Graphs;
with GNAT.Sets;   use GNAT.Sets;

procedure Operations is
   type Vertex_Id is
 (No_V, VA, VB, VC, VD, VE, VF, VG, VH, VX, VY, VZ);
   No_Vertex_Id : constant Vertex_Id := No_V;

   function Hash_Vertex (V : Vertex_Id) return Bucket_Range_Type;

   type Edge_Id is
(No_E, E1, E2, E3, E4, E5, E6, E7, E8, E9, E10, E97, E98, E99);
   No_Edge_Id : constant Edge_Id := No_E;

   function Hash_Edge (E : Edge_Id) return Bucket_Range_Type;

   package ES is new Membership_Set
 (Element_Type => Edge_Id,
  "="  => "=",
  Hash => Hash_Edge);

   package DG is new Directed_Graph
 (Vertex_Id   => Vertex_Id,
  No_Vertex   => No_Vertex_Id,
  Hash_Vertex => Hash_Vertex,
  Same_Vertex => "=",
  Edge_Id => Edge_Id,
  No_Edge => No_Edge_Id,
  Hash_Edge   => Hash_Edge,
  Same_Edge   => "=");
   use DG;

   package VS is new Membership_Set
 (Element_Type => Vertex_Id,
  "="  => "=",
  Hash => Hash_Vertex);

   ---
   -- Local subprograms --
   ---

   procedure Check_Belongs_To_Component
 (R: String;
  G: Instance;
  V: Vertex_Id;
  Exp_Comp : Component_Id);
   --  Verify that vertex V of graph G belongs to component Exp_Comp. R is the
   --  calling routine.

   procedure Check_Belongs_To_Some_Component
 (R : String;
  G : Instance;
  V : Vertex_Id);
   --  Verify that vertex V of graph G belongs to some component. R is the
   --  calling routine.

   procedure Check_Destination_Vertex
 (R : String;
  G : Instance;
  E : Edge_Id;
  Exp_V : Vertex_Id);
   --  Vertify that the destination vertex of edge E of grah G is Exp_V. R is
   --  the calling routine.

   procedure Check_Distinct_Components
 (R  : String;
  Comp_1 : Component_Id;
  Comp_2 : Component_Id);
   --  Verify that components Comp_1 and Comp_2 are distinct (not the same)

   procedure Check_Has_Component
 (R  : String;
  G  : Instance;
  G_Name : String;
  Comp   : Component_Id);
   --  Verify that graph G with name G_Name contains component Comp. R is the
   --  calling routine.

   procedure Check_Has_Edge
 (R : String;
  G : Instance;
  E : Edge_Id);
   --  Verify that graph G contains edge E. R is the calling routine.

   procedure Check_Has_Vertex
 (R : String;
  G : Instance;
  V : Vertex_Id);
   --  Verify that graph G contains vertex V. R is the calling routine.

   procedure Check_No_Component
 (R : String;
  G : Instance;
  V : Vertex_Id);
   --  Verify that vertex V does not belong to some component. R is the calling
   --  routine.

   procedure Check_No_Component
 (R  : String;
  G  : Instance;
  G_Name : String;
  Comp   : Component_Id);
   --  Verify that graph G with name G_Name does not contain component Comp. R
   --  is the calling routine.

   procedure Check_No_Edge
 (R : String;
  G : Instance;
  E : Edge_Id);
   --  Verify that graph G does not contain edge E. R is the calling routine.

   procedure Check_No_Vertex
 (R : String;
  G : Instance;
  V : Vertex_Id);
   --  Verify that graph G does not contain vertex V. R is the calling routine.

   procedure Check_Number_Of_Components
 (R   : String;
  G   : Instance;
  Exp_Num : Natural);
   --  Verify that graph G has exactly Exp_Num components. R is the calling
   --  routine.

   procedure Check_Number_Of_Edges
 (R   : String;
  G   : Instance;
  Exp_Num : Natural);
   --  Verify that graph G has exactly Exp_Num edges. R is the calling routine.

   procedure Check_Number_Of_Vertices
 (R   : String;
  G   : Instance;
  Exp_Num : Natural);
   --  Verify that graph G has exactly Exp_Num vertices. R is the calling
   --  routine.

   procedure Check_Outgoing_Edge_Iterator
 (R   : String;
  G   : Instance;
  V   : Vertex_Id;
  Set : ES.Instance);
   --  Verify that all outgoing edges of vertex V of graph G can be iterated
   --  and appear in set Set. R is the calling routine.

   procedure Check_Source_Vertex
 (R : String;
  G : Instance;
  E : Edge_Id;
  Exp_V : Vertex_Id);
   --  Vertify that the source vertex of edge E of grah G is Exp_V. R is the
   --  calling routine.

   procedure Check_Vertex_Iterator
 (R: String;
  G: Instance;
  Comp : Component_Id;
  Set  : VS.Instance);
   --  Verify that all vertices of component Comp of graph G can be iterated
   --  and appear in set Set. R is the calling routine.

   function Create_And_Populate return Instance;
   --  Create a

[Ada] Improve error message on mult/div between fixed-point and integer

2019-07-01 Thread Pierre-Marie de Rodat

Multiplication and division of a fixed-point type by an integer type is
only defined by default for type Integer. Clarify the error message to
explain that a conversion is needed in other cases.

Also change an error message to start with lowercase as it should be.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Yannick Moy  

gcc/ada/

* sem_ch4.adb (Operator_Check): Refine error message.--- gcc/ada/sem_ch4.adb
+++ gcc/ada/sem_ch4.adb
@@ -7375,7 +7375,7 @@ package body Sem_Ch4 is
   Etype (Next_Formal (First_Formal (Op_Id
then
   Error_Msg_N
-("No legal interpretation for operator&", N);
+("no legal interpretation for operator&", N);
   Error_Msg_NE
 ("\use clause on& would make operation legal",
  N, Scope (Op_Id));
@@ -7393,6 +7393,26 @@ package body Sem_Ch4 is
Error_Msg_NE ("\left operand has}!",  N, Etype (L));
Error_Msg_NE ("\right operand has}!", N, Etype (R));
 
+   --  For multiplication and division operators with
+   --  a fixed-point operand and an integer operand,
+   --  indicate that the integer operand should be of
+   --  type Integer.
+
+   if Nkind_In (N, N_Op_Multiply, N_Op_Divide)
+ and then Is_Fixed_Point_Type (Etype (L))
+ and then Is_Integer_Type (Etype (R))
+   then
+  Error_Msg_N ("\convert right operand to "
+   & "`Integer`", N);
+
+   elsif Nkind (N) = N_Op_Multiply
+ and then Is_Fixed_Point_Type (Etype (R))
+ and then Is_Integer_Type (Etype (L))
+   then
+  Error_Msg_N ("\convert left operand to "
+   & "`Integer`", N);
+   end if;
+
 --  For concatenation operators it is more difficult to
 --  determine which is the wrong operand. It is worth
 --  flagging explicitly an access type, for those who

[Ada] More permissive use of GNAT attribute Enum_Rep

2019-07-01 Thread Pierre-Marie de Rodat

This patch allows the prefix of the attribute Enum_Rep to be an
attribute referece (such as Enum_Type'First). A recent patch had
restricted the prefix to be an object of a discrete type, which is
incompatible with orevious usage.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* sem_attr.adb (Analyze_Attribute, case Enum_Rep): Allow prefix
of attribute to be an attribute reference of a discrete type.

gcc/testsuite/

* gnat.dg/enum_rep.adb, gnat.dg/enum_rep.ads: New testcase.--- gcc/ada/sem_attr.adb
+++ gcc/ada/sem_attr.adb
@@ -3833,14 +3833,16 @@ package body Sem_Attr is
 Check_Discrete_Type;
 Resolve (E1, P_Base_Type);
 
- --  X'Enum_Rep case. X must be an object or enumeration literal, and
- --  it must be of a discrete type.
+ --  X'Enum_Rep case. X must be an object or enumeration literal
+ --  (including an attribute reference), and it must be of a
+ --  discrete type.
 
  elsif not
((Is_Object_Reference (P)
or else
  (Is_Entity_Name (P)
-and then Ekind (Entity (P)) = E_Enumeration_Literal))
+and then Ekind (Entity (P)) = E_Enumeration_Literal)
+   or else Nkind (P) = N_Attribute_Reference)
  and then Is_Discrete_Type (Etype (P)))
  then
 Error_Attr_P ("prefix of % attribute must be discrete object");

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/enum_rep.adb
@@ -0,0 +1,5 @@
+--  { dg-do compile }
+
+package body Enum_Rep is
+   procedure Foo is null;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/enum_rep.ads
@@ -0,0 +1,22 @@
+with Interfaces;
+
+package Enum_Rep is
+
+   type My_Type is range 00 .. 100;
+
+   subtype My_Subtype2 is Interfaces.Unsigned_32
+range My_Type'First'Enum_Rep .. My_Type'Last'Enum_Rep;
+
+   My_Type_First : constant My_Type :=  My_Type'First;
+   My_Type_Last : constant My_Type :=  My_Type'Last;
+
+   subtype My_Subtype is Interfaces.Unsigned_32
+ range My_Type_First'Enum_Rep .. My_Type_Last'Enum_Rep;
+
+   subtype My_Subtype1 is Interfaces.Unsigned_32
+range My_Type'Enum_Rep (My_Type'First) ..
+   My_Type'Enum_Rep (MY_Type'Last);
+
+   procedure Foo;
+
+end;

[Ada] Spurious error on inst. of partially defaulted formal package

2019-07-01 Thread Pierre-Marie de Rodat

This patch removes a spurious error on an instantiation whose generic
unit has a formal package where some formal parameters are
box-initialiaed.  Previously the code assumed that box-initialization
for a formal package applied to all its formal parameters.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* sem_ch12.adb (Is_Defaulted): New predicate in
Check_Formal_Package_Intance, to skip the conformance of checks
on parameters of a formal package that are defaulted,

gcc/testsuite/

* gnat.dg/generic_inst3.adb,
gnat.dg/generic_inst3_kafka_lib-topic.ads,
gnat.dg/generic_inst3_kafka_lib.ads,
gnat.dg/generic_inst3_markets.ads,
gnat.dg/generic_inst3_traits-encodables.ads,
gnat.dg/generic_inst3_traits.ads: New testcase.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -6195,6 +6195,12 @@ package body Sem_Ch12 is
   --  Common error routine for mismatch between the parameters of the
   --  actual instance and those of the formal package.
 
+  function Is_Defaulted (Param : Entity_Id) return Boolean;
+  --  If the formql package has partly box-initialized formals, skip
+  --  conformace check for these formals. Previously the code assumed
+  --  that boc initialization for a formal package applied to all
+  --  its formal parameters.
+
   function Same_Instantiated_Constant (E1, E2 : Entity_Id) return Boolean;
   --  The formal may come from a nested formal package, and the actual may
   --  have been constant-folded. To determine whether the two denote the
@@ -6245,6 +6251,32 @@ package body Sem_Ch12 is
  end if;
   end Check_Mismatch;
 
+  --
+  -- Is_Defaulted --
+  --
+
+  function Is_Defaulted (Param : Entity_Id) return Boolean is
+ Assoc : Node_Id;
+  begin
+ Assoc := First (Generic_Associations
+ (Parent (Associated_Formal_Package (Actual_Pack;
+
+ while Present (Assoc) loop
+if Nkind (Assoc) = N_Others_Choice then
+   return True;
+
+elsif Nkind (Assoc) = N_Generic_Association
+  and then Chars (Selector_Name (Assoc)) = Chars (Param)
+then
+   return Box_Present (Assoc);
+end if;
+
+Next (Assoc);
+ end loop;
+
+ return False;
+  end Is_Defaulted;
+
   
   -- Same_Instantiated_Constant --
   
@@ -6414,6 +6446,9 @@ package body Sem_Ch12 is
  then
 goto Next_E;
 
+ elsif Is_Defaulted (E1) then
+goto Next_E;
+
  elsif Is_Type (E1) then
 
 --  Subtypes must statically match. E1, E2 are the local entities

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst3.adb
@@ -0,0 +1,20 @@
+--  { dg-do compile }
+
+with Generic_Inst3_Kafka_Lib.Topic;
+with Generic_Inst3_Traits.Encodables;
+with Generic_Inst3_Markets;
+
+procedure Generic_Inst3 is
+   generic
+  with package Values is new Generic_Inst3_Traits.Encodables (<>);
+  with package Topic is new Generic_Inst3_Kafka_Lib.Topic
+  (Values => Values, others => <>);
+   package Dummy is
+   end Dummy;
+
+   package Inst is new Dummy
+  (Values => Generic_Inst3_Markets.Data_Encodables,
+   Topic  => Generic_Inst3_Markets.Data_Topic);
+begin
+   null;
+end Generic_Inst3;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst3_kafka_lib-topic.ads
@@ -0,0 +1,7 @@
+with Generic_Inst3_Traits.Encodables;
+generic
+   Topic_Name : String;
+   with package Keys is new Generic_Inst3_Traits.Encodables (<>);
+   with package Values is new Generic_Inst3_Traits.Encodables (<>);
+package Generic_Inst3_Kafka_Lib.Topic is
+end Generic_Inst3_Kafka_Lib.Topic;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst3_kafka_lib.ads
@@ -0,0 +1,2 @@
+package Generic_Inst3_Kafka_Lib is
+end Generic_Inst3_Kafka_Lib;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst3_markets.ads
@@ -0,0 +1,10 @@
+with Generic_Inst3_Kafka_Lib.Topic;
+with Generic_Inst3_Traits.Encodables;
+package Generic_Inst3_Markets is
+   type Data_Type is null record;
+   function Image (D : Data_Type) return String is ("");
+   package Data_Encodables is new Generic_Inst3_Traits.Encodables (Data_Type, Image);
+   package Data_Topic is new Generic_Inst3_Kafka_Lib.Topic
+  (Keys => Data_Encodables, Values => Data_Encodables,
+   Topic_Name => "bla");
+end Generic_Inst3_Markets;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/generic_inst3_traits-encodables.ads
@@ -0,0 +1,8 @@
+with Ada.Streams;
+generic
+   pragma Warnings (Off, "is not referenced");
+   type T (<>) is private;
+   with function Image (Val : in T) return String;
+package Generic_Inst3_Traits.Encodables is
+   pragma Pu

[Ada] Crash on improper pragma Weak_External

2019-07-01 Thread Pierre-Marie de Rodat

This patch adds a guard on the use of pragma Weak_External. This pragma
affects link-time addresses of entities, and does not apply to types.
Previous to this patch the compiler would abort on a misuse of the
pragma.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* sem_prag.adb (Analyze_Pragma, case Weak_External): Pragma only
applies to entities with run-time addresses, not to types.

gcc/testsuite/

* gnat.dg/weak3.adb, gnat.dg/weak3.ads: New testcase.--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -25608,6 +25608,12 @@ package body Sem_Prag is
Ent := Underlying_Type (Ent);
 end if;
 
+--  The pragma applies to entities with addresses.
+
+if Is_Type (Ent) then
+   Error_Pragma ("pragma applies to objects and subprograms");
+end if;
+
 --  The only processing required is to link this item on to the
 --  list of rep items for the given entity. This is accomplished
 --  by the call to Rep_Item_Too_Late (when no error is detected

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/weak3.adb
@@ -0,0 +1,11 @@
+--  { dg-do compile }
+
+package body Weak3 is
+
+   type T is new Integer;
+   pragma Weak_External (T); --  { dg-error "pragma applies to objects and subprograms" }
+   X : T;
+
+   procedure Foo is null;
+
+end Weak3;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/weak3.ads
@@ -0,0 +1,3 @@
+package Weak3 is
+   procedure Foo;
+end Weak3;

[Ada] Wrong code with -gnatVa on lock-free protected objects

2019-07-01 Thread Pierre-Marie de Rodat

This patch fixes the handling of validity checks on protected objects
that use the Lock-Free implementation when validity checks are enabled,
previous to this patch the compiler would report improperly that a
condition in a protected operation was always True (when comoipled with
-gnatwa) and would generate incorrect code fhat operation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* checks.adb (Insert_Valid_Check): Do not apply validity check
to variable declared within a protected object that uses the
Lock_Free implementation, to prevent unwarranted constant
folding, because entities within such an object msut be treated
as volatile.

gcc/testsuite/

* gnat.dg/prot7.adb, gnat.dg/prot7.ads: New testcase.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -7429,6 +7429,19 @@ package body Checks is
  return;
   end if;
 
+  --  Entities declared in Lock_free protected types must be treated
+  --  as volatile, and we must inhibit validity checks to prevent
+  --  improper constant folding.
+
+  if Is_Entity_Name (Expr)
+and then Is_Subprogram (Scope (Entity (Expr)))
+and then Present (Protected_Subprogram (Scope (Entity (Expr
+and then Uses_Lock_Free
+   (Scope (Protected_Subprogram (Scope (Entity (Expr)
+  then
+ return;
+  end if;
+
   --  If we have a checked conversion, then validity check applies to
   --  the expression inside the conversion, not the result, since if
   --  the expression inside is valid, then so is the conversion result.

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot7.adb
@@ -0,0 +1,22 @@
+--  { dg-do compile }
+--  { dg-options "-gnatwa -gnatVa" }
+
+package body Prot7 is
+   protected body Default_Slice is
+  function Get return Instance_Pointer is
+  begin
+ return Default;
+  end Get;
+
+  procedure Set (
+Discard : in out Boolean;
+Slice   : in Instance_Pointer
+  ) is
+  begin
+ Discard := Default /= null;
+ if not Discard then
+Default := Slice;
+ end if;
+  end Set;
+   end Default_Slice;
+end Prot7;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/prot7.ads
@@ -0,0 +1,16 @@
+package Prot7 is
+   type Instance_Pointer is access Integer;
+
+   protected Default_Slice
+with Lock_Free
+   is
+  function Get return Instance_Pointer;
+
+  procedure Set (
+Discard : in out Boolean;
+Slice   : in Instance_Pointer
+  );
+   private
+  Default : Instance_Pointer;
+   end Default_Slice;
+end Prot7;

[Ada] Spurious error private subtype derivation

2019-07-01 Thread Pierre-Marie de Rodat

This patch fixes a spurious error on a derived type declaration whose
subtype indication is a subtype of a private type whose full view is a
constrained discriminated type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* sem_ch3.adb (Build_Derived_Record_Type): If the parent type is
declared as a subtype of a private type with an inherited
discriminant constraint, its generated full base appears as a
record subtype, so we need to retrieve its oen base type so that
the inherited constraint can be applied to it.

gcc/testsuite/

* gnat.dg/derived_type6.adb, gnat.dg/derived_type6.ads: New
testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -8582,6 +8582,16 @@ package body Sem_Ch3 is
  Parent_Base := Base_Type (Parent_Type);
   end if;
 
+  --  If the parent type is declared as a subtype of another private
+  --  type with inherited discriminants, its generated base type is
+  --  itself a record subtype. To further inherit the constraint we
+  --  need to use its own base to have an unconstrained type on which
+  --  to apply the inherited constraint.
+
+  if Ekind (Parent_Base) = E_Record_Subtype then
+ Parent_Base := Base_Type (Parent_Base);
+  end if;
+
   --  AI05-0115: if this is a derivation from a private type in some
   --  other scope that may lead to invisible components for the derived
   --  type, mark it accordingly.

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/derived_type6.adb
@@ -0,0 +1,5 @@
+-- { dg-do compile }
+
+package body Derived_Type6 is
+  procedure Foo is null;
+end Derived_Type6;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/derived_type6.ads
@@ -0,0 +1,9 @@
+with Ada.Strings.Bounded;
+
+package Derived_Type6 is
+  package b is new Ada.Strings.Bounded.Generic_Bounded_Length(10);
+  subtype s1 is b.Bounded_String;
+  type s2 is new s1;
+
+  procedure Foo;
+end Derived_Type6;

[Ada] Remove a SPARK rule about implicit Global

2019-07-01 Thread Pierre-Marie de Rodat

A rule about implicit Global contract for functions whose names overload
an abstract state was never implemented (and no user complained about
this). It is now removed, so references to other rules need to be
renumbered.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Piotr Trojanek  

gcc/ada/

* einfo.adb, sem_ch7.adb, sem_prag.adb, sem_util.adb: Update
references to the SPARK RM after the removal of Rule 7.1.4(5).--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -8141,7 +8141,7 @@ package body Einfo is
function Is_External_State (Id : E) return B is
begin
   --  To qualify, the abstract state must appear with option "external" or
-  --  "synchronous" (SPARK RM 7.1.4(8) and (10)).
+  --  "synchronous" (SPARK RM 7.1.4(7) and (9)).
 
   return
 Ekind (Id) = E_Abstract_State
@@ -8319,7 +8319,7 @@ package body Einfo is
function Is_Synchronized_State (Id : E) return B is
begin
   --  To qualify, the abstract state must appear with simple option
-  --  "synchronous" (SPARK RM 7.1.4(10)).
+  --  "synchronous" (SPARK RM 7.1.4(9)).
 
   return
 Ekind (Id) = E_Abstract_State

--- gcc/ada/sem_ch7.adb
+++ gcc/ada/sem_ch7.adb
@@ -3253,7 +3253,7 @@ package body Sem_Ch7 is
 
   --  A [generic] package that defines at least one non-null abstract state
   --  requires a completion only when at least one other construct requires
-  --  a completion in a body (SPARK RM 7.1.4(4) and (6)). This check is not
+  --  a completion in a body (SPARK RM 7.1.4(4) and (5)). This check is not
   --  performed if the caller requests this behavior.
 
   if Do_Abstract_States

--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -12219,7 +12219,7 @@ package body Sem_Prag is
Check_Ghost_Synchronous;
 
 --  Option Part_Of without an encapsulating state is
---  illegal (SPARK RM 7.1.4(9)).
+--  illegal (SPARK RM 7.1.4(8)).
 
 elsif Chars (Opt) = Name_Part_Of then
SPARK_Msg_N

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -10737,7 +10737,7 @@ package body Sem_Util is
  --   Asynch_Writers Effective_Writes
  --
  --  Note that both forms of External have higher precedence than
- --  Synchronous (SPARK RM 7.1.4(10)).
+ --  Synchronous (SPARK RM 7.1.4(9)).
 
  elsif Has_Synchronous then
 return Nam_In (Property, Name_Async_Readers, Name_Async_Writers);

[Ada] Cleanup references to LynuxWorks in docs and comments

2019-07-01 Thread Pierre-Marie de Rodat

Apparently the company behind LynxOS is now called Lynx Software
Technologies (formerly LynuxWorks).

Use the current name in user docs and the previous name in developer
comment (to hopefully reflect the company name at the time when the
patchset mentioned in the comment was released).

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Piotr Trojanek  

gcc/ada/

* sysdep.c: Cleanup references to LynuxWorks in docs and
comments.--- gcc/ada/sysdep.c
+++ gcc/ada/sysdep.c
@@ -1066,4 +1066,3 @@ __gnat_name_case_equivalence ()
   return 1;
 #endif
 }
-

[Ada] Revert "Global => null" on calendar routines that use timezones

2019-07-01 Thread Pierre-Marie de Rodat

Some routines from the Ada.Calendar package, i.e. Year, Month, Day,
Split and Time_Off, rely on OS-specific timezone databases that are kept
in files (e.g. /etc/localtime on Linux). In SPARK we want to model this
as a potential side-effect, so those routines can't have "Global =>
null".

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Piotr Trojanek  

gcc/ada/

* libgnat/a-calend.ads: Revert "Global => null" contracts on
non-pure routines.--- gcc/ada/libgnat/a-calend.ads
+++ gcc/ada/libgnat/a-calend.ads
@@ -61,19 +61,20 @@ is
--  the result will contain all elapsed leap seconds since the start of
--  Ada time until now.
 
-   function Year(Date : Time) return Year_Number  with Global => null;
-   function Month   (Date : Time) return Month_Number with Global => null;
-   function Day (Date : Time) return Day_Number   with Global => null;
-   function Seconds (Date : Time) return Day_Duration with Global => null;
+   function Year(Date : Time) return Year_Number;
+   function Month   (Date : Time) return Month_Number;
+   function Day (Date : Time) return Day_Number;
+   function Seconds (Date : Time) return Day_Duration;
+   --  SPARK Note: These routines, just like Split and Time_Of below, might use
+   --  the OS-specific timezone database that is typically stored in a file.
+   --  This side effect needs to be modeled, so there is no Global => null.
 
procedure Split
  (Date: Time;
   Year: out Year_Number;
   Month   : out Month_Number;
   Day : out Day_Number;
-  Seconds : out Day_Duration)
-   with
- Global => null;
+  Seconds : out Day_Duration);
--  Break down a time value into its date components set in the current
--  time zone. If Split is called on a time value created using Ada 2005
--  Time_Of in some arbitrary time zone, the input value will always be
@@ -83,9 +84,7 @@ is
  (Year: Year_Number;
   Month   : Month_Number;
   Day : Day_Number;
-  Seconds : Day_Duration := 0.0) return Time
-   with
- Global => null;
+  Seconds : Day_Duration := 0.0) return Time;
--  GNAT Note: Normally when procedure Split is called on a Time value
--  result of a call to function Time_Of, the out parameters of procedure
--  Split are identical to the in parameters of function Time_Of. However,

[Ada] gprbuild fails to find ghost ALI files

2019-07-01 Thread Pierre-Marie de Rodat

This patch fixes a bug where if a ghost unit is compiled with
ignored-ghost mode in a library project, then gprbuild will fail to find
the ALI file, because the compiler generates an empty object file, but
no ALI file.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Bob Duff  

gcc/ada/

* gnat1drv.adb (gnat1drv): Call Write_ALI if the main unit is
ignored-ghost.--- gcc/ada/gnat1drv.adb
+++ gcc/ada/gnat1drv.adb
@@ -1453,9 +1453,13 @@ begin
 
  --  Generate ALI file if specially requested, or for missing subunits,
  --  subunits or predefined generic. For ignored ghost code, the object
- --  file IS generated, so Object should be True.
+ --  file IS generated, so Object should be True, and since the object
+ --  file is generated, we need to generate the ALI file. We never want
+ --  an object file without an ALI file.
 
- if Opt.Force_ALI_Tree_File then
+ if Is_Ignored_Ghost_Unit (Main_Unit_Node)
+   or else Opt.Force_ALI_Tree_File
+ then
 Write_ALI (Object => Is_Ignored_Ghost_Unit (Main_Unit_Node));
  end if;

[Ada] Compiler abort on use of Invalid_Value on numeric positive subtype

2019-07-01 Thread Pierre-Marie de Rodat

Invalid_Value in most cases uses a predefined numeric value from a
built-in table, but if the type does not include zero in its range, the
literal 0 is used instead. In that case the value (produced by a call to
Get_Simple_Init_Val) must be resolved for proper type information.

The following must compile quietly:

   gnatmake -q main


with Problems; use Problems;
with Text_IO; use Text_IO;

procedure Main is
begin

   Put_Line ("P1: " & P1'Image);
   Put_Line ("P2: " & P2'Image);
   Put_Line ("P3: " & P3'Image);
   Put_Line ("P4: " & P4'Image);

end Main;
--
package Problems is

   function P1 return Integer;
   function P2 return Long_Integer;

   -- Max. number of prime factors a number can have is log_2 N
   -- For N = 600851475143, this is ~ 40
   -- type P3_Factors is array (1 .. 40) of Long_Integer;
   function P3 return Long_Integer;

   type P4_Palindrome is range 100*100 .. 999*999;
   function P4 return P4_Palindrome;

end Problems;

package body Problems is

   function P1 return Integer is separate;
   function P2 return Long_Integer is separate;
   function P3 return Long_Integer is separate;
   function P4 return P4_Palindrome is separate;

end Problems;

separate(Problems)

function P1 return Integer is

   Sum : Integer range 0 .. 500_500 := 0;

begin

   for I in Integer range 1 .. 1000 - 1 loop
  if I mod 3 = 0 or I mod 5 = 0 then
 Sum := Sum + I;
  end if;
   end loop;

   return Sum;

end P1;
--
separate(Problems)

function P2 return Long_Integer is

   subtype Total is Long_Integer range 0 .. 8_02e6 ;
   subtype Elem  is Totalrange 0 .. 4e7 ;

   Sum : Total := 0;
   a, b, c : Elem;

begin
   a := 1;
   b := 2;

   loop
  if b mod 2 = 0 then
 Sum := Sum + b;
  end if;

  c := b;
  b := a + b;
  a := c;

  exit when b >= 4e6;
   end loop;

   return Sum;

end P2;
--
with Text_IO; use Text_IO;
with Ada.Numerics.Elementary_Functions; use Ada.Numerics.Elementary_Functions;

separate(Problems)
function P3 return Long_Integer is

   -- Greatest prime factor
   GPF  : Long_Integer   := 1;

   Dividend : Long_Integer  := 600851475143;
   Factor   : Long_Integer  := 2;
   Quotient : Long_Integer;

begin

   while Dividend > 1 loop
  Quotient := Dividend / Factor;
  if Dividend mod Factor = 0 then
 GPF := Factor;
 Dividend := Quotient;
  else
 if Factor >= Quotient then
GPF := Dividend;
exit;
 end if;
 Factor := Factor + 1;
  end if;
   end loop;

   return GPF;

end P3;

with Text_IO; use Text_IO;
separate(Problems)
function P4 return P4_Palindrome is

   type TripleDigit is range 100 .. 999;
   a, b: TripleDigit := TripleDigit'First;

   c : P4_Palindrome;

   Max_Palindrome : P4_Palindrome := P4_Palindrome'Invalid_Value;

   function Is_Palindrome (X : in P4_Palindrome) return Boolean is

  type Int_Digit is range 0 .. 9;
  type Int_Digits is array (1 .. 6) of Int_Digit;

  type Digit_Extractor is range 0 .. P4_Palindrome'Last;
  Y : Digit_Extractor := Digit_Extractor (X);
  X_Digits : Int_Digits;

   begin

  for I in reverse X_Digits'Range loop
 X_Digits (I) := Int_Digit (Y mod 10);
 Y := Y / 10;
  end loop;

  return
(X_Digits (1) = X_Digits (6) and X_Digits (2) = X_Digits (5) and
 X_Digits (3) = X_Digits (4)) or
(X_Digits (2) = X_Digits (6) and X_Digits (3) = X_Digits (5) and
 X_Digits(1) = 0);

   end Is_Palindrome;

begin

   for a in TripleDigit'Range loop
  for b in TripleDigit'Range loop
 c := P4_Palindrome (a * b);
 if Is_Palindrome (c) then
if Max_Palindrome'Valid or else c > Max_Palindrome then
   Max_Palindrome := c;
end if;
 end if;
  end loop;
   end loop;

   return Max_Palindrome;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Ed Schonberg  

gcc/ada/

* exp_attr.adb (Expand_Attribute_Reference, case Invalid_Value):
Resolve result of call to Get_Simple_Init_Val, which may be a
conversion of a literal.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -4242,6 +4242,11 @@ package body Exp_Attr is
   when Attribute_Invalid_Value =>
  Rewrite (N, Get_Simple_Init_Val (Ptyp, N));
 
+ --  The value produced may be a conversion of a literal, which
+ --  must be resolved to establish its proper type.
+
+ Analyze_And_Resolve (N);
+
   --
   -- Last --
   --

[Ada] Crash due to missing freeze nodes in transient scope

2019-07-01 Thread Pierre-Marie de Rodat

The following patch updates the freezing of expressions to insert the
generated freeze nodes prior to the expression that produced them when
the context is a transient scope within a type initialization procedure.
This ensures that the nodes are properly interleaved with respect to the
constructs that generated them.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-07-01  Hristian Kirtchev  

gcc/ada/

* freeze.adb (Freeze_Expression): Remove the horrible useless
name hiding of N. Insert the freeze nodes generated by the
expression prior to the expression when the nearest enclosing
scope is transient.

gcc/testsuite/

* gnat.dg/freezing1.adb, gnat.dg/freezing1.ads,
gnat.dg/freezing1_pack.adb, gnat.dg/freezing1_pack.ads: New
testcase.--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -7665,9 +7665,8 @@ package body Freeze is
 or else Ekind (Current_Scope) = E_Void
   then
  declare
-N: constant Node_Id := Current_Scope;
-Freeze_Nodes : List_Id  := No_List;
-Pos  : Int  := Scope_Stack.Last;
+Freeze_Nodes : List_Id := No_List;
+Pos  : Int := Scope_Stack.Last;
 
  begin
 if Present (Desig_Typ) then
@@ -7700,7 +7699,19 @@ package body Freeze is
 end if;
 
 if Is_Non_Empty_List (Freeze_Nodes) then
-   if No (Scope_Stack.Table (Pos).Pending_Freeze_Actions) then
+
+   --  When the current scope is transient, insert the freeze nodes
+   --  prior to the expression that produced them. Transient scopes
+   --  may create additional declarations when finalizing objects
+   --  or managing the secondary stack. Inserting the freeze nodes
+   --  of those constructs prior to the scope would result in a
+   --  freeze-before-declaration, therefore the freeze node must
+   --  remain interleaved with their constructs.
+
+   if Scope_Is_Transient then
+  Insert_Actions (N, Freeze_Nodes);
+
+   elsif No (Scope_Stack.Table (Pos).Pending_Freeze_Actions) then
   Scope_Stack.Table (Pos).Pending_Freeze_Actions :=
 Freeze_Nodes;
else

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/freezing1.adb
@@ -0,0 +1,5 @@
+--  { dg-do compile }
+
+package body Freezing1 is
+   procedure Foo is null;
+end Freezing1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/freezing1.ads
@@ -0,0 +1,10 @@
+with Freezing1_Pack; use Freezing1_Pack;
+
+package Freezing1 is
+   type T is abstract tagged record
+  Collection : access I_Interface_Collection'Class :=
+new I_Interface_Collection'Class'(Factory.Create_Collection);
+   end record;
+
+   procedure Foo;
+end Freezing1;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/freezing1_pack.adb
@@ -0,0 +1,8 @@
+package body Freezing1_Pack is
+   function Create_Collection
+ (Factory : in T_Factory) return I_Interface_Collection'Class
+   is
+   begin
+  return Implem'(null record);
+   end Create_Collection;
+end Freezing1_Pack;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/freezing1_pack.ads
@@ -0,0 +1,16 @@
+package Freezing1_Pack is
+   type T_Factory is abstract tagged private;
+   type I_Interface_Collection is interface;
+
+   Factory : constant T_Factory;
+
+   function Create_Collection
+ (Factory : in T_Factory) return I_Interface_Collection'Class;
+
+   type Implem is new I_Interface_Collection with null record;
+
+private
+   type T_Factory is tagged null record;
+
+   Factory : constant T_Factory := T_Factory'(null record);
+end Freezing1_Pack;

Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format

2019-07-01 Thread Martin Liška

On 7/1/19 2:56 PM, Giuliano Belinassi wrote:
> Hi,
> 
> On 07/01, Martin Jambor wrote:
>> On Sat, Jun 29 2019, Giuliano Belinassi wrote:
>>> This patch makes lto-dump dump the symbol table callgraph in graphviz
>>> format.
>> -ENOPATCH
> Sorry,
> I forgot the most important. Here it is.

Hello.

I like the intention! Please try to use:
$ git diff HEAD > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py 
/tmp/patch

which will show you some coding style issues.

> 
>> Martin
>>
>>> I've not added any test to this because I couldn't find a way to call
>>> lto-dump from the testsuite. Also, any feedback with regard to how can
>>> I improve this is welcome.
>>>
>>> gcc/ChangeLog
>>> 2019-06-29  Giuliano Belinassi  
>>>
>>> * cgraph.c (dump_graphviz): New function
>>> * cgraph.h (dump_graphviz): New function
>>> * symtab.c (dump_graphviz): New function
>>> * varpool.c (dump_graphviz): New function
>>>
>>> gcc/lto/ChangeLog
>>> 2019-06-29  Giuliano Belinassi  
>>>
>>> * lang.opt (flag_dump_callgraph): New flag
>>> * lto-dump.c (dump_symtab_graphviz): New function
>>> * lto-dump.c (dump_tool_help): New option
>>> * lto-dump.c (lto_main): New option
> Giuliano
> 
> ===
> 
> diff --git gcc/cgraph.c gcc/cgraph.c
> index 28019aba434..55f4ee0bdaa 100644
> 
> --- gcc/cgraph.c
> 
> +++ gcc/cgraph.c
> 
> @@ -2204,6 +2204,22 @@
> 
>  cgraph_node::dump (FILE *f)
> 
> }
> }
> +/* Dump call graph node to file F in graphviz format. */
> +
> +void
> +cgraph_node::dump_graphviz (FILE *f)
> +{
> + cgraph_edge *edge;
> +
> + for (edge = callees; edge; edge = edge->next_callee)
> + {
> + cgraph_node *callee = edge->callee;
> +
> + fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ());
> + }
> +}
> +
> +
> /* Dump call graph node NODE to stderr. */
> DEBUG_FUNCTION void
> 
> ===
> 
> diff --git gcc/cgraph.h gcc/cgraph.h
> index 18839a4a5ec..181c00a3bd8 100644
> 
> --- gcc/cgraph.h
> 
> +++ gcc/cgraph.h
> 
> @@ -135,6 +135,9 @@
> 
>  public:
> 
> /* Dump symtab node to F. */
> void dump (FILE *f);
> + /* Dump symtab callgraph in graphviz format*/
> + void dump_graphviz (FILE *f);
> +
> /* Dump symtab node to stderr. */
> void DEBUG_FUNCTION debug (void);
> 
> @@ -1106,6 +1109,9 @@
> 
>  public:
> 
> /* Dump call graph node to file F. */
> void dump (FILE *f);
> + /* Dump call graph node to file F. */
> + void dump_graphviz (FILE *f);
> +
> /* Dump call graph node to stderr. */
> void DEBUG_FUNCTION debug (void);
> 
> @@ -1861,6 +1867,9 @@
> 
>  public:
> 
> /* Dump given varpool node to F. */
> void dump (FILE *f);
> + /* Dump given varpool node in graphviz format to F. */
> + void dump_graphviz (FILE *f);
> +
> /* Dump given varpool node to stderr. */
> void DEBUG_FUNCTION debug (void);
> 
> @@ -2279,6 +2288,9 @@
> 
>  public:
> 
> /* Dump symbol table to F. */
> void dump (FILE *f);
> + /* Dump symbol table to F in graphviz format. */
> + void dump_graphviz (FILE *f);
> +

I would not define it for varpool_node as you don't need it.

> /* Dump symbol table to stderr. */
> void DEBUG_FUNCTION debug (void);
> 
> ===
> 
> diff --git gcc/lto/lang.opt gcc/lto/lang.opt
> index 5bacef349e3..c62dd5aac08 100644
> 
> --- gcc/lto/lang.opt
> 
> +++ gcc/lto/lang.opt
> 
> @@ -127,6 +127,9 @@
> 
>  help
> 
> LTODump Var(flag_lto_dump_tool_help)
> Dump the dump tool command line options.
> +callgraph
> +LTODump Var(flag_dump_callgraph)
> +Dump the symtab callgraph.
> fresolution=
> LTO Joined
> 
> ===
> 
> diff --git gcc/lto/lto-dump.c gcc/lto/lto-dump.c
> index 691d109ff34..bc20120f2d5 100644
> 
> --- gcc/lto/lto-dump.c
> 
> +++ gcc/lto/lto-dump.c
> 
> @@ -197,6 +197,12 @@
> 
>  void dump_list_variables (void)
> 
> e->dump ();
> }
> +/* Dump symbol table in graphviz format. */
> +void dump_symtab_graphviz (void)
> +{
> + symtab->dump_graphviz (stdout);
> +}
> +
> /* Dump symbol list. */
> void dump_list (void)
> 
> @@ -251,26 +257,27 @@
> 
>  void dump_body ()
> 
> /* List of command line options for dumping. */
> void dump_tool_help ()
> {
> - printf ("Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n");
> - printf ("LTO dump tool command line options.\n\n");
> - printf (" -list [options] Dump the symbol list.\n");
> - printf (" -demangle Dump the demangled output.\n");
> - printf (" -defined-only Dump only the defined symbols.\n");
> - printf (" -print-value Dump initial values of the "
> - "variables.\n");
> - printf (" -name-sort Sort the symbols alphabetically.\n");
> - printf (" -size-sort Sort the symbols according to size.\n");
> - printf (" -reverse-sort Dump the symbols in reverse order.\n");
> - printf (" -symbol= Dump the details of specific symbol.\n");
> - printf (" -objects Dump the details of LTO objects.\n");
> - printf (" -type-stats Dump statistics of tr

Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format

2019-07-01 Thread Giuliano Belinassi

Hi,

On 07/01, Martin Jambor wrote:
> On Sat, Jun 29 2019, Giuliano Belinassi wrote:
> > This patch makes lto-dump dump the symbol table callgraph in graphviz
> > format.
> 
> -ENOPATCH
Sorry,
I forgot the most important. Here it is.

> 
> Martin
> 
> >
> > I've not added any test to this because I couldn't find a way to call
> > lto-dump from the testsuite. Also, any feedback with regard to how can
> > I improve this is welcome.
> >
> > gcc/ChangeLog
> > 2019-06-29  Giuliano Belinassi  
> >
> > * cgraph.c (dump_graphviz): New function
> > * cgraph.h (dump_graphviz): New function
> > * symtab.c (dump_graphviz): New function
> > * varpool.c (dump_graphviz): New function
> >
> > gcc/lto/ChangeLog
> > 2019-06-29  Giuliano Belinassi  
> >
> > * lang.opt (flag_dump_callgraph): New flag
> > * lto-dump.c (dump_symtab_graphviz): New function
> > * lto-dump.c (dump_tool_help): New option
> > * lto-dump.c (lto_main): New option
Giuliano
diff --git gcc/cgraph.c gcc/cgraph.c
index 28019aba434..55f4ee0bdaa 100644
--- gcc/cgraph.c
+++ gcc/cgraph.c
@@ -2204,6 +2204,22 @@ cgraph_node::dump (FILE *f)
 }
 }
 
+/* Dump call graph node to file F in graphviz format.  */
+
+void
+cgraph_node::dump_graphviz (FILE *f)
+{
+  cgraph_edge *edge;
+
+  for (edge = callees; edge; edge = edge->next_callee)
+{
+  cgraph_node *callee = edge->callee;
+
+  fprintf(f, "\t\"%s\" -> \"%s\"\n", name (), callee->name ());
+}
+}
+
+
 /* Dump call graph node NODE to stderr.  */
 
 DEBUG_FUNCTION void
diff --git gcc/cgraph.h gcc/cgraph.h
index 18839a4a5ec..181c00a3bd8 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -135,6 +135,9 @@ public:
   /* Dump symtab node to F.  */
   void dump (FILE *f);
 
+  /* Dump symtab callgraph in graphviz format*/
+  void dump_graphviz (FILE *f);
+
   /* Dump symtab node to stderr.  */
   void DEBUG_FUNCTION debug (void);
 
@@ -1106,6 +1109,9 @@ public:
   /* Dump call graph node to file F.  */
   void dump (FILE *f);
 
+  /* Dump call graph node to file F.  */
+  void dump_graphviz (FILE *f);
+
   /* Dump call graph node to stderr.  */
   void DEBUG_FUNCTION debug (void);
 
@@ -1861,6 +1867,9 @@ public:
   /* Dump given varpool node to F.  */
   void dump (FILE *f);
 
+  /* Dump given varpool node in graphviz format to F.  */
+  void dump_graphviz (FILE *f);
+
   /* Dump given varpool node to stderr.  */
   void DEBUG_FUNCTION debug (void);
 
@@ -2279,6 +2288,9 @@ public:
   /* Dump symbol table to F.  */
   void dump (FILE *f);
 
+  /* Dump symbol table to F in graphviz format.  */
+  void dump_graphviz (FILE *f);
+
   /* Dump symbol table to stderr.  */
   void DEBUG_FUNCTION debug (void);
 
diff --git gcc/lto/lang.opt gcc/lto/lang.opt
index 5bacef349e3..c62dd5aac08 100644
--- gcc/lto/lang.opt
+++ gcc/lto/lang.opt
@@ -127,6 +127,9 @@ help
 LTODump Var(flag_lto_dump_tool_help)
 Dump the dump tool command line options.
 
+callgraph
+LTODump Var(flag_dump_callgraph)
+Dump the symtab callgraph.
 
 fresolution=
 LTO Joined
diff --git gcc/lto/lto-dump.c gcc/lto/lto-dump.c
index 691d109ff34..bc20120f2d5 100644
--- gcc/lto/lto-dump.c
+++ gcc/lto/lto-dump.c
@@ -197,6 +197,12 @@ void dump_list_variables (void)
 e->dump ();
 }
 
+/* Dump symbol table in graphviz format.  */
+void dump_symtab_graphviz (void)
+{
+  symtab->dump_graphviz (stdout);
+}
+
 /* Dump symbol list.  */
 
 void dump_list (void)
@@ -251,26 +257,27 @@ void dump_body ()
 /* List of command line options for dumping.  */
 void dump_tool_help ()
 {
-  printf ("Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n");
-  printf ("LTO dump tool command line options.\n\n");
-  printf ("  -list [options]   Dump the symbol list.\n");
-  printf ("-demangle   Dump the demangled output.\n");
-  printf ("-defined-only   Dump only the defined symbols.\n");
-  printf ("-print-valueDump initial values of the "
-	  "variables.\n");
-  printf ("-name-sort  Sort the symbols alphabetically.\n");
-  printf ("-size-sort  Sort the symbols according to size.\n");
-  printf ("-reverse-sort   Dump the symbols in reverse order.\n");
-  printf ("  -symbol=  Dump the details of specific symbol.\n");
-  printf ("  -objects  Dump the details of LTO objects.\n");
-  printf ("  -type-stats   Dump statistics of tree types.\n");
-  printf ("  -tree-stats   Dump statistics of trees.\n");
-  printf ("  -gimple-stats Dump statistics of gimple "
-	  "statements.\n");
-  printf ("  -dump-body=   Dump the specific gimple body.\n");
-  printf ("  -dump-level=  Deciding the optimization level "
-	  "of body.\n");
-  printf ("  -help Display the dump tool help.\n");
+  const char *msg = 
+"Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]...\n\n"
+"LTO dump

Re: Use ODR for canonical types construction in LTO

2019-07-01 Thread Jan Hubicka

Hi,
this patch fixes the ICE. Problem is that va_arg type is pre-streamed
and thus at stream-in time we have main variant constructed by LTO FE
which is !CXX_ODR_P while vairants are ones comming from C++ FE which
are ODR.  It is safe to drop the flags here since we only care about
main variants anyway.

I am testing the patch on aarch64 now.

Honza

Index: lto/lto-common.c
===
--- lto/lto-common.c(revision 272846)
+++ lto/lto-common.c(working copy)
@@ -568,8 +568,17 @@ lto_register_canonical_types_for_odr_typ
 
   /* Register all remaining types.  */
   FOR_EACH_VEC_ELT (*types_to_register, i, t)
-if (!TYPE_CANONICAL (t))
-  gimple_register_canonical_type (t);
+{
+  /* For pre-streamed types like va-arg it is possible that main variant
+is !CXX_ODR_P while the variant (which is streamed) is.
+Copy CXX_ODR_P to make type verifier happy.  This is safe because
+in canonical type calculation we only consider main variants.
+However we can not change this flag before streaming is finished
+to not affect tree merging.  */
+  TYPE_CXX_ODR_P (t) = TYPE_CXX_ODR_P (TYPE_MAIN_VARIANT (t));
+  if (!TYPE_CANONICAL (t))
+gimple_register_canonical_type (t);
+}
 }

Re: [ARM/FDPIC v5 00/21] FDPIC ABI for ARM

2019-07-01 Thread Christophe Lyon

ping^4 ?
https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00817.html

On Mon, 17 Jun 2019 at 13:41, Christophe Lyon
 wrote:
>
> ping^3 ?
>
> On Thu, 6 Jun 2019 at 14:36, Christophe Lyon  
> wrote:
> >
> > Hi,
> >
> > If this makes review easier, here are the areas covered by the patches:
> >
> > - patches 1,3,4,7,8,9,10,12,21: target-specific
> > - patch 2: configure
> > - patch 5,6,11,13: generic parts, undef #if defined(__FDPIC__)
> > - patches 14-20: testsuite
> >
> > Christophe
> >
> > On Tue, 4 Jun 2019 at 14:57, Christophe Lyon  
> > wrote:
> > >
> > > Ping?
> > >
> > >
> > > On Thu, 23 May 2019 at 14:46, Christophe Lyon  
> > > wrote:
> > > >
> > > > Ping?
> > > >
> > > > Any feedback other than what I got on patch 03/21 ?
> > > >
> > > > Thanks,
> > > >
> > > > Christophe
> > > >
> > > >
> > > > On 15/05/2019 14:39, Christophe Lyon wrote:
> > > > > Hello,
> > > > >
> > > > > This patch series implements the GCC contribution of the FDPIC ABI for
> > > > > ARM targets.
> > > > >
> > > > > This ABI enables to run Linux on ARM MMU-less cores and supports
> > > > > shared libraries to reduce the memory footprint.
> > > > >
> > > > > Without MMU, text and data segments relative distances are different
> > > > > from one process to another, hence the need for a dedicated FDPIC
> > > > > register holding the start address of the data segment. One of the
> > > > > side effects is that function pointers require two words to be
> > > > > represented: the address of the code, and the data segment start
> > > > > address. These two words are designated as "Function Descriptor",
> > > > > hence the "FD PIC" name.
> > > > >
> > > > > On ARM, the FDPIC register is r9 [1], and the target name is
> > > > > arm-uclinuxfdpiceabi. Note that arm-uclinux exists, but uses another
> > > > > ABI and the BFLAT file format; it does not support code sharing.
> > > > > The -mfdpic option is enabled by default, and -mno-fdpic should be
> > > > > used to build the Linux kernel.
> > > > >
> > > > > This work was developed some time ago by STMicroelectronics, and was
> > > > > presented during Linaro Connect SFO15 (September 2015). You can watch
> > > > > the discussion and read the slides [2].
> > > > > This presentation was related to the toolchain published on github 
> > > > > [3],
> > > > > which is based on binutils-2.22, gcc-4.7, uclibc-0.9.33.2, gdb-7.5.1
> > > > > and qemu-2.3.0, and for which pre-built binaries are available [3].
> > > > >
> > > > > The ABI itself is described in details in [1].
> > > > >
> > > > > Our Linux kernel patches have been updated and committed by Nicolas
> > > > > Pitre (Linaro) in July 2017. They are required so that the loader is
> > > > > able to handle this new file type. Indeed, the ELF files are tagged
> > > > > with ELFOSABI_ARM_FDPIC. This new tag has been allocated by ARM, as
> > > > > well as the new relocations involved.
> > > > >
> > > > > The binutils, QEMU and uclibc-ng patch series have been merged a few
> > > > > months ago. [4][5][6]
> > > > >
> > > > > This series provides support for architectures that support ARM and/or
> > > > > Thumb-2 and has been tested on arm-linux-gnueabi without regression,
> > > > > as well as arm-uclinuxfdpiceabi, using QEMU. arm-uclinuxfdpiceabi has
> > > > > a few more failures than arm-linux-gnueabi, but is quite functional.
> > > > >
> > > > > I have also booted an STM32 board (stm32f469) which uses a cortex-m4
> > > > > with linux-4.20.17 and ran successfully several tools.
> > > > >
> > > > > Are the GCC patches OK for inclusion in master?
> > > > >
> > > > > Changes between v4 and v5:
> > > > > - rebased on top of recent gcc-10 master (April 26th, 2019)
> > > > > - fixed handling of stack-protector combined patterns in FDPIC mode
> > > > >
> > > > > Changes between v3 and v4:
> > > > >
> > > > > - improved documentation (patch 1)
> > > > > - emit an error message (sorry) if the target architecture does not
> > > > >support arm nor thumb-2 modes (patch 4)
> > > > > - handle Richard's comments on patch 4 (comments, unspec)
> > > > > - added .align directive (patch 5)
> > > > > - fixed use of kernel helpers (__kernel_cmpxchg, __kernel_dmb) (patch 
> > > > > 6)
> > > > > - code factorization in patch 7
> > > > > - typos/internal function name in patch 8
> > > > > - improved patch 12
> > > > > - dropped patch 16
> > > > > - patch 20 introduces arm_arch*_thumb_ok effective targets to help
> > > > >skip some tests
> > > > > - I tested patch 2 on xtensa-buildroot-uclinux-uclibc, it adds many
> > > > >new tests, but a few regressions
> > > > >(https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00713.html)
> > > > > - I compiled and executed several LTP tests to exercise pthreads and 
> > > > > signals
> > > > > - I wrote and executed a simple testcase to change the interaction
> > > > >with __kernel_cmpxchg (ie. call the kernel helper rather than use 
> > > > > an
> > > > >implementation in libgcc as requested by Richard)
> > > > >
> >

Re: [PATCH][ARM] Add support for "noinit" attribute

2019-07-01 Thread Christophe Lyon

ping?

On Thu, 13 Jun 2019 at 17:13, Christophe Lyon
 wrote:
>
> Hi,
>
> Similar to what already exists for TI msp430 or in TI compilers for
> arm, this patch adds support for "noinit" attribute for arm. It's very
> similar to the corresponding code in GCC for msp430.
>
> It is useful for embedded targets where the user wants to keep the
> value of some data when the program is restarted: such variables are
> not zero-initialized.It is mostly a helper/shortcut to placing
> variables in a dedicated section.
>
> It's probably desirable to add the following chunk to the GNU linker:
> diff --git a/ld/emulparams/armelf.sh b/ld/emulparams/armelf.sh
> index 272a8bc..9555cec 100644
> --- a/ld/emulparams/armelf.sh
> +++ b/ld/emulparams/armelf.sh
> @@ -10,7 +10,19 @@ OTHER_TEXT_SECTIONS='*(.glue_7t) *(.glue_7)
> *(.vfp11_veneer) *(.v4_bx)'
>  OTHER_BSS_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__bss_start__ =
> .${CREATE_SHLIB+)};"
>  OTHER_BSS_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}_bss_end__ =
> .${CREATE_SHLIB+)}; ${CREATE_SHLIB+PROVIDE (}__bss_end__ =
> .${CREATE_SHLIB+)};"
>  OTHER_END_SYMBOLS="${CREATE_SHLIB+PROVIDE (}__end__ = .${CREATE_SHLIB+)};"
> -OTHER_SECTIONS='.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }'
> +OTHER_SECTIONS='
> +.note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }
> +  /* This section contains data that is not initialised during load
> + *or* application reset.  */
> +   .noinit (NOLOAD) :
> +   {
> + . = ALIGN(2);
> + PROVIDE (__noinit_start = .);
> + *(.noinit)
> + . = ALIGN(2);
> + PROVIDE (__noinit_end = .);
> +   }
> +'
>
> so that the noinit section has the "NOLOAD" flag.
>
> I'll submit that part separately to the binutils project if OK.
>
> OK?
>
> Thanks,
>
> Christophe

Re: [PING][AArch64] Use scvtf fbits option where appropriate

2019-07-01 Thread Wilco Dijkstra

Hi Joel,

This looks good. One more thing, the patterns need to be conditional
on check flag_trapping_math since the division can underflow and
reassociating it would remove that. Other than that I think this is ready,
but I can't approve.

Wilco

Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-07-01 Thread Segher Boessenkool

On Mon, Jul 01, 2019 at 12:30:41PM +0200, Richard Biener wrote:
> On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt  wrote:
> >
> > On 6/27/19 6:45 AM, Segher Boessenkool wrote:
> > > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
> > >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt  
> > >> wrote:
> > >>> We've done some experimenting and realized that the subject option 
> > >>> almost
> > >>> always provide improved performance for Power when the loop unroller is
> > >>> enabled.  So this patch turns that flag on by default for us.
> > >> I guess it creates more freedom for combine (more single-uses) and 
> > >> register
> > >> allocation.  I wonder in which cases this might pessimize things?  I 
> > >> guess
> > >> the pre-RA scheduler might make RAs life harder with creating overlapping
> > >> life-ranges.
> > >>
> > >> I guess you didn't actually investigate the nature of the improvements 
> > >> you saw?
> > > It breaks the length of dependency chains by a factor equal to the unroll
> > > factor.  I do not know why this doesn't help a lot everywhere.  It of
> > > course raises register pressure, maybe that is just it?
> >
> > Right, it's all about breaking dependencies to more efficiently exploit
> > the microarchitecture.  By default, variable expansion in GCC is quite
> > conservative, creating only two reduction streams out of one, so it's
> > pretty rare for it to cause spill.  This can be adjusted upwards with
> > --param max-variable-expansions-in-unroller=n.  Our experiments show
> > that raising n to 4 starts to cause some minor degradations, which are
> > almost certainly due to pressure, so the default setting looks appropriate.
> 
> But it's probably only an issue for targets which enable pre-RA scheduling
> by default?  It might also increase RA compile-time (more allocnos).

It only helps super-scalar CPUs normally.  It doesn't increase register
use much at all, compared to only doing unrolling.  It helps scheduling
a lot, but I don't see where sched1 comes in here?

To see if it helps your arch, just try it out?


Segher

Re: RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch

2019-07-01 Thread Richard Biener

On Mon, Jul 1, 2019 at 1:22 PM Joern Wolfgang Rennecke
 wrote:
>
> The heuristic introduced for PR71016 prevents recognizing a max / min
> combo like it is used for
> saturation when followed by a conversion.
> The attached patch refines the heuristic to allow this case. Regression
> tested on x86_64-pc-linux-gnu .

Few style nits:

  if (!gsi_end_p (gsi))
-   return NULL;
+   {
+ gimple *assign = gsi_stmt (gsi);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+   return NULL;
   if (gassign *assign = dyn_cast 
(gsi_stmt (gsi)))
 {
+ tree lhs = gimple_assign_lhs (assign);
+ enum tree_code ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+   return NULL;
+ gsi_prev_nondebug (&gsi);
+ if (!gsi_end_p (gsi))
+   return NULL;
}
  else
return NULL;
+   }

also please check that 'lhs' is equal to gimple_assing_rhs1 (arg0_def_stmt)
otherwise you'd also allow MIN/MAX unrelated to the conversion detected.

On x86 I see the MIN_EXPR is already detected by GENERIC folding,
I wonder if that is required or if we can handle the case without in one
phiopt pass invocation as well.

Otherwise looks good to me.
Thanks,
Richard.

Re: [patch][aarch64]: fix unrecognizable insn for ldr got in ilp32 tiny

2019-07-01 Thread Wilco Dijkstra

Hi Silvia,

> Combined them into one pattern. Updated the diff and the changelog is now:
 
 gcc/ChangeLog:
 
 2019-06-18  Sylvia Taylor  
 
     * config/aarch64/aarch64.c
     (aarch64_load_symref_appropriately): Change SYMBOL_TINY_GOT.
     * config/aarch64/aarch64.md
     (ldr_got_tiny_): New pattern.
     (ldr_got_tiny_sidi): New pattern.
 

Thanks, this looks fine to me, but I can't approve. 

As an aside there is an inconsistency in the mode used for symbols throughout
aarch64.md. Some have no mode, others use PTR, and some cases use DI mode
(which means ILP32 may want SImode for some symbols and DImode in other 
cases...).
Since various existing patterns have this issue it's not a problem with this 
patch.

Wilco

Re: [PATCH 04/30] Changes to arc

2019-07-01 Thread Andrew Burgess

* acsaw...@linux.ibm.com  [2019-06-25 15:22:13 -0500]:

> From: Aaron Sawdey 
> 
>   * config/arc/arc-protos.h: Change movmem to cpymem.
>   * config/arc/arc.c (arc_expand_movmem): Change movmem to cpymem.
>   * config/arc/arc.h: Change movmem to cpymem in comment.
>   * config/arc/arc.md (movmemsi): Change movmem to cpymem.

OK for ARC.

Thanks,
Andrew

> ---
>  gcc/config/arc/arc-protos.h | 2 +-
>  gcc/config/arc/arc.c| 6 +++---
>  gcc/config/arc/arc.h| 2 +-
>  gcc/config/arc/arc.md   | 4 ++--
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index f501bc3..74e5247 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -35,7 +35,7 @@ extern void arc_final_prescan_insn (rtx_insn *, rtx *, int);
>  extern const char *arc_output_libcall (const char *);
>  extern int arc_output_addsi (rtx *operands, bool, bool);
>  extern int arc_output_commutative_cond_exec (rtx *operands, bool);
> -extern bool arc_expand_movmem (rtx *operands);
> +extern bool arc_expand_cpymem (rtx *operands);
>  extern bool prepare_move_operands (rtx *operands, machine_mode mode);
>  extern void emit_shift (enum rtx_code, rtx, rtx, rtx);
>  extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 89f69c79..23171d2 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -8883,7 +8883,7 @@ arc_output_commutative_cond_exec (rtx *operands, bool 
> output_p)
>return 8;
>  }
>  
> -/* Helper function of arc_expand_movmem.  ADDR points to a chunk of memory.
> +/* Helper function of arc_expand_cpymem.  ADDR points to a chunk of memory.
> Emit code and return an potentially modified address such that offsets
> up to SIZE are can be added to yield a legitimate address.
> if REUSE is set, ADDR is a register that may be modified.  */
> @@ -8917,7 +8917,7 @@ force_offsettable (rtx addr, HOST_WIDE_INT size, bool 
> reuse)
> offset ranges.  Return true on success.  */
>  
>  bool
> -arc_expand_movmem (rtx *operands)
> +arc_expand_cpymem (rtx *operands)
>  {
>rtx dst = operands[0];
>rtx src = operands[1];
> @@ -10427,7 +10427,7 @@ arc_use_by_pieces_infrastructure_p (unsigned 
> HOST_WIDE_INT size,
>   enum by_pieces_operation op,
>   bool speed_p)
>  {
> -  /* Let the movmem expander handle small block moves.  */
> +  /* Let the cpymem expander handle small block moves.  */
>if (op == MOVE_BY_PIECES)
>  return false;
>  
> diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
> index 80dead9..4a9dd07 100644
> --- a/gcc/config/arc/arc.h
> +++ b/gcc/config/arc/arc.h
> @@ -1423,7 +1423,7 @@ do { \
> in one reasonably fast instruction.  */
>  #define MOVE_MAX 4
>  
> -/* Undo the effects of the movmem pattern presence on STORE_BY_PIECES_P .  */
> +/* Undo the effects of the cpymem pattern presence on STORE_BY_PIECES_P .  */
>  #define MOVE_RATIO(SPEED) ((SPEED) ? 15 : 3)
>  
>  /* Define this to be nonzero if shift instructions ignore all but the
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 528e344..ba595dd 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -5122,13 +5122,13 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> (set_attr "type" "loop_end")
> (set_attr "length" "4,20")])
>  
> -(define_expand "movmemsi"
> +(define_expand "cpymemsi"
>[(match_operand:BLK 0 "" "")
> (match_operand:BLK 1 "" "")
> (match_operand:SI 2 "nonmemory_operand" "")
> (match_operand 3 "immediate_operand" "")]
>""
> -  "if (arc_expand_movmem (operands)) DONE; else FAIL;")
> +  "if (arc_expand_cpymem (operands)) DONE; else FAIL;")
>  
>  ;; Close http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35803 if this works
>  ;; to the point that we can generate cmove instructions.
> -- 
> 2.7.4
>

RFA: fix PR66726 regression for min/max/conversion combo from PR71016 patch

2019-07-01 Thread Joern Wolfgang Rennecke

The heuristic introduced for PR71016 prevents recognizing a max / min 
combo like it is used for

saturation when followed by a conversion.
The attached patch refines the heuristic to allow this case. Regression 
tested on x86_64-pc-linux-gnu .


Index: tree-ssa-phiopt.c
===
--- tree-ssa-phiopt.c   (revision 272846)
+++ tree-ssa-phiopt.c   (working copy)
@@ -504,7 +504,18 @@ factor_out_conditional_conversion (edge
  gsi = gsi_for_stmt (arg0_def_stmt);
  gsi_prev_nondebug (&gsi);
  if (!gsi_end_p (gsi))
-   return NULL;
+   {
+ gimple *assign = gsi_stmt (gsi);
+ if (gimple_code (assign) != GIMPLE_ASSIGN)
+   return NULL;
+ tree lhs = gimple_assign_lhs (assign);
+ enum tree_code ass_code = gimple_assign_rhs_code (assign);
+ if (ass_code != MAX_EXPR && ass_code != MIN_EXPR)
+   return NULL;
+ gsi_prev_nondebug (&gsi);
+ if (!gsi_end_p (gsi))
+   return NULL;
+   }
  gsi = gsi_for_stmt (arg0_def_stmt);
  gsi_next_nondebug (&gsi);
  if (!gsi_end_p (gsi))
Index: testsuite/gcc.dg/tree-ssa/pr66726-4.c
===
--- testsuite/gcc.dg/tree-ssa/pr66726-4.c   (nonexistent)
+++ testsuite/gcc.dg/tree-ssa/pr66726-4.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1-details" } */
+
+#define SAT(x) (x < 0 ? 0 : (x > 255 ? 255 : x))
+
+void
+foo (unsigned char *p, int i)
+{
+  *p = SAT (i);
+}
+
+/* { dg-final { scan-tree-dump-times "COND_EXPR .*and PHI .*converted to 
straightline code" 1 "phiopt1" } } */

Re: [PATCH 2/2] Rename SINGE_VALUE to TOPN_VALUES counters.

2019-07-01 Thread Martin Liška

@Honza: PING^1

On 6/20/19 4:46 PM, Martin Liška wrote:
> And the second part is rename so that it reflect reality
> that single value can actually track multiple values.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
>

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-07-01 Thread Martin Liška

@Honza: PING^1

On 6/20/19 4:45 PM, Martin Liška wrote:
> Hi.
> 
> So the first part is about support of N tracked values to be supported.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
>

Re: allow EH to escape from GIMPLE_EH_ELSE ELSE block

2019-07-01 Thread Richard Biener

On Fri, Jun 28, 2019 at 11:43 AM Alexandre Oliva  wrote:
>
> On Jun 27, 2019, Richard Biener  wrote:
>
> > On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:
> >>
> >> The only preexisting use of GIMPLE_EH_ELSE, for transactional memory
> >> commits, did not allow exceptions to escape from the ELSE path.  The
> >> trick it uses to allow the ELSE path to see the propagating exception
> >> does not work very well if the exception cleanup raises further
> >> exceptions: the ELSE block is configured to handle exceptions in
> >> itself.  This confuses the heck out of CFG and EH cleanups.
> >>
> >> Basing the lowering context for the ELSE block on outer_state, rather
> >> than this_state, gets us the expected enclosing handler.
> >>
> >> Regstrapped on x86_64-linux-gnu.  Ok to install?
>
> > Testcase?
>
> None possible with the current codebase, I'm afraid.  Nothing creates
> EH_ELSE yet, and nothing creates GIMPLE_EH_ELSE with
> exception-generating cleanups.

Oh, I see.  The GIMPLE frontend is also missing parsing support
for the EH stmt kinds, it might have been possible to build a testcase
with that I guess (yeah, only a very slight hint ... ;)).  Can you share
a .gimple dump that has the issue?  So the testcase "survives"
CFG construction but then crashes during the first CFG cleanup or
only after some EH optimization?

Thanks,
Richard.

> The following patch, an extract of a larger patch that is still under
> internal review (and pending approval of the EH_ELSE-related changes
> ;-), is minimized into pointlessness so as to just exercise the
> GIMPLE_EH_ELSE lowering issue.
>
> IIRC the problem is NOT immediately triggered in a bootstrap, it
> requires more elaborate EH scenarios to trigger it: without the
> GIMPLE_EH_ELSE lowering patch, a few libadalang units failed to compile
> within delete_unreachable_blocks, using a compiler built with the patch
> below and the patch that introduced EH_ELSE that I posted yesterday.
>
> At first, I suspected GIMPLE_EH_ELSE had bit-rotted, as it doesn't seem
> to get much use, but the problem turned out to be caused by the
> nonsensical CFG resulting from the GIMPLE_EH_ELSE lowering, that breaks
> EH CFG optimizations and end up corrupting the CFG.  IIRC it was
> unsplit_eh that mishandled self edges that arise after a bunch of other
> valid transformations.  I cannot observe the crash with trunk any more,
> but the EH tree is visibly wrong in tree dumps with eh and blocks,
> compiling such a simple testcase as this (with a patched compiler as
> described above):
>
> -- compile with -Ofast -g -c
> with GNAT.Semaphores; use GNAT.Semaphores;
> package T is
>subtype Mutual_Exclusion is Binary_Semaphore
>  (Initially_Available => True,
>   Ceiling => Default_Ceiling);
>Lock : aliased Mutual_Exclusion;
> end T;
>
>
> Self edges end up arising because of the way GIMPLE_EH_ELSE was lowered:
> the exceptional cleanup was lowered as if within the TRY_FINALLY_EXPR
> TRY block, whose exceptions get handled by the exceptional cleanup, but
> both cleanup paths should be lowered as if after the TRY_FINALLY_EXPR,
> so that an enclosing EH region is used should they throw exceptions.
>
> The current lowering made sense for cleanups that couldn't throw: no EH
> edge would come out of the lowered exceptional cleanup block.  However,
> EH_ELSE-using code below breaks that assumption.  Fortunately, the
> assumption was not essential for GIMPLE_EH_ELSE: the lowering code just
> needed this small adjustment to make room for relaxing the requirement
> that the cleanups couldn't throw and restoring CFG EH edges that match
> what we get without the patch below.
>
>
> --- gcc/ada/gcc-interface/trans.c
> +++ gcc/ada/gcc-interface/trans.c
> @@ -6249,7 +6249,7 @@ Exception_Handler_to_gnu_gcc (Node_Id gnat_node)
>if (stmt_list_cannot_alter_control_flow_p (Statements (gnat_node)))
>  add_stmt_with_node (stmt, gnat_node);
>else
> -add_cleanup (stmt, gnat_node);
> +add_cleanup (build2 (EH_ELSE, void_type_node, stmt, stmt), gnat_node);
>
>gnat_poplevel ();
>
> @@ -9066,7 +9081,29 @@ add_cleanup (tree gnu_cleanup, Node_Id g
>  {
>if (Present (gnat_node))
>  set_expr_location_from_node (gnu_cleanup, gnat_node, true);
> -  append_to_statement_list (gnu_cleanup, ¤t_stmt_group->cleanups);
> +  /* An EH_ELSE must be by itself, and that's all we need when we use
> + it.  The assert below makes sure that is so.  Should we ever need
> + more than that, we could combine EH_ELSEs, and copy non-EH_ELSE
> + stmts into both cleanup paths of an EH_ELSE, being careful to
> + make sure the exceptional path doesn't throw.  */
> +  if (TREE_CODE (gnu_cleanup) == EH_ELSE)
> +{
> +  gcc_assert (!current_stmt_group->cleanups);
> +  if (Present (gnat_node))
> +   {
> + set_expr_location_from_node (TREE_OPERAND (gnu_cleanup, 0),
> +  gnat_node, true);
> + set_expr_location_from_

Re: [patch, c++ openmp] Improve diagnostics for unmappable types

2019-07-01 Thread Andrew Stubbs


On 28/06/2019 17:21, Jason Merrill wrote:

+  inform ((decl ? DECL_SOURCE_LOCATION (decl) : UNKNOWN_LOCATION),
+  "incomplete types are not mappable");


It's better to use input_location as a fallback; essentially no 
diagnostics should use UNKNOWN_LOCATION.  And let's print the type with 
%qT.



+    if (notes)
+  result = false;
+    else
+  return false;


Returning early when !notes doesn't seem worth the extra lines of code.


How is this version?

Andrew

Improve OpenMP map diagnostics.

2019-07-01  Andrew Stubbs  

	gcc/cp/
	* cp-tree.h (cp_omp_emit_unmappable_type_notes): New prototype.
	* decl.c (cp_finish_decl): Call cp_omp_emit_unmappable_type_notes.
	* decl2.c (cp_omp_mappable_type): Move contents to ...
	(cp_omp_mappable_type_1):  ... here and add note output.
	(cp_omp_emit_unmappable_type_notes): New function.
	* semantics.c (finish_omp_clauses): Call
	cp_omp_emit_unmappable_type_notes in four places.

	gcc/testsuite/
	* g++.dg/gomp/unmappable-1.C: New file.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index bf47f67721e..a7b2151e6dd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6533,6 +6533,7 @@ extern int parm_index   (tree);
 extern tree vtv_start_verification_constructor_init_function (void);
 extern tree vtv_finish_verification_constructor_init_function (tree);
 extern bool cp_omp_mappable_type		(tree);
+extern bool cp_omp_emit_unmappable_type_notes	(tree);
 extern void cp_check_const_attributes (tree);
 
 /* in error.c */
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 5d49535b0d9..74343bc1ec4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7433,8 +7433,11 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 			DECL_ATTRIBUTES (decl));
   complete_type (TREE_TYPE (decl));
   if (!cp_omp_mappable_type (TREE_TYPE (decl)))
-	error ("%q+D in declare target directive does not have mappable type",
-	   decl);
+	{
+	  error ("%q+D in declare target directive does not have mappable"
+		 " type", decl);
+	  cp_omp_emit_unmappable_type_notes (TREE_TYPE (decl));
+	}
   else if (!lookup_attribute ("omp declare target",
   DECL_ATTRIBUTES (decl))
 	   && !lookup_attribute ("omp declare target link",
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 206f04c6320..b415716c7dd 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1406,32 +1406,68 @@ cp_check_const_attributes (tree attributes)
 }
 }
 
-/* Return true if TYPE is an OpenMP mappable type.  */
-bool
-cp_omp_mappable_type (tree type)
+/* Return true if TYPE is an OpenMP mappable type.
+   If NOTES is non-zero, emit a note message for each problem.  */
+static bool
+cp_omp_mappable_type_1 (tree type, bool notes)
 {
+  bool result = true;
+
   /* Mappable type has to be complete.  */
   if (type == error_mark_node || !COMPLETE_TYPE_P (type))
-return false;
+{
+  if (notes)
+	{
+	  tree decl = TYPE_MAIN_DECL (type);
+	  inform ((decl ? DECL_SOURCE_LOCATION (decl) : input_location),
+		  "incomplete type %qT is not mappable", type);
+	}
+  result = false;
+}
   /* Arrays have mappable type if the elements have mappable type.  */
   while (TREE_CODE (type) == ARRAY_TYPE)
 type = TREE_TYPE (type);
   /* A mappable type cannot contain virtual members.  */
   if (CLASS_TYPE_P (type) && CLASSTYPE_VTABLES (type))
-return false;
+{
+  if (notes)
+	inform (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (type)),
+		"type %qT with virtual members is not mappable", type);
+  result = false;
+}
   /* All data members must be non-static.  */
   if (CLASS_TYPE_P (type))
 {
   tree field;
   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 	if (VAR_P (field))
-	  return false;
+	  {
+	if (notes)
+	  inform (DECL_SOURCE_LOCATION (field),
+		  "static field %qD is not mappable", field);
+	result = false;
+	  }
 	/* All fields must have mappable types.  */
 	else if (TREE_CODE (field) == FIELD_DECL
-		 && !cp_omp_mappable_type (TREE_TYPE (field)))
-	  return false;
+		 && !cp_omp_mappable_type_1 (TREE_TYPE (field), notes))
+	  result = false;
 }
-  return true;
+  return result;
+}
+
+/* Return true if TYPE is an OpenMP mappable type.  */
+bool
+cp_omp_mappable_type (tree type)
+{
+  return cp_omp_mappable_type_1 (type, false);
+}
+
+/* Return true if TYPE is an OpenMP mappable type.
+   Emit an error messages if not.  */
+bool
+cp_omp_emit_unmappable_type_notes (tree type)
+{
+  return cp_omp_mappable_type_1 (type, true);
 }
 
 /* Return the last pushed declaration for the symbol DECL or NULL
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 92c48753d42..8f019580d0f 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7090,6 +7090,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 "array section does not have mappable type "
 "in %qs clause",
 omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+		  cp_omp_emit_u

Re: introduce EH_ELSE tree and gimplifier

2019-07-01 Thread Richard Biener

On Fri, Jun 28, 2019 at 5:21 AM Alexandre Oliva  wrote:
>
> On Jun 27, 2019, Richard Biener  wrote:
>
> > On Thu, Jun 27, 2019 at 10:18 AM Alexandre Oliva  wrote:
>
> >> @@ -909,6 +909,13 @@ DEFTREECODE (TRY_CATCH_EXPR, "try_catch_expr", 
> >> tcc_statement, 2)
> >> The second operand is a cleanup expression which is evaluated
> >> on any exit (normal, exception, or jump out) from this expression.  */
> >> DEFTREECODE (TRY_FINALLY_EXPR, "try_finally", tcc_statement, 2)
> >> +
> >> +/* Evaluate either the normal or the exceptional cleanup.  This must
> >> +   only be present as the cleanup expression in a TRY_FINALLY_EXPR.
> >> +   If the TRY_FINALLY_EXPR completes normally, the first operand of
> >> +   EH_ELSE is used as a cleanup, otherwise the second operand is
> >> +   used.  */
> >> +DEFTREECODE (EH_ELSE, "eh_else", tcc_statement, 2)
>
> > It's a bit weird that this is a tcc_statement as opposed to tcc_expression,
>
> Erhm...  What's weird about it?  It is not something that belongs in an
> arbitrary expr, it's a lot more like a top-level statement, like
> try_finally, having side effects but no useful value.

It's weird because it appears in a TRY_FINALLY statement operand.
But I guess the line between statements and expressions in GENERIC
is muddy...

> > also I'd have called it EH_ELSE_EXPR for clarity.
>
> Now *that* would be weird IMHO.  No offense intended, but I'd have
> dropped the _EXPR from TRY_FINALLY_EXPR to make it match the
> "try_finally" string.

OK, let me say for consistency then ...

> What reasons are there for them to deserve the _EXPR suffix?

History I guess?  All 'old' tcc_statement are _EXPR.

> If I rename it to EH_ELSE_EXPR, should GIMPLE_EH_ELSE also get a _EXPR
> suffix?
>
>
> > I also wonder why TRY_FINALLY_EXPR didn't just get a third optional
> > operand?
>
> I considered that possibility, but I didn't expect EH_ELSE to be used
> often enough to bloat every TRY_FINALLY_EXPR with another operand, plus
> a flag or some other means to remove ambiguity between same-cleanup and
> no-else-cleanup.

Fair enough.

Richard.

>
> --
> Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
> Be the change, be Free! FSF Latin America board member
> GNU Toolchain EngineerFree Software Evangelist
> Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara

Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-07-01 Thread Richard Biener

On Fri, Jun 28, 2019 at 2:16 AM Jeff Law  wrote:
>
> On 6/27/19 10:12 AM, Jakub Jelinek wrote:
> > On Thu, Jun 27, 2019 at 09:15:41AM -0600, Jeff Law wrote:
> >> Actually it was trivial to create with store merging.
> >>
> >> char x[20];
> >> foo()
> >> {
> >>   x[0] = 0x41;
> >>   x[1] = 0x42;
> >> }
> >>
> >>   MEM  [(char *)&x] = 16961;
> >>
> >> So clearly we can get this in gimple.  So we need a check of some kind,
> >> either on the LHS or the RHS to ensure that we actually have a single
> >> character store as opposed to something like the above.
> >
> > handle_char_store is only called if:
> >   tree type = TREE_TYPE (lhs);
> >   if (TREE_CODE (type) == ARRAY_TYPE)
> > type = TREE_TYPE (type);
> >   if (TREE_CODE (type) == INTEGER_TYPE
> >   && TYPE_MODE (type) == TYPE_MODE (char_type_node)
> >   && TYPE_PRECISION (type) == TYPE_PRECISION (char_type_node))
> > {
> >   if (! handle_char_store (gsi))
> > return false;
> > }
> > so the above case will not trigger, the only way to call it for multiple
> > character stores is if you have an array.  And so far I've succeeded
> > creating that just with strcpy builtin.
> How I could have missed it so many times is a mystery to me.  I
> literally was looking at this on the phone with Martin today for the nth
> time and my brain just wouldn't parse this code correctly.
>
> We can only get into handle_char_store for an LHS that is a character or
> array of characters.  So the only case we have to worry about would be
> if we have a constant array on the RHS.  Something like this from Jakub:
>
>  MEM  [(char *)&x] = MEM  [(char *)"ab"];
>
> But in this case the RHS is going to be an ARRAY_TYPE.  And thus
> Martin's new code would reject it because it would only do the
> optmiization if the RHS has INTEGER_TYPE.
>
> So I think my concerns are ultimately a non-issue.
>
> FWIW if you're trying to get something like Jakub's assignment, this
> will do it:
>
>   char a[7];
>   int f ()
>   {
> __builtin_strcpy (a, "654321");
>   }
>
> Which results in this being passed to handle_char_store:
>
>   MEM  [(char * {ref-all})&a] = MEM 
> [(char * {ref-all})"654321"];
>
> We can make it more problematical by first doing a strcpy into a so that
> we create an SI structure.  Something like this:
>
>  char a[7];
>   int f ()
>   {
> strcpy (a, "123");
> __builtin_strcpy (a, "654321");
>   }
>
> Compiled with -O2 -fno-tree-dse (the -fno-tree-dse ensures the first
> strcpy is not eliminated).
>
> That should get us into handle_char_store and into the new code from
> Martin's patch that wants to test:
>
>
>
> +  if (cmp > 0
> + && storing_nonzero_p
> + && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)
>
> But the RHS in this case has an ARRAY_TYPE and Martin's test would
> reject it.
>
>
> So at this point my concerns are alleviated.
>
> Jakub, Richi, do either of you have concerns?  I've kindof owned the
> review of this patch from Martin, but since y'all have chimed in, I'd
> like to give you another chance to chime in before I ACK.
>
> jeff
>
> ps.  FWIW, the gimple FE seems to really dislike constant strings in MEM
> expressions like we've been working with.  It also seems to go into an
> infinite loop if you've got an extra closing paren on the RHS..  It was
> still useful to poke at things a bit.  One could argue that if we get
> the FE to a suitable state that it could define what is and what is not
> valid gimple.  That would likely be incredibly help for this kind of stuff.

;)  error handling is a bit weak indeed.

I'm testing/committing the following to allow string literals in __MEM
which needed string address literals.  I guess for Fortran we need
to extend this to cover CONST_DECLs which should eventually
appear as _Literal (int *) &5 for example.

2019-07-01  Richard Biener  

c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Handle
_Literal (char *) &"foo" for address literals pointing to
STRING_CSTs.

* gcc.dg/gimplefe-42.c: New testcase.


gimplefe-addr-taken-string-literal
Description: Binary data

[PATCH 2/2] Add zstd support for LTO bytecode compression.

2019-07-01 Thread Martin Liška

Hi.

This is updated version of the zstd patch that should handle all what Joseph
pointed out.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 5d2006c9c4d481f4083d5a591327ee64847b0bf7 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 20 Jun 2019 10:08:17 +0200
Subject: [PATCH 2/2] Add zstd support for LTO bytecode compression.

gcc/ChangeLog:

2019-07-01  Martin Liska  

	* Makefile.in: Define ZSTD_LIB.
	* common.opt: Adjust compression level
	to support also zstd levels.
	* config.in: Regenerate.
	* configure: Likewise.
	* configure.ac: Add --with-zstd and --with-zstd-include options
	and detect ZSTD.
	* doc/install.texi: Mention zstd dependency.
	* gcc.c: Print supported LTO compression algorithms.
	* lto-compress.c (lto_normalized_zstd_level): Likewise.
	(lto_compression_zstd): Likewise.
	(lto_uncompression_zstd): Likewise.
	(lto_end_compression): Dispatch in between zlib and zstd.
	(lto_compression_zlib): Mark with ATTRIBUTE_UNUSED.
	(lto_uncompression_zlib): Make it static.
	* lto-compress.h (lto_end_uncompression): Fix GNU coding style.
	* lto-section-in.c (lto_get_section_data): Pass info
	about used compression.
	* lto-streamer-out.c: By default use zstd when possible.
	* timevar.def (TV_IPA_LTO_DECOMPRESS): Rename to decompression
	(TV_IPA_LTO_COMPRESS): Likewise for compression.
---
 gcc/Makefile.in|   4 +-
 gcc/common.opt |   4 +-
 gcc/config.in  |   6 ++
 gcc/configure  | 163 -
 gcc/configure.ac   |  66 +
 gcc/doc/install.texi   |   6 ++
 gcc/gcc.c  |   5 ++
 gcc/lto-compress.c | 141 +--
 gcc/lto-compress.h |   3 +-
 gcc/lto-section-in.c   |   2 +-
 gcc/lto-streamer-out.c |   4 +
 gcc/timevar.def|   4 +-
 12 files changed, 378 insertions(+), 30 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d9e0885b96b..597dc01328b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1065,7 +1065,7 @@ BUILD_LIBDEPS= $(BUILD_LIBIBERTY)
 LIBS = @LIBS@ libcommon.a $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBBACKTRACE) \
 	$(LIBIBERTY) $(LIBDECNUMBER) $(HOST_LIBS)
 BACKENDLIBS = $(ISLLIBS) $(GMPLIBS) $(PLUGINLIBS) $(HOST_LIBS) \
-	$(ZLIB)
+	$(ZLIB) $(ZSTD_LIB)
 # Any system libraries needed just for GNAT.
 SYSLIBS = @GNAT_LIBEXC@
 
@@ -1076,6 +1076,8 @@ GNATMAKE = @GNATMAKE@
 # Libs needed (at present) just for jcf-dump.
 LDEXP_LIB = @LDEXP_LIB@
 
+ZSTD_LIB = @ZSTD_LIB@
+
 # Likewise, for use in the tools that must run on this machine
 # even if we are cross-building GCC.
 BUILD_LIBS = $(BUILD_LIBIBERTY)
diff --git a/gcc/common.opt b/gcc/common.opt
index a1544d06824..3b71a36552b 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1888,8 +1888,8 @@ Specify the algorithm to partition symbols and vars at linktime.
 
 ; The initial value of -1 comes from Z_DEFAULT_COMPRESSION in zlib.h.
 flto-compression-level=
-Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 9)
--flto-compression-level=	Use zlib compression level  for IL.
+Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 19)
+-flto-compression-level=	Use zlib/zstd compression level  for IL.
 
 flto-odr-type-merging
 Common Ignore
diff --git a/gcc/config.in b/gcc/config.in
index a718ceaf3da..13fd7959dd7 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1926,6 +1926,12 @@
 #endif
 
 
+/* Define if you have a working  header file. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_ZSTD_H
+#endif
+
+
 /* Define if isl is in use. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_isl
diff --git a/gcc/configure b/gcc/configure
index 955e9ccc09b..8c9f7742ac7 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -782,6 +782,8 @@ manext
 LIBICONV_DEP
 LTLIBICONV
 LIBICONV
+ZSTD_LIB
+ZSTD_INCLUDE
 DL_LIB
 LDEXP_LIB
 EXTRA_GCC_LIBS
@@ -959,6 +961,9 @@ with_pkgversion
 with_bugurl
 enable_languages
 with_multilib_list
+with_zstd
+with_zstd_include
+with_zstd_lib
 enable_rpath
 with_libiconv_prefix
 enable_sjlj_exceptions
@@ -1783,6 +1788,12 @@ Optional Packages:
   --with-pkgversion=PKG   Use PKG in the version string in place of "GCC"
   --with-bugurl=URL   Direct users to URL to report a bug
   --with-multilib-listselect multilibs (AArch64, SH and x86-64 only)
+  --with-zstd=PATHspecify prefix directory for installed zstd library.
+  Equivalent to --with-zstd-include=PATH/include plus
+  --with-zstd-lib=PATH/lib
+  --with-zstd-include=PATH
+  specify directory for installed zstd include files
+  --with-zstd-lib=PATHspecify directory for the installed zstd library
   --with-gnu-ld   assume the C compiler uses GNU ld default=no
   --with-libiconv-prefix[=DIR]  search for libiconv in DIR/include and DIR/lib
   --without-libiconv-prefix don't search for libiconv in includ

Re: [PATCH] Add .gnu.lto_.lto section.

2019-07-01 Thread Martin Liška

Hi.

Ok, so there's a version with added ChangeLog that survives regression tests.

Ready to be installed?
Thanks,
Martin
>From e6745583dc4b7f5543878c0a25498e818531f73e Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 21 Jun 2019 12:14:04 +0200
Subject: [PATCH 1/2] Add .gnu.lto_.lto section.

gcc/ChangeLog:

2019-07-01  Martin Liska  

	* lto-section-in.c (lto_get_section_data): Add "lto" section.
	* lto-section-out.c (lto_destroy_simple_output_block): Never
	compress LTO_section_lto section.
	* lto-streamer-out.c (produce_asm): Do not set major_version
	and minor_version.
	(lto_output_toplevel_asms): Likewise.
	(produce_lto_section): New function.
	(lto_output): Call produce_lto_section.
	(lto_write_mode_table): Do not set major_version and
	minor_version.
	(produce_asm_for_decls): Likewise.
	* lto-streamer.h (enum lto_section_type): Add LTO_section_lto
	type.
	(struct lto_header): Remove.
	(struct lto_section): New struct.
	(struct lto_simple_header): Do not inherit from lto_header.
	(struct lto_file_decl_data): Add lto_section_header field.

gcc/lto/ChangeLog:

2019-07-01  Martin Liska  

	* lto-common.c: Read LTO section and verify header.
---
 gcc/lto-section-in.c   |  9 +++--
 gcc/lto-section-out.c  |  2 --
 gcc/lto-streamer-out.c | 40 +---
 gcc/lto-streamer.h | 25 +
 gcc/lto/lto-common.c   | 15 +++
 5 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index 4cfc0cad4be..4e7d1181f23 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -52,10 +52,10 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "icf",
   "offload_table",
   "mode_table",
-  "hsa"
+  "hsa",
+  "lto"
 };
 
-
 /* Hooks so that the ipa passes can call into the lto front end to get
sections.  */
 
@@ -146,7 +146,7 @@ lto_get_section_data (struct lto_file_decl_data *file_data,
   /* WPA->ltrans streams are not compressed with exception of function bodies
  and variable initializers that has been verbatim copied from earlier
  compilations.  */
-  if (!flag_ltrans || decompress)
+  if ((!flag_ltrans || decompress) && section_type != LTO_section_lto)
 {
   /* Create a mapping header containing the underlying data and length,
 	 and prepend this to the uncompression buffer.  The uncompressed data
@@ -167,9 +167,6 @@ lto_get_section_data (struct lto_file_decl_data *file_data,
   data = buffer.data + header_length;
 }
 
-  lto_check_version (((const lto_header *)data)->major_version,
-		 ((const lto_header *)data)->minor_version,
-		 file_data->file_name);
   return data;
 }
 
diff --git a/gcc/lto-section-out.c b/gcc/lto-section-out.c
index c91e58f0465..7ae102164ef 100644
--- a/gcc/lto-section-out.c
+++ b/gcc/lto-section-out.c
@@ -285,8 +285,6 @@ lto_destroy_simple_output_block (struct lto_simple_output_block *ob)
   /* Write the header which says how to decode the pieces of the
  t.  */
   memset (&header, 0, sizeof (struct lto_simple_header));
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
   header.main_size = ob->main_stream->total_size;
   lto_write_data (&header, sizeof header);
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index dc68429303c..7dee770aa11 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1974,10 +1974,6 @@ produce_asm (struct output_block *ob, tree fn)
   /* The entire header is stream computed here.  */
   memset (&header, 0, sizeof (struct lto_function_header));
 
-  /* Write the header.  */
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
-
   if (section_type == LTO_section_function_body)
 header.cfg_size = ob->cfg_stream->total_size;
   header.main_size = ob->main_stream->total_size;
@@ -2270,10 +2266,6 @@ lto_output_toplevel_asms (void)
   /* The entire header stream is computed here.  */
   memset (&header, 0, sizeof (header));
 
-  /* Write the header.  */
-  header.major_version = LTO_major_version;
-  header.minor_version = LTO_minor_version;
-
   header.main_size = ob->main_stream->total_size;
   header.string_size = ob->string_stream->total_size;
   lto_write_data (&header, sizeof header);
@@ -2390,6 +2382,29 @@ prune_offload_funcs (void)
 DECL_PRESERVE_P (fn_decl) = 1;
 }
 
+/* Produce LTO section that contains global information
+   about LTO bytecode.  */
+
+static void
+produce_lto_section ()
+{
+  /* Stream LTO meta section.  */
+  output_block *ob = create_output_block (LTO_section_lto);
+
+  char * section_name = lto_get_section_name (LTO_section_lto, NULL, NULL);
+  lto_begin_section (section_name, false);
+  free (section_name);
+
+  lto_compression compression = ZLIB;
+
+  bool slim_object = flag_generate_lto && !flag_fat_lto_objects;
+  lto_section s
+= { LTO_major_version, LTO_minor_version, slim_object, compression, 0 };
+  lto_write_data (&s, sizeof s);
+  lto_end_se

Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default

2019-07-01 Thread Richard Biener

On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt  wrote:
>
> On 6/27/19 6:45 AM, Segher Boessenkool wrote:
> > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
> >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt  
> >> wrote:
> >>> We've done some experimenting and realized that the subject option almost
> >>> always provide improved performance for Power when the loop unroller is
> >>> enabled.  So this patch turns that flag on by default for us.
> >> I guess it creates more freedom for combine (more single-uses) and register
> >> allocation.  I wonder in which cases this might pessimize things?  I guess
> >> the pre-RA scheduler might make RAs life harder with creating overlapping
> >> life-ranges.
> >>
> >> I guess you didn't actually investigate the nature of the improvements you 
> >> saw?
> > It breaks the length of dependency chains by a factor equal to the unroll
> > factor.  I do not know why this doesn't help a lot everywhere.  It of
> > course raises register pressure, maybe that is just it?
>
> Right, it's all about breaking dependencies to more efficiently exploit
> the microarchitecture.  By default, variable expansion in GCC is quite
> conservative, creating only two reduction streams out of one, so it's
> pretty rare for it to cause spill.  This can be adjusted upwards with
> --param max-variable-expansions-in-unroller=n.  Our experiments show
> that raising n to 4 starts to cause some minor degradations, which are
> almost certainly due to pressure, so the default setting looks appropriate.

But it's probably only an issue for targets which enable pre-RA scheduling
by default?  It might also increase RA compile-time (more allocnos).

Richard.

> >> Do we want to adjust the flags documentation, saying whether this is 
> >> enabled
> >> by default depends on the target (or even list them)?
> > Good idea, thanks.
>
> OK, I'll update the docs and make the change that Segher requested.
> Thanks for the reviews!
>
> Bill
> >
> >
> > Segher
>

Re: [PATCH] Remove another bunch of dead assignment.

2019-07-01 Thread Martin Liška

On 6/27/19 7:24 PM, Richard Sandiford wrote:
> Martin Liška  writes:
>> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
>> index d50b811d863..1bd251ea8e2 100644
>> --- a/gcc/config/i386/i386-expand.c
>> +++ b/gcc/config/i386/i386-expand.c
>> @@ -19780,8 +19780,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
>>emit_insn (gen_vec_widen_umult_even_v4si (t5, 
>>  gen_lowpart (V4SImode, op1),
>>  gen_lowpart (V4SImode, op2)));
>> -  op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
>> -
>> +  expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
>>  }
>>else
>>  {
> 
> This means that nothing uses the expanded value.  It looks like the
> call was meant to be force_expand_binop instead, so that the expansion
> always goes to op0.

You are right. The same function is called in the else branch of the condition.

I'm sending updated version of the patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

> 
> Richard
> 

>From aecb58d06336baeaa86942424c3314c6020dd754 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 27 Jun 2019 13:39:24 +0200
Subject: [PATCH] Remove another bunch of dead assignment.

gcc/ChangeLog:

2019-06-27  Martin Liska  

	* lra-eliminations.c (eliminate_regs_in_insn): Remove
	dead assignemts.
	* reg-stack.c (check_asm_stack_operands): Likewise.
	* tree-ssa-structalias.c (create_function_info_for): Likewise.
	* tree-vect-generic.c (expand_vector_operations_1): Likewise.
	* config/i386/i386-expand.c (ix86_expand_sse2_mulvxdi3): Use
	force_expand_binop.

gcc/c-family/ChangeLog:

2019-06-27  Martin Liska  

	* c-common.c (try_to_locate_new_include_insertion_point): Remove
	dead assignemts.

gcc/cp/ChangeLog:

2019-06-27  Martin Liska  

	* call.c (build_new_op_1): Remove
	dead assignemts.
	* typeck.c (cp_build_binary_op): Likewise.

gcc/fortran/ChangeLog:

2019-06-27  Martin Liska  

	* check.c (gfc_check_c_funloc): Remove
	dead assignemts.
	* decl.c (variable_decl): Likewise.
	* resolve.c (resolve_typebound_function): Likewise.
	* simplify.c (gfc_simplify_matmul): Likewise.
	(gfc_simplify_scan): Likewise.
	* trans-array.c (gfc_could_be_alias): Likewise.
	* trans-common.c (add_equivalences): Likewise.
	* trans-expr.c (trans_class_vptr_len_assignment): Likewise.
	(gfc_trans_array_constructor_copy): Likewise.
	(gfc_trans_assignment_1): Likewise.
	* trans-intrinsic.c (conv_intrinsic_atomic_op): Likewise.
	* trans-openmp.c (gfc_omp_finish_clause): Likewise.
	* trans-types.c (gfc_get_array_descriptor_base): Likewise.
	* trans.c (gfc_build_final_call): Likewise.

libcpp/ChangeLog:

2019-06-27  Martin Liska  

	* line-map.c (linemap_get_expansion_filename): Remove
	dead assignemts.
	* mkdeps.c (make_write): Likewise.
---
 gcc/c-family/c-common.c   |  4 ++--
 gcc/config/i386/i386-expand.c |  3 +--
 gcc/cp/call.c |  2 +-
 gcc/cp/typeck.c   |  1 -
 gcc/fortran/check.c   | 18 +++---
 gcc/fortran/decl.c|  1 -
 gcc/fortran/resolve.c |  1 -
 gcc/fortran/simplify.c| 27 ---
 gcc/fortran/trans-array.c |  2 --
 gcc/fortran/trans-common.c|  6 ++
 gcc/fortran/trans-expr.c  |  6 --
 gcc/fortran/trans-intrinsic.c |  1 -
 gcc/fortran/trans-openmp.c|  1 -
 gcc/fortran/trans-types.c | 10 +-
 gcc/fortran/trans.c   |  3 ---
 gcc/lra-eliminations.c|  2 +-
 gcc/reg-stack.c   |  1 -
 gcc/tree-ssa-structalias.c|  1 -
 gcc/tree-vect-generic.c   |  2 --
 libcpp/line-map.c |  3 +--
 libcpp/mkdeps.c   |  2 +-
 21 files changed, 33 insertions(+), 64 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index da4aadbc590..cb92710f2bc 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8601,8 +8601,8 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
 
   /*  Get ordinary map containing LOC (or its expansion).  */
   const line_map_ordinary *ord_map_for_loc = NULL;
-  loc = linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT,
-  &ord_map_for_loc);
+  linemap_resolve_location (line_table, loc, LRK_MACRO_EXPANSION_POINT,
+			&ord_map_for_loc);
   gcc_assert (ord_map_for_loc);
 
   for (unsigned int i = 0; i < LINEMAPS_ORDINARY_USED (line_table); i++)
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 85d74a28636..5d3b74a159f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -19779,8 +19779,7 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
   emit_insn (gen_vec_widen_umult_even_v4si (t5, 
 	gen_lowpart (V4SImode, op1),
 	gen_lowpart (V4SImode, op2)));
-  op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
-
+  forc

[range-ops] patch 05/04: bonus round!

2019-07-01 Thread Aldy Hernandez

This is completely unrelated to range-ops itself, but may yield better 
results in value_range intersections.  It's just something I found while 
working on VRP, and have been dragging around on our branch.


If we know the intersection of two ranges is the empty set, there is no 
need to conservatively add anything to the result.


Tested on x86-64 Linux with --enable-languages=all.

Aldy
commit 4f9aa7bd1066267eee92f622ff29d78534158e20
Author: Aldy Hernandez 
Date:   Fri Jun 28 11:34:19 2019 +0200

Do not try to further refine a VR_UNDEFINED result when intersecting
value_ranges.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 01fb97cedb2..b0d78ee6871 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-07-01  Aldy Hernandez  
+
+	* tree-vrp.c (intersect_ranges): If we know the intersection is
+	empty, there is no need to conservatively add anything else to
+	the set.
+
 2019-06-26  Jeff Law  
 
 	PR tree-optimization/90883
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index dc7f825efc8..594ee9adc17 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -5977,6 +5977,11 @@ intersect_ranges (enum value_range_kind *vr0type,
 	gcc_unreachable ();
 }
 
+  /* If we know the intersection is empty, there's no need to
+ conservatively add anything else to the set.  */
+  if (*vr0type == VR_UNDEFINED)
+return;
+
   /* As a fallback simply use { *VRTYPE, *VR0MIN, *VR0MAX } as
  result for the intersection.  That's always a conservative
  correct estimate unless VR1 is a constant singleton range

[range-ops] patch 04/04: range-ops proper (PLACEHOLDER)

2019-07-01 Thread Aldy Hernandez

We still need to do some cosmetic changes to range-ops itself, but I 
realize it may be easier to review the first 3 patches, if one has an 
idea of how it all fits together.  As I said before, the first 3 patches 
can go in as is; this is just for the curious or impatient.


As such, I am not submitting this patch for contribution yet.  I'll 
contribute later this week, after I cleanup some #if 0 code and 
reminders we had written throughout the code.


For example, there are some optimizations that I'll hold off on 
contributing, because they yield different (read "better") results than 
mainline.  Since we want to make sure there are absolutely no 
differences from extract_range_from* for now, I believe it is prudent to 
delay these optimizations until they can be contributed separately when 
the dust settles.


Be that as it may, we can still discuss the general layout now.

The main idea is that all calls to tree-vrp's extract_range_from_
binary/unary_expr in vr-values.s are replaced with range_fold_*expr.
These new functions will then call both range-ops and the old
extract_range_from_binary/unary, and assert that the results are the
same.

The code asserting and comparing the results for range-ops and VRP is 
all in assert_compare_value_ranges.  I have taken care of starting from 
scratch (no exceptions) and adding only what is strictly necessary, 
where I have proven why two results mismatch but yet range-ops is 
correct.  For example, EXACT, MIN, MAX_EXPR, etc on VRP give up a lot 
sooner than range-ops.


One annoying set of differences is the order in which extract_range* 
extracts anti-ranges into pairs with ranges_from_anti_range, versus the 
way we pick at sub-ranges.  The ordering matters, and VRP will 
frequently give up (into VARYING) because the intermediate 
representation may be unrepresentable.  The range-ops iteration does 
slightly better, and there are sometimes differences when doing 
operations on pairs of anti-ranges.  If VRP yields a VARYING result, but 
range-ops does better, we ignore the difference.  There is a huge 
comment section where I explain this in depth, with examples.


Other than the few accounted for exceptions in 
assert_compare_value_ranges, there should be no differences, and we 
abort() very loudly if there is.


Now...

The VRP entry points into the ranger are:
range_ops_fold_unary_expr
range_ops_fold_binary_expr

These functions are organized into 3 distinctive sections:

a) Perform any special behavior users of extract_range* expect.
   This is generally the special code at the top of
   extract_range_{binary,unary}*.

b) Symbolic handling.

   Pleasantly we have found out that the only places where we
   actually care about symbolics in VRP are:

1. PLUS_EXPR, MINUS_EXPR
2. POINTER_PLUS_EXPR
3. NEGATE_EXPR
4. BIT_NOT_EXPR
5. CONVERT_EXPR_CODE_P

Interestingly, NEGATE and BIT_NOT care about symbolics
because they may degrade into PLUS/MINUS.  Since
POINTER_PLUS_EXPR is just a special purpose PLUS, the
argument can be made that symbolics only matter for
PLUS/MINUS and CONVERT_EXPR.

c) Calling into range-ops fold.

A few random notes on the rest:

1. range-op.cc is the main engine.

2. range.cc is our irange handling code.  As discussed before, irange
   and value_range_base share the same API, and for trunk we propose
   a simple drop-in replacement of:

#ifndef USE_IRANGE
class value_range_storage;
typedef value_range_base irange;
typedef value_range_storage irange_storage;
#endif

   I am attaching everything as it fits in our branch, but I believe
   what will ultimately happen is that we keep the #if USE_IRANGE bits
   in our branch, and everything else in trunk.

   NOTE:

   I would like to keep our nomenclature for irange even in trunk, as it
   makes it easier to keep track of things in branch vs. trunk.
   Besides, there should be no references to irange in tree-vrp, only in
   range-op.c and range.h/cc, and eventually gori/ranger/etc.

   This will allow us to experiment with either irange or value_range
   seamlessly.

3. I have provided a compile time flag for use during stage1 debugging. 
The flag -franges= chooses which range folding mechanism to use (VRP, 
range-ops, or both while checking that they agree).  The default is 
-franges=checking.


Again, this is entire patchset is a place holder, so everyone can see 
how it all fits together.  I'll be cleaning this up, and only posting 
the #if !USE_RANGE bits.  Be that as it may, this patch has been tested 
on x86-64 Linux with --enable-languages=all :).


Aldy
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d9e0885b96b..a3f3a201787 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1450,6 +1450,8 @@ OBJS = \
 	print-tree.o \
 	profile.o

[PATCH 2/2] gdbhooks.py: extend vec support in pretty printers

2019-07-01 Thread Vladislav Ivanishin

This change is threefold:
- enable pretty printing of vec<>, not just vec<>*
- generalize 'vec<(\S+), (\S+), (\S+)>' regex, which is limiting
- extend to work for vl_ptr layout (only vl_embed was supported)

The motivating example for all three is a vector of vectors in
tree-ssa-uninit.c:

typedef vec pred_chain;
typedef vec pred_chain_union;

(This predictably expands to
  vec, va_heap, vl_ptr>
in gdb, if we use strip_typedefs() on it -- otherwise it's just
pred_chain_union.  The first patch of the series takes care of that.)

This example features "direct" vecs themselves rather than pointers to
vecs, which is not a rare case; in fact, a quick grep showed that the
number of declarations of direct vecs is about the same as that of
pointers to vec.

The GDB's Python API seems to do dereferencing automatically (i.e. it
detects whether to use '->', or '.' for field access), allowing to use
the same code to access fields from a struct directly as well as through
a pointer.

This makes it possible to reuse VecPrinter (meant for vec<> *) for plain
vec<>, but perhaps it would be beneficial to handle all pointers in a
generic way e.g. print the address and then print the dereferenced
object, letting the appropriate pretty printer to pick it up and do its
thing, if one is provided for the type.

The output I have for direct vecs now:

(gdb) p *bb->succs
$94 = @0x776569b0 = { 5)>,  3)>}

Compare that to

(gdb) p bb->succs
$70 = 0x776569b0 = { 5)>,  3)>}

Hope the choice of the '@' sign to represent the address of an object
taken by the pretty printer is not confusing?  (GDB itself uses this
notation for references, like so: (edge_def *&) @0x77656c38.)

OK?

gcc/

* gdbhooks.py: Adjust toplevel comment.  Add a new example.
(VecPrinter.__init__): Distinguish between pointer and direct types.
(VecPrinter.to_string): Print pointer value differently based on that
distinction.
(VecPrinter.children): Adjust accordingly, and account for vl_ptr
layout, add a comment about that.
(build_pretty_printer): Generalize regex for the regex based subprinter
for vec<>*.  Add a new regex subprinter for vec<>.
---
 gcc/gdbhooks.py | 47 +++
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index b036704b86a..a7e665cadb9 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -128,12 +128,18 @@ and here's a length 1 vec:
   (gdb) p bb->succs
   $19 = 0x70428bb8 = { EXIT)>}
 
-You cannot yet use array notation [] to access the elements within the
-vector: attempting to do so instead gives you the vec itself (for vec[0]),
-or a (probably) invalid cast to vec<> for the memory after the vec (for
-vec[1] onwards).
+Direct instances of vec<> are also supported (their addresses are
+prefixed with the '@' sign).
 
-Instead (for now) you must access m_vecdata:
+In case you have a vec<> pointer, to use array notation [] to access the
+elements within the vector you must first dereference the pointer, just
+as you do in C:
+  (gdb) p (*bb->succs)[0]
+  $50 = (edge_def *&) @0x77656c38:  10)>
+  (gdb) p bb->succs[0][0]
+  $51 = (edge_def *&) @0x77656c38:  10)>
+
+Another option is to access m_vecdata:
   (gdb) p bb->preds->m_vecdata[0]
   $20 =  5)>
   (gdb) p bb->preds->m_vecdata[1]
@@ -441,6 +447,10 @@ class VecPrinter:
 #-ex "up" -ex "p bb->preds"
 def __init__(self, gdbval):
 self.gdbval = gdbval
+if self.gdbval.type.code == gdb.TYPE_CODE_PTR:
+self.ptr = intptr(self.gdbval)
+else:
+self.ptr = None
 
 def display_hint (self):
 return 'array'
@@ -448,14 +458,25 @@ class VecPrinter:
 def to_string (self):
 # A trivial implementation; prettyprinting the contents is done
 # by gdb calling the "children" method below.
-return '0x%x' % intptr(self.gdbval)
+if self.ptr:
+return '0x%x' % self.ptr
+else:
+return '@0x%x' % intptr(self.gdbval.address)
 
 def children (self):
-if intptr(self.gdbval) == 0:
+if self.ptr == 0:
 return
-m_vecpfx = self.gdbval['m_vecpfx']
+m_vec = self.gdbval
+# There are 2 kinds of vec layouts: vl_embed, and vl_ptr.  The latter
+# is implemented with the former stored in the m_vec field.  Sadly,
+# template_argument(2) won't work: `gdb.error: No type named vl_embed`.
+try:
+m_vec = m_vec['m_vec']
+except:
+pass
+m_vecpfx = m_vec['m_vecpfx']
 m_num = m_vecpfx['m_num']
-m_vecdata = self.gdbval['m_vecdata']
+m_vecdata = m_vec['m_vecdata']
 for i in range(m_num):
 yield ('[%d]' % i, m_vecdata[i])
 
@@ -572,9 +593,11 @@ def build_pretty_printer():
 pp.add_printer_for_types(['rtx_def *'], 'rtx_def', RtxPrinter)
 pp.add_printe

[PATCH 1/2] gdbhooks.py: use strip_typedefs to simplify matching type names

2019-07-01 Thread Vladislav Ivanishin

Hi!

GDB's Python API provides strip_typedefs method that can be instrumental
for writing DRY code.  Using it at least partially obviates the need for
the add_printer_for_types method we have in gdbhooks.py (it takes a list
of typenames to match and is usually used to deal with typedefs).

I think, the code can be simplified further (there's still
['tree_node *', 'const tree_node *'], which is a little awkward) if we
deal with pointer types in a uniform fashion (I'll touch on this in the
description of the second patch).  But that can be improved separately.

The gimple_statement_cond, etc. part has been dysfunctional for a while
(namely since gimple-classes-v2-option-3 branch was merged).  I updated
it to use the new classes' names.  That seems to work (though it doesn't
output much info anyway: pretty
vs. 
   (gphi *) 0x778c0200
in the raw version).

I changed the name passed to pp.add_printer_for_types() for scalar_mode
and friends -- so they all share the same name now -- but I don't
believe the name is used in any context where it would matter.

This is just a clean up of gdbhooks.py.  OK to commit?

gcc/

* gdbhooks.py (GdbPrettyPrinters.__call__): canonicalize any variants of
a type name using strip_typdefs.
(build_pretty_printer): Use canonical names for types and otherwise
simplify the code.  Clean up gimple_sth, gimple_statement_sth cruft.
---
 gcc/gdbhooks.py | 48 +++-
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index 26a09749aa3..b036704b86a 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -528,7 +528,7 @@ class GdbPrettyPrinters(gdb.printing.PrettyPrinter):
 self.subprinters.append(GdbSubprinterRegex(regex, name, class_))
 
 def __call__(self, gdbval):
-type_ = gdbval.type.unqualified()
+type_ = gdbval.type.unqualified().strip_typedefs()
 str_type = str(type_)
 for printer in self.subprinters:
 if printer.enabled and printer.handles_type(str_type):
@@ -540,7 +540,7 @@ class GdbPrettyPrinters(gdb.printing.PrettyPrinter):
 
 def build_pretty_printer():
 pp = GdbPrettyPrinters('gcc')
-pp.add_printer_for_types(['tree', 'const_tree'],
+pp.add_printer_for_types(['tree_node *', 'const tree_node *'],
  'tree', TreePrinter)
 pp.add_printer_for_types(['cgraph_node *', 'varpool_node *', 'symtab_node *'],
  'symtab_node', SymtabNodePrinter)
@@ -548,32 +548,25 @@ def build_pretty_printer():
  'cgraph_edge', CgraphEdgePrinter)
 pp.add_printer_for_types(['ipa_ref *'],
  'ipa_ref', IpaReferencePrinter)
-pp.add_printer_for_types(['dw_die_ref'],
+pp.add_printer_for_types(['die_struct *'],
  'dw_die_ref', DWDieRefPrinter)
 pp.add_printer_for_types(['gimple', 'gimple *',
 
   # Keep this in the same order as gimple.def:
-  'gimple_cond', 'const_gimple_cond',
-  'gimple_statement_cond *',
-  'gimple_debug', 'const_gimple_debug',
-  'gimple_statement_debug *',
-  'gimple_label', 'const_gimple_label',
-  'gimple_statement_label *',
-  'gimple_switch', 'const_gimple_switch',
-  'gimple_statement_switch *',
-  'gimple_assign', 'const_gimple_assign',
-  'gimple_statement_assign *',
-  'gimple_bind', 'const_gimple_bind',
-  'gimple_statement_bind *',
-  'gimple_phi', 'const_gimple_phi',
-  'gimple_statement_phi *'],
+  'gcond', 'gcond *',
+  'gdebug', 'gdebug *',
+  'glabel', 'glabel *',
+  'gswitch', 'gswitch *',
+  'gassign', 'gassign *',
+  'gbind', 'gbind *',
+  'gphi', 'gphi *'],
 
  'gimple',
  GimplePrinter)
-pp.add_printer_for_types(['basic_block', 'basic_block_def *'],
+pp.add_printer_for_types(['basic_block_def *'],
  'basic_block',
  BasicBlockPrinter)
-pp.add_printer_for_types(['edge', 'edge_def *'],
+pp.add_printer_for_types(['edge_def *'],
  'edge',
  CfgEdgePrinter)
 pp.add_printer_for_types(['rtx_def *'], 'rtx_def', RtxPrinter)
@@ -585,18 +578,15 @@ def build_pretty_printer():
 
 pp.add_printer_for_regex(r'opt_mode<(\S+)>',

Re: [PATCH] Automatics in equivalence statements

2019-07-01 Thread Mark Eggleston



On 25/06/2019 14:17, Mark Eggleston wrote:


On 25/06/2019 00:17, Jeff Law wrote:

On 6/24/19 2:19 AM, Bernhard Reutner-Fischer wrote:

On Fri, 21 Jun 2019 07:10:11 -0700
Steve Kargl  wrote:


On Fri, Jun 21, 2019 at 02:31:51PM +0100, Mark Eggleston wrote:

Currently variables with the AUTOMATIC attribute can not appear in an
EQUIVALENCE statement. However its counterpart, STATIC, can be 
used in

an EQUIVALENCE statement.

Where there is a clear conflict in the attributes of variables in an
EQUIVALENCE statement an error message will be issued as is currently
the case.

If there is no conflict e.g. a variable with a AUTOMATIC attribute 
and a

variable(s) without attributes all variables in the EQUIVALENCE will
become AUTOMATIC.

Note: most of this patch was written by Jeff Law 

Please review.

ChangeLogs:

gcc/fortran

      Jeff Law  
      Mark Eggleston  

      * gfortran.h: Add check_conflict declaration.

This is wrong.  By convention a routine that is not static
has the gfc_ prefix.

Updated the code to use gfc_check_conflict instead.



Furthermore doesn't this export indicate that you're committing a
layering violation somehow?

Possibly.  I'm the original author, but my experience in our fortran
front-end is minimal.  I fully expected this patch to need some 
tweaking.


We certainly don't want to recreate all the checking that's done in
check_conflict.  We just need to defer it to a later point --
find_equivalence seemed like a good point since we've got the full
equivalence list handy and can accumulate the attributes across the
entire list, then check for conflicts.

If there's a concrete place where you think we should be doing this, I'm
all ears.


Any suggestions will be appreciate.
      * symbol.c (check_conflict): Remove automatic in equivalence 
conflict

      check.
      * symbol.c (save_symbol): Add check for in equivalence to 
stop the

      the save attribute being added.
      * trans-common.c (build_equiv_decl): Add is_auto parameter and
      add !is_auto to condition where TREE_STATIC (decl) is set.
      * trans-common.c (build_equiv_decl): Add local variable is_auto,
      set it true if an atomatic attribute is encountered in the 
variable

atomatic? I read atomic but you mean automatic.

Yes.

      list.  Call build_equiv_decl with is_auto as an additional 
parameter.

      flag_dec_format_defaults is enabled.
      * trans-common.c (accumulate_equivalence_attributes) : New 
subroutine.
      * trans-common.c (find_equivalence) : New local variable 
dummy_symbol,
      accumulated equivalence attributes from each symbol then 
check for

      conflicts.
I'm just curious why you don't gfc_copy_attr for the most part of 
accumulate_equivalence_attributes?

thanks,

Simply didn't know about it.  It could probably significantly simplify
the accumulation of attributes step.
Using gfc_copy_attr causes a great many "Duplicate DIMENSION attribute 
specified at (1)" errors. This is because there is a great deal of 
checking done instead of simply keeping track of the attributes used 
which is all that is required for determining whether there is a 
conflict in the equivalence statement.


Also, the final section of accumulate_equivalence_attributes involving 
SAVE, INTENT and ACCESS look suspect to me. I'll check and update the 
patch if necessary.


No need to check intent as there is already a conflict with DUMMY and 
INTENT can only be present for dummy variables.


Please find attached an updated patch. Change logs:

gcc/fortran

    Jeff Law  
    Mark Eggleston  

    * gfortran.h: Add gfc_check_conflict declaration.
    * symbol.c (check_conflict): Rename cfg_check_conflict and remove
    static.
    * symbol.c (cfg_check_conflict): Remove automatic in equivalence
    conflict check.
    * symbol.c (save_symbol): Add check for in equivalence to stop the
    the save attribute being added.
    * trans-common.c (build_equiv_decl): Add is_auto parameter and
    add !is_auto to condition where TREE_STATIC (decl) is set.
    * trans-common.c (build_equiv_decl): Add local variable is_auto,
    set it true if an atomatic attribute is encountered in the variable
    list.  Call build_equiv_decl with is_auto as an additional parameter.
    flag_dec_format_defaults is enabled.
    * trans-common.c (accumulate_equivalence_attributes) : New subroutine.
    * trans-common.c (find_equivalence) : New local variable dummy_symbol,
    accumulated equivalence attributes from each symbol then check for
    conflicts.

gcc/testsuite

    Mark Eggleston 

    * gfortran.dg/auto_in_equiv_1.f90: New test.
    * gfortran.dg/auto_in_equiv_2.f90: New test.
    * gfortran.dg/auto_in_equiv_3.f90: New test.

If the updated patch is acceptable, please can someone with the 
privileges commit the patch.


Mark




Jeff




--
https://www.codethink.co.uk/privacy.html

>From 321c7c84f9578e99ac0a1fa5f3ed1fd78b328d1f Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Tue, 11 Sep 2018 12:50:11 +0100
S

[PATCH] Fix use-after-scope in host-mingw32.c (PR target/88056).

2019-07-01 Thread Martin Liška

Hi.

The patch is about use-after-scope. However, I can't verify
it survives bootstrap on the affected target.

Ready for the trunk?
Thanks,
Martin

gcc/ChangeLog:

2019-07-01  Martin Liska  

PR target/88056
* config/i386/host-mingw32.c (mingw32_gt_pch_use_address):
Define local_object_name in outer scope in order to handle
use-after-scope issue.
---
 gcc/config/i386/host-mingw32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/gcc/config/i386/host-mingw32.c b/gcc/config/i386/host-mingw32.c
index f2b56d71c5b..3254d028313 100644
--- a/gcc/config/i386/host-mingw32.c
+++ b/gcc/config/i386/host-mingw32.c
@@ -157,10 +157,10 @@ mingw32_gt_pch_use_address (void *addr, size_t size, int fd,
   /* Determine the version of Windows we are running on and use a
  uniquely-named local object if running > 4.  */
   GetVersionEx (&version_info);
+
+  char local_object_name[sizeof (OBJECT_NAME_FMT) + sizeof (DWORD) * 2];
   if (version_info.dwMajorVersion > 4)
 {
-  char local_object_name [sizeof (OBJECT_NAME_FMT)
-			  + sizeof (DWORD) * 2];
   snprintf (local_object_name, sizeof (local_object_name),
 		OBJECT_NAME_FMT "%lx", GetCurrentProcessId());
   object_name = local_object_name;

Fix verifier ICE on CLOBBER of COMPONENT_REF

2019-07-01 Thread Jan Hubicka

Hi,
the testcase makes inliner to substitute RESULT_DECL by COMPONENT_REF
inside CLOBBER statement. This is not allowed and leads to ICE in
verifier (while I think we should fix code to support this).
This patch simply makes inliner to watch for this case and do not remap
the statement at all.

The testcase works for 32bit only which is bit unfortunate. For 64bit
build NRV does not happen, i did not look into why.

OK?
Honza

* tree-inline.c (remap_gimple_stmt): Do not subtitute handled components
to clobber of return value.
* g++.dg/lto/pr90990_0.C: New testcase.
Index: tree-inline.c
===
--- tree-inline.c   (revision 272846)
+++ tree-inline.c   (working copy)
@@ -1757,6 +1757,18 @@ remap_gimple_stmt (gimple *stmt, copy_bo
return NULL;
}
}
+ 
+  /* We do not allow CLOBBERs of handled components.  In case
+returned value is stored via such handled component, remove
+the clobber so stmt verifier is happy.  */
+  if (gimple_clobber_p (stmt)
+ && TREE_CODE (gimple_assign_lhs (stmt)) == RESULT_DECL)
+   {
+ tree remapped = remap_decl (gimple_assign_lhs (stmt), id);
+ if (!DECL_P (remapped)
+ && TREE_CODE (remapped) != MEM_REF)
+   return NULL;
+   }
 
   if (gimple_debug_bind_p (stmt))
{
Index: testsuite/g++.dg/lto/pr90990_0.C
===
--- testsuite/g++.dg/lto/pr90990_0.C(nonexistent)
+++ testsuite/g++.dg/lto/pr90990_0.C(working copy)
@@ -0,0 +1,31 @@
+// { dg-lto-do link }
+/* { dg-extra-ld-options {  -r -nostdlib } } */
+class A {
+public:
+  float m_floats;
+  A() {}
+};
+class B {
+public:
+  A operator[](int);
+};
+class C {
+  B m_basis;
+
+public:
+  A operator()(A) {
+m_basis[1] = m_basis[2];
+A a;
+return a;
+  }
+};
+class D {
+public:
+  C m_fn1();
+};
+class F {
+  A m_pivotInB;
+  F(D &, const A &);
+};
+F::F(D &p1, const A &p2) : m_pivotInB(p1.m_fn1()(p2)) {}
+

[range-ops] patch 03/04: abstract out a few things from extract_range_from*

2019-07-01 Thread Aldy Hernandez

I hate duplicating code, and the symbolic handling of pointer_plus_expr, 
plus_minus_expr, and the code dealing with overflows, is ripe for 
sharing with the ranger.  I've abstracted these into their own 
functions.  No changes in functionality is expected.


You will notice that I moved value_range_kind into coretypes.h, as I'd 
like to use it for value_range, wide-int-range.h, and the ranger header 
files.  For value_range, all options are valid.  For wide-int-range.h 
(adjust_range_for_overflow), everything but VR_UNDEFINED is valid.  For 
the ranger, all values are valid but only for the constructors as our 
internal implementation has no need for this field.  The ranger accesses 
things with the higher-level num_pairs(), lower/upper_bound(), etc.


Tested on x86-64 Linux with --enable-languages=all.

Aldy
commit e58c0f8b8a27fa91fe22f1f12d19f3d37cc729ae
Author: Aldy Hernandez 
Date:   Fri Jun 28 10:41:05 2019 +0200

Move set_value_range_with_overflow into wide-int-range.cc as
adjust_range_for_overflow.

Move the POINTER_PLUS_EXPR symbolics code into
handle_symbolics_in_pointer_plus_expr.

Move the PLUS/MINUS_EXPR handling code into extract_range_from_plus_expr.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 866200d9594..40a51fc67dc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2019-07-01  Aldy Hernandez  
+
+	* tree-vrp.h (value_range_kind): Move from here...
+	* coretypes.h (value_range_kind): ...to here.
+	* tree-vrp.c (extract_range_from_binary_expr): Abstract pointer
+	plus handling of symbolics to
+	handle_symbolics_in_pointer_plus_expr().
+	Abstract plus/minus handling code to
+	extract_range_from_plus_expr.
+	Move set_value_range_with_overflow to wide-int-range.cc.
+	(handle_symbolics_in_pointer_plus_expr): New.
+	(extract_range_from_plus_expr): New.
+	* wide-int-range.cc (adjust_range_for_overflow): New.
+	* wide-int-range.h (adjust_range_for_overflow): New.
+
 2019-07-01  Aldy Hernandez  
 
 	* tree-vrp.c (value_range_base::set_and_canonicalize): Rename to
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 2f6b8599d7c..4bb1282b1b9 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -202,6 +202,24 @@ enum profile_update {
   PROFILE_UPDATE_PREFER_ATOMIC
 };
 
+/* Types of ranges.
+
+   This is still prefixed with VR_*, even though it is more general
+   purpose, to avoid having to replace everything across the compiler.
+   Perhaps we should change it later.  */
+enum value_range_kind {
+  /* Empty range.  */
+  VR_UNDEFINED,
+  /* Range spans the entire domain.  */
+  VR_VARYING,
+  /* Range is [MIN, MAX].  */
+  VR_RANGE,
+  /* Range is ~[MIN, MAX].  */
+  VR_ANTI_RANGE,
+  /* Range is a nice guy.  */
+  VR_LAST
+};
+
 /* Types of unwind/exception handling info that can be generated.  */
 
 enum unwind_info_type
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index f78517b5b3b..efd51735022 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -1470,112 +1470,196 @@ combine_bound (enum tree_code code, wide_int &wi, wi::overflow_type &ovf,
 wi = wi::shwi (0, prec);
 }
 
-/* Given a range in [WMIN, WMAX], adjust it for possible overflow and
-   put the result in VR.
+/* Fold two value range's of a POINTER_PLUS_EXPR into VR.  Return TRUE
+   if successful.  */
 
-   TYPE is the type of the range.
+static bool
+handle_symbolics_in_pointer_plus_expr (value_range_base *vr,
+   enum tree_code code,
+   tree expr_type,
+   const value_range_base *vr0,
+   const value_range_base *vr1)
+{
+  if (POINTER_TYPE_P (expr_type) && code == POINTER_PLUS_EXPR)
+{
+  /* For pointer types, we are really only interested in asserting
+	 whether the expression evaluates to non-NULL.
+	 With -fno-delete-null-pointer-checks we need to be more
+	 conservative.  As some object might reside at address 0,
+	 then some offset could be added to it and the same offset
+	 subtracted again and the result would be NULL.
+	 E.g.
+	 static int a[12]; where &a[0] is NULL and
+	 ptr = &a[6];
+	 ptr -= 6;
+	 ptr will be NULL here, even when there is POINTER_PLUS_EXPR
+	 where the first range doesn't include zero and the second one
+	 doesn't either.  As the second operand is sizetype (unsigned),
+	 consider all ranges where the MSB could be set as possible
+	 subtractions where the result might be NULL.  */
+  if ((!range_includes_zero_p (vr0)
+	   || !range_includes_zero_p (vr1))
+	  && !TYPE_OVERFLOW_WRAPS (expr_type)
+	  && (flag_delete_null_pointer_checks
+	  || (range_int_cst_p (vr1)
+		  && !tree_int_cst_sign_bit (vr1->max ()
+	vr->set_nonzero (expr_type);
+  else if (vr0->zero_p () && vr1->zero_p ())
+	vr->set_zero (expr_type);
+  else
+	vr->set_varying (expr_type);
+  return true;
+}
+  return false;
+}
 
-   MIN_OVF and MAX_OVF indicate what type of overflow, if any,
-   occurred while originally calculating WMIN or WMAX.  -1 indicates
-   underflow.  +1 indicates overflow.  0 indicates neither.  */
+/* Ex

Re: [PATCH] Fix libstdc++ install-pdf support.

2019-07-01 Thread Jonathan Wakely


On 01/07/19 01:21 -0700, Jim Wilson wrote:

Try to run "make install-pdf" on a system with dblatex and pdflatex installed
but not doxygen gives an error.
   run_doxygen error: Could not find Doxygen 1.7.0 in path.
Looking at the build log, I see that this is also using xsltproc and xmllint.
Installing doxygen and running again, I get lots of ignored errors for a
missing dot program.  Looking at the docs I see the collaboration diagrams
are missing.  Installing dot and rebuilding and now I have the collaboration
diagrams.  I don't see any evidence that the pdf docs are using the
stylesheets.  Otherwise, they need everything else that the xml docs need.

Regenerating configure I got an unexpected change, but that is an issue with
a patch a few days ago that added a new thread file, and regenerated the
libgcc configure to use it, but failed to notice that the libstdc++ configure
should have been regenerated too.

Tested with x86_64-linux builds with various packages installed or removed,
and looking at the final docs to make sure they look right.

OK?


OK, thanks.

Fix g++.dg/lto/alias-1 and 2 for non-plugin builds

2019-07-01 Thread Jan Hubicka

Hi,
alias-1 and alias-2 fails with -fno-use-inliner-plugin because the
inlinig it relies on does not happen. This is because w/o resolution
info we can not optimize away the offline copy and we know that main
is executed once and optimize for size.

Bootsrapped/regtested x86_64-linux, comitted.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 272846)
+++ ChangeLog   (working copy)
@@ -1,3 +1,11 @@
+2019-07-01  Jan Hubicka  
+
+   PR lto/91028
+   PR lto/90720
+   * g++.dg/lto/alias-1_0.C: Add loop to make inlining happen with
+   -fno-use-linker-plugin
+   * g++.dg/lto/alias-2_0.C: Likewise.
+
 2019-07-01  Dominique d'Humieres  
 
* g++.dg/cpp0x/gen-attrs-67.C: Add error for darwin.
Index: g++.dg/lto/alias-1_0.C
===
--- g++.dg/lto/alias-1_0.C  (revision 272846)
+++ g++.dg/lto/alias-1_0.C  (working copy)
@@ -17,6 +17,7 @@ __attribute__ ((used))
 struct b **bptr = (struct b**)&aptr;
 extern void init ();
 extern void inline_me_late (int);
+int n=1;
 
 
 int
@@ -24,7 +25,8 @@ main (int argc, char **argv)
 {
   init ();
   aptr = 0;
-  inline_me_late (argc);
+  for (int i=0; i

[range-ops] patch 02/04: enforce canonicalization in value_range

2019-07-01 Thread Aldy Hernandez

As discussed prior.  This enforces canonicalization at creation time, 
which makes things a lot more consistent throughout.


Since now [MIN, MAX] will be canonicalized into VR_VARYING, we can no 
longer depend on normalizing VARYING's into [MIN, MAX] for more general 
handling.  I've adjusted the few places where we were doing this so that 
they work correctly.


As a bonus, now that we're canonicalized, I've tweaked singleton to work 
with anti-ranges.


Note: There is a small note in ranges_from_anti_range, which I will 
start making heavier use of in subsequent patches:


+  /* ?? This function is called multiple times from num_pairs,
+ lower_bound, and upper_bound.  We should probably memoize this, or
+ rewrite the callers in such a way that we're not re-calculating
+ this constantly.  */

I don't think this is needed now, but just a note of things to come.  It 
may not be an issue, but just something to keep in mind.


Tested on x86-64 Linux with --enable-languages=all.

Aldy
commit d220a3eeb77615b39260e532e34815a3810e8348
Author: Aldy Hernandez 
Date:   Fri Jun 28 11:15:03 2019 +0200

Enforce canonicalization in value_range.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a5247735694..866200d9594 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@
+2019-07-01  Aldy Hernandez  
+
+	* tree-vrp.c (value_range_base::set_and_canonicalize): Rename to
+	set() and add normalization of [MIN, MAX] into varying.
+	(value_range_base::singleton_p): Make work with anti ranges.
+	(ranges_from_anti_range): Handle pointers.
+	(extract_range_into_wide_ints): Call normalize_symbolics.
+	(extract_range_from_binary_expr): Do not build a [MIN,MAX] range
+	because it will be canonicalized into varying.
+	Call set() instead of set_and_canonicalize().
+	(extract_range_from_unary_expr): Call set() instead of
+	set_and_canonicalize().
+	(intersect_helper): Do not call set_and_canonicalize.
+	* tree-vrp.h (value_range_base): Remove set_and_canonicalize.
+	(value_range): Same.
+	* vr-values.c (extract_range_from_var_from_comparison_expr): Same.
+
 2019-07-01  Aldy Hernandez  
 
 	* gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 97046c22ed1..f78517b5b3b 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -69,20 +69,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "wide-int-range.h"
 
+static bool
+ranges_from_anti_range (const value_range_base *ar,
+			value_range_base *vr0, value_range_base *vr1,
+			bool handle_pointers = false);
+
 /* Set of SSA names found live during the RPO traversal of the function
for still active basic-blocks.  */
 static sbitmap *live;
 
-void
-value_range_base::set (enum value_range_kind kind, tree min, tree max)
-{
-  m_kind = kind;
-  m_min = min;
-  m_max = max;
-  if (flag_checking)
-check ();
-}
-
 void
 value_range::set_equiv (bitmap equiv)
 {
@@ -363,6 +358,24 @@ value_range::equiv_add (const_tree var,
 bool
 value_range_base::singleton_p (tree *result) const
 {
+  if (m_kind == VR_ANTI_RANGE)
+{
+  if (nonzero_p ())
+	{
+	  if (TYPE_PRECISION (type ()) == 1)
+	{
+	  if (result)
+		*result = m_max;
+	  return true;
+	}
+	  return false;
+	}
+
+  value_range_base vr0, vr1;
+  return (ranges_from_anti_range (this, &vr0, &vr1, true)
+	  && vr1.undefined_p ()
+	  && vr0.singleton_p (result));
+}
   if (m_kind == VR_RANGE
   && vrp_operand_equal_p (min (), max ())
   && is_gimple_min_invariant (min ()))
@@ -690,8 +703,7 @@ intersect_range_with_nonzero_bits (enum value_range_kind vr_type,
extract ranges from var + CST op limit.  */
 
 void
-value_range_base::set_and_canonicalize (enum value_range_kind kind,
-	tree min, tree max)
+value_range_base::set (enum value_range_kind kind, tree min, tree max)
 {
   if (kind == VR_UNDEFINED)
 {
@@ -711,7 +723,9 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind,
   if (TREE_CODE (min) != INTEGER_CST
   || TREE_CODE (max) != INTEGER_CST)
 {
-  set (kind, min, max);
+  m_kind = kind;
+  m_min = min;
+  m_max = max;
   return;
 }
 
@@ -747,12 +761,13 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind,
   kind = kind == VR_RANGE ? VR_ANTI_RANGE : VR_RANGE;
 }
 
+  tree type = TREE_TYPE (min);
+
   /* Anti-ranges that can be represented as ranges should be so.  */
   if (kind == VR_ANTI_RANGE)
 {
   /* For -fstrict-enums we may receive out-of-range ranges so consider
  values < -INF and values > INF as -INF/INF as well.  */
-  tree type = TREE_TYPE (min);
   bool is_min = (INTEGRAL_TYPE_P (type)
 		 && tree_int_cst_compare (min, TYPE_MIN_VALUE (type)) <= 0);
   bool is_max = (INTEGRAL_TYPE_P (type)
@@ -795,22 +810,37 @@ value_range_base::set_and_canonicalize (enum value_range_kind kind,
 }
 }
 
+  /* Normalize [MIN, MAX] into VARYING and ~[

Re: [PATCH 5/5] Use ira_setup_alts for conflict detection

2019-07-01 Thread Richard Sandiford

Vladimir Makarov  writes:
> On 2019-06-21 9:43 a.m., Richard Sandiford wrote:
>> make_early_clobber_and_input_conflicts records allocno conflicts
>> between inputs and earlyclobber outputs.  It (rightly) avoids
>> doing this for inputs that are explicitly allowed to match the
>> output due to matching constraints.
>>
>> The problem is that whether this matching is allowed varies
>> between alternatives.  At the moment the code avoids adding
>> a clobber if *any* enabled alternative allows the match,
>> even if some other operand makes that alternative impossible.
>>
>> The specific instance of this for SVE is that some alternatives
>> allow matched earlyclobbers when a third operand X is constant zero.
>> We should avoid adding conflicts when X really is constant zero,
>> but should ignore the match if X is nonzero or nonconstant.
>>
>> ira_setup_alts can already filter these alternatives out for us,
>> so all we need to do is use it in process_bb_node_lives.  The
>> preferred_alternatives variable is only used for this earlyclobber
>> detection, so no other check should be affected.
>>
>> With the previous patch to check the reject weight in ira_setup_alts,
>> this has the effect of ignoring expensive alternatives if we have
>> other valid alternatives with zero cost.  It seems reasonable to base
>> the heuristic on only the alternatives that we'd actually like to use,
>> but if this ends up being too aggressive, we could instead make the new
>> reject behaviour conditional and only use it for add_insn_allocno_copies.
>>
> This patch definitely improves the heuristics.
>
> The only missed part is a comment for preferred_alternatives
>
> /* The value of get_preferred_alternatives for the current instruction,
>     supplemental to recog_data.  */
> static alternative_mask preferred_alternatives;
>
> The comment becomes misleading after the patch.

Oops, thanks for noticing that.

> With changing the comment, the patch is ok for me.

Thanks, here's what I installed after testing on aarch64-linux-gnu
and x86_64-linux-gnu.

Richard


2019-07-01  Richard Sandiford  

gcc/
* ira-lives.c (process_bb_node_lives): Use ira_setup_alts.

Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c 2019-06-29 17:20:49.0 +0100
+++ gcc/ira-lives.c 2019-07-01 09:33:14.026477923 +0100
@@ -80,8 +80,9 @@ Software Foundation; either version 3, o
 /* The number of last call at which given allocno was saved.  */
 static int *allocno_saved_at_call;
 
-/* The value of get_preferred_alternatives for the current instruction,
-   supplemental to recog_data.  */
+/* The value returned by ira_setup_alts for the current instruction;
+   i.e. the set of alternatives that we should consider to be likely
+   candidates during reloading.  */
 static alternative_mask preferred_alternatives;
 
 /* If non-NULL, the source operand of a register to register copy for which
@@ -1236,9 +1237,7 @@ process_bb_node_lives (ira_loop_tree_nod
  }
  }
 
- extract_insn (insn);
- preferred_alternatives = get_preferred_alternatives (insn);
- preprocess_constraints (insn);
+ preferred_alternatives = ira_setup_alts (insn);
  process_single_reg_class_operands (false, freq);
 
  if (call_p)

[range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING

2019-07-01 Thread Aldy Hernandez

As discussed before, this enforces types on undefined and varying, which 
makes everything more regular, and removes some special casing 
throughout range handling.


The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which 
will make it easy to get at the bounds of a range later on.  Since 
pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst() 
and wide_int_to_tree(wi::max_value(precision)), for consistency. 
UNDEFINED is set similarly, though nobody should ever ask for anything 
except type() from it.  That is, no one should be asking for the bounds.


There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats, 
presumably to pass around state??  This causes value_range_base::type() 
and others to fail, even though I haven't actually seen a case of 
someone actually trying to set a VR_RANGE of a float.  For now, I've 
built a NOP_EXPR wrapper so type() works correctly.  The correct 
approach would probably be to avoid creating these varying/undefined 
ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so. 
Suggestions welcome, if you don't like special casing this for ipa-cp. 
Or perhaps as a follow up.


In this patch I start introducing a couple small API changes that will 
be needed for range-ops.  Since they're needed/useful for this patch, I 
started adding them on a need-to-use basis.  They are:


value_range_base (tree type)

This is our constructor for building a varying:
value_range_base foo (some_type);

value_range_base::supports_type_p(tree type)

We use this instead of having to test everywhere for
INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout.
I have not ported every use of the INTEGRAL || POINTER in the
compiler to this function.  But that could be a follow up
cleanup if someone (else) is interested :).

value_range_base_normalize_symbolics():

This normalizes symbolics into constants.  In VRP we usually
normalize necessary symbolics into [MIN, MAX].  This patch does
slightly better.  Now we transform:

  // [SYM, SYM] -> VARYING
  // [SYM, NUM] -> [-MIN, NUM]
  // [NUM, SYM] -> [NUM, +MAX]
  // ~[SYM, NUM] -> [NUM + 1, +MAX]
  // ~[NUM, SYM] -> [-MIN, NUM - 1]

TBH, this bit and its use in *multiplicative_op probably fits
better with the canonicalization patch, but as I said.  They're
a bit intertwined.  Ughh.

Finally, as you mentioned before, we need a way of caching varyings in 
the allocation pool.  The type_range_cache class in the attached patch 
is Andrew's doing, but I'll be happy to take the blame and address 
anything that needs doing.


Tested on x86-64 Linux with --enable-languages=all.

Aldy
commit 24c3a6a2cb7424a9c022930cada3a5f3c84a388a
Author: Aldy Hernandez 
Date:   Fri Jun 28 11:00:10 2019 +0200

VR_UNDEFINED and VR_VARYING now have a type.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 01fb97cedb2..a5247735694 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,72 @@
+2019-07-01  Aldy Hernandez  
+
+	* gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs
+	who's result are not valid for a value_range.
+	Set type for varying value_range.
+	* ipa-cp.c (ipcp_vr_lattice::init): Set type for varying/undef
+	value_range.
+	(ipcp_vr_lattice::set_to_bottom): Same.
+	(initialize_node_lattices): Same.
+	* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
+	Same.
+	* tree-ssanames.c (get_range_info): Same.
+	* tree-vrp.c (value_range::set_equiv): Do not set equiv on
+	undef/varying.
+	(value_range_base::value_range_base): New constructor.
+	(value_range_base::check): Remove assert for empty min/max.
+	(value_range_base::equal_p): Allow comparison of typeless undefs.
+	(value_range_base::set_undefined): Add type.
+	(value_range::set_undefined): Same.
+	(value_range_base::set_varying): Same.
+	(value_range::set_varying): Same.
+	(value_range_base::type): Remove assert.
+	(value_range_base::dump): Display type for varying/undef.
+	(value_range_base::dump): Add argument-less overload.
+	(value_range::dump): Same.
+	(vrp_val_max): Add handle_pointers argument.
+	(vrp_val_min): Same.
+	(vrp_val_is_max): Same.
+	(vrp_val_is_min): Same.
+	(value_range_base::set_and_canonicalize): Adjust so type is
+	allowed for varying/undef.
+	(ranges_from_anti_range): Same.
+	(extract_range_from_muliplicative_op): Same.
+	(extract_range_from_binary_expr): Same.
+	(extract_range_from_unary_expr): Same.
+	(vrp_prop::vrp_initialize): Same.
+	(vrp_prop::visit_stmt): Same.
+	(value_range_base::union_helper): Same.
+	(value_range_base::normalize_symbolics): Same.
+	(determine_value_range_1): Same.
+	* tree-vrp.h (value_range_base): Add value_range_base(tree type)
+	constructor.
+	Add dump (), supports_type_p,
+	value_range_base_normalize_symbolics, set_varying, and
+	set_undefined.
+	(value_range): Add set_varying, set_undefined, and dump().
+	(vrp_val_is_min): A

Re: [PATCH][gcc] libgccjit: check result_type in gcc_jit_context_new_binary_op

2019-07-01 Thread Andrea Corallo


Andrea Corallo writes:

> David Malcolm writes:
>
>> On Tue, 2019-06-25 at 08:11 +, Andrea Corallo wrote:
>>> Hi,
>>> third version for this patch with the simplified test.
>>>
>>> make check-jit pass clean
>>>
>>> Bests
>>>   Andrea
>>>
>>> 2019-06-09  Andrea Corallo  andrea.cora...@arm.com
>>>
>>> * libgccjit.c (gcc_jit_context_new_binary_op): Check result_type to
>>> be a
>>> numeric type.
>>>
>>>
>>> 2019-06-20  Andrea Corallo andrea.cora...@arm.com
>>>
>>> * jit.dg/test-error-gcc_jit_context_new_binary_op-bad-res-type.c:
>>> New testcase.
>>
>> Thanks for the updated patch.
>>
>> This is good for trunk.
>>
>> (Copying and pasting from my other review): are you working on getting
>> SVN commit access, or do you want me to commit your two patches on your
>> behalf?
>>
>> Thanks
>> Dave
>
> Hi David,
> I can work on to get the SVN commit access.
> As a maintainer has to sponsor it would you mind being the one?
>
> Thanks
>   Andrea

Hi David,
kind ping on this :)

Best regards
  Andrea

Re: [PATCH] Make lto-dump dump symtab callgraph in graphviz format

2019-07-01 Thread Martin Jambor

On Sat, Jun 29 2019, Giuliano Belinassi wrote:
> This patch makes lto-dump dump the symbol table callgraph in graphviz
> format.

-ENOPATCH

Martin

>
> I've not added any test to this because I couldn't find a way to call
> lto-dump from the testsuite. Also, any feedback with regard to how can
> I improve this is welcome.
>
> gcc/ChangeLog
> 2019-06-29  Giuliano Belinassi  
>
> * cgraph.c (dump_graphviz): New function
> * cgraph.h (dump_graphviz): New function
> * symtab.c (dump_graphviz): New function
> * varpool.c (dump_graphviz): New function
>
> gcc/lto/ChangeLog
> 2019-06-29  Giuliano Belinassi  
>
> * lang.opt (flag_dump_callgraph): New flag
> * lto-dump.c (dump_symtab_graphviz): New function
> * lto-dump.c (dump_tool_help): New option
> * lto-dump.c (lto_main): New option

[PATCH] Fix libstdc++ install-pdf support.

2019-07-01 Thread Jim Wilson

Try to run "make install-pdf" on a system with dblatex and pdflatex installed
but not doxygen gives an error.
run_doxygen error: Could not find Doxygen 1.7.0 in path.
Looking at the build log, I see that this is also using xsltproc and xmllint.
Installing doxygen and running again, I get lots of ignored errors for a
missing dot program.  Looking at the docs I see the collaboration diagrams
are missing.  Installing dot and rebuilding and now I have the collaboration
diagrams.  I don't see any evidence that the pdf docs are using the
stylesheets.  Otherwise, they need everything else that the xml docs need.

Regenerating configure I got an unexpected change, but that is an issue with
a patch a few days ago that added a new thread file, and regenerated the
libgcc configure to use it, but failed to notice that the libstdc++ configure
should have been regenerated too.

Tested with x86_64-linux builds with various packages installed or removed,
and looking at the final docs to make sure they look right.

OK?

Jim

libstdc++-v3/
* configure.ac (BUILD_PDF): Also test for doxygen, dot, xsltproc,
and xmllint.
* configure: Regenerate.
---
 libstdc++-v3/configure| 21 +
 libstdc++-v3/configure.ac |  4 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 62d9cb49acf..d06b0440cb5 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -15418,6 +15418,7 @@ $as_echo "$target_thread_file" >&6; }
 case $target_thread_file in
 aix)   thread_header=config/rs6000/gthr-aix.h ;;
 dce)   thread_header=config/pa/gthr-dce.h ;;
+gcn)   thread_header=config/gcn/gthr-gcn.h ;;
 lynx)  thread_header=config/gthr-lynx.h ;;
 mipssde)   thread_header=config/mips/gthr-mipssde.h ;;
 posix) thread_header=gthr-posix.h ;;
@@ -15637,7 +15638,7 @@ $as_echo "$glibcxx_cv_atomic_long_long" >&6; }
   # Fake what AC_TRY_COMPILE does.
 
 cat > conftest.$ac_ext << EOF
-#line 15640 "configure"
+#line 15641 "configure"
 int main()
 {
   typedef bool atomic_type;
@@ -15672,7 +15673,7 @@ $as_echo "$glibcxx_cv_atomic_bool" >&6; }
 rm -f conftest*
 
 cat > conftest.$ac_ext << EOF
-#line 15675 "configure"
+#line 15676 "configure"
 int main()
 {
   typedef short atomic_type;
@@ -15707,7 +15708,7 @@ $as_echo "$glibcxx_cv_atomic_short" >&6; }
 rm -f conftest*
 
 cat > conftest.$ac_ext << EOF
-#line 15710 "configure"
+#line 15711 "configure"
 int main()
 {
   // NB: _Atomic_word not necessarily int.
@@ -15743,7 +15744,7 @@ $as_echo "$glibcxx_cv_atomic_int" >&6; }
 rm -f conftest*
 
 cat > conftest.$ac_ext << EOF
-#line 15746 "configure"
+#line 15747 "configure"
 int main()
 {
   typedef long long atomic_type;
@@ -15896,7 +15897,7 @@ $as_echo "mutex" >&6; }
   # unnecessary for this test.
 
 cat > conftest.$ac_ext << EOF
-#line 15899 "configure"
+#line 15900 "configure"
 int main()
 {
   _Decimal32 d1;
@@ -15938,7 +15939,7 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
   # unnecessary for this test.
 
 cat > conftest.$ac_ext << EOF
-#line 15941 "configure"
+#line 15942 "configure"
 template
   struct same
   { typedef T2 type; };
@@ -15972,7 +15973,7 @@ $as_echo "$enable_int128" >&6; }
 rm -f conftest*
 
 cat > conftest.$ac_ext << EOF
-#line 15975 "configure"
+#line 15976 "configure"
 template
   struct same
   { typedef T2 type; };
@@ -81880,7 +81881,11 @@ $as_echo "no" >&6; }
 fi
 
 
- if test $ac_cv_prog_DBLATEX = "yes" &&
+ if test $ac_cv_prog_DOXYGEN = "yes" &&
+  test $ac_cv_prog_DOT = "yes" &&
+  test $ac_cv_prog_XSLTPROC = "yes" &&
+  test $ac_cv_prog_XMLLINT = "yes" &&
+  test $ac_cv_prog_DBLATEX = "yes" &&
   test $ac_cv_prog_PDFLATEX = "yes"; then
   BUILD_PDF_TRUE=
   BUILD_PDF_FALSE='#'
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 2e3a1a98f33..80d8202c337 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -483,6 +483,10 @@ AM_CONDITIONAL(BUILD_MAN,
 AC_CHECK_PROG([DBLATEX], dblatex, yes, no)
 AC_CHECK_PROG([PDFLATEX], pdflatex, yes, no)
 AM_CONDITIONAL(BUILD_PDF,
+  test $ac_cv_prog_DOXYGEN = "yes" &&
+  test $ac_cv_prog_DOT = "yes" &&
+  test $ac_cv_prog_XSLTPROC = "yes" &&
+  test $ac_cv_prog_XMLLINT = "yes" &&
   test $ac_cv_prog_DBLATEX = "yes" &&
   test $ac_cv_prog_PDFLATEX = "yes")
 
-- 
2.17.1

Re: [PATCH] Add late non-iterating FRE with optimize > 1

2019-07-01 Thread Richard Biener

On Thu, 27 Jun 2019, Richard Biener wrote:

> 
> This fixes FREs handling of TARGET_MEM_REF (it didn't consider
> &TARGET_MEM_REF) and adds a late FRE pass which has iteration
> disabled and runs only at -O[2s]+ to limit the compile-time
> impact.
> 
> This helps cases where unrolling and vectorization exposes
> "piecewise" redundancies DOM cannot handle.  Thus
> 
>  (vector *)&a = { 1, 2, 3, 4 };
>  .. = a[2];
> 
> there's still the opposite case not handled (PR83518) but
> I will see whether I can make it work without too much cost:
> 
>  a[0] = 1;
>  a[1] = 2;
>  a[2] = 3;
>  a[3] = 4;
>  ... = (vector *)&a;
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> I'll commit the TARGET_MEM_REF fixing indepenently.
> 
> Any comments?  I'm not sure I like globbing the iteration
> parameter and the optimize > 1 check; maybe I should simply
> rename it to 'late' ...
> 
> The compile-time impact might be non-trivial for those testcases
> that run into a large overhead from the alias-stmt walking but
> I didn't do any measurements yet.

Testing went good with only one false-positive fallout of
gcc.dg/tree-ssa/pr77445-2.c where new threading opportunities
arise in thread3 due to late FRE but those opportunities cross
loops and thus are not considered.  But that breaks the testcases
dump scanning for profile correctness.  I've re-visited PR77445
and applied the obvious change.

I have commited the FRE TARGET_MEM_REF fix separately and now
the patch below.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

2019-07-01  Richard Biener  

* tree-ssa-sccvn.c (class pass_fre): Add may_iterate
pass parameter.
(pass_fre::execute): Honor it.
* passes.def: Adjust pass_fre invocations to allow iterating,
add non-iterating pass_fre before late threading/dom.

* gcc.dg/tree-ssa/pr77445-2.c: Adjust.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 272742)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -6872,14 +6853,24 @@ class pass_fre : public gimple_opt_pass
 {
 public:
   pass_fre (gcc::context *ctxt)
-: gimple_opt_pass (pass_data_fre, ctxt)
+: gimple_opt_pass (pass_data_fre, ctxt), may_iterate (true)
   {}
 
   /* opt_pass methods: */
   opt_pass * clone () { return new pass_fre (m_ctxt); }
-  virtual bool gate (function *) { return flag_tree_fre != 0; }
+  void set_pass_param (unsigned int n, bool param)
+{
+  gcc_assert (n == 0);
+  may_iterate = param;
+}
+  virtual bool gate (function *)
+{
+  return flag_tree_fre != 0 && (may_iterate || optimize > 1);
+}
   virtual unsigned int execute (function *);
 
+private:
+  bool may_iterate;
 }; // class pass_fre
 
 unsigned int
@@ -6888,15 +6879,16 @@ pass_fre::execute (function *fun)
   unsigned todo = 0;
 
   /* At -O[1g] use the cheap non-iterating mode.  */
+  bool iterate_p = may_iterate && (optimize > 1);
   calculate_dominance_info (CDI_DOMINATORS);
-  if (optimize > 1)
+  if (iterate_p)
 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
   default_vn_walk_kind = VN_WALKREWRITE;
-  todo = do_rpo_vn (fun, NULL, NULL, optimize > 1, true);
+  todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true);
   free_rpo_vn ();
 
-  if (optimize > 1)
+  if (iterate_p)
 loop_optimizer_finalize ();
 
   return todo;
Index: gcc/passes.def
===
--- gcc/passes.def  (revision 272742)
+++ gcc/passes.def  (working copy)
@@ -83,7 +83,7 @@ along with GCC; see the file COPYING3.
  /* pass_build_ealias is a dummy pass that ensures that we
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
- NEXT_PASS (pass_fre);
+ NEXT_PASS (pass_fre, true /* may_iterate */);
  NEXT_PASS (pass_early_vrp);
  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dse);
@@ -117,7 +117,7 @@ along with GCC; see the file COPYING3.
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_ch);
- NEXT_PASS (pass_fre);
+ NEXT_PASS (pass_fre, true /* may_iterate */);
  /* We use pass_lim to rewrite in-memory iteration and reduction
 variable accesses in loops into local variables accesses.  */
  NEXT_PASS (pass_lim);
@@ -199,7 +199,7 @@ along with GCC; see the file COPYING3.
 execute TODO_rebuild_alias at this point.  */
   NEXT_PASS (pass_build_alias);
   NEXT_PASS (pass_return_slot);
-  NEXT_PASS (pass_fre);
+  NEXT_PASS (pass_fre, true /* may_iterate */);
   NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_thread_jumps);
   NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
@@ -312,6 +312,7 @@ along with GCC; see the file COPYING3.
   NEXT_PASS (pass_strength_reduction);
   NEXT_

[range-ops] patch 00/04: Summary

2019-07-01 Thread Aldy Hernandez

I'm happy to finally contribute the range-ops part of the ranger work. 
This is the infrastructure for folding unary and binary ranges of 
tree_codes into a resulting range.  It is the ranger counterpart of 
extract_range_from_*_expr in tree-vrp.c (not the similarly called 
callers in vr-values.c).


It is divided into four parts.  The first 3 patches are supporting 
infrastructure for range-ops, but are technically independent and are 
useful on their own.  They are:


patch 01: VR_UNDEFINED and VR_VARYING now have a type.

patch 02: Enforce canonicalization in value_range.

patch 03: Abstract overflow handling, pointer_plus_expr
  symbolics, and plus/minus into their own functions.

We've discussed and agreed on the general approach of the first two 
before.  The final one is just shuffling things around so we can share 
value_range overflow handling, plus/minus, and pointer_plus symbolics 
with the ranger.  There are no changes to functionality in patch 03.


Note, that even though these 3 patches are technically independent, they 
keep touching the same bits over and over, so it's easier to scaffold 
them.  That is, patch 2 depends on 1 and patch 3 depends on 1 and 2.  It 
was a pain just to get them organized without just submitting a huge patch.


Finally, there is patch 04 which is range-ops itself.  I will discuss 
this in detail in the appropriate patch, but the general gist is that 
(if we agree), we will be running range-ops and extract_range_from_*expr 
during stage1, while comparing that the results agree.  Once everyone is 
in agreement, we can rip out the extract_range_from_* bits from 
tree-vrp.c  There will be a flag (-franges={vrp,range-ops,checking} and 
the default will be checking (run both and compare).


Let's get this party started!

Aldy

Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned

2019-07-01 Thread Richard Biener

On Fri, 28 Jun 2019, Andrew Pinski wrote:

> On Thu, Jun 27, 2019 at 9:55 PM Li Jia He  wrote:
> >
> >
> >
> > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > >> Hi,
> > >>
> > >> According to the optimizable case described by Qi Feng on
> > >> issue 88784, we can combine the cases into the following:
> > >>
> > >> 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > >> 2. x >  y  &&  x == XXX_MIN  -->   false
> > >> 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > >>
> > >> 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > >> 5. x <  y  &&  x == XXX_MAX  -->   false
> > >> 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > >>
> > >> 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > >> 8. x <= y  ||  x != XXX_MIN  -->   true
> > >> 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > >>
> > >> 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > >> 11. x >= y  ||  x != XXX_MAX  -->   true
> > >> 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > >>
> > >> Note: XXX_MIN represents the minimum value of type x.
> > >>XXX_MAX represents the maximum value of type x.
> > >>
> > >> Here we don't need to care about whether the operation is
> > >> signed or unsigned.  For example, in the below equation:
> > >>
> > >> 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > >>
> > >> If the x type is signed int and XXX_MIN is INT_MIN, we can
> > >> optimize it to 'x > y'.  However, if the type of x is unsigned
> > >> int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > >>
> > >> The regression testing for the patch was done on GCC mainline on
> > >>
> > >>  powerpc64le-unknown-linux-gnu (Power 9 LE)
> > >>
> > >> with no regressions.  Is it OK for trunk ?
> > >>
> > >> Thanks,
> > >> Lijia He
> > >>
> > >> gcc/ChangeLog
> > >>
> > >> 2019-06-27  Li Jia He  
> > >>  Qi Feng  
> > >>
> > >>  PR middle-end/88784
> > >>  * gimple-fold.c (and_comparisons_contain_equal_operands): New 
> > >> function.
> > >>  (and_comparisons_1): Use and_comparisons_contain_equal_operands.
> > >>  (or_comparisons_contain_equal_operands): New function.
> > >>  (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > Would this be better done via match.pd?  ISTM this transformation would
> > > be well suited for that framework.
> >
> > Hi, Jeff
> >
> > I did this because of the following test case:
> > `
> > _Bool comp(unsigned x, unsigned y)
> > {
> >return x > y && x != 0;
> > }
> > `
> > The gimple file dumped on the power platform is:
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >_Bool D.2837;
> >int iftmp.0;
> >
> >if (x > y) goto ; else goto ;
> >:
> >if (x != 0) goto ; else goto ;
> >:
> >iftmp.0 = 1;
> >goto ;
> >:
> >iftmp.0 = 0;
> >:
> >D.2837 = (_Bool) iftmp.0;
> >return D.2837;
> > }
> > `
> > However, the gimple file dumped on x86 is
> > `
> > comp (unsigned int x, unsigned int y)
> > {
> >_Bool D.2837;
> >
> >_1 = x > y;
> >_2 = x != 0;
> >_3 = _1 & _2;
> >_4 = (int) _3;
> >D.2837 = (_Bool) _4;
> >return D.2837;
> > }
> > `
> >
> > The reason for the inconsistency between these two behaviors is param
> > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > file, we can only optimize the situation in which the statement is in
> > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > can't handle this situation.
> >
> > Another reason is that I found out maybe_fold_and_comparisons and
> > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > also by reassoc pass. Using this method can basically unify param
> > logical-op-non-short-circuit=0 or 1.
> 
> 
> As mentioned before ifcombine pass should be using gimple-match
> instead of fold_build.  Try converting ifcombine over to gimple-match
> infrastructure and add these to match.pd.

Yes, I mentioned that in the PR.  The issue is that at the moment
to combine x > y with x <= y you'd have to build GENERIC trees
for both or temporary GIMPLE assign with a SSA def (and then feed
that into the GENERIC or GIMPLE match.pd path).

maybe_fold_and/or_comparisons handle two exploded binary expressions
while the current match.pd entries handle at most one exploded one
(the outermost then, either AND or OR).  But it would be definitely
doable to auto-generate maybe_fold_and/or_comparisons from match.pd
patterns which is what I'd ultimatively suggest to do (in some more
generalized form maybe).  Either with a separate genmatch invocation
or as part of the --gimple processing (not sure what is more feasible
here).

I told Li Jia He that I don't expect him to do this work.

Note I didn't review the actual patch yet.

Thanks,
Richard.

98 matches

Mail list logo