Re: [PATCH PR92926]Fix wrong code caused by ctor node translation unit wide sharing

2020-01-08 Thread Bin.Cheng
On Fri, Dec 20, 2019 at 3:10 PM Richard Biener
 wrote:
>
> On December 20, 2019 2:13:47 AM GMT+01:00, "Bin.Cheng" 
>  wrote:
> >On Fri, Dec 13, 2019 at 11:26 AM bin.cheng
> > wrote:
> >>
> >> Hi,
> >>
> >> As reported in PR92926, constant ctor is shared translation unit wide
> >because of constexpr_call_table,
> >> however, during gimplify, the shared ctor could be modified.  This
> >patch fixes the issue by unsharing
> >> it before modification in gimplify.  A test is reduced from cppcoro
> >library and added.
> >>
> >> Bootstrap and test ongoing.  Not sure if this is the correct fix
> >though, any comments?
> >Ping.  Any comment?
>
> Looks reasonable to me.
Given PR92926 is marked as duplicate of PR93143, I updated test case
of the patch.

Thanks,
bin

2019-12-13  Bin Cheng  

PR tree-optimization/93143
* gimplify.c (gimplify_init_constructor): Unshare ctor node before
clearing.

gcc/testsuite
2019-12-13  Bin Cheng  

PR tree-optimization/93143
* g++.dg/pr93143.C: New test.
From 77252c3bb41887af1daa9e83615a8aa32dc330f9 Mon Sep 17 00:00:00 2001
From: "bin.cheng" 
Date: Thu, 9 Jan 2020 14:13:08 +0800
Subject: [PATCH] Fix pr93143.

---
 gcc/gimplify.c |  2 ++
 gcc/testsuite/g++.dg/pr93143.C | 73 ++
 2 files changed, 75 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pr93143.C

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 73fb2e7..55d7a93 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -5083,6 +5083,8 @@ gimplify_init_constructor (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	/* Zap the CONSTRUCTOR element list, which simplifies this case.
 	   Note that we still have to gimplify, in order to handle the
 	   case of variable sized types.  Avoid shared tree structures.  */
+	ctor = unshare_expr (ctor);
+	TREE_OPERAND (*expr_p, 1) = ctor;
 	CONSTRUCTOR_ELTS (ctor) = NULL;
 	TREE_SIDE_EFFECTS (ctor) = 0;
 	object = unshare_expr (object);
diff --git a/gcc/testsuite/g++.dg/pr93143.C b/gcc/testsuite/g++.dg/pr93143.C
new file mode 100644
index 000..40710cf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr93143.C
@@ -0,0 +1,73 @@
+// { dg-do run }
+// { dg-options "-O3 -std=c++14" }
+
+struct array
+{
+constexpr unsigned char operator[](int i) const noexcept
+{
+return arr[i];
+}
+
+unsigned char arr[16];
+};
+
+
+class v6 {
+public:
+using bytes_type = array;
+constexpr v6(bytes_type const & bytes);
+constexpr bool is_loopback() const noexcept;
+static constexpr v6 loopback() noexcept
+{
+return v6(v6::bytes_type{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1});
+}
+private:
+bytes_type bytes_;
+};
+
+
+
+constexpr v6::v6(bytes_type const & bytes)
+: bytes_(bytes)
+{}
+
+constexpr
+bool v6::is_loopback() const noexcept
+{
+return bytes_[0] == 0 &&
+bytes_[1] == 0 &&
+bytes_[2] == 0 &&
+bytes_[3] == 0 &&
+bytes_[4] == 0 &&
+bytes_[5] == 0 &&
+bytes_[6] == 0 &&
+bytes_[7] == 0 &&
+bytes_[8] == 0 &&
+bytes_[9] == 0 &&
+bytes_[10] == 0 &&
+bytes_[11] == 0 &&
+bytes_[12] == 0 &&
+bytes_[13] == 0 &&
+bytes_[14] == 0 &&
+bytes_[15] == 1;
+}
+
+void v6_type()
+{
+[[maybe_unused]] constexpr auto loopback = v6::loopback();
+}
+
+int main()
+{
+v6_type();
+
+constexpr auto a = v6::loopback();
+if (!a.is_loopback())
+__builtin_abort();
+
+auto b = v6::loopback();
+if (!b.is_loopback())
+__builtin_abort();
+
+return 0;
+}
-- 
1.8.3.1



Re: [PATCH 05/41] Add -fdiagnostics-nn-line-numbers

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 23:35 -0500, David Malcolm wrote:
> On Wed, 2020-01-08 at 21:17 -0700, Jeff Law wrote:
> > On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> > > I may be able to self-approve this.  It's used by the
> > > diagnostic_path
> > > patch, and by the analyzer test suite.  Perhaps better to make
> > > undocumeted, or do it via a DejaGnu pruning directive, but I wanted
> > > to get v5 of the kit posted.
> > > 
> > > This patch implements -fdiagnostics-nn-line-numbers, a new option
> > > which makes diagnostic_show_locus print "NN" rather than specific
> > > line numbers when printing the left margin.
> > > 
> > > This is intended purely to make it easier to write certain kinds of
> > > DejaGnu test; various integration tests for diagnostic paths later
> > > in the patch kit make use of it.
> > > 
> > > gcc/ChangeLog:
> > >   * common.opt (fdiagnostics-nn-line-numbers): New option.
> > >   * diagnostic-show-locus.c
> > > (layout::m_use_nn_for_line_numbers_p):
> > >   New field.
> > >   (layout::layout): Initialize it.
> > >   (layout::calculate_linenum_width): Use it when computing
> > >   m_linenum_width.
> > >   (layout::print_source_line): Implement printing "NN" rather
> > > than
> > >   the line number.
> > >   (selftest::test_line_numbers_multiline_range): Add a test of
> > > "NN"
> > >   printing.
> > >   * diagnostic.c (diagnostic_initialize): Initialize
> > >   use_nn_for_line_numbers_p.
> > >   (num_digits): Add "use_nn_p" param.
> > >   (selftest::test_num_digits): Add a test for use_nn_p==true.
> > >   * diagnostic.h (diagnostic_context::use_nn_for_line_numbers_p):
> > >   New field.
> > >   (num_digits): Add optional "use_nn_p" param.
> > >   * doc/invoke.texi (-fdiagnostics-nn-line-numbers): New option.
> > >   * dwarf2out.c (gen_producer_string): Ignore
> > >   OPT_fdiagnostics_nn_line_numbers.
> > >   * lto-wrapper.c (merge_and_complain): Handle
> > >   OPT_fdiagnostics_nn_line_numbers.
> > >   (append_compiler_options): Likewise.
> > >   (append_diag_options): Likewise.
> > >   * opts.c (common_handle_option): Likewise.
> > >   * toplev.c (general_init): Initialize
> > >   global_dc->use_nn_for_line_numbers_p.
> > Reminds me a lot of the option to not print insn numbers and certain
> > addresses in RTL dumps -- which makes comparing them easier.
> > 
> > OK
> > jeff
> 
> Thanks.  I've actually reworked my working copy to use a DejaGnu-based
> postprocessing approach instead:
>   https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00398.html
> which avoids adding an option.
> 
> Does this latter approach look OK?  (and is the other patch OK?)
I don't have a strong opinion here.  I guess it's marginally better to
not have the option in gcc.  I'll take a looksie at the updated patch.

Jeff



Re: [PATCH 05/41] Add -fdiagnostics-nn-line-numbers

2020-01-08 Thread David Malcolm
On Wed, 2020-01-08 at 21:17 -0700, Jeff Law wrote:
> On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> > I may be able to self-approve this.  It's used by the
> > diagnostic_path
> > patch, and by the analyzer test suite.  Perhaps better to make
> > undocumeted, or do it via a DejaGnu pruning directive, but I wanted
> > to get v5 of the kit posted.
> > 
> > This patch implements -fdiagnostics-nn-line-numbers, a new option
> > which makes diagnostic_show_locus print "NN" rather than specific
> > line numbers when printing the left margin.
> > 
> > This is intended purely to make it easier to write certain kinds of
> > DejaGnu test; various integration tests for diagnostic paths later
> > in the patch kit make use of it.
> > 
> > gcc/ChangeLog:
> > * common.opt (fdiagnostics-nn-line-numbers): New option.
> > * diagnostic-show-locus.c
> > (layout::m_use_nn_for_line_numbers_p):
> > New field.
> > (layout::layout): Initialize it.
> > (layout::calculate_linenum_width): Use it when computing
> > m_linenum_width.
> > (layout::print_source_line): Implement printing "NN" rather
> > than
> > the line number.
> > (selftest::test_line_numbers_multiline_range): Add a test of
> > "NN"
> > printing.
> > * diagnostic.c (diagnostic_initialize): Initialize
> > use_nn_for_line_numbers_p.
> > (num_digits): Add "use_nn_p" param.
> > (selftest::test_num_digits): Add a test for use_nn_p==true.
> > * diagnostic.h (diagnostic_context::use_nn_for_line_numbers_p):
> > New field.
> > (num_digits): Add optional "use_nn_p" param.
> > * doc/invoke.texi (-fdiagnostics-nn-line-numbers): New option.
> > * dwarf2out.c (gen_producer_string): Ignore
> > OPT_fdiagnostics_nn_line_numbers.
> > * lto-wrapper.c (merge_and_complain): Handle
> > OPT_fdiagnostics_nn_line_numbers.
> > (append_compiler_options): Likewise.
> > (append_diag_options): Likewise.
> > * opts.c (common_handle_option): Likewise.
> > * toplev.c (general_init): Initialize
> > global_dc->use_nn_for_line_numbers_p.
> Reminds me a lot of the option to not print insn numbers and certain
> addresses in RTL dumps -- which makes comparing them easier.
> 
> OK
> jeff

Thanks.  I've actually reworked my working copy to use a DejaGnu-based
postprocessing approach instead:
  https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00398.html
which avoids adding an option.

Does this latter approach look OK?  (and is the other patch OK?)

Thanks
Dave



Re: [PATCH] avoid warning on vectorized past-the-end stores (PR 93200)

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 17:23 +, Martin Sebor wrote:
> A recent improvement to the vectorizer (r278334 if my bisection
> is right) can transform multiple stores to adjacent struct members
> into single vectorized assignments that write over all the members
> in a single MEM_REF.  These are then flagged by -Wstringop-overflow
> thanks to its also recently enhanced past-the-end store detection.
> The warnings have been causing failures in some of Jeff's periodic
> builds (e.g., in cjdns-v20.4).
> 
> Reliably distinguishing these transformed, multi-member, MEM_REF
> stores from accidental bugs the warning is designed to detect will
> require annotating them somehow at the time they are introduced.
> Until that happens, the attached patch simply tweaks the logic that
> determines the size of the destination objects to punt on these
> vectorized MEM_REFs.
I thought we had other code which could combine stores into consecutive
memory locations that might run afoul of this warning as well.  But I
can't seem to find that code -- we may well have throttled it a while
back because of data store races.

OK.  Thanks for taking care of this so quickly.

jeff




Re: [PATCH 05/41] Add -fdiagnostics-nn-line-numbers

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> I may be able to self-approve this.  It's used by the diagnostic_path
> patch, and by the analyzer test suite.  Perhaps better to make
> undocumeted, or do it via a DejaGnu pruning directive, but I wanted
> to get v5 of the kit posted.
> 
> This patch implements -fdiagnostics-nn-line-numbers, a new option
> which makes diagnostic_show_locus print "NN" rather than specific
> line numbers when printing the left margin.
> 
> This is intended purely to make it easier to write certain kinds of
> DejaGnu test; various integration tests for diagnostic paths later
> in the patch kit make use of it.
> 
> gcc/ChangeLog:
>   * common.opt (fdiagnostics-nn-line-numbers): New option.
>   * diagnostic-show-locus.c (layout::m_use_nn_for_line_numbers_p):
>   New field.
>   (layout::layout): Initialize it.
>   (layout::calculate_linenum_width): Use it when computing
>   m_linenum_width.
>   (layout::print_source_line): Implement printing "NN" rather than
>   the line number.
>   (selftest::test_line_numbers_multiline_range): Add a test of "NN"
>   printing.
>   * diagnostic.c (diagnostic_initialize): Initialize
>   use_nn_for_line_numbers_p.
>   (num_digits): Add "use_nn_p" param.
>   (selftest::test_num_digits): Add a test for use_nn_p==true.
>   * diagnostic.h (diagnostic_context::use_nn_for_line_numbers_p):
>   New field.
>   (num_digits): Add optional "use_nn_p" param.
>   * doc/invoke.texi (-fdiagnostics-nn-line-numbers): New option.
>   * dwarf2out.c (gen_producer_string): Ignore
>   OPT_fdiagnostics_nn_line_numbers.
>   * lto-wrapper.c (merge_and_complain): Handle
>   OPT_fdiagnostics_nn_line_numbers.
>   (append_compiler_options): Likewise.
>   (append_diag_options): Likewise.
>   * opts.c (common_handle_option): Likewise.
>   * toplev.c (general_init): Initialize
>   global_dc->use_nn_for_line_numbers_p.
Reminds me a lot of the option to not print insn numbers and certain
addresses in RTL dumps -- which makes comparing them easier.

OK
jeff
> 



Re: [PATCH 04/41] vec.h: add auto_delete_vec

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> Needs review.  Used by diagnostic_path patch and in various places
> in the analyzer.
> 
> msebor raised some concerns about the v1 version of this patch here:
>   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00221.html
> which I believe I addressed in v4:
>   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01319.html
> 
> Changed in v4: added DISABLE_COPY_AND_ASSIGN
> 
> This patch adds a class auto_delete_vec, a subclass of auto_vec 
> that deletes all of its elements on destruction; it's used in many
> places later in the kit.
> 
> This is a crude way for a vec to "own" the objects it points to
> and clean up automatically (essentially a workaround for not being able
> to use unique_ptr, due to C++98).
> 
> gcc/ChangeLog:
>   * vec.c (class selftest::count_dtor): New class.
>   (selftest::test_auto_delete_vec): New test.
>   (selftest::vec_c_tests): Call it.
>   * vec.h (class auto_delete_vec): New class template.
>   (auto_delete_vec::~auto_delete_vec): New dtor.
Just to be clear because my earlier message referenced this patch from
an earlier thread, this is OK.

jeff
> 



Re: [RFC] IVOPTs select cand with preferred D-form access

2020-01-08 Thread Kewen.Lin
Hi Bin,

> I am a bit worried that would make IVOPTs heavy too, it might be
> possible to compute heuristics whether loop should be unrolled as a
> post-IVOPTs transformation.  Of course the transformation needs to do
> more work than simply unrolling in order to take advantage of
> aforementioned addressing mode.

Agreed, I prefer to just figure out the unroll factor (UF) by some
heurisitcs instead of performing actual unrolling as well.  I guess 
"post-IVOPTs" is a typo for "pre-IVOPTs"?

> BTW, unrolled loop won't perform as good as ppc if the target doesn't
> support [base + register + offset] addressing mode?
> 

The target which doesn't support D-form would probably have benefits
from unrolling, but IVOPTs decision won't affect it since X-form doesn't
have offset field to hide step updates.  In the next patch, I'll compute
UF with moreheurisitics and update IV cand step cost with that, for
D-form cand, just one time step_cost, but for X-form cand, it would be
UF*step_cost.

> Another point, in case of multiple passes doing unrolling, the
> "already unrolled" information may need to be recorded as a flag of
> loop properties.

Yes, we can update the computed UF into loop struct unroll field.
I'll check performance impact to avoid the proposed UF computation is 
poor.

Thanks,
Kewen



Re: [PATCH] RISC-V: Disable use of TLS copy relocs.

2020-01-08 Thread Palmer Dabbelt via gcc-patches

On Wed, 08 Jan 2020 17:05:21 PST (-0800), Jim Wilson wrote:

Musl and lld don't support TLS copy relocs, and don't want to add support
for this feature which is unique to RISC-V.  Only GNU ld and glibc support
them.  In the pasbi discussion, people have pointed out various problems
with using them, so we are deprecating them.  There doesn't seem to be an
ABI break from dropping them so this patch modifies gcc to stop creating
them.  I'm using an ifdef for now in case a problem turns up and the code
has to be re-enabled.  The plan is to add an initial to local exec
relaxation as a replacement, though this has not been defined or
implemented yet.

This was tested with native gcc and glibc builds and checks with no
regressions.

Committed.

Jim

gcc/
* config/riscv/riscv.c (riscv_legitimize_tls_address): Ifdef out
use of TLS_MODEL_LOCAL_EXEC when not pic.
---
 gcc/config/riscv/riscv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 3e0bedaf145..4ba811126fe 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -1257,9 +1257,12 @@ riscv_legitimize_tls_address (rtx loc)
   rtx dest, tp, tmp;
   enum tls_model model = SYMBOL_REF_TLS_MODEL (loc);

+#if 0
+  /* TLS copy relocs are now deprecated and should not be used.  */
   /* Since we support TLS copy relocs, non-PIC TLS accesses may all use LE.  */
   if (!flag_pic)
 model = TLS_MODEL_LOCAL_EXEC;
+#endif

   switch (model)
 {


Thanks!


Re: [PATCH] Use cgraph_node::dump_{asm_},name where possible.

2020-01-08 Thread luoxhu




On 2020/1/8 22:54, Martin Liška wrote:

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index bd44063a1ac..789564ba335 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1148,8 +1148,7 @@ symbol_table::materialize_all_clones (void)
  if (symtab->dump_file)
{
  fprintf (symtab->dump_file, "cloning %s to %s\n",
-  xstrdup_for_dump (node->clone_of->name ()),
-  xstrdup_for_dump (node->name ()));
+  node->clone_of->dump_name (), node->name ());


Also node->dump_name () here?



[PATCH] RISC-V: Disable use of TLS copy relocs.

2020-01-08 Thread Jim Wilson
Musl and lld don't support TLS copy relocs, and don't want to add support
for this feature which is unique to RISC-V.  Only GNU ld and glibc support
them.  In the pasbi discussion, people have pointed out various problems
with using them, so we are deprecating them.  There doesn't seem to be an
ABI break from dropping them so this patch modifies gcc to stop creating
them.  I'm using an ifdef for now in case a problem turns up and the code
has to be re-enabled.  The plan is to add an initial to local exec
relaxation as a replacement, though this has not been defined or
implemented yet.

This was tested with native gcc and glibc builds and checks with no
regressions.

Committed.

Jim

gcc/
* config/riscv/riscv.c (riscv_legitimize_tls_address): Ifdef out
use of TLS_MODEL_LOCAL_EXEC when not pic.
---
 gcc/config/riscv/riscv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 3e0bedaf145..4ba811126fe 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -1257,9 +1257,12 @@ riscv_legitimize_tls_address (rtx loc)
   rtx dest, tp, tmp;
   enum tls_model model = SYMBOL_REF_TLS_MODEL (loc);
 
+#if 0
+  /* TLS copy relocs are now deprecated and should not be used.  */
   /* Since we support TLS copy relocs, non-PIC TLS accesses may all use LE.  */
   if (!flag_pic)
 model = TLS_MODEL_LOCAL_EXEC;
+#endif
 
   switch (model)
 {
-- 
2.17.1



Re: [PATCH 15/49] Add ordered_hash_map

2020-01-08 Thread David Malcolm
On Wed, 2019-12-04 at 10:59 -0700, Martin Sebor wrote:
> On 11/15/19 6:23 PM, David Malcolm wrote:
> > This patch adds an ordered_hash_map template, which is similar to
> > hash_map, but preserves insertion order.
> > 
> > gcc/ChangeLog:
> > * Makefile.in (OBJS): Add ordered-hash-map-tests.o.
> > * ordered-hash-map-tests.cc: New file.
> > * ordered-hash-map.h: New file.
> > * selftest-run-tests.c (selftest::run_tests): Call
> > selftest::ordered_hash_map_tests_cc_tests.
> > * selftest.h (selftest::ordered_hash_map_tests_cc_tests): New
> > decl.
> > ---
> >   gcc/Makefile.in   |   1 +
> >   gcc/ordered-hash-map-tests.cc | 247
> > ++
> >   gcc/ordered-hash-map.h| 184
> > +++
> >   gcc/selftest-run-tests.c  |   1 +
> >   gcc/selftest.h|   1 +
> >   5 files changed, 434 insertions(+)
> >   create mode 100644 gcc/ordered-hash-map-tests.cc
> >   create mode 100644 gcc/ordered-hash-map.h
> > 

[...]

> The container defines a copy-constructor but no copy assignment.
> Is it safely assignable? (I don't think auto_vec is safely copyable
> or assignable due to PR 90904.  It looks like the copy ctor works
> around it below.)

It's not safely assignable; I don't believe I'm using that (I am using
the copy ctor).

I can make it private or similar to ensure it's not used.

> I don't think I've made use of the hash_map copy ctor or copy
> assignment but if it's anything like other GCC containers I'd
> worry about it not doing the right thing, especially for non-
> PODs.  I spent too much time chasing down miscompilations and
> other problems due to bugs (or design limitations) in these
> classes.
> 
> I'd far prefer to see us use libstdc++ containers in new code
> than introduce new ones of our own.  They are better designed
> and much better tested than these.  (I realize we're still
> hampered by targeting C++ 98.)
> 
> Martin

[CCing Jonathan for libstdc++ expertise]

Is there such a container in libstdc++?

This patch implements a map that preserves insertion order when
iterating, without requiring the Key type to be comparable.

As far as I can tell, std::map instead is a map that requires the Key
type to be comparable, and uses that to implement a red-black tree
(rather than via hashing), and doesn't preserve insertion ordering.

I use this ordered_hash_map class later in various places in the
analyzer patch kit to ensure more deterministic results, so that
results aren't affected by hash values of possibly-changing pointer
values.

[...]

Dave



[PATCH] testsuite: add lib/nn-line-numbers.exp

2020-01-08 Thread David Malcolm
(replying to my own "[PATCH 05/41] Add -fdiagnostics-nn-line-numbers"
with a followup that does it at the DejaGnu level rather than as a
test-only option)

On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> I may be able to self-approve this.  It's used by the diagnostic_path
> patch, and by the analyzer test suite.  Perhaps better to make
> undocumeted, or do it via a DejaGnu pruning directive, but I wanted
> to get v5 of the kit posted.
> 
> This patch implements -fdiagnostics-nn-line-numbers, a new option
> which makes diagnostic_show_locus print "NN" rather than specific
> line numbers when printing the left margin.
> 
> This is intended purely to make it easier to write certain kinds of
> DejaGnu test; various integration tests for diagnostic paths later
> in the patch kit make use of it.
> 
> gcc/ChangeLog:
>   * common.opt (fdiagnostics-nn-line-numbers): New option.
>   * diagnostic-show-locus.c
> (layout::m_use_nn_for_line_numbers_p):
>   New field.
>   (layout::layout): Initialize it.
>   (layout::calculate_linenum_width): Use it when computing
>   m_linenum_width.
>   (layout::print_source_line): Implement printing "NN" rather
> than
>   the line number.
>   (selftest::test_line_numbers_multiline_range): Add a test of
> "NN"
>   printing.
>   * diagnostic.c (diagnostic_initialize): Initialize
>   use_nn_for_line_numbers_p.
>   (num_digits): Add "use_nn_p" param.
>   (selftest::test_num_digits): Add a test for use_nn_p==true.
>   * diagnostic.h (diagnostic_context::use_nn_for_line_numbers_p):
>   New field.
>   (num_digits): Add optional "use_nn_p" param.
>   * doc/invoke.texi (-fdiagnostics-nn-line-numbers): New option.
>   * dwarf2out.c (gen_producer_string): Ignore
>   OPT_fdiagnostics_nn_line_numbers.
>   * lto-wrapper.c (merge_and_complain): Handle
>   OPT_fdiagnostics_nn_line_numbers.
>   (append_compiler_options): Likewise.
>   (append_diag_options): Likewise.
>   * opts.c (common_handle_option): Likewise.
>   * toplev.c (general_init): Initialize
>   global_dc->use_nn_for_line_numbers_p.

Here's an alterative patch to the above that replaces the
"-fdiagnostics-nn-line-numbers" option in earlier versions of the
analyzer patch kit, by doing it at the DejaGnu level instead.

This patch adds support for obscuring the line numbers printed in the
left-hand margin when printing the source code, converting them to NN,
e.g from:

  7111 |   if (!(flags & 0x0001)) {
   |  ^
   |  |
   |  (1) following 'true' branch...
  7112 |

to:

   NN  |   if (!(flags & 0x0001)) {
   |  ^
   |  |
   |  (1) following 'true' branch...
   NN  |

This is useful in followup patches e.g. when testing how interprocedural
paths are printed using multiline.exp, to avoid depending on precise line
numbers.

I'm testing this now (but it seems to be a working, drop-in replacement
for the option in the parts of the patch kit I've tested with it).

Examples of use can be seen in the analyzer test suite:
  https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00320.html
(search for -fdiagnostics-nn-line-numbers and dg-begin-multiline-output
there to get the idea)

OK for trunk assuming the other testing looks good?

gcc/testsuite/ChangeLog:
* lib/gcc-dg.exp (cleanup-after-saved-dg-test): Reset global
nn_line_numbers_enabled.
* lib/nn-line-numbers.exp: New file.
* lib/prune.exp: Load nn-line-numbers.exp.
(prune_gcc_output): Call maybe-handle-nn-line-numbers.
---
 gcc/testsuite/lib/gcc-dg.exp  |   2 +
 gcc/testsuite/lib/nn-line-numbers.exp | 103 ++
 gcc/testsuite/lib/prune.exp   |   5 ++
 3 files changed, 110 insertions(+)
 create mode 100644 gcc/testsuite/lib/nn-line-numbers.exp

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index e6875de23831..cccd3ce4742c 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -940,6 +940,7 @@ if { [info procs saved-dg-test] == [list] } {
global set_compiler_env_var
global saved_compiler_env_var
global keep_saved_temps_suffixes
+   global nn_line_numbers_enabled
global multiline_expected_outputs
global freeform_regexps
global save_linenr_varnames
@@ -967,6 +968,7 @@ if { [info procs saved-dg-test] == [list] } {
if [info exists testname_with_flags] {
unset testname_with_flags
}
+   set nn_line_numbers_enabled 0
set multiline_expected_outputs []
set freeform_regexps []
 
diff --git a/gcc/testsuite/lib/nn-line-numbers.exp 
b/gcc/testsuite/lib/nn-line-numbers.exp
new file mode 100644
index ..fed1004eb8e7
--- /dev/null
+++ b/gcc/testsuite/lib/nn-line-numbers.exp
@@ -0,0 +1,103 @@
+#   Copyright (C) 2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify

Re: [PATCH 03/41] sbitmap.h: add operator const_sbitmap to auto_sbitmap

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> Needs review.  (Used in one place by region-model.cc)
> 
> Changed in v5:
> - follow msebor's suggestion of using operator const_sbitmap
> rather than operator const sbitmap&, as per:
> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00224.html
> 
> gcc/ChangeLog:
>   * sbitmap.h (auto_sbitmap): Add operator const_sbitmap.
OK.
jeff
> 



Re: [PATCH 02/41] analyzer: internal documentation

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> Needs review.
> 
> Changed in v5:
> - updated for removal of analyzer-specific builtins:
>   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01310.html
> 
> Changed in v4:
>   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02026.html
> 
> gcc/ChangeLog:
>   * Makefile.in (TEXI_GCCINT_FILES): Add analyzer.texi.
>   * doc/analyzer.texi: New file.
>   * doc/gccint.texi ("Static Analyzer") New menu item.
>   (analyzer.texi): Include it.
I think this is OK as well.  Like the other doc patch, I don't mind
some iteration here if we need it.

jeff



Re: [PATCH 01/41] analyzer: user-facing documentation

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 04:02 -0500, David Malcolm wrote:
> Sandra reviewed the v1 version of this patch here:
>   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00549.html
> and noted that the organization could use some work.
> 
> TODO: update re Sandra's ideas
> 
> Changed in v4:
> - Use -fanalyzer rather than --analyzer
> - Add -W[no-]analyzer-unsafe-call-within-signal-handler
> 
> gcc/ChangeLog:
>   * doc/invoke.texi ("Static Analyzer Options"): New list and new section.
>   ("Warning Options"): Add static analysis warnings to the list.
>   (-Wno-analyzer-double-fclose): New option.
>   (-Wno-analyzer-double-free): New option.
>   (-Wno-analyzer-exposure-through-output-file): New option.
>   (-Wno-analyzer-file-leak): New option.
>   (-Wno-analyzer-free-of-non-heap): New option.
>   (-Wno-analyzer-malloc-leak): New option.
>   (-Wno-analyzer-possible-null-argument): New option.
>   (-Wno-analyzer-possible-null-dereference): New option.
>   (-Wno-analyzer-null-argument): New option.
>   (-Wno-analyzer-null-dereference): New option.
>   (-Wno-analyzer-stale-setjmp-buffer): New option.
>   (-Wno-analyzer-tainted-array-index): New option.
>   (-Wno-analyzer-use-after-free): New option.
>   (-Wno-analyzer-use-of-pointer-in-stale-stack-frame): New option.
>   (-Wno-analyzer-use-of-uninitialized-value): New option.
>   (-Wanalyzer-too-complex): New option.
>   (-fanalyzer-call-summaries): New warning.
>   (-fanalyzer-checker=): New warning.
>   (-fanalyzer-fine-grained): New warning.
>   (-fno-analyzer-state-merge): New warning.
>   (-fno-analyzer-state-purge): New warning.
>   (-fanalyzer-transitivity): New warning.
>   (-fanalyzer-verbose-edges): New warning.
>   (-fanalyzer-verbose-state-changes): New warning.
>   (-fanalyzer-verbosity=): New warning.
>   (-fdump-analyzer): New warning.
>   (-fdump-analyzer-callgraph): New warning.
>   (-fdump-analyzer-exploded-graph): New warning.
>   (-fdump-analyzer-exploded-nodes): New warning.
>   (-fdump-analyzer-exploded-nodes-2): New warning.
>   (-fdump-analyzer-exploded-nodes-3): New warning.
>   (-fdump-analyzer-supergraph): New warning.
I think this is generally OK.  I don't mind iterating on this stuff.

Jeff
> 



Re: [PATCH 05/49] vec.h: add auto_delete_vec

2020-01-08 Thread Jeff Law
On Wed, 2019-12-18 at 10:59 -0500, David Malcolm wrote:
> On Wed, 2019-12-04 at 09:29 -0700, Martin Sebor wrote:
> > On 11/15/19 6:22 PM, David Malcolm wrote:
> > > This patch adds a class auto_delete_vec, a subclass of auto_vec
> > > 
> > > that deletes all of its elements on destruction; it's used in many
> > > places later in the kit.
> > > 
> > > This is a crude way for a vec to "own" the objects it points to
> > > and clean up automatically (essentially a workaround for not being
> > > able
> > > to use unique_ptr, due to C++98).
> > > 
> > > gcc/ChangeLog:
> > >   * vec.c (class selftest::count_dtor): New class.
> > >   (selftest::test_auto_delete_vec): New test.
> > >   (selftest::vec_c_tests): Call it.
> > >   * vec.h (class auto_delete_vec): New class template.
> > >   (auto_delete_vec::~auto_delete_vec): New dtor.
> > 
> > Because of slicing, unless preventing the elements from being
> > deleted in the class dtor is meant to be a feature, it seems that
> > using a wrapper class rather than public derivation from auto_vec
> > might be a safer solution.
> > 
> > It might be worth mentioning in a comment that the class isn't
> > safe to copy or assign (each copy would wind up delete the same
> > pointers), in addition to making its copy ctor and copy assignment
> > operator inaccessible or deleted.
> > 
> > Martin
> 
> In the version of the patch in the v4 kit:
>   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01035.html
> I added:
>   private:
>  DISABLE_COPY_AND_ASSIGN(auto_delete_vec);
> to the class.
> 
> Does that satisfy your concerns about slicing? (and, indeed, about
> copying and assigning)
I think this can and should go forward at this point.

jeff



Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Wed, Jan 08 2020, Jan Hubicka wrote:
>> Hi,
>> 
>> On Fri, Jan 03 2020, Martin Liška wrote:
>> > Hi.
>> >
>> > This is similar transformation for IPA passes. This time,
>> > one needs to use opt_for_fn in order to get the right
>> > parameter values.
>> >
>> > @Martin, Honza:
>> > There are last few remaining parameters which should use
>> > opt_for_fn:
>> >
>> > param_ipa_sra_max_replacements
>> 
>> IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn
>> 
>> 2020-01-07  Martin Jambor  
>> 
>>  * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
>>  * ipa-sra.c (scanned_node): New variable.
>>  (allocate_access): Use it to get param_ipa_sra_max_replacements.
>>  (ipa_sra_summarize_function): Set up scanned_node.
>>  (pull_accesses_from_callee): New parameter caller, use it to get
>>  param_ipa_sra_max_replacements.
>>  (param_splitting_across_edge): Pass the caller to
>>  pull_accesses_from_callee.
>> ---
>>  gcc/ipa-sra.c  | 33 +
>>  gcc/params.opt |  2 +-
>>  2 files changed, 22 insertions(+), 13 deletions(-)
>> 
>> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
>> index a051a9f2154..133ed687509 100644
>> --- a/gcc/ipa-sra.c
>> +++ b/gcc/ipa-sra.c
>> @@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, 
>> cgraph_edge *,
>>  
>>  /* With all GTY stuff done, we can move to anonymous namespace.  */
>>  namespace {
>> +/* Functions which currently has its body analyzed.  */
>> +
>> +cgraph_node *scanned_node;
>> +
>>  /* Quick mapping from a decl to its param descriptor.  */
>>  
>>  hash_map *decl2desc;
>> @@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
>>   HOST_WIDE_INT offset, HOST_WIDE_INT size)
>>  {
>>if (desc->access_count
>> -  == (unsigned) param_ipa_sra_max_replacements)
>> +  == (unsigned) opt_for_fn (scanned_node->decl,
>> +param_ipa_sra_max_replacements))
>>  {
>>disqualify_split_candidate (desc, "Too many replacement candidates");
>>return NULL;
>> @@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>>   node->order);
>>if (!ipa_sra_preliminary_function_checks (node))
>>  return;
>> +  scanned_node = node;
>>gcc_obstack_init (_obstack);
>>isra_func_summary *ifs = func_sums->get_create (node);
>>ifs->m_candidate = true;
>> @@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>>delete decl2desc;
>>decl2desc = NULL;
>>obstack_free (_obstack, NULL);
>> +  scanned_node = NULL;
>
> It is your code.  having static var to track currently analyzed function
> is bit ugly, and I am not sure if you don't have current_function_decl
> set to that function in all cases.  But I will leave this to your
> decision.

It is my code but apparently I already forgot some changes to it.
Originally I only pushed/popped cfun when processing (some) scan results
(I originally hoped not to do it at all) but since May I actually push
it whenever the body of the function is scanned, so using cfun is indeed
OK.

So please use the following instead:

IPA-CP: Access param_ipa_sra_max_replacements through opt_for_fn

2020-01-08  Martin Jambor  

* params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
* ipa-sra.c (pull_accesses_from_callee): New parameter caller, use it
to get param_ipa_sra_max_replacements.
(param_splitting_across_edge): Pass the caller to
pull_accesses_from_callee.
---
 gcc/ipa-sra.c  | 24 +---
 gcc/params.opt |  2 +-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index a051a9f2154..51d225ed772 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -3246,16 +3246,17 @@ all_callee_accesses_present_p (isra_param_desc 
*param_desc,
 enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN};
 
 
-/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, if
-   they would not violate some constraint there.  If successful, return NULL,
-   otherwise return the string reason for failure (which can be written to the
-   dump file).  DELTA_OFFSET is the known offset of the actual argument withing
-   the formal parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the
-   size of the actual argument or zero, if not known. In case of success, set
-   *CHANGE_P to true if propagation actually changed anything.  */
+/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
+   (which belongs to CALLER) if they would not violate some constraint there.
+   If successful, return NULL, otherwise return the string reason for failure
+   (which can be written to the dump file).  DELTA_OFFSET is the known offset
+   of the actual argument withing the formal parameter (so of ARG_DESCS within
+   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
+   known. In case 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Wed, Jan 08 2020, Martin Jambor wrote:
> Hi,
>
> On Fri, Jan 03 2020, Martin Liška wrote:
>> Hi.
>>
>> This is similar transformation for IPA passes. This time,
>> one needs to use opt_for_fn in order to get the right
>> parameter values.
>>
>> @Martin, Honza:
>> There are last few remaining parameters which should use
>> opt_for_fn:
>>
>> param_ipa_max_agg_items
>
>

Sorry, I sent out an earlier version of the patch that fails bootstrap
because of a Werror.  This one is the one I meant (the only difference
is that I changed one unsigned int into a signed one).

IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn

2020-01-07  Martin Jambor  

* params.opt (param_ipa_max_agg_items): Mark as Optimization
* ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
instead of param_ipa_max_agg_items.
(merge_aggregate_lattices): Extract param_ipa_max_agg_items from
optimization info for the callee.
---
 gcc/ipa-cp.c   | 14 +-
 gcc/ipa-prop.c |  7 ---
 gcc/params.opt |  2 +-
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4381b35a809..9e20e278eff 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices 
*dest_plats,
unless there are too many already.  If there are two many, return false.  If
there are overlaps turn whole DEST_PLATS to bottom and return false.  If any
skipped lattices were newly marked as containing variable, set *CHANGE to
-   true.  */
+   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
 
 static bool
 merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
 HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
 struct ipcp_agg_lattice ***aglat,
-bool pre_existing, bool *change)
+bool pre_existing, bool *change, int max_agg_items)
 {
   gcc_checking_assert (offset >= 0);
 
@@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices 
*dest_plats,
  set_agg_lats_to_bottom (dest_plats);
  return false;
}
-  if (dest_plats->aggs_count == param_ipa_max_agg_items)
+  if (dest_plats->aggs_count == max_agg_items)
return false;
   dest_plats->aggs_count++;
   new_al = ipcp_agg_lattice_pool.allocate ();
@@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
 ret |= set_agg_lats_contain_variable (dest_plats);
   dst_aglat = _plats->aggs;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+ param_ipa_max_agg_items);
   for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
src_aglat;
src_aglat = src_aglat->next)
@@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
   if (new_offset < 0)
continue;
   if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
-  _aglat, pre_existing, ))
+  _aglat, pre_existing, , max_agg_items))
{
  struct ipcp_agg_lattice *new_al = *dst_aglat;
 
@@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
*cs,
   if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
return true;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+ param_ipa_max_agg_items);
   FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
{
  HOST_WIDE_INT val_size;
@@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
*cs,
  val_size = tree_to_shwi (TYPE_SIZE (item->type));
 
  if (merge_agg_lats_step (dest_plats, item->offset, val_size,
-  , pre_existing, ))
+  , pre_existing, , max_agg_items))
{
  ret |= propagate_aggregate_lattice (cs, item, *aglat);
  aglat = &(*aglat)->next;
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index fcb13dfbac4..3488674760f 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1852,8 +1852,9 @@ determine_known_aggregate_parts (struct 
ipa_func_body_info *fbi,
   tree arg_base;
   bool check_ref, by_ref;
   ao_ref r;
+  int max_agg_items = opt_for_fn (fbi->node->decl, param_ipa_max_agg_items);
 
-  if (param_ipa_max_agg_items == 0)
+  if (max_agg_items == 0)
 return;
 
   /* The function operates in three stages.  First, we prepare check_ref, r,
@@ -1951,14 +1952,14 @@ determine_known_aggregate_parts (struct 
ipa_func_body_info *fbi,
 operands, whose definitions can finally reach the call.  */
  add_to_agg_contents_list (, (*copy = *content, copy));
 
- if (++value_count == param_ipa_max_agg_items)
+ if (++value_count == max_agg_items)
break;
}
 
  /* Add to the list consisting 

Re: [PATCH] Improve __builtin_sub_overflow with signed double-word operands (PR target/93141)

2020-01-08 Thread Uros Bizjak
On Wed, Jan 8, 2020 at 9:09 AM Jakub Jelinek  wrote:
>
> Hi!
>
> This is very similar to the previous PR93141 addv4 half and
> improves signed __builtin_sub_overflow on double-words rather than
> __builtin_add_overflow.
>
> I have left out the uaddv4 double-word stuff, because I ran into
> issues with it - as the pattern uses (set (reg:CC flags) (compare:CC (reg:TI) 
> (reg:TI)))
> as the first set, ifcvt.c (canonicalize_condition it calls) extracts
> the TImode comparison arguments from it, but as we don't have a cmpti3
> pattern, noce_try_store_flag fails and we don't manage to convert branchy
> code into setc.  Even for smaller modes ifcvt in that case does weird
> things, after the sub instruction which sets the flags too it emits
> a cmp with the same arguments and only postreload we notice the compare is
> redundant and remove it.  So, dunno if we want to improve somehow ifcvt
> and for conditions where we'd look through a multiple set instructions
> somehow try harder to reuse the original instruction, or if usub4
> should not use compare:CC of the operands and instead use some obfuscation
> to only set CCCmode to prevent ifcvt from punting on it.
>
> Anyway, this part works, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?
>
> 2020-01-07  Jakub Jelinek  
>
> PR target/93141
> (subv4): Use SWIDWI iterator instead of SWI.  Use
>  instead of .  Use
> CONST_SCALAR_INT_P instead of CONST_INT_P.
> (*subv4_1): Rename to ...
> (subv4_1): ... this.
> (*subv4_doubleword, *addv4_doubleword_1): New
> define_insn_and_split patterns.
> (*subv4_overflow_1, *addv4_overflow_2): New define_insn
> patterns.
>
> * gcc.target/i386/pr93141-1.c: Add tests with constants that have MSB
> of the low half of the constant set.
> * gcc.target/i386/pr93141-2.c: New test.

LGTM.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2020-01-07 08:01:22.618613702 +0100
> +++ gcc/config/i386/i386.md 2020-01-07 10:07:01.770587027 +0100
> @@ -6569,16 +6569,17 @@ (define_insn "*subsi_2_zext"
>  ;; Subtract with jump on overflow.
>  (define_expand "subv4"
>[(parallel [(set (reg:CCO FLAGS_REG)
> -  (eq:CCO (minus:
> - (sign_extend:
> -(match_operand:SWI 1 "nonimmediate_operand"))
> - (match_dup 4))
> -  (sign_extend:
> - (minus:SWI (match_dup 1)
> -(match_operand:SWI 2
> -   "")
> - (set (match_operand:SWI 0 "register_operand")
> -  (minus:SWI (match_dup 1) (match_dup 2)))])
> +  (eq:CCO
> +(minus:
> +  (sign_extend:
> +(match_operand:SWIDWI 1 "nonimmediate_operand"))
> +  (match_dup 4))
> +(sign_extend:
> +  (minus:SWIDWI (match_dup 1)
> +(match_operand:SWIDWI 2
> +   "")
> + (set (match_operand:SWIDWI 0 "register_operand")
> +  (minus:SWIDWI (match_dup 1) (match_dup 2)))])
> (set (pc) (if_then_else
>(eq (reg:CCO FLAGS_REG) (const_int 0))
>(label_ref (match_operand 3))
> @@ -6586,7 +6587,7 @@ (define_expand "subv4"
>""
>  {
>ix86_fixup_binary_operands_no_copy (MINUS, mode, operands);
> -  if (CONST_INT_P (operands[2]))
> +  if (CONST_SCALAR_INT_P (operands[2]))
>  operands[4] = operands[2];
>else
>  operands[4] = gen_rtx_SIGN_EXTEND (mode, operands[2]);
> @@ -6608,7 +6609,7 @@ (define_insn "*subv4"
>[(set_attr "type" "alu")
> (set_attr "mode" "")])
>
> -(define_insn "*subv4_1"
> +(define_insn "subv4_1"
>[(set (reg:CCO FLAGS_REG)
> (eq:CCO (minus:
>(sign_extend:
> @@ -6633,6 +6634,162 @@ (define_insn "*subv4_1"
>   (const_string "4")]
>   (const_string "")))])
>
> +(define_insn_and_split "*subv4_doubleword"
> +  [(set (reg:CCO FLAGS_REG)
> +   (eq:CCO
> + (minus:
> +   (sign_extend:
> + (match_operand: 1 "nonimmediate_operand" "0,0"))
> +   (sign_extend:
> + (match_operand: 2 "x86_64_hilo_general_operand" 
> "r,o")))
> + (sign_extend:
> +   (minus: (match_dup 1) (match_dup 2)
> +   (set (match_operand: 0 "nonimmediate_operand" "=ro,r")
> +   (minus: (match_dup 1) (match_dup 2)))]
> +  "ix86_binary_operator_ok (MINUS, mode, operands)"
> +  "#"
> +  "reload_completed"
> +  [(parallel [(set (reg:CC FLAGS_REG)
> +  (compare:CC (match_dup 1) (match_dup 2)))
> + (set (match_dup 0)
> +  (minus:DWIH (match_dup 1) (match_dup 2)))])
> +   (parallel [(set (reg:CCO 

[C++ PATCH 3/3] Add TARGET_EXPR_DIRECT_INIT_P sanity check.

2020-01-08 Thread Jason Merrill
The previous patch fixes an instance of directly expanding a TARGET_EXPR that
has TARGET_EXPR_DIRECT_INIT_P set, which should never happen.  So let's check
for any other instances.

Tested x86_64-pc-linux-gnu, applying to trunk.

* cp-gimplify.c (cp_gimplify_expr) [TARGET_EXPR]: Check
TARGET_EXPR_DIRECT_INIT_P.
* constexpr.c (cxx_eval_constant_expression): Likewise.
---
 gcc/cp/constexpr.c   | 1 +
 gcc/cp/cp-gimplify.c | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 5fe6d0277b6..9306a7dce4a 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5312,6 +5312,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
  *non_constant_p = true;
  break;
}
+  gcc_checking_assert (!TARGET_EXPR_DIRECT_INIT_P (t));
   /* Avoid evaluating a TARGET_EXPR more than once.  */
   if (tree *p = ctx->global->values.get (TARGET_EXPR_SLOT (t)))
{
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 1d2a77d2c0a..827d240d11a 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -925,6 +925,13 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
}
   break;
 
+case TARGET_EXPR:
+  /* A TARGET_EXPR that expresses direct-initialization should have been
+elided by cp_gimplify_init_expr.  */
+  gcc_checking_assert (!TARGET_EXPR_DIRECT_INIT_P (*expr_p));
+  ret = GS_UNHANDLED;
+  break;
+
 case RETURN_EXPR:
   if (TREE_OPERAND (*expr_p, 0)
  && (TREE_CODE (TREE_OPERAND (*expr_p, 0)) == INIT_EXPR
-- 
2.18.1



[C++ PATCH 1/3] Remove constexpr support for DECL_BY_REFERENCE.

2020-01-08 Thread Jason Merrill
Since we switched to doing constexpr evaluation on pre-GENERIC trees,
we don't have to handle DECL_BY_REFERENCE.

Tested x86_64-pc-linux-gnu, applying to trunk.

* constexpr.c (cxx_eval_call_expression): Remove DECL_BY_REFERENCE
support.
---
 gcc/cp/constexpr.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 417af182a2a..806d3ab2cff 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2333,17 +2333,8 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  remapped = DECL_CHAIN (remapped);
}
  /* Add the RESULT_DECL to the values map, too.  */
- tree slot = NULL_TREE;
- if (DECL_BY_REFERENCE (res))
-   {
- slot = AGGR_INIT_EXPR_SLOT (t);
- tree addr = build_address (slot);
- addr = build_nop (TREE_TYPE (res), addr);
- ctx->global->values.put (res, addr);
- ctx->global->values.put (slot, NULL_TREE);
-   }
- else
-   ctx->global->values.put (res, NULL_TREE);
+ gcc_assert (!DECL_BY_REFERENCE (res));
+ ctx->global->values.put (res, NULL_TREE);
 
  /* Track the callee's evaluated SAVE_EXPRs and TARGET_EXPRs so that
 we can forget their values after the call.  */
@@ -2370,7 +2361,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
result = void_node;
  else
{
- result = *ctx->global->values.get (slot ? slot : res);
+ result = *ctx->global->values.get (res);
  if (result == NULL_TREE && !*non_constant_p)
{
  if (!ctx->quiet)
@@ -2409,8 +2400,6 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 one constexpr evaluation?  If so, maybe also clear out
 other vars from call, maybe in BIND_EXPR handling?  */
  ctx->global->values.remove (res);
- if (slot)
-   ctx->global->values.remove (slot);
  for (tree parm = parms; parm; parm = TREE_CHAIN (parm))
ctx->global->values.remove (parm);
 

base-commit: f83c72fa36479084490f9a62adb4ed20aef72907
-- 
2.18.1



[C++ PATCH 2/3] PR c++/91369 - constexpr destructor and member initializer.

2020-01-08 Thread Jason Merrill
Previously it didn't matter whether we looked through a TARGET_EXPR in
constexpr evaluation, but now that we have constexpr destructors it does.
On IRC I mentioned the idea of clearing TARGET_EXPR_CLEANUP in
digest_nsdmi_init, but since this initialization is expressed by an
INIT_EXPR, it's better to handle all INIT_EXPR, not just those for a member
initializer.

Tested x86_64-pc-linux-gnu, applying to trunk.

* constexpr.c (cxx_eval_store_expression): Look through TARGET_EXPR
when not preevaluating.
---
 gcc/cp/constexpr.c   |  6 ++
 gcc/testsuite/g++.dg/cpp2a/constexpr-new10.C | 19 +++
 2 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-new10.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 806d3ab2cff..5fe6d0277b6 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -4577,6 +4577,12 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
}
   new_ctx.ctor = *valp;
   new_ctx.object = target;
+  /* Avoid temporary materialization when initializing from a TARGET_EXPR.
+We don't need to mess with AGGR_EXPR_SLOT/VEC_INIT_EXPR_SLOT because
+expansion of those trees uses ctx instead.  */
+  if (TREE_CODE (init) == TARGET_EXPR)
+   if (tree tinit = TARGET_EXPR_INITIAL (init))
+ init = tinit;
   init = cxx_eval_constant_expression (_ctx, init, false,
   non_constant_p, overflow_p);
   if (ctors->is_empty())
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-new10.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-new10.C
new file mode 100644
index 000..500a3240c8f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-new10.C
@@ -0,0 +1,19 @@
+// PR c++/91369
+// { dg-do compile { target c++2a } }
+
+struct S {
+  constexpr S (int* i) : s{i} {}
+  constexpr ~S () { delete s; }
+  int *s;
+};
+
+struct T { S t = { new int }; };
+
+constexpr auto
+foo ()
+{
+  T b;
+  return true;
+}
+
+static_assert (foo ());
-- 
2.18.1



Re: [Patch 0/X] HWASAN v3

2020-01-08 Thread Kostya Serebryany via gcc-patches
[asan/hwasan co-author here, with clearly biased opinions]

On Android, HWASAN is already a fully usable testing tool.
We apply it to the kernel, user space system libraries, and select apps.
A phone with HWASAN-ified system is fully usable (I carry one as my
primary device since March 2019).
HWASAN has discovered over 120 bugs by now (heap-use-after-free,
heap/stack buffer overflows, stack-use-after-return, double free).
Many of the bugs were discovered during the everyday use (as opposed
to testing in the lab).
The overhead is low enough that on a top-tier CPU the user will rarely
notice any slowdown
(the increased battery drain *is* noticeable - compiler
instrumentation is not a substitute for hardware).
HWASAN has also helped discover 4 instances of future incompatibility
with MTE, all fixed.

The main benefit of HWASAN over ASAN is, as Matthew correctly
explains, the memory usage.
On embedded devices, this is often the difference between "can't
deploy" and "can deploy"
because, unlike in the server land, you can't install more RAM.

The other, more subtle benefit, is that HWASAN is more sensitive to
some types of bugs,
such as buffer-overflow-far-from-bounds or use-after-long-ago-free, etc.

MTE hardware is years away. Even once we have it in major new devices,
many smaller devices will still be running on Arm v8, for a decade or two.
As with ASAN/TSAN/UBSAN, having this sanitizer implemented in GCC will
vastly extend its user base and applicability and thus contribute to
the overall code quality and security.

Whether HWASAN should intercept libc functions or libc itself should
support HWASAN...
My strong opinion is that today the interception approach can only be
seen as a way to prototype.
ASAN, implemented in 2011, had to use interception because we needed
to get a new idea working fast.
However, over these 9 years, the interception caused an enormous
amount of complexity and user dissatisfaction.
The Android implementation of HWASAN (with hooks in the Bionic libc
and no interceptors) is
many times simpler, robust, and complete.
We need to do the same for other LIBCs, eventually, but we don't have
to do it immediately.

--kcc





On Wed, Jan 8, 2020 at 3:26 AM Matthew Malcomson
 wrote:
>
> Hi everyone,
>
> I'm writing this email to summarise & publicise the state of this patch
> series, especially the difficulties around approval for GCC 10 mentioned
> on IRC.
>
>
> The main obstacle seems to be that no maintainer feels they have enough
> knowledge about hwasan and justification that it's worthwhile to approve
> the patch series.
>
> Similarly, Martin has given a review of the parts of the code he can
> (thanks!), but doesn't feel he can do a deep review of the code related
> to the RTL hooks and stack expansion -- hence that part is as yet not
> reviewed in-depth.
>
>
>
> The questions around justification raised on IRC are mainly that it
> seems like a proof-of-concept for MTE rather than a stand-alone useable
> sanitizer.  Especially since in the GNU world hwasan instrumented code
> is not really ready for production since we can only use the
> less-"interceptor ABI" rather than the "platform ABI".  This restriction
> is because there is no version of glibc with the required modifications
> to provide the "platform ABI".
>
> (n.b. that since https://reviews.llvm.org/D69574 the code-generation for
> these ABI's is the same).
>
>
>  From my perspective the reasons that make HWASAN useful in itself are:
>
> 1) Much less memory usage.
>
>  From a back-of-the-envelope calculation based on the hwasan paper's
> table of memory overhead from over-alignment
> https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code
> has an overhead of about 1.1x (~4% from overalignment and ~6.25% from
> shadow memory), while asan seems to have an overhead somewhere in the
> range 1.5x - 3x.
>
> Maybe there's some data out there comparing total overheads that I
> haven't found? (I'd appreciate a reference if anyone has that info).
>
>
>
> 2) Available on more architectures that MTE.
>
> HWASAN only requires TBI, which is a feature of all AArch64 machines,
> while MTE will be an optional extension and only available on certain
> architectures.
>
>
> 3) This enables using hwasan in the kernel.
>
> While instrumented user-space applications will be using the
> "interceptor ABI" and hence are likely not production-quality, the
> biggest aim of implementing hwasan in GCC is to allow building the Linux
> kernel with tag-based sanitization using GCC.
>
> Instrumented kernel code uses hooks in the kernel itself, so this ABI
> distinction is no longer relevant, and this sanitizer should produce a
> production-quality kernel binary.
>
>
>
>
> I'm hoping I can find a maintainer willing to review and ACK this patch
> series -- especially with stage3 coming to a close soon.  If there's
> anything else I could do to help get someone willing up-to-speed then
> please just ask.
>
>
> Cheers,
> Matthew
>
>
>
> On 

[committed] Make Wstringop-overflow-27 testnames unique [was Re: [PING 3][PATCH] track dynamic allocation in strlen (PR 91582)]

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 12:52 +0100, Andreas Schwab wrote:
> On Dez 06 2019, Martin Sebor wrote:
> 
> > diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c 
> > b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
> > new file mode 100644
> > index 000..249ce2b6ad5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
> > +void test_strcpy_warn (const char *s)
> > +{
> > +  {
> > +const char a[] = "123";
> > +/* Verify that using signed int for the strlen result works (i.e.,
> > +   that the conversion from signed int to size_t doesn't prevent
> > +   the detection.  */
> > +int n = strlen (a);
> > +char *t = (char*)calloc (n, 1); // { dg-message "at offset 0 to an 
> > object with size 3 allocated by 'calloc' here" "calloc note" { xfail *-*-* 
> > } }
> > +// { dg-message "at offset 0 to an 
> > object with size at most 3 allocated by 'calloc' here" "calloc note" { 
> > target *-*-* } .-1 }
> 
> Please make the test name unique.
> 
> > +strcpy (t, a);  // { dg-warning "writing 4 bytes 
> > into a region of size (between 0 and )?3 " }
> > +
> > +sink (t);
> > +  }
> > +
> > +  {
> > +const char a[] = "1234";
> > +size_t n = strlen (a);
> > +char *t = (char*)malloc (n);// { dg-message "at offset 0 to an 
> > object with size 4 allocated by 'malloc' here" "malloc note" { xfail *-*-* 
> > } }
> > +// { dg-message "at offset 0 to an 
> > object with size at most 4 allocated by 'malloc' here" "malloc note" { 
> > target *-*-* } .-1 }
> 
> Likewise.
Fixed via the attached patch which I've committed to the trunk.

jeff

commit 48e76be17adbf93fe264fc118adbcf2ae6a14806
Author: law 
Date:   Wed Jan 8 18:46:33 2020 +

* gcc.dg/Wstringop-overflow-27.c: Make testnames unique.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@280016 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 537091ffec6..622589e3db6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2020-01-08  Jeff Law  
+
+   * gcc.dg/Wstringop-overflow-27.c: Make testnames unique.
+
 2020-01-08  Joel Brobecker  
 Olivier Hainque  
 
diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c 
b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
index 249ce2b6ad5..8e2cfe30725 100644
--- a/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
+++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
@@ -260,8 +260,8 @@ void test_strcpy_warn (const char *s)
that the conversion from signed int to size_t doesn't prevent
the detection.  */
 int n = strlen (a);
-char *t = (char*)calloc (n, 1); // { dg-message "at offset 0 to an 
object with size 3 allocated by 'calloc' here" "calloc note" { xfail *-*-* } }
-// { dg-message "at offset 0 to an 
object with size at most 3 allocated by 'calloc' here" "calloc note" { target 
*-*-* } .-1 }
+char *t = (char*)calloc (n, 1); // { dg-message "at offset 0 to an 
object with size 3 allocated by 'calloc' here" "calloc note 1" { xfail *-*-* } }
+// { dg-message "at offset 0 to an 
object with size at most 3 allocated by 'calloc' here" "calloc note 2" { target 
*-*-* } .-1 }
 strcpy (t, a);  // { dg-warning "writing 4 bytes into 
a region of size (between 0 and )?3 " }
 
 sink (t);
@@ -270,8 +270,8 @@ void test_strcpy_warn (const char *s)
   {
 const char a[] = "1234";
 size_t n = strlen (a);
-char *t = (char*)malloc (n);// { dg-message "at offset 0 to an 
object with size 4 allocated by 'malloc' here" "malloc note" { xfail *-*-* } }
-// { dg-message "at offset 0 to an 
object with size at most 4 allocated by 'malloc' here" "malloc note" { target 
*-*-* } .-1 }
+char *t = (char*)malloc (n);// { dg-message "at offset 0 to an 
object with size 4 allocated by 'malloc' here" "malloc note 1" { xfail *-*-* } }
+// { dg-message "at offset 0 to an 
object with size at most 4 allocated by 'malloc' here" "malloc note 2" { target 
*-*-* } .-1 }
 strcpy (t, a);  // { dg-warning "writing 5 bytes into 
a region of size (between 0 and )?4 " }
 sink (t);
   }


[committed] hash-map-tests.c: fix memory leak

2020-01-08 Thread David Malcolm
This commit makes "make selftest-valgrind" clean by fixing this leak:

4 bytes in 1 blocks are definitely lost in loss record 1 of 734
   at 0x483AB1A: calloc (vg_replace_malloc.c:762)
   by 0x261DBE0: xcalloc (xmalloc.c:162)
   by 0x2538C46: selftest::test_map_of_strings_to_int() (hash-map-tests.c:87)
   by 0x253ABD2: selftest::hash_map_tests_c_tests() (hash-map-tests.c:307)
   by 0x24A885B: selftest::run_tests() (selftest-run-tests.c:65)
   by 0x1373D80: toplev::run_self_tests() (toplev.c:2339)
   by 0x1373FA7: toplev::main(int, char**) (toplev.c:2421)
   by 0x2550EFF: main (main.c:39)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Committed to trunk as r280015 under the "obvious" rule.

gcc/ChangeLog:
* hash-map-tests.c (selftest::test_map_of_strings_to_int): Fix
memory leak.
---
 gcc/hash-map-tests.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/hash-map-tests.c b/gcc/hash-map-tests.c
index c351bb22ddba..635740290658 100644
--- a/gcc/hash-map-tests.c
+++ b/gcc/hash-map-tests.c
@@ -101,6 +101,8 @@ test_map_of_strings_to_int ()
   ASSERT_EQ (1, string_map.elements ());
   ASSERT_EQ (true, string_map.put (another_ant, 5));
   ASSERT_EQ (1, string_map.elements ());
+
+  free (another_ant);
 }
 
 /* Construct a hash_map using int_hash and verify that
-- 
2.21.0



Re: [RFC] IVOPTs select cand with preferred D-form access

2020-01-08 Thread Segher Boessenkool
On Wed, Jan 08, 2020 at 07:12:29PM +0800, Bin.Cheng wrote:
> I am a bit worried that would make IVOPTs heavy too,

Yeah.  And ivopts already *is* heavy, by nature of what it does.  Giving
it extra work to do is not a good idea imo.

> it might be
> possible to compute heuristics whether loop should be unrolled as a
> post-IVOPTs transformation.

Right.  Either ivopts or something before it (it can be a separate pass
perhaps) should figure out which loops should be unrolled how much.
The ivopts pass can then use that information, but the actual unrolling
can be done much later (maybe even as late in RTL as it is now).

> Of course the transformation needs to do
> more work than simply unrolling in order to take advantage of
> aforementioned addressing mode.
> BTW, unrolled loop won't perform as good as ppc if the target doesn't
> support [base + register + offset] addressing mode?

PowerPC only has [reg+imm] and [reg+reg] addressing (and both of those
can do plain [reg]).  There is no [reg+reg+imm].

> Another point, in case of multiple passes doing unrolling, the
> "already unrolled" information may need to be recorded as a flag of
> loop properties.

Right, or at least store how much the current plan says to unroll each
loop, so that all passes can take that into account, or even adjust it
if that is a good idea.


Segher


Re: [PING 3][PATCH] track dynamic allocation in strlen (PR 91582)

2020-01-08 Thread Jeff Law
On Wed, 2020-01-08 at 12:52 +0100, Andreas Schwab wrote:
> On Dez 06 2019, Martin Sebor wrote:
> 
> > diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c 
> > b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
> > new file mode 100644
> > index 000..249ce2b6ad5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
> > +void test_strcpy_warn (const char *s)
> > +{
> > +  {
> > +const char a[] = "123";
> > +/* Verify that using signed int for the strlen result works (i.e.,
> > +   that the conversion from signed int to size_t doesn't prevent
> > +   the detection.  */
> > +int n = strlen (a);
> > +char *t = (char*)calloc (n, 1); // { dg-message "at offset 0 to an 
> > object with size 3 allocated by 'calloc' here" "calloc note" { xfail *-*-* 
> > } }
> > +// { dg-message "at offset 0 to an 
> > object with size at most 3 allocated by 'calloc' here" "calloc note" { 
> > target *-*-* } .-1 }
> 
> Please make the test name unique.
I've got a patch to do that in my local tree.  I'll push it
momentarily.

jeff



Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-08 Thread Kwok Cheung Yeung

On 08/01/2020 11:42 am, Andrew Stubbs wrote:

On 08/01/2020 11:07, Kwok Cheung Yeung wrote:

+#define __sync_subword_compare_and_swap(type, size)    \


Macro parameters are conventionally upper case.



Fixed. I upper-cased the macro name as well.


+    \
+type    \
+__sync_val_compare_and_swap_##size (type *ptr, type oldval, type 
newval)    \

+{    \
+  unsigned int *wordptr    \
+    = (unsigned int *)((unsigned long long) ptr & ~3ULL);    \


Please use "intptr_t" rather than "unsigned long long" (which should 
probably have been "unsigned long" anyway).




I used uintptr_t instead as we are doing unsigned operations (but it 
probably doesn't matter anyway).


+__sync_bool_compare_and_swap_##size (type *ptr, type oldval, type 
newval)   \

+{    \
+  return __sync_val_compare_and_swap_##size(ptr, oldval, newval) == 
oldval; \


Space before '('.



Fixed.

Is this version okay for trunk?

Thanks

Kwok
From a163377f719e950b0d3820b703029d133ba83637 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Thu, 21 Nov 2019 03:54:46 -0800
Subject: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap
 operations

2020-01-08  Kwok Cheung Yeung  

libgcc/
* config/gcn/atomic.c: New.
* config/gcn/t-amdgcn (LIB2ADD): Add atomic.c.
---
 libgcc/config/gcn/atomic.c | 60 ++
 libgcc/config/gcn/t-amdgcn |  3 ++-
 2 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/gcn/atomic.c

diff --git a/libgcc/config/gcn/atomic.c b/libgcc/config/gcn/atomic.c
new file mode 100644
index 000..214c9a5
--- /dev/null
+++ b/libgcc/config/gcn/atomic.c
@@ -0,0 +1,60 @@
+/* AMD GCN atomic operations
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Mentor Graphics.
+
+   This file is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 3, or (at your option) any
+   later version.
+
+   This file is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include 
+#include 
+
+#define __SYNC_SUBWORD_COMPARE_AND_SWAP(TYPE, SIZE) \
+\
+TYPE\
+__sync_val_compare_and_swap_##SIZE (TYPE *ptr, TYPE oldval, TYPE newval) \
+{   \
+  unsigned int *wordptr = (unsigned int *)((uintptr_t) ptr & ~3UL); \
+  int shift = ((uintptr_t) ptr & 3UL) * 8;  \
+  unsigned int valmask = (1 << (SIZE * 8)) - 1;
 \
+  unsigned int wordmask = ~(valmask << shift);  \
+  unsigned int oldword = *wordptr;  \
+  for (;;)  \
+{   \
+  TYPE prevval = (oldword >> shift) & valmask;  \
+  if (__builtin_expect (prevval != oldval, 0))  \
+   return prevval;  \
+  unsigned int newword = oldword & wordmask;\
+  newword |= ((unsigned int) newval) << shift;  \
+  unsigned int prevword \
+ = __sync_val_compare_and_swap_4 (wordptr, oldword, newword);   \
+  if (__builtin_expect (prevword == oldword, 1))\
+   return oldval;   \
+  oldword = prevword;   \
+}   \
+}   \
+\
+bool 

[PATCH] avoid warning on vectorized past-the-end stores (PR 93200)

2020-01-08 Thread Martin Sebor

A recent improvement to the vectorizer (r278334 if my bisection
is right) can transform multiple stores to adjacent struct members
into single vectorized assignments that write over all the members
in a single MEM_REF.  These are then flagged by -Wstringop-overflow
thanks to its also recently enhanced past-the-end store detection.
The warnings have been causing failures in some of Jeff's periodic
builds (e.g., in cjdns-v20.4).

Reliably distinguishing these transformed, multi-member, MEM_REF
stores from accidental bugs the warning is designed to detect will
require annotating them somehow at the time they are introduced.
Until that happens, the attached patch simply tweaks the logic that
determines the size of the destination objects to punt on these
vectorized MEM_REFs.

Martin
PR middle-end/93200 - spurious -Wstringop-overflow due to assignment vectorization to multiple members

gcc/testsuite/ChangeLog:

	PR middle-end/93200
	* gcc.dg/Wstringop-overflow-30.c: New test.

gcc/ChangeLog:

	PR middle-end/93200
	* builtins.c (compute_objsize): Avoid handling MEM_REFs of vector type.

Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 280012)
+++ gcc/builtins.c	(working copy)
@@ -3966,6 +3966,18 @@
   || TREE_CODE (dest) == MEM_REF)
 {
   tree ref = TREE_OPERAND (dest, 0);
+  tree reftype = TREE_TYPE (ref);
+  if (TREE_CODE (dest) == MEM_REF && TREE_CODE (reftype) == POINTER_TYPE)
+	{
+	  /* Give up for MEM_REFs of vector types; those may be synthesized
+	 from multiple assignments to consecutive data members.  See PR
+	 93200.
+	 FIXME: Deal with this more generally, e.g., by marking up such
+	 MEM_REFs at the time they're created.  */
+	  reftype = TREE_TYPE (reftype);
+	  if (TREE_CODE (reftype) == VECTOR_TYPE)
+	return NULL_TREE;
+	}
   tree off = TREE_OPERAND (dest, 1);
   if (tree size = compute_objsize (ref, ostype, pdecl, poff))
 	{
Index: gcc/testsuite/gcc.dg/Wstringop-overflow-30.c
===
--- gcc/testsuite/gcc.dg/Wstringop-overflow-30.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wstringop-overflow-30.c	(working copy)
@@ -0,0 +1,80 @@
+/* PR middle-end/93200 - spurious -Wstringop-overflow due to assignment
+   vectorization to multiple members
+   { dg-do compile }
+   { dg-options "-O3 -Wall" } */
+
+typedef __INT8_TYPE__  int8_t;
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+typedef __INT64_TYPE__ int64_t;
+
+struct A { char b, c; };
+struct B1A { int8_t i8; struct A a; };
+struct B2A { int16_t i16; struct A a; };
+struct B3A { int16_t i16; int8_t i8; struct A a; };
+struct B4A { int64_t i64; struct A a; };
+
+void ba1 (struct B1A *p)
+{
+  p->a.b = 0; p->a.c = 1;
+}
+
+void b2a (struct B2A *p)
+{
+  /* This results in:
+ vector(2) char *vectp.14_6 = _2(D)->a.b;
+ MEM  [(char *)vectp.14_6] = { 4, 5 };  */
+
+  p->a.b = 4;   // { dg-bogus "-Wstringop-overflow" }
+  p->a.c = 5;
+}
+
+void b3a (struct B3A *p)
+{
+  p->a.b = 4; p->a.c = 5;
+}
+
+void b4a (struct B4A *p)
+{
+  /* This results in:
+ vector(2) char *vectp.22_6 = _2(D)->a.b;
+ MEM  [(char *)vectp.22_6] = { 6, 7 };  */
+
+  p->a.b = 6;   // { dg-bogus "-Wstringop-overflow" }
+  p->a.c = 7;
+}
+
+
+struct Aa { char a[2], b[2]; };
+struct B1Aa { int8_t i8; struct Aa a; };
+struct B2Aa { int16_t i16; struct Aa a; };
+struct B3Aa { int16_t i16; int8_t i8; struct Aa a; };
+struct B4Aa { int64_t i64; struct Aa a; };
+
+void b1aa (struct B1Aa *p)
+{
+  p->a.a[0] = 0; p->a.a[1] = 1;
+  p->a.b[0] = 0; p->a.b[1] = 1;
+}
+
+void b2aa (struct B2Aa *p)
+{
+  p->a.a[0] = 2; p->a.a[1] = 3;
+  p->a.b[0] = 2; p->a.b[1] = 3;
+}
+
+void b3aa (struct B3Aa *p)
+{
+  p->a.a[0] = 4; p->a.a[1] = 5;
+  p->a.b[0] = 4; p->a.b[1] = 5;
+}
+
+void b4aa (struct B4Aa *p)
+{
+  /* This results in:
+ vector(4) char *vectp.36_8 = _2(D)->a.a[0];
+ MEM  [(char *)vectp.36_8] = { 6, 7, 6, 7 };  */
+
+  p->a.a[0] = 6; p->a.a[1] = 7;
+  p->a.b[0] = 6; p->a.b[1] = 7;
+}


Re: [PATCH] libstdc++: Fix error handling in filesystem::remove_all (PR93201)

2020-01-08 Thread Jonathan Wakely

On 08/01/20 16:44 +, Jonathan Wakely wrote:

When recursing into a directory, any errors that occur while removing a
directory entry are ignored, because the subsequent increment of the
directory iterator clears the error_code object.

This fixes that bug by checking the result of each recursive operation
before incrementing. This is a change in observable behaviour, because
previously other directory entries would still be removed even if one
(or more) couldn't be removed due to errors. Now the operation stops on
the first error, which is what the code intended to do all along. The
standard doesn't specify what happens in this case (because the order
that the entries are processed is unspecified anyway).

It also improves the error reporting so that the name of the file that
could not be removed is included in the filesystem_error exception. This
is done by introducing a new helper type for reporting errors with
additional context and a new function that uses that type. Then the
overload of std::filesystem::remove_all that throws an exception can use
the new function to ensure any exception contains the additional
information.

For std::experimental::filesystem::remove_all just fix the bug where
errors are ignored.

PR libstdc++/93201
* src/c++17/fs_ops.cc (do_remove_all): New function implementing more
detailed error reporting for remove_all. Check result of recursive
call before incrementing iterator.
(remove_all(const path&), remove_all(const path&, error_code&)): Use
do_remove_all.
* src/filesystem/ops.cc (remove_all(const path&, error_code&)): Check
result of recursive call before incrementing iterator.
* testsuite/27_io/filesystem/operations/remove_all.cc: Check errors
are reported correctly.
* testsuite/experimental/filesystem/operations/remove_all.cc: Likewise.


This is what I plan to commit to the branches. It doesn't improve the
error reporting, just fixes the bug. But that *does* still change the
observable behaviour when an error occurs (because now we return
immediately on the first error, instead of continuing to remove as
much as possible).

I've sent an email to the Library Evolution list to discuss the
desired behaviour. Does anybody here have an opinion? Should we
preserve the existing "keep going then report errors at the end"
behaviour for the branches, or just make this change?



commit 43899ee8cc95620b0d93560d4087ce3cc76d
Author: Jonathan Wakely 
Date:   Wed Jan 8 16:23:07 2020 +

libstdc++: Fix error handling in filesystem::remove_all (PR93201)

When recursing into a directory, any errors that occur while removing a
directory entry are ignored, because the subsequent increment of the
directory iterator clears the error_code object.

This fixes that bug by checking the result of each recursive operation
before incrementing. This is a change in observable behaviour, because
previously other directory entries would still be removed even if one
(or more) couldn't be removed due to errors. Now the operation stops on
the first error, which is what the code intended to do all along. The
standard doesn't specify what happens in this case (because the order
that the entries are processed is unspecified anyway).

PR libstdc++/93201
* src/c++17/fs_ops.cc (remove_all(const path&, error_code&)): Check
result of recursive call before incrementing iterator.
* src/filesystem/ops.cc (remove_all(const path&, error_code&)):
Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc: Check errors
are reported correctly.
* testsuite/experimental/filesystem/operations/remove_all.cc: Likewise.

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index d8064819d36..d918c2af530 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1299,12 +1299,17 @@ fs::remove_all(const path& p, error_code& ec)
   uintmax_t count = 0;
   if (s.type() == file_type::directory)
 {
-  for (directory_iterator d(p, ec), end; !ec && d != end; d.increment(ec))
-	count += fs::remove_all(d->path(), ec);
-  if (ec.value() == ENOENT)
-	ec.clear();
-  else if (ec)
-	return -1;
+  directory_iterator d(p, ec), end;
+  while (!ec && d != end)
+	{
+	  const auto removed = fs::remove_all(d->path(), ec);
+	  if (removed == numeric_limits::max())
+	return -1;
+	  count += removed;
+	  d.increment(ec);
+	  if (ec)
+	return -1;
+	}
 }
 
   if (fs::remove(p, ec))
diff --git a/libstdc++-v3/src/filesystem/ops.cc b/libstdc++-v3/src/filesystem/ops.cc
index 36b5d2c24f6..a5887f37ce1 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -1098,12 +1098,17 @@ fs::remove_all(const path& p, error_code& ec) noexcept
   uintmax_t count = 0;
   if (s.type() == 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi,
> 
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_max_agg_items
> > param_ipa_cp_unit_growth
> > param_ipa_sra_max_replacements
> 
> sent in previous separate email messages.  All of those patches passed
> bootstrap and testing and LTO bootstrap on an x86_64-linux.  Feel free
> to commit them with your other param-related patches after they are
> reviewed.
> 
> > param_max_speculative_devirt_maydefs
> >
> 
> This one is Honza's but I am quite sure you can just use cfun->decl in
> the one place where this is queried.
You do not need that. param_* already does that.  I think it is
safe to simply declare it Optimizatoin/PerFunction.

Honza
> 
> Martin


[PATCH] libstdc++: Fix error handling in filesystem::remove_all (PR93201)

2020-01-08 Thread Jonathan Wakely
When recursing into a directory, any errors that occur while removing a
directory entry are ignored, because the subsequent increment of the
directory iterator clears the error_code object.

This fixes that bug by checking the result of each recursive operation
before incrementing. This is a change in observable behaviour, because
previously other directory entries would still be removed even if one
(or more) couldn't be removed due to errors. Now the operation stops on
the first error, which is what the code intended to do all along. The
standard doesn't specify what happens in this case (because the order
that the entries are processed is unspecified anyway).

It also improves the error reporting so that the name of the file that
could not be removed is included in the filesystem_error exception. This
is done by introducing a new helper type for reporting errors with
additional context and a new function that uses that type. Then the
overload of std::filesystem::remove_all that throws an exception can use
the new function to ensure any exception contains the additional
information.

For std::experimental::filesystem::remove_all just fix the bug where
errors are ignored.

PR libstdc++/93201
* src/c++17/fs_ops.cc (do_remove_all): New function implementing more
detailed error reporting for remove_all. Check result of recursive
call before incrementing iterator.
(remove_all(const path&), remove_all(const path&, error_code&)): Use
do_remove_all.
* src/filesystem/ops.cc (remove_all(const path&, error_code&)): Check
result of recursive call before incrementing iterator.
* testsuite/27_io/filesystem/operations/remove_all.cc: Check errors
are reported correctly.
* testsuite/experimental/filesystem/operations/remove_all.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.


commit b89525a0036c9b9f6d3d6952b54624fca27d9774
Author: Jonathan Wakely 
Date:   Wed Jan 8 16:23:07 2020 +

libstdc++: Fix error handling in filesystem::remove_all (PR93201)

When recursing into a directory, any errors that occur while removing a
directory entry are ignored, because the subsequent increment of the
directory iterator clears the error_code object.

This fixes that bug by checking the result of each recursive operation
before incrementing. This is a change in observable behaviour, because
previously other directory entries would still be removed even if one
(or more) couldn't be removed due to errors. Now the operation stops on
the first error, which is what the code intended to do all along. The
standard doesn't specify what happens in this case (because the order
that the entries are processed is unspecified anyway).

It also improves the error reporting so that the name of the file that
could not be removed is included in the filesystem_error exception. This
is done by introducing a new helper type for reporting errors with
additional context and a new function that uses that type. Then the
overload of std::filesystem::remove_all that throws an exception can use
the new function to ensure any exception contains the additional
information.

For std::experimental::filesystem::remove_all just fix the bug where
errors are ignored.

PR libstdc++/93201
* src/c++17/fs_ops.cc (do_remove_all): New function implementing 
more
detailed error reporting for remove_all. Check result of recursive
call before incrementing iterator.
(remove_all(const path&), remove_all(const path&, error_code&)): Use
do_remove_all.
* src/filesystem/ops.cc (remove_all(const path&, error_code&)): 
Check
result of recursive call before incrementing iterator.
* testsuite/27_io/filesystem/operations/remove_all.cc: Check errors
are reported correctly.
* testsuite/experimental/filesystem/operations/remove_all.cc: 
Likewise.

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 8ad2e7fce1f..873f93aacfc 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1275,42 +1275,105 @@ fs::remove(const path& p, error_code& ec) noexcept
   return false;
 }
 
+namespace std::filesystem
+{
+namespace
+{
+  struct ErrorReporter
+  {
+explicit
+ErrorReporter(error_code& ec) : code()
+{ }
+
+explicit
+ErrorReporter(const char* s, const path& p)
+: code(nullptr), msg(s), path1()
+{ }
+
+error_code* code;
+const char* msg;
+const path* path1;
+
+void
+report(const error_code& ec) const
+{
+  if (code)
+   *code = ec;
+  else
+   _GLIBCXX_THROW_OR_ABORT(filesystem_error(msg, *path1, ec));
+}
+
+void
+report(const error_code& ec, const path& path2) const
+{
+  if (code)
+   *code = ec;
+  else if 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi,
> 
> On Wed, Jan 08 2020, Jan Hubicka wrote:
> >> Hi,
> >> 
> >> On Fri, Jan 03 2020, Martin Liška wrote:
> >> > Hi.
> >> >
> >> > This is similar transformation for IPA passes. This time,
> >> > one needs to use opt_for_fn in order to get the right
> >> > parameter values.
> >> >
> >> > @Martin, Honza:
> >> > There are last few remaining parameters which should use
> >> > opt_for_fn:
> >> >
> >> > param_ipa_cp_unit_growth
> >> 
> >> So as we discussed, picking this one from one particular node is not
> 
> all right, the above was perhaps confusing, the patch does not pick one
> value but keeps picking the maximum growth ratio from each and every
> node as it considers cloning opportunities...
> 
> >> what one would expect to happen, but inlining does it too and so anyway:
> >
> > Inlining does not do that.  For each inlining decision it calculcates
> > the growth accroding to function it inlines into. So if you set
> > unit-growth more for -O3 than for -O2 (as I am just finishing
> > benchmarking of) and combine both settings, the -O3 code will be allowed
> > to grow unit when -O2 code would not.  I think ipa-cp can do the same.
> >
> 
> ...and thus I believe the patch actually the patch does the same.

I guess it won't cut cloning -O2 nodes before cloning -O3 nodes?
But if that is hard to set up (it may be since it is not done in greedy
way as in inline) the patch is OK - it is definitly improvement over
what we have right now. Having one param that matters at linktime 
and rest tha tmatters at compile time would definitly be very confusing.

Honza
> 
> Martin
> 


Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi,
> 
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_sra_max_replacements
> 
> IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn
> 
> 2020-01-07  Martin Jambor  
> 
>   * params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
>   * ipa-sra.c (scanned_node): New variable.
>   (allocate_access): Use it to get param_ipa_sra_max_replacements.
>   (ipa_sra_summarize_function): Set up scanned_node.
>   (pull_accesses_from_callee): New parameter caller, use it to get
>   param_ipa_sra_max_replacements.
>   (param_splitting_across_edge): Pass the caller to
>   pull_accesses_from_callee.
> ---
>  gcc/ipa-sra.c  | 33 +
>  gcc/params.opt |  2 +-
>  2 files changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index a051a9f2154..133ed687509 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, 
> cgraph_edge *,
>  
>  /* With all GTY stuff done, we can move to anonymous namespace.  */
>  namespace {
> +/* Functions which currently has its body analyzed.  */
> +
> +cgraph_node *scanned_node;
> +
>  /* Quick mapping from a decl to its param descriptor.  */
>  
>  hash_map *decl2desc;
> @@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
>HOST_WIDE_INT offset, HOST_WIDE_INT size)
>  {
>if (desc->access_count
> -  == (unsigned) param_ipa_sra_max_replacements)
> +  == (unsigned) opt_for_fn (scanned_node->decl,
> + param_ipa_sra_max_replacements))
>  {
>disqualify_split_candidate (desc, "Too many replacement candidates");
>return NULL;
> @@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>node->order);
>if (!ipa_sra_preliminary_function_checks (node))
>  return;
> +  scanned_node = node;
>gcc_obstack_init (_obstack);
>isra_func_summary *ifs = func_sums->get_create (node);
>ifs->m_candidate = true;
> @@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
>delete decl2desc;
>decl2desc = NULL;
>obstack_free (_obstack, NULL);
> +  scanned_node = NULL;

It is your code.  having static var to track currently analyzed function
is bit ugly, and I am not sure if you don't have current_function_decl
set to that function in all cases.  But I will leave this to your
decision.

Patch is OK either in this form or with tracking updated.
Honza


Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Wed, Jan 08 2020, Jan Hubicka wrote:
>> Hi,
>> 
>> On Fri, Jan 03 2020, Martin Liška wrote:
>> > Hi.
>> >
>> > This is similar transformation for IPA passes. This time,
>> > one needs to use opt_for_fn in order to get the right
>> > parameter values.
>> >
>> > @Martin, Honza:
>> > There are last few remaining parameters which should use
>> > opt_for_fn:
>> >
>> > param_ipa_cp_unit_growth
>> 
>> So as we discussed, picking this one from one particular node is not

all right, the above was perhaps confusing, the patch does not pick one
value but keeps picking the maximum growth ratio from each and every
node as it considers cloning opportunities...

>> what one would expect to happen, but inlining does it too and so anyway:
>
> Inlining does not do that.  For each inlining decision it calculcates
> the growth accroding to function it inlines into. So if you set
> unit-growth more for -O3 than for -O2 (as I am just finishing
> benchmarking of) and combine both settings, the -O3 code will be allowed
> to grow unit when -O2 code would not.  I think ipa-cp can do the same.
>

...and thus I believe the patch actually the patch does the same.

Martin



Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:
> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_max_agg_items
> param_ipa_cp_unit_growth
> param_ipa_sra_max_replacements

sent in previous separate email messages.  All of those patches passed
bootstrap and testing and LTO bootstrap on an x86_64-linux.  Feel free
to commit them with your other param-related patches after they are
reviewed.

> param_max_speculative_devirt_maydefs
>

This one is Honza's but I am quite sure you can just use cfun->decl in
the one place where this is queried.

Martin


Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:
> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_sra_max_replacements

IPA-CP: Always access param_ipa_sra_max_replacements through opt_for_fn

2020-01-07  Martin Jambor  

* params.opt (param_ipa_sra_max_replacements): Mark as Optimization.
* ipa-sra.c (scanned_node): New variable.
(allocate_access): Use it to get param_ipa_sra_max_replacements.
(ipa_sra_summarize_function): Set up scanned_node.
(pull_accesses_from_callee): New parameter caller, use it to get
param_ipa_sra_max_replacements.
(param_splitting_across_edge): Pass the caller to
pull_accesses_from_callee.
---
 gcc/ipa-sra.c  | 33 +
 gcc/params.opt |  2 +-
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index a051a9f2154..133ed687509 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -521,6 +521,10 @@ ipa_sra_call_summaries::duplicate (cgraph_edge *, 
cgraph_edge *,
 
 /* With all GTY stuff done, we can move to anonymous namespace.  */
 namespace {
+/* Functions which currently has its body analyzed.  */
+
+cgraph_node *scanned_node;
+
 /* Quick mapping from a decl to its param descriptor.  */
 
 hash_map *decl2desc;
@@ -1265,7 +1269,8 @@ allocate_access (gensum_param_desc *desc,
 HOST_WIDE_INT offset, HOST_WIDE_INT size)
 {
   if (desc->access_count
-  == (unsigned) param_ipa_sra_max_replacements)
+  == (unsigned) opt_for_fn (scanned_node->decl,
+   param_ipa_sra_max_replacements))
 {
   disqualify_split_candidate (desc, "Too many replacement candidates");
   return NULL;
@@ -2472,6 +2477,7 @@ ipa_sra_summarize_function (cgraph_node *node)
 node->order);
   if (!ipa_sra_preliminary_function_checks (node))
 return;
+  scanned_node = node;
   gcc_obstack_init (_obstack);
   isra_func_summary *ifs = func_sums->get_create (node);
   ifs->m_candidate = true;
@@ -2526,6 +2532,7 @@ ipa_sra_summarize_function (cgraph_node *node)
   delete decl2desc;
   decl2desc = NULL;
   obstack_free (_obstack, NULL);
+  scanned_node = NULL;
   if (dump_file)
 fprintf (dump_file, "\n\n");
   if (flag_checking)
@@ -3246,16 +3253,17 @@ all_callee_accesses_present_p (isra_param_desc 
*param_desc,
 enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_CERTAIN};
 
 
-/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, if
-   they would not violate some constraint there.  If successful, return NULL,
-   otherwise return the string reason for failure (which can be written to the
-   dump file).  DELTA_OFFSET is the known offset of the actual argument withing
-   the formal parameter (so of ARG_DESCS within PARAM_DESCS), ARG_SIZE is the
-   size of the actual argument or zero, if not known. In case of success, set
-   *CHANGE_P to true if propagation actually changed anything.  */
+/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC,
+   (which belongs to CALLER) if they would not violate some constraint there.
+   If successful, return NULL, otherwise return the string reason for failure
+   (which can be written to the dump file).  DELTA_OFFSET is the known offset
+   of the actual argument withing the formal parameter (so of ARG_DESCS within
+   PARAM_DESCS), ARG_SIZE is the size of the actual argument or zero, if not
+   known. In case of success, set *CHANGE_P to true if propagation actually
+   changed anything.  */
 
 static const char *
-pull_accesses_from_callee (isra_param_desc *param_desc,
+pull_accesses_from_callee (cgraph_node *caller, isra_param_desc *param_desc,
   isra_param_desc *arg_desc,
   unsigned delta_offset, unsigned arg_size,
   bool *change_p)
@@ -3335,7 +3343,7 @@ pull_accesses_from_callee (isra_param_desc *param_desc,
   return NULL;
 
 if ((prop_count + pclen
-> (unsigned) param_ipa_sra_max_replacements)
+> (unsigned) opt_for_fn (caller->decl, param_ipa_sra_max_replacements))
|| size_would_violate_limit_p (param_desc,
   param_desc->size_reached + prop_size))
   return "propagating accesses would violate the count or size limit";
@@ -3455,7 +3463,8 @@ param_splitting_across_edge (cgraph_edge *cs)
  else
{
  const char *pull_failure
-   = pull_accesses_from_callee (param_desc, arg_desc, 0, 0, );
+   = pull_accesses_from_callee (cs->caller, param_desc, arg_desc,
+0, 0, );
  if (pull_failure)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3516,7 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi,
> 
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_cp_unit_growth
> 
> So as we discussed, picking this one from one particular node is not
> what one would expect to happen, but inlining does it too and so anyway:

Inlining does not do that.  For each inlining decision it calculcates
the growth accroding to function it inlines into. So if you set
unit-growth more for -O3 than for -O2 (as I am just finishing
benchmarking of) and combine both settings, the -O3 code will be allowed
to grow unit when -O2 code would not.  I think ipa-cp can do the same.

Honza


Add missing { dg-require-effective-target fpic } directives to aarch64 tests

2020-01-08 Thread Olivier Hainque
Hello,

This patch adds missing { dg-require-effective-target fpic }
directives to aarch64 tests using -fpic or -fPIC explicitly.

This prevents spurious test failures on configurations not
supporting the options, such as VxWorks for at least kernel
mode on any target.

Committing to trunk after regtest on aarch64-linux, based on
the pre-approval agreed upon there:

  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01122.html

We have a few patches doing a similar thing for other
target test sets or for common tests, which we'll post
separately (and I presume will call for separate approvals,
so we'll probably defer until next stage1).

Thanks!

With Kind Regards,

Olivier

2020-01-08  Joel Brobecker  
Olivier Hainque  

testsuite/
* g++.target/aarch64/sve/tls_2.C: Add missing
{ dg-require-effective-target fpic } directive.
* gcc.target/aarch64/noplt_2.c: Likewise.
* gcc.target/aarch64/noplt_3.c: Likewise.
* gcc.target/aarch64/pic-constantpool1.c: Likewise.
* gcc.target/aarch64/pic-small.c: Likewise.
* gcc.target/aarch64/pic-symrefplus.c: Likewise.
* gcc.target/aarch64/pr66912.c: Likewise.
* gcc.target/aarch64/sve/tls_1.c: Likewise.
* gcc.target/aarch64/sve/tls_preserve_1.c: Likewise.
* gcc.target/aarch64/sve/tls_preserve_2.c: Likewise.
* gcc.target/aarch64/sve/tls_preserve_3.c: Likewise.
* gcc.target/aarch64/tlsie_tiny_1.c: Likewise.
* gcc.target/aarch64/tlsle12_1.c: Likewise.
* gcc.target/aarch64/tlsle12_tiny_1.c: Likewise.
* gcc.target/aarch64/tlsle24_1.c: Likewise.
* gcc.target/aarch64/tlsle24_tiny_1.c: Likewise.
* gcc.target/aarch64/tlsle32_1.c: Likewise.
* gcc.target/aarch64/tlsle_sizeadj_small_1.c: Likewise.
* gcc.target/aarch64/tlsle_sizeadj_tiny_1.c: Likewise.

diff --git a/gcc/testsuite/g++.target/aarch64/sve/tls_2.C 
b/gcc/testsuite/g++.target/aarch64/sve/tls_2.C
index 9267f1e92d1..a1a2c85e591 100644
--- a/gcc/testsuite/g++.target/aarch64/sve/tls_2.C
+++ b/gcc/testsuite/g++.target/aarch64/sve/tls_2.C
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target tls } */
 /* { dg-options "-O2 -fPIC -msve-vector-bits=256" } */
+/* { dg-require-effective-target fpic } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c 
b/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
index 9ee772f87f4..46687bafe8b 100644
--- a/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fpic -fdump-rtl-loop2_invariant" } */
+/* { dg-require-effective-target fpic } */
 /* { dg-skip-if "Load/Store hoisted by RTL PRE already" { aarch64*-*-* }  { 
"-mcmodel=tiny" "-mcmodel=large" } { "" } } */
 
 int bar (int);
diff --git a/gcc/testsuite/gcc.target/aarch64/noplt_1.c 
b/gcc/testsuite/gcc.target/aarch64/noplt_1.c
index 731fcaea23f..f99a30aeb0e 100644
--- a/gcc/testsuite/gcc.target/aarch64/noplt_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/noplt_1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fpic -fno-plt" } */
+/* { dg-require-effective-target fpic } */
 /* { dg-skip-if "-mcmodel=large, no support for -fpic" { aarch64-*-* }  { 
"-mcmodel=large" } { "" } } */
 
 int* bar (void) ;
diff --git a/gcc/testsuite/gcc.target/aarch64/noplt_2.c 
b/gcc/testsuite/gcc.target/aarch64/noplt_2.c
index 3be94aafc66..8d0b899fd60 100644
--- a/gcc/testsuite/gcc.target/aarch64/noplt_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/noplt_2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fpic" } */
+/* { dg-require-effective-target fpic } */
 /* { dg-skip-if "-mcmodel=large, no support for -fpic" { aarch64-*-* }  { 
"-mcmodel=large" } { "" } } */
 
 __attribute__ ((noplt))
diff --git a/gcc/testsuite/gcc.target/aarch64/noplt_3.c 
b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
index a3826184549..450cc1aaf50 100644
--- a/gcc/testsuite/gcc.target/aarch64/noplt_3.c
+++ b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fpic -fno-plt" } */
+/* { dg-require-effective-target fpic } */
 /* { dg-skip-if "-mcmodel=large, no support for -fpic" { aarch64-*-* }  { 
"-mcmodel=large" } { "" } } */
 
 int dec (int);
diff --git a/gcc/testsuite/gcc.target/aarch64/pic-constantpool1.c 
b/gcc/testsuite/gcc.target/aarch64/pic-constantpool1.c
index 043f1ee2c0d..755c0b67ea4 100644
--- a/gcc/testsuite/gcc.target/aarch64/pic-constantpool1.c
+++ b/gcc/testsuite/gcc.target/aarch64/pic-constantpool1.c
@@ -1,5 +1,6 @@
 /* { dg-options "-O2 -mcmodel=small -fPIC" }  */
 /* { dg-do compile } */
+/* { dg-require-effective-target fpic } */
 
 extern int __finite (double __value) __attribute__ ((__nothrow__)) 
__attribute__ ((__const__));
 extern int __finitef (float __value) __attribute__ ((__nothrow__)) 
__attribute__ ((__const__));
diff --git 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:
> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_cp_unit_growth

So as we discussed, picking this one from one particular node is not
what one would expect to happen, but inlining does it too and so anyway:


IPA-CP: Always access param_ipcp_unit_growth through opt_for_fn

2020-01-07  Martin Jambor  

* params.opt (param_ipcp_unit_growth): Mark as Optimization.
* ipa-cp.c (max_new_size): Removed.
(orig_overall_size): New variable.
(get_max_overall_size): New function.
(estimate_local_effects): Use it.  Adjust dump.
(decide_about_value): Likewise.
(ipcp_propagate_stage): Do not calculate max_new_size, just store
orig_overall_size.  Adjust dump.
(ipa_cp_c_finalize): Clear orig_overall_size instead of max_new_size.
---
 gcc/ipa-cp.c   | 39 ++-
 gcc/params.opt |  2 +-
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 9e20e278eff..c2572e3e0e8 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -375,7 +375,7 @@ static profile_count max_count;
 
 /* Original overall size of the program.  */
 
-static long overall_size, max_new_size;
+static long overall_size, orig_overall_size;
 
 /* Node name to unique clone suffix number map.  */
 static hash_map *clone_num_suffixes;
@@ -3395,6 +3395,23 @@ perform_estimation_of_a_value (cgraph_node *node, 
vec known_csts,
   val->local_size_cost = size;
 }
 
+/* Get the overall limit oof growth based on parameters extracted from growth.
+   it does not really make sense to mix functions with different overall growth
+   limits but it is possible and if it happens, we do not want to select one
+   limit at random.  */
+
+static long
+get_max_overall_size (cgraph_node *node)
+{
+  long max_new_size = orig_overall_size;
+  long large_unit = opt_for_fn (node->decl, param_large_unit_insns);
+  if (max_new_size < large_unit)
+max_new_size = large_unit;
+  int unit_growth = opt_for_fn (node->decl, param_ipcp_unit_growth);
+  max_new_size += max_new_size * unit_growth / 100 + 1;
+  return max_new_size;
+}
+
 /* Iterate over known values of parameters of NODE and estimate the local
effects in terms of time and size they have.  */
 
@@ -3457,7 +3474,7 @@ estimate_local_effects (struct cgraph_node *node)
   stats.freq_sum, stats.count_sum,
   size))
{
- if (size + overall_size <= max_new_size)
+ if (size + overall_size <= get_max_overall_size (node))
{
  info->do_clone_for_all_contexts = true;
  overall_size += size;
@@ -3467,8 +3484,8 @@ estimate_local_effects (struct cgraph_node *node)
 "known contexts, growth deemed beneficial.\n");
}
  else if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "   Not cloning for all contexts because "
-"max_new_size would be reached with %li.\n",
+   fprintf (dump_file, "  Not cloning for all contexts because "
+"maximum unit size would be reached with %li.\n",
 size + overall_size);
}
   else if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3860,14 +3877,10 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
 max_count = max_count.max (node->count.ipa ());
   }
 
-  max_new_size = overall_size;
-  if (max_new_size < param_large_unit_insns)
-max_new_size = param_large_unit_insns;
-  max_new_size += max_new_size * param_ipcp_unit_growth / 100 + 1;
+  orig_overall_size = overall_size;
 
   if (dump_file)
-fprintf (dump_file, "\noverall_size: %li, max_new_size: %li\n",
-overall_size, max_new_size);
+fprintf (dump_file, "\noverall_size: %li\n", overall_size);
 
   propagate_constants_topo (topo);
   if (flag_checking)
@@ -5380,11 +5393,11 @@ decide_about_value (struct cgraph_node *node, int 
index, HOST_WIDE_INT offset,
   perhaps_add_new_callers (node, val);
   return false;
 }
-  else if (val->local_size_cost + overall_size > max_new_size)
+  else if (val->local_size_cost + overall_size > get_max_overall_size (node))
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "   Ignoring candidate value because "
-"max_new_size would be reached with %li.\n",
+"maximum unit size would be reached with %li.\n",
 val->local_size_cost + overall_size);
   return false;
 }
@@ -5928,6 +5941,6 @@ ipa_cp_c_finalize (void)
 {
   max_count = profile_count::uninitialized ();
   overall_size = 0;
-  max_new_size = 0;
+  orig_overall_size = 0;
   

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi.
> 
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
> 
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
> 
> param_ipa_max_agg_items
> param_ipa_cp_unit_growth
> param_ipa_sra_max_replacements
> param_max_speculative_devirt_maydefs
max_speculative_devirt_maydefs is used only on analysis stage when cfun
is set up properly, so it is safe to set it PerFunction.

Honza
> 
> Can you please help me with these as it's in your code?
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2020-01-03  Martin Liska  
> 
>   * auto-profile.c (auto_profile): Use opt_for_fn
>   for a parameter.
>   * ipa-cp.c (ipcp_lattice::add_value): Likewise.
>   (propagate_vals_across_arith_jfunc): Likewise.
>   (hint_time_bonus): Likewise.
>   (incorporate_penalties): Likewise.
>   (good_cloning_opportunity_p): Likewise.
>   (perform_estimation_of_a_value): Likewise.
>   (estimate_local_effects): Likewise.
>   (ipcp_propagate_stage): Likewise.
>   * ipa-fnsummary.c (decompose_param_expr): Likewise.
>   (set_switch_stmt_execution_predicate): Likewise.
>   (analyze_function_body): Likewise.
>   * ipa-inline-analysis.c (offline_size): Likewise.
>   * ipa-inline.c (early_inliner): Likewise.
>   * ipa-prop.c (ipa_analyze_node): Likewise.
>   (ipcp_transform_function): Likewise.
>   * ipa-sra.c (process_scan_results): Likewise.
>   (ipa_sra_summarize_function): Likewise.
>   * params.opt: Rename ipcp-unit-growth to
>   ipa-cp-unit-growth.  Add Optimization for various
>   IPA-related parameters.
> ---
>  gcc/auto-profile.c|  3 ++-
>  gcc/ipa-cp.c  | 44 +++
>  gcc/ipa-fnsummary.c   |  7 ---
>  gcc/ipa-inline-analysis.c |  7 ---
>  gcc/ipa-inline.c  |  6 --
>  gcc/ipa-prop.c|  4 ++--
>  gcc/ipa-sra.c |  6 --
>  gcc/params.opt| 34 +++---
>  8 files changed, 63 insertions(+), 48 deletions(-)
> 
> 

> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
> index 6aca2f29022..c5e9f1336a7 100644
> --- a/gcc/auto-profile.c
> +++ b/gcc/auto-profile.c
> @@ -1628,7 +1628,8 @@ auto_profile (void)
> function before annotation, so the profile inside bar@loc_foo2
> will be useful.  */
>  autofdo::stmt_set promoted_stmts;
> -for (int i = 0; i < param_early_inliner_max_iterations; i++)
> +for (int i = 0; i < opt_for_fn (node->decl,
> + param_early_inliner_max_iterations); i++)
>{
>  if (!flag_value_profile_transformations
>  || !autofdo::afdo_vpt_for_early_inline (_stmts))
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 43c0d5a6706..def89471abb 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -1859,7 +1859,8 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   return false;
>}
>  
> -  if (!unlimited && values_count == param_ipa_cp_value_list_size)
> +  if (!unlimited && values_count == opt_for_fn (cs->caller->decl,
> + param_ipa_cp_value_list_size))
>  {
>/* We can only free sources, not the values themselves, because sources
>of other values in this SCC might point to them.   */
> @@ -1986,12 +1987,15 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
>  {
>int i;
>  
> -  if (src_lat != dest_lat || param_ipa_cp_max_recursive_depth < 1)
> +  int max_recursive_depth = opt_for_fn(cs->caller->decl,
> +param_ipa_cp_max_recursive_depth);
> +  if (src_lat != dest_lat || max_recursive_depth < 1)
>   return dest_lat->set_contains_variable ();
>  
>/* No benefit if recursive execution is in low probability.  */
>if (cs->sreal_frequency () * 100
> -   <= ((sreal) 1) * param_ipa_cp_min_recursive_probability)
> +   <= ((sreal) 1) * opt_for_fn (cs->caller->decl,
> +param_ipa_cp_min_recursive_probability))
>   return dest_lat->set_contains_variable ();
>  
>auto_vec *, 8> val_seeds;
> @@ -2019,7 +2023,7 @@ propagate_vals_across_arith_jfunc (cgraph_edge *cs,
>/* Recursively generate lattice values with a limited count.  */
>FOR_EACH_VEC_ELT (val_seeds, i, src_val)
>   {
> -   for (int j = 1; j < param_ipa_cp_max_recursive_depth; j++)
> +   for (int j = 1; j < max_recursive_depth; j++)
>   {
> tree cstval = get_val_across_arith_op (opcode, opnd1_type, opnd2,
>src_val, res_type);
> @@ -3155,11 +3159,11 @@ devirtualization_time_bonus (struct cgraph_node *node,
>  /* Return time bonus incurred 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Jan Hubicka
> Hi,
> 
> On Fri, Jan 03 2020, Martin Liška wrote:
> > Hi.
> >
> > This is similar transformation for IPA passes. This time,
> > one needs to use opt_for_fn in order to get the right
> > parameter values.
> >
> > @Martin, Honza:
> > There are last few remaining parameters which should use
> > opt_for_fn:
> >
> > param_ipa_max_agg_items
> 
> 
> IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn
> 
> 2020-01-07  Martin Jambor  
> 
>   * params.opt (param_ipa_max_agg_items): Mark as Optimization
>   * ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
>   instead of param_ipa_max_agg_items.
>   (merge_aggregate_lattices): Extract param_ipa_max_agg_items from
>   optimization info for the callee.
OK, thanks (to both Martins) for working on this

Honza
> ---
>  gcc/ipa-cp.c   | 14 +-
>  gcc/ipa-prop.c |  8 +---
>  gcc/params.opt |  2 +-
>  3 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 4381b35a809..9e20e278eff 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices 
> *dest_plats,
> unless there are too many already.  If there are two many, return false.  
> If
> there are overlaps turn whole DEST_PLATS to bottom and return false.  If 
> any
> skipped lattices were newly marked as containing variable, set *CHANGE to
> -   true.  */
> +   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
>  
>  static bool
>  merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
>HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
>struct ipcp_agg_lattice ***aglat,
> -  bool pre_existing, bool *change)
> +  bool pre_existing, bool *change, int max_agg_items)
>  {
>gcc_checking_assert (offset >= 0);
>  
> @@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices 
> *dest_plats,
> set_agg_lats_to_bottom (dest_plats);
> return false;
>   }
> -  if (dest_plats->aggs_count == param_ipa_max_agg_items)
> +  if (dest_plats->aggs_count == max_agg_items)
>   return false;
>dest_plats->aggs_count++;
>new_al = ipcp_agg_lattice_pool.allocate ();
> @@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
>  ret |= set_agg_lats_contain_variable (dest_plats);
>dst_aglat = _plats->aggs;
>  
> +  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
> +   param_ipa_max_agg_items);
>for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
> src_aglat;
> src_aglat = src_aglat->next)
> @@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
>if (new_offset < 0)
>   continue;
>if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
> -_aglat, pre_existing, ))
> +_aglat, pre_existing, , max_agg_items))
>   {
> struct ipcp_agg_lattice *new_al = *dst_aglat;
>  
> @@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
> *cs,
>if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
>   return true;
>  
> +  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
> +   param_ipa_max_agg_items);
>FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
>   {
> HOST_WIDE_INT val_size;
> @@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
> *cs,
> val_size = tree_to_shwi (TYPE_SIZE (item->type));
>  
> if (merge_agg_lats_step (dest_plats, item->offset, val_size,
> -, pre_existing, ))
> +, pre_existing, , max_agg_items))
>   {
> ret |= propagate_aggregate_lattice (cs, item, *aglat);
> aglat = &(*aglat)->next;
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index fcb13dfbac4..1ccdbbfb8c0 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -1852,8 +1852,10 @@ determine_known_aggregate_parts (struct 
> ipa_func_body_info *fbi,
>tree arg_base;
>bool check_ref, by_ref;
>ao_ref r;
> +  unsigned max_agg_items = opt_for_fn (fbi->node->decl,
> +param_ipa_max_agg_items);
>  
> -  if (param_ipa_max_agg_items == 0)
> +  if (max_agg_items == 0)
>  return;
>  
>/* The function operates in three stages.  First, we prepare check_ref, r,
> @@ -1951,14 +1953,14 @@ determine_known_aggregate_parts (struct 
> ipa_func_body_info *fbi,
>operands, whose definitions can finally reach the call.  */
> add_to_agg_contents_list (, (*copy = *content, copy));
>  
> -   if (++value_count == param_ipa_max_agg_items)
> +   if (++value_count == max_agg_items)
>   break;
>   }
>  
> /* Add 

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-08 Thread Martin Jambor
Hi,

On Fri, Jan 03 2020, Martin Liška wrote:
> Hi.
>
> This is similar transformation for IPA passes. This time,
> one needs to use opt_for_fn in order to get the right
> parameter values.
>
> @Martin, Honza:
> There are last few remaining parameters which should use
> opt_for_fn:
>
> param_ipa_max_agg_items


IPA-CP: Always access param_ipa_max_agg_items through opt_for_fn

2020-01-07  Martin Jambor  

* params.opt (param_ipa_max_agg_items): Mark as Optimization
* ipa-cp.c (merge_agg_lats_step): New parameter max_agg_items, use
instead of param_ipa_max_agg_items.
(merge_aggregate_lattices): Extract param_ipa_max_agg_items from
optimization info for the callee.
---
 gcc/ipa-cp.c   | 14 +-
 gcc/ipa-prop.c |  8 +---
 gcc/params.opt |  2 +-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4381b35a809..9e20e278eff 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2458,13 +2458,13 @@ set_check_aggs_by_ref (class ipcp_param_lattices 
*dest_plats,
unless there are too many already.  If there are two many, return false.  If
there are overlaps turn whole DEST_PLATS to bottom and return false.  If any
skipped lattices were newly marked as containing variable, set *CHANGE to
-   true.  */
+   true.  MAX_AGG_ITEMS is the maximum number of lattices.  */
 
 static bool
 merge_agg_lats_step (class ipcp_param_lattices *dest_plats,
 HOST_WIDE_INT offset, HOST_WIDE_INT val_size,
 struct ipcp_agg_lattice ***aglat,
-bool pre_existing, bool *change)
+bool pre_existing, bool *change, int max_agg_items)
 {
   gcc_checking_assert (offset >= 0);
 
@@ -2499,7 +2499,7 @@ merge_agg_lats_step (class ipcp_param_lattices 
*dest_plats,
  set_agg_lats_to_bottom (dest_plats);
  return false;
}
-  if (dest_plats->aggs_count == param_ipa_max_agg_items)
+  if (dest_plats->aggs_count == max_agg_items)
return false;
   dest_plats->aggs_count++;
   new_al = ipcp_agg_lattice_pool.allocate ();
@@ -2553,6 +2553,8 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
 ret |= set_agg_lats_contain_variable (dest_plats);
   dst_aglat = _plats->aggs;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+ param_ipa_max_agg_items);
   for (struct ipcp_agg_lattice *src_aglat = src_plats->aggs;
src_aglat;
src_aglat = src_aglat->next)
@@ -2562,7 +2564,7 @@ merge_aggregate_lattices (struct cgraph_edge *cs,
   if (new_offset < 0)
continue;
   if (merge_agg_lats_step (dest_plats, new_offset, src_aglat->size,
-  _aglat, pre_existing, ))
+  _aglat, pre_existing, , max_agg_items))
{
  struct ipcp_agg_lattice *new_al = *dst_aglat;
 
@@ -2742,6 +2744,8 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
*cs,
   if (set_check_aggs_by_ref (dest_plats, jfunc->agg.by_ref))
return true;
 
+  int max_agg_items = opt_for_fn (cs->callee->function_symbol ()->decl,
+ param_ipa_max_agg_items);
   FOR_EACH_VEC_ELT (*jfunc->agg.items, i, item)
{
  HOST_WIDE_INT val_size;
@@ -2751,7 +2755,7 @@ propagate_aggs_across_jump_function (struct cgraph_edge 
*cs,
  val_size = tree_to_shwi (TYPE_SIZE (item->type));
 
  if (merge_agg_lats_step (dest_plats, item->offset, val_size,
-  , pre_existing, ))
+  , pre_existing, , max_agg_items))
{
  ret |= propagate_aggregate_lattice (cs, item, *aglat);
  aglat = &(*aglat)->next;
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index fcb13dfbac4..1ccdbbfb8c0 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1852,8 +1852,10 @@ determine_known_aggregate_parts (struct 
ipa_func_body_info *fbi,
   tree arg_base;
   bool check_ref, by_ref;
   ao_ref r;
+  unsigned max_agg_items = opt_for_fn (fbi->node->decl,
+  param_ipa_max_agg_items);
 
-  if (param_ipa_max_agg_items == 0)
+  if (max_agg_items == 0)
 return;
 
   /* The function operates in three stages.  First, we prepare check_ref, r,
@@ -1951,14 +1953,14 @@ determine_known_aggregate_parts (struct 
ipa_func_body_info *fbi,
 operands, whose definitions can finally reach the call.  */
  add_to_agg_contents_list (, (*copy = *content, copy));
 
- if (++value_count == param_ipa_max_agg_items)
+ if (++value_count == max_agg_items)
break;
}
 
  /* Add to the list consisting of all dominating virtual operands.  */
  add_to_agg_contents_list (_list, content);
 
- if (++item_count == 2 * param_ipa_max_agg_items)
+ if (++item_count == 2 * max_agg_items)

[OpenACC] libgomp.texi — document acc_*_async and acc_*_finalize(_async) functions

2020-01-08 Thread Tobias Burnus
When looking at libgomp.texi the other day, I saw that the acc_*_async 
variants and the acc_*_finalize functions of OpenACC 2.5 were not 
documented.


Hence, this patch adds them. Those are part of OpenACC 2.5, hence, I 
updated the @ref (but referenced to OpenACC 2.6 instead).


Possible variants:
(a) update all acc_* calls to OpenACC 2.6 @refs
(b) defer updating the @ref until the OpenACC version is bumped from 2.0 
(alias 201306) to OpenACC 2.6 (alias 201711). [Cf. OG9 branch's 
7a22697197b85931d9fda66e8b0f75171ea13b43]
(c) Independent of the @ref: write the variable-type declarations for 
Fortran en bloc after all the "subroutine" as they are the same – 
especially useful for acc_copyout* which has 8 variants. That's how 
OpenACC 2.7's spec does it.


Regarding (c): If one goes for that change, does one keep the 
"INTERFACE" string in the table for each "subroutine" line? And what do 
to about the variable-declaration lines? Adding a single "ARGUMENTS" 
before the first of those (i.e. in the "a" line)?


Comments, suggestions, approval?

Tobias

	* libgomp.texi (OpenACC Runtime Library Routines): Document *_async
	and *_finalize variants.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index f082a4a401b..af2c8bee0aa 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -2371,6 +2371,7 @@ variable or array element and @var{len} specifies the length in bytes.
 @item @emph{C/C++}:
 @multitable @columnfractions .20 .80
 @item @emph{Prototype}: @tab @code{void *acc_copyin(h_void *a, size_t len);}
+@item @emph{Prototype}: @tab @code{void *acc_copyin_async(h_void *a, size_t len, int async);}
 @end multitable
 
 @item @emph{Fortran}:
@@ -2380,11 +2381,18 @@ variable or array element and @var{len} specifies the length in bytes.
 @item @emph{Interface}: @tab @code{subroutine acc_copyin(a, len)}
 @item   @tab @code{type, dimension(:[,:]...) :: a}
 @item   @tab @code{integer len}
+@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, async)}
+@item   @tab @code{type, dimension(:[,:]...) :: a}
+@item   @tab @code{integer(acc_handle_kind) :: async}
+@item @emph{Interface}: @tab @code{subroutine acc_copyin_async(a, len, async)}
+@item   @tab @code{type, dimension(:[,:]...) :: a}
+@item   @tab @code{integer len}
+@item   @tab @code{integer(acc_handle_kind) :: async}
 @end multitable
 
 @item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.0}, section
-3.2.17.
+@uref{https://www.openacc.org, OpenACC specification v2.6}, section
+3.2.20.
 @end table
 
 
@@ -2444,6 +2452,7 @@ array element and @var{len} specifies the length in bytes.
 @item @emph{C/C++}:
 @multitable @columnfractions .20 .80
 @item @emph{Prototype}: @tab @code{void *acc_create(h_void *a, size_t len);}
+@item @emph{Prototype}: @tab @code{void *acc_create_async(h_void *a, size_t len, int async);}
 @end multitable
 
 @item @emph{Fortran}:
@@ -2453,11 +2462,18 @@ array element and @var{len} specifies the length in bytes.
 @item @emph{Interface}: @tab @code{subroutine acc_create(a, len)}
 @item   @tab @code{type, dimension(:[,:]...) :: a}
 @item   @tab @code{integer len}
+@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, async)}
+@item   @tab @code{type, dimension(:[,:]...) :: a}
+@item   @tab @code{integer(acc_handle_kind) :: async}
+@item @emph{Interface}: @tab @code{subroutine acc_create_async(a, len, async)}
+@item   @tab @code{type, dimension(:[,:]...) :: a}
+@item   @tab @code{integer len}
+@item   @tab @code{integer(acc_handle_kind) :: async}
 @end multitable
 
 @item @emph{Reference}:
-@uref{https://www.openacc.org, OpenACC specification v2.0}, section
-3.2.19.
+@uref{https://www.openacc.org, OpenACC specification v2.6}, section
+3.2.21.
 @end table
 
 
@@ -2517,6 +2533,9 @@ array element and @var{len} specifies the length in bytes.
 @item @emph{C/C++}:
 @multitable @columnfractions .20 .80
 @item @emph{Prototype}: @tab @code{acc_copyout(h_void *a, size_t len);}
+@item @emph{Prototype}: @tab @code{acc_copyout_async(h_void *a, size_t len, int async);}
+@item @emph{Prototype}: @tab @code{acc_copyout_finalize(h_void *a, size_t len);}
+@item @emph{Prototype}: @tab @code{acc_copyout_finalize_async(h_void *a, size_t len, int async);}
 @end multitable
 
 @item @emph{Fortran}:
@@ -2526,11 +2545,30 @@ array element and @var{len} specifies the length in bytes.
 @item @emph{Interface}: @tab @code{subroutine acc_copyout(a, len)}
 @item   @tab @code{type, dimension(:[,:]...) :: a}
 @item   @tab @code{integer len}
+@item @emph{Interface}: @tab @code{subroutine acc_copyout_async(a, async)}
+@item   @tab @code{type, dimension(:[,:]...) :: a}
+@item   @tab @code{integer(acc_handle_kind) :: async}
+@item 

Re: [patch] relax aarch64 stack-clash tests depedence on alloca.h

2020-01-08 Thread Olivier Hainque



> On 7 Jan 2020, at 18:21, Richard Sandiford  wrote:

>>  * gcc.target/aarch64/stack-check-alloca.h: Remove
>>  #include alloca.h. #define alloca __builtin_alloca
>>  instead.

> OK, thanks.

Great, thanks Richard!



Re: [PATCH] Fix x86 ICE when peepholing2 @stack_protect_set_1_ with *lea (PR target/93187)

2020-01-08 Thread Uros Bizjak
On Wed, Jan 8, 2020 at 8:58 AM Jakub Jelinek  wrote:
>
> Hi!
>
> On the following testcase, the peephole2s merge @stack_protect_set_1_
> with not the expected *mov{si,di}_internal, but *lea instead -
> which looks like a mov, but uses address_no_seg_operand predicate/Ts
> constraint.  The peephole2s check that operand with several smaller
> predicates, as we do not want to match anything not matched by the
> constraints used in the *stack_protect_set_{2_,3} patterns,
> and I thought those predicates together are subset of general_operand,
> which is used as the predicate in those patterns,
> but apparently that is not the case.  So this patch also verifies
> the operand is general_operand.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-01-07  Jakub Jelinek  
>
> PR target/93187
> * config/i386/i386.md (*stack_protect_set_2_ peephole2,
> *stack_protect_set_3 peephole2): Also check that the second
> insns source is general_operand.
>
> * g++.dg/opt/pr93187.C: New test.

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2020-01-07 18:12:48.043555173 +0100
> +++ gcc/config/i386/i386.md 2020-01-07 20:18:27.666952849 +0100
> @@ -20084,6 +20084,7 @@ (define_peephole2
>(set (match_operand:SI 3 "general_reg_operand")
> (match_operand:SI 4))]
>   "REGNO (operands[2]) == REGNO (operands[3])
> +  && general_operand (operands[4], SImode)
>&& (general_reg_operand (operands[4], SImode)
>|| memory_operand (operands[4], SImode)
>|| immediate_operand (operands[4], SImode))
> @@ -20128,6 +20129,7 @@ (define_peephole2
>  (clobber (reg:CC FLAGS_REG))])
>(set (match_dup 2) (match_operand:DI 3))]
>   "TARGET_64BIT
> +  && general_operand (operands[3], DImode)
>&& (general_reg_operand (operands[3], DImode)
>|| memory_operand (operands[3], DImode)
>|| x86_64_zext_immediate_operand (operands[3], DImode)
> --- gcc/testsuite/g++.dg/opt/pr93187.C.jj   2020-01-07 20:20:29.467117172 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr93187.C  2020-01-07 20:21:40.459047146 +0100
> @@ -0,0 +1,77 @@
> +// PR target/93187
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2" }
> +// { dg-additional-options "-fstack-protector-strong" { target 
> fstack_protector } }
> +// { dg-additional-options "-fpie" { target pie } }
> +
> +struct A;
> +struct B;
> +struct C { int operator () (B, const B &); };
> +struct D { typedef int *d; };
> +struct E { C g; };
> +struct F { F (D::d); friend bool operator==(F &, const int &); };
> +template  struct H {
> +  typedef D *I;
> +  E l;
> +  I foo ();
> +  T t;
> +  F bar (I, const T &);
> +  F baz (const T &);
> +};
> +template 
> +F
> +H::bar (I n, const T )
> +{
> +  while (n)
> +if (l.g (t, o))
> +  n = 0;
> +  return 0;
> +}
> +template 
> +F
> +H::baz (const T )
> +{
> +  D *r = foo ();
> +  F p = bar (r, n);
> +  return p == 0 ? 0 : p;
> +}
> +template  struct J {
> +  H h;
> +  B 
> +  void baz () { h.baz (q); }
> +};
> +enum K { L };
> +template  struct M;
> +template  struct G {
> +  using N = J;
> +  N *operator->();
> +};
> +template  struct M : public G {
> +  using N = J;
> +  N *foo () { return n; }
> +  N *n;
> +  int o;
> +};
> +template 
> +inline typename G::N *
> +G::operator-> ()
> +{
> +  N *n = static_cast> *>(this)->foo ();
> +  return n;
> +}
> +struct B { bool qux (); };
> +struct O {
> +  struct P { M p; };
> +  static thread_local P o;
> +  int baz () const;
> +};
> +thread_local O::P O::o;
> +B be;
> +int
> +O::baz () const
> +{
> +  do
> +o.p->baz ();
> +  while (be.qux ());
> +  __builtin_unreachable ();
> +}
>
> Jakub
>


Re: [PATCH] Fix ia32 ICE while compiling glibc (PR target/93174)

2020-01-08 Thread Uros Bizjak
On Wed, Jan 8, 2020 at 8:48 AM Jakub Jelinek  wrote:
>
> Hi!
>
> Joseph reported ia32 glibc build ICEs, because the
> *adddi3_doubleword_cc_overflow_1 pattern allows a memory output and matching
> input, but addcarry* to which it splits doesn't, for some strange
> reason it only allows register output.  As it emits an adc instruction
> which is very similar to add, I don't see the point of doing in the
> predicates/constraints anything other than what add does.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2020-01-07  Jakub Jelinek  
>
> PR target/93174
> * config/i386/i386.md (addcarry_0): Use nonimmediate_operand
> predicate for output operand instead of register_operand.
> (addcarry, addcarry_1): Likewise.  Add alternative with
> memory destination and non-memory operands[2].
>
> * gcc.c-torture/compile/pr93174.c: New test.

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.md.jj  2020-01-07 10:54:45.924177633 +0100
> +++ gcc/config/i386/i386.md 2020-01-07 18:12:48.043555173 +0100
> @@ -6786,13 +6786,13 @@ (define_insn "addcarry"
>   (plus:SWI48
> (match_operator:SWI48 5 "ix86_carry_flag_operator"
>   [(match_operand 3 "flags_reg_operand") (const_int 0)])
> -   (match_operand:SWI48 1 "nonimmediate_operand" "%0"))
> - (match_operand:SWI48 2 "nonimmediate_operand" "rm")))
> +   (match_operand:SWI48 1 "nonimmediate_operand" "%0,0"))
> + (match_operand:SWI48 2 "nonimmediate_operand" "r,rm")))
>   (plus:
> (zero_extend: (match_dup 2))
> (match_operator: 4 "ix86_carry_flag_operator"
>   [(match_dup 3) (const_int 0)]
> -   (set (match_operand:SWI48 0 "register_operand" "=r")
> +   (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r")
> (plus:SWI48 (plus:SWI48 (match_op_dup 5
>  [(match_dup 3) (const_int 0)])
> (match_dup 1))
> @@ -6812,7 +6812,7 @@ (define_expand "addcarry_0"
>(match_operand:SWI48 1 "nonimmediate_operand")
>(match_operand:SWI48 2 "x86_64_general_operand"))
>  (match_dup 1)))
> -  (set (match_operand:SWI48 0 "register_operand")
> +  (set (match_operand:SWI48 0 "nonimmediate_operand")
>(plus:SWI48 (match_dup 1) (match_dup 2)))])]
>"ix86_binary_operator_ok (PLUS, mode, operands)")
>
> @@ -6830,7 +6830,7 @@ (define_insn "*addcarry_1"
> (match_operand: 6 "const_scalar_int_operand" "")
> (match_operator: 4 "ix86_carry_flag_operator"
>   [(match_dup 3) (const_int 0)]
> -   (set (match_operand:SWI48 0 "register_operand" "=r")
> +   (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm")
> (plus:SWI48 (plus:SWI48 (match_op_dup 5
>  [(match_dup 3) (const_int 0)])
> (match_dup 1))
> --- gcc/testsuite/gcc.c-torture/compile/pr93174.c.jj2020-01-07 
> 18:16:29.460185015 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr93174.c   2020-01-07 
> 18:15:29.716094376 +0100
> @@ -0,0 +1,14 @@
> +/* PR target/93174 */
> +
> +unsigned long long a[2];
> +void bar (void);
> +
> +void
> +foo (int c)
> +{
> +  int e = c >> 2;
> +  a[0] += c;
> +  a[1] = a[0] < c;
> +  while (e--)
> +bar ();
> +}
>
> Jakub
>


Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)

2020-01-08 Thread Stam Markianos-Wright


On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
> Hi Stam,
> 
> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>> Pinging with more correct maintainers this time :)
>>
>> Also would need to backport to gcc7,8,9, but need to get this approved
>> first!
>>
> 
> Sorry for the delay.

Same here now! Sorry totally forget about this in the lead up to Xmas!

Done the changes marked below and also removed the unnecessary extra #defines 
from the test.

> 
> 
>> Thank you,
>> Stam
>>
>>
>>  Forwarded Message 
>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
>> branches in Thumb2 (PR91816)
>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>> From: Stam Markianos-Wright 
>> To: Ramana Radhakrishnan 
>> CC: gcc-patches@gcc.gnu.org , nd ,
>> James Greenhalgh , Richard Earnshaw
>> 
>>
>>
>>
>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>> >>
>> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>> >> however, on my native Aarch32 setup the test times out when run as part
>> >> of a big "make check-gcc" regression, but not when run individually.
>> >>
>> >> 2019-10-11  Stamatis Markianos-Wright 
>> >>
>> >>   * config/arm/arm.md: Update b for Thumb2 range checks.
>> >>   * config/arm/arm.c: New function arm_gen_far_branch.
>> >>   * config/arm/arm-protos.h: New function arm_gen_far_branch
>> >>   prototype.
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >>
>> >> 2019-10-11  Stamatis Markianos-Wright 
>> >>
>> >>   * testsuite/gcc.target/arm/pr91816.c: New test.
>> >
>> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> >> index f995974f9bb..1dce333d1c3 100644
>> >> --- a/gcc/config/arm/arm-protos.h
>> >> +++ b/gcc/config/arm/arm-protos.h
>> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>> cpu_arch_option *,
>> >>
>> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>> >>
>> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char *);
>> >> +
>> >> +
>> >
>> > Lets get the nits out of the way.
>> >
>> > Unnecessary extra new line, need a space between int and const above.
>> >
>> >
>>
>> .Fixed!
>>
>> >>   #endif /* ! GCC_ARM_PROTOS_H */
>> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> >> index 39e1a1ef9a2..1a693d2ddca 100644
>> >> --- a/gcc/config/arm/arm.c
>> >> +++ b/gcc/config/arm/arm.c
>> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>> >>   }
>> >>   } /* Namespace selftest.  */
>> >>
>> >> +
>> >> +/* Generate code to enable conditional branches in functions over 1 MiB. 
>> >>  */
>> >> +const char *
>> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>> >> +    const char * branch_format)
>> >
>> > Not sure if this is some munging from the attachment but check
>> > vertical alignment of parameters.
>> >
>>
>> .Fixed!
>>
>> >> +{
>> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
>> >> +  char label_buf[256];
>> >> +  char buffer[128];
>> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>> >> +    CODE_LABEL_NUMBER (tmp_label));
>> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>> >> +  rtx dest_label = operands[pos_label];
>> >> +  operands[pos_label] = tmp_label;
>> >> +
>> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>> >> +  output_asm_insn (buffer, operands);
>> >> +
>> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>> >> label_ptr);
>> >> +  operands[pos_label] = dest_label;
>> >> +  output_asm_insn (buffer, operands);
>> >> +  return "";
>> >> +}
>> >> +
>> >> +
>> >
>> > Unnecessary extra newline.
>> >
>>
>> .Fixed!
>>
>> >>   #undef TARGET_RUN_TARGET_SELFTESTS
>> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>> >>   #endif /* CHECKING_P */
>> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> >> index f861c72ccfc..634fd0a59da 100644
>> >> --- a/gcc/config/arm/arm.md
>> >> +++ b/gcc/config/arm/arm.md
>> >> @@ -6686,9 +6686,16 @@
>> >>   ;; And for backward branches we have
>> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 
>> >>4).
>> >>   ;;
>> >> +;; In 16-bit Thumb these ranges are:
>> >>   ;; For a 'b'   pos_range = 2046, neg_range = -2048 giving 
>> >>(-2040->2048).
>> >>   ;; For a 'b' pos_range = 254, neg_range = -256  giving (-250 
>> >>->256).
>> >>
>> >> +;; In 32-bit Thumb these ranges are:
>> >> +;; For a 'b'   +/- 16MB is not checked for.
>> >> +;; For a 'b' pos_range = 1048574, neg_range = -1048576  giving
>> >> +;; (-1048568 -> 1048576).
>> >> +
>> >> +
>> >
>> > Unnecessary extra newline.
>> >
>>
>> .Fixed!
>>
>> >>   (define_expand "cbranchsi4"
>> >> [(set (pc) (if_then_else
>> >> (match_operator 0 "expandable_comparison_operator"
>> >> @@ -6947,22 +6954,42 @@
>> >> (pc)))]
>> >> "TARGET_32BIT"
>> >> "*
>> >> -  if (arm_ccfsm_state == 1 || arm_ccfsm_state 

Re: [PATCH] Use cgraph_node::dump_{asm_},name where possible.

2020-01-08 Thread Jan Hubicka
> Hi.
> 
> The patch consistent usage of cgraph_node::dump_{asm_,}name where possible.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?

OK, thanks!
Not all dump_name/dump_asm_name choices are fully logical, but I see it
is comming form name/asm_name use. I suppose we could fix that case by
case eventually.

Honza
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2020-01-08  Martin Liska  
> 
>   * cgraph.c (cgraph_node::dump): Use ::dump_name or
>   ::dump_asm_name instead of (::name or ::asm_name).
>   * cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
>   * cgraphunit.c (walk_polymorphic_call_targets): Likewise.
>   (analyze_functions): Likewise.
>   (expand_all_functions): Likewise.
>   * ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
>   (propagate_bits_across_jump_function): Likewise.
>   (dump_profile_updates): Likewise.
>   (ipcp_store_bits_results): Likewise.
>   (ipcp_store_vr_results): Likewise.
>   * ipa-devirt.c (dump_targets): Likewise.
>   * ipa-fnsummary.c (analyze_function_body): Likewise.
>   * ipa-hsa.c (check_warn_node_versionable): Likewise.
>   (process_hsa_functions): Likewise.
>   * ipa-icf.c (sem_item_optimizer::merge_classes): Likewise.
>   (set_alias_uids): Likewise.
>   * ipa-inline-transform.c (save_inline_function_body): Likewise.
>   * ipa-inline.c (recursive_inlining): Likewise.
>   (inline_to_all_callers_1): Likewise.
>   (ipa_inline): Likewise.
>   * ipa-profile.c (ipa_propagate_frequency_1): Likewise.
>   (ipa_propagate_frequency): Likewise.
>   * ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
>   (remove_described_reference): Likewise.
>   * ipa-pure-const.c (worse_state): Likewise.
>   (check_retval_uses): Likewise.
>   (analyze_function): Likewise.
>   (propagate_pure_const): Likewise.
>   (propagate_nothrow): Likewise.
>   (dump_malloc_lattice): Likewise.
>   (propagate_malloc): Likewise.
>   (pass_local_pure_const::execute): Likewise.
>   * ipa-visibility.c (optimize_weakref): Likewise.
>   (function_and_variable_visibility): Likewise.
>   * ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
>   (ipa_discover_variable_flags): Likewise.
>   * lto-streamer-out.c (output_function): Likewise.
>   (output_constructor): Likewise.
>   * tree-inline.c (copy_bb): Likewise.
>   * tree-ssa-structalias.c (ipa_pta_execute): Likewise.
>   * varpool.c (symbol_table::remove_unreferenced_decls): Likewise.
> 
> gcc/lto/ChangeLog:
> 
> 2020-01-08  Martin Liska  
> 
>   * lto-partition.c (add_symbol_to_partition_1): Use ::dump_name or
>   ::dump_asm_name instead of (::name or ::asm_name).
>   (lto_balanced_map): Likewise.
>   (promote_symbol): Likewise.
>   (rename_statics): Likewise.
>   * lto.c (lto_wpa_write_files): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-01-08  Martin Liska  
> 
>   * gcc.dg/ipa/ipa-icf-1.c: Update expected scanned output.
>   * gcc.dg/ipa/ipa-icf-10.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-11.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-12.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-13.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-16.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-18.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-2.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-20.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-21.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-23.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-25.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-26.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-27.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-3.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-35.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-36.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-37.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-38.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-5.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-7.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-8.c: Likewise.
>   * gcc.dg/ipa/ipa-icf-merge-1.c: Likewise.
>   * gcc.dg/ipa/pr64307.c: Likewise.
>   * gcc.dg/ipa/pr90555.c: Likewise.
>   * gcc.dg/ipa/propmalloc-1.c: Likewise.
>   * gcc.dg/ipa/propmalloc-2.c: Likewise.
>   * gcc.dg/ipa/propmalloc-3.c: Likewise.
> ---
>  gcc/cgraph.c   |  2 +-
>  gcc/cgraphclones.c |  3 +--
>  gcc/cgraphunit.c   |  9 
>  gcc/ipa-cp.c   | 22 ++--
>  gcc/ipa-devirt.c   |  3 +--
>  gcc/ipa-fnsummary.c|  2 +-
>  gcc/ipa-hsa.c  | 10 -
>  gcc/ipa-icf.c  | 10 -
>  gcc/ipa-inline-transform.c |  2 +-
>  gcc/ipa-inline.c   | 11 +-
>  gcc/ipa-profile.c  | 18 
>  gcc/ipa-prop.c 

[committed] libgomp.texi: Fix typos, use https.

2020-01-08 Thread Tobias Burnus

Committed as obvious.

Tobias

Index: libgomp/ChangeLog
===
--- libgomp/ChangeLog	(revision 280006)
+++ libgomp/ChangeLog	(revision 280008)
@@ -0,0 +1,4 @@
+2020-01-08  Tobias Burnus  
+
+	* libgomp.texi: Fix typos, use https.
+
Index: libgomp/libgomp.texi
===
--- libgomp/libgomp.texi	(revision 280006)
+++ libgomp/libgomp.texi	(revision 280008)
@@ -1730 +1730 @@
-@uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html, 
+@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
@@ -1732 +1732 @@
-@uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
+@uref{https://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
@@ -1814 +1814 @@
-@code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
+@code{#pragma acc} in C/C++ and @code{!$acc} directives in free form,
@@ -1855 +1855 @@
-* acc_async_test_all::  Tests for completion of all asychronous
+* acc_async_test_all::  Tests for completion of all asynchronous
@@ -1859 +1859 @@
-* acc_wait_all::Waits for completion of all asyncrhonous
+* acc_wait_all::Waits for completion of all asynchronous
@@ -1942 +1942 @@
-This function indicates to the runtime library which device typr, specified
+This function indicates to the runtime library which device type, specified
@@ -1993 +1993 @@
-specified by @var{num}, associated with the specifed device
+specified by @var{num}, associated with the specified device
@@ -2396 +2396 @@
-This function tests if the host data specifed by @var{a} and of length
+This function tests if the host data specified by @var{a} and of length
@@ -2469 +2469 @@
-This function tests if the host data specifed by @var{a} and of length
+This function tests if the host data specified by @var{a} and of length
@@ -3036 +3036 @@
-The primary means by that the asychronous functionality is accessed
+The primary means by that the asynchronous functionality is accessed
@@ -3938 +3938 @@
-be reported via @uref{http://gcc.gnu.org/bugzilla/, Bugzilla}.  Please add
+be reported via @uref{https://gcc.gnu.org/bugzilla/, Bugzilla}.  Please add


[PATCH] Use cgraph_node::dump_{asm_},name where possible.

2020-01-08 Thread Martin Liška

Hi.

The patch consistent usage of cgraph_node::dump_{asm_,}name where possible.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-01-08  Martin Liska  

* cgraph.c (cgraph_node::dump): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
* cgraphclones.c (symbol_table::materialize_all_clones): Likewise.
* cgraphunit.c (walk_polymorphic_call_targets): Likewise.
(analyze_functions): Likewise.
(expand_all_functions): Likewise.
* ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
(propagate_bits_across_jump_function): Likewise.
(dump_profile_updates): Likewise.
(ipcp_store_bits_results): Likewise.
(ipcp_store_vr_results): Likewise.
* ipa-devirt.c (dump_targets): Likewise.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-hsa.c (check_warn_node_versionable): Likewise.
(process_hsa_functions): Likewise.
* ipa-icf.c (sem_item_optimizer::merge_classes): Likewise.
(set_alias_uids): Likewise.
* ipa-inline-transform.c (save_inline_function_body): Likewise.
* ipa-inline.c (recursive_inlining): Likewise.
(inline_to_all_callers_1): Likewise.
(ipa_inline): Likewise.
* ipa-profile.c (ipa_propagate_frequency_1): Likewise.
(ipa_propagate_frequency): Likewise.
* ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
(remove_described_reference): Likewise.
* ipa-pure-const.c (worse_state): Likewise.
(check_retval_uses): Likewise.
(analyze_function): Likewise.
(propagate_pure_const): Likewise.
(propagate_nothrow): Likewise.
(dump_malloc_lattice): Likewise.
(propagate_malloc): Likewise.
(pass_local_pure_const::execute): Likewise.
* ipa-visibility.c (optimize_weakref): Likewise.
(function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
(ipa_discover_variable_flags): Likewise.
* lto-streamer-out.c (output_function): Likewise.
(output_constructor): Likewise.
* tree-inline.c (copy_bb): Likewise.
* tree-ssa-structalias.c (ipa_pta_execute): Likewise.
* varpool.c (symbol_table::remove_unreferenced_decls): Likewise.

gcc/lto/ChangeLog:

2020-01-08  Martin Liska  

* lto-partition.c (add_symbol_to_partition_1): Use ::dump_name or
::dump_asm_name instead of (::name or ::asm_name).
(lto_balanced_map): Likewise.
(promote_symbol): Likewise.
(rename_statics): Likewise.
* lto.c (lto_wpa_write_files): Likewise.

gcc/testsuite/ChangeLog:

2020-01-08  Martin Liska  

* gcc.dg/ipa/ipa-icf-1.c: Update expected scanned output.
* gcc.dg/ipa/ipa-icf-10.c: Likewise.
* gcc.dg/ipa/ipa-icf-11.c: Likewise.
* gcc.dg/ipa/ipa-icf-12.c: Likewise.
* gcc.dg/ipa/ipa-icf-13.c: Likewise.
* gcc.dg/ipa/ipa-icf-16.c: Likewise.
* gcc.dg/ipa/ipa-icf-18.c: Likewise.
* gcc.dg/ipa/ipa-icf-2.c: Likewise.
* gcc.dg/ipa/ipa-icf-20.c: Likewise.
* gcc.dg/ipa/ipa-icf-21.c: Likewise.
* gcc.dg/ipa/ipa-icf-23.c: Likewise.
* gcc.dg/ipa/ipa-icf-25.c: Likewise.
* gcc.dg/ipa/ipa-icf-26.c: Likewise.
* gcc.dg/ipa/ipa-icf-27.c: Likewise.
* gcc.dg/ipa/ipa-icf-3.c: Likewise.
* gcc.dg/ipa/ipa-icf-35.c: Likewise.
* gcc.dg/ipa/ipa-icf-36.c: Likewise.
* gcc.dg/ipa/ipa-icf-37.c: Likewise.
* gcc.dg/ipa/ipa-icf-38.c: Likewise.
* gcc.dg/ipa/ipa-icf-5.c: Likewise.
* gcc.dg/ipa/ipa-icf-7.c: Likewise.
* gcc.dg/ipa/ipa-icf-8.c: Likewise.
* gcc.dg/ipa/ipa-icf-merge-1.c: Likewise.
* gcc.dg/ipa/pr64307.c: Likewise.
* gcc.dg/ipa/pr90555.c: Likewise.
* gcc.dg/ipa/propmalloc-1.c: Likewise.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.
---
 gcc/cgraph.c   |  2 +-
 gcc/cgraphclones.c |  3 +--
 gcc/cgraphunit.c   |  9 
 gcc/ipa-cp.c   | 22 ++--
 gcc/ipa-devirt.c   |  3 +--
 gcc/ipa-fnsummary.c|  2 +-
 gcc/ipa-hsa.c  | 10 -
 gcc/ipa-icf.c  | 10 -
 gcc/ipa-inline-transform.c |  2 +-
 gcc/ipa-inline.c   | 11 +-
 gcc/ipa-profile.c  | 18 
 gcc/ipa-prop.c |  4 ++--
 gcc/ipa-pure-const.c   | 24 +++---
 gcc/ipa-visibility.c   |  8 
 gcc/ipa.c  | 11 +-
 gcc/lto-streamer-out.c  

[PATCH] Make sinking clobbers across EH reliable

2020-01-08 Thread Richard Biener


This makes $subject reliably catch secondary opportunities (which
cause quadraticness in PR93199).  It also makes virtual operand
updating in this process a bit cheaper.

This is a first step with the second addressing the quadraticness
(either by some algorithmic changes or by capping the number of
clobbers to sink if the former turns out too ugly)

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2020-01-08  Richard Biener  

PR middle-end/93199
* tree-eh.c (sink_clobbers): Update virtual operands for
the first and last stmt only.  Add a dry-run capability.
(pass_lower_eh_dispatch::execute): Perform clobber sinking
after CFG manipulations and in RPO order to catch all
secondary opportunities reliably.

Index: gcc/tree-eh.c
===
--- gcc/tree-eh.c   (revision 28)
+++ gcc/tree-eh.c   (working copy)
@@ -3550,10 +3550,11 @@ optimize_clobbers (basic_block bb)
 }
 
 /* Try to sink var = {v} {CLOBBER} stmts followed just by
-   internal throw to successor BB.  */
+   internal throw to successor BB.  If FOUND_OPPORTUNITY is not NULL
+   then do not perform the optimization but set *FOUND_OPPORTUNITY to true.  */
 
 static int
-sink_clobbers (basic_block bb)
+sink_clobbers (basic_block bb, bool *found_opportunity = NULL)
 {
   edge e;
   edge_iterator ei;
@@ -3591,13 +3592,19 @@ sink_clobbers (basic_block bb)
   if (!any_clobbers)
 return 0;
 
+  /* If this was a dry run, tell it we found clobbers to sink.  */
+  if (found_opportunity)
+{
+  *found_opportunity = true;
+  return 0;
+}
+
   edge succe = single_succ_edge (bb);
   succbb = succe->dest;
 
   /* See if there is a virtual PHI node to take an updated virtual
  operand from.  */
   gphi *vphi = NULL;
-  tree vuse = NULL_TREE;
   for (gphi_iterator gpi = gsi_start_phis (succbb);
!gsi_end_p (gpi); gsi_next ())
 {
@@ -3605,11 +3612,12 @@ sink_clobbers (basic_block bb)
   if (virtual_operand_p (res))
{
  vphi = gpi.phi ();
- vuse = res;
  break;
}
 }
 
+  gimple *first_sunk = NULL;
+  gimple *last_sunk = NULL;
   dgsi = gsi_after_labels (succbb);
   gsi = gsi_last_bb (bb);
   for (gsi_prev (); !gsi_end_p (gsi); gsi_prev ())
@@ -3641,36 +3649,37 @@ sink_clobbers (basic_block bb)
  forwarder edge we can keep virtual operands in place.  */
   gsi_remove (, false);
   gsi_insert_before (, stmt, GSI_NEW_STMT);
-
-  /* But adjust virtual operands if we sunk across a PHI node.  */
-  if (vuse)
+  if (!first_sunk)
+   first_sunk = stmt;
+  last_sunk = stmt;
+}
+  if (first_sunk)
+{
+  /* Adjust virtual operands if we sunk across a virtual PHI.  */
+  if (vphi)
{
- gimple *use_stmt;
  imm_use_iterator iter;
  use_operand_p use_p;
- FOR_EACH_IMM_USE_STMT (use_stmt, iter, vuse)
+ gimple *use_stmt;
+ tree phi_def = gimple_phi_result (vphi);
+ FOR_EACH_IMM_USE_STMT (use_stmt, iter, phi_def)
FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
- SET_USE (use_p, gimple_vdef (stmt));
- if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (vuse))
+  SET_USE (use_p, gimple_vdef (first_sunk));
+ if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (phi_def))
{
- SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_vdef (stmt)) = 1;
- SSA_NAME_OCCURS_IN_ABNORMAL_PHI (vuse) = 0;
+ SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_vdef (first_sunk)) = 1;
+ SSA_NAME_OCCURS_IN_ABNORMAL_PHI (phi_def) = 0;
}
- /* Adjust the incoming virtual operand.  */
- SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (vphi, succe), gimple_vuse (stmt));
- SET_USE (gimple_vuse_op (stmt), vuse);
+ SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (vphi, succe),
+  gimple_vuse (last_sunk));
+ SET_USE (gimple_vuse_op (last_sunk), phi_def);
}
   /* If there isn't a single predecessor but no virtual PHI node
  arrange for virtual operands to be renamed.  */
-  else if (gimple_vuse_op (stmt) != NULL_USE_OPERAND_P
-  && !single_pred_p (succbb))
+  else if (!single_pred_p (succbb)
+  && TREE_CODE (gimple_vuse (last_sunk)) == SSA_NAME)
{
- /* In this case there will be no use of the VDEF of this stmt. 
-???  Unless this is a secondary opportunity and we have not
-removed unreachable blocks yet, so we cannot assert this.  
-Which also means we will end up renaming too many times.  */
- SET_USE (gimple_vuse_op (stmt), gimple_vop (cfun));
- mark_virtual_operands_for_renaming (cfun);
+ mark_virtual_operand_for_renaming (gimple_vuse (last_sunk));
  todo |= TODO_update_ssa_only_virtuals;
}
 }
@@ -3863,6 +3872,7 @@ 

[PATCH] Fix PR92997

2020-01-08 Thread Richard Biener


Committed.

Richard.

2020-01-08  Richard Biener  

PR testsuite/92997
* gcc.dg/torture/ftrapv-1.c (iaddv): Use noipa attribute.

Index: gcc/testsuite/gcc.dg/torture/ftrapv-1.c
===
--- gcc/testsuite/gcc.dg/torture/ftrapv-1.c (revision 28)
+++ gcc/testsuite/gcc.dg/torture/ftrapv-1.c (working copy)
@@ -13,7 +13,7 @@
 /* Disallow inlining/cloning which would constant propagate and trigger
unrelated bugs.  */
 
-int __attribute__((noinline,noclone))
+int __attribute__((noipa))
 iaddv (int a, int b)
 {
   return a + b;


Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
> 
> > 
> > I would still preffer invalidation before streaming (which is fully
> > deterministic) and possibly have option
> 
> Do you mean __gcov_merge_topn?

I suggest we do the following:

 - have non-deterministic and deterministic version of TOPN counter 
   and a flag chosing between deterministic and non-deterministic
   instrumentation.
 - in non-deterministic version do merging dropping least frequent
   values as done with your patch.
 - in deterministic version do the following

   in __gcov-merge_topn
   1) prune out all values in measured counts with associated counter
  less than total_count/N (where N is number of values we track)
   2) prune out all values in read counter same way
   3) merge the values ignoring any with resulting counter being
  less than merged_total_count/N and invalidating whole counter if
  there are too many suriving ones
   
   4) before using profiles in GCC prune out all vlaues with associated
   counter less than total_count/N (they are not useful anyway).

   Here total_count can be simply sum of all N counters, it would be
   better if it was total number of runs, but that is not available at
   gcov_merge time unless we introduce extra counter for misses.

   Observation is that since we never kick out value because counter is
   full (but still rather invalidate whole counter) this remains
   deterministics. Hopefully prunning out useless small values will
   prevent the Firefox-like examples from triggering too often.
   Definitly this is stronger than simply invalidating all counters
   where number of runs != sum of all counts which is other
   deterministics variant.

   2) and 4) is only needed since we currently have no convenient place
   to prune counters prior streaming if merging is not done at all.
   If we invent one we could skip that step.


Does it make sense to you?

I was also wondering: with TOPN it would be nice to have the property
that target with at greater 1/(TOPN+1) probability gets into the counter,
but this guaranteed only for N=1.
For N>1 one can populate N-1 counters with not too important values 
and only then start putting in frequent values be sure that they fight
and be sure that the frequent target is always fight between themself
for the last slot (exploiting the "lets decrease the least counter
heuristics")

Like for N=3

X X Y Y Z W Z W Z W Z W  ..

here first two counters will get occupied by X and Y with counts 2 and
the third counter will end up with W. Z will be missed even if it has
limit 0.5 and both X and Y probability 0 in the limit.

What about increasing value of every couner by N on hit and decreasing
all by 1 on miss?  It will make sure that counts having frequency less
than 1/N will be dropped making space for those with higher frequencies.
This should also improve behaviour WRT the problem above.
> 
> > -fno-deterministic-profile-merging (or something similar) for users who
> > do not care and for threaded programs where this is impossible
> > (which needs to trigger different merging in libgcov).
> 
> Fine, let's postpone that for GCC 11.

It is a regression, so I think we have right to solve it for GCC 10.

Honza
> 
> Martin
> 
> > 
> > Honza
> > 
> 


Re: [patch, fortran] Fix PR 65428, ICE on nested empty array constructors

2020-01-08 Thread Tobias Burnus

Hello Thomas,

sorry for the belated review. I am not completely happy about the 
introduction of yet another two global variables, but I also do not see 
an easy way out. Hence: OK.


I was playing around with the following test case – you might consider 
to add them as well. (I would exclude the error item, however.)


Cheers,

Tobias

! C7110  (R770) If type-spec is omitted, each ac-value expression in the
! array-constructor shall have the same declared type and kind type 
parameters


! Should be fine as there is either no or only one ac-value:
print *, [[integer ::],[real::]]
print *, [[integer ::],[real::], [1], [real ::]]
print *, [[integer ::],[real::], ["ABC"], [real ::]] // "ABC"
print *, [integer :: [integer ::],[real::]]

! Old but for completeness:
! OK - accepted
print *, [integer :: [1],[1.0]]
! OK – "Error: Element in INTEGER(4) array constructor at (1) is REAL(4)"
print *, [[1],[1.0]]
end


On 1/2/20 11:35 PM, Thomas Koenig wrote:

The solution is to stash the type away in yet another variable
and only use it if the constructor turns out to be empty, and
the type has not been set some other way.

Regression-tested. OK for trunk?

Save typespec for empty array constructor.

2020-01-02  Thomas Koenig  

PR fortran/65428
* array.c (empty_constructor): New variable.
(empty_ts): New variable.
(expand_constructor): Save typespec in empty_ts.
Unset empty_constructor if there is an element.
(gfc_expand_constructor): Initialize empty_constructor
and empty_ts.  If there was no explicit constructor
type and the constructor is empty, take the type from
empty_ts.

2020-01-02  Thomas Koenig  

PR fortran/65428
* gfortran.dg/zero_sized_11.f90: New test.


Re: [PATCH] PR libstdc++/92124 on hashtable

2020-01-08 Thread Jonathan Wakely

On 08/01/20 06:43 +0100, François Dumont wrote:

@@ -404,15 +413,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_begin() const
  { return static_cast<__node_type*>(_M_before_begin._M_nxt); }

-  // Assign *this using another _Hashtable instance. Either elements
-  // are copy or move depends on the _NodeGenerator.
-  template
+  // Assign *this using another _Hashtable instance. Whether elements
+  // are copy or move depends on the _Ht reference.


Should be "are copied or moved".

OK for trunk with that change, thanks!



Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Martin Liška

On 1/8/20 1:24 PM, Jan Hubicka wrote:

On 1/8/20 11:35 AM, Jan Hubicka wrote:

Hi,
Just to explain better what I am worried about.  The overall sum of
counters in TOPN does not have very good meaning if you have more than N
target.

Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
guarantees if target X is taken by more than 50% of times, it will win,
however its count can be arbitrarily low.

Consider following sequence of call targets

Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
The TOPN counter will behave as follows:
Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8

Now for sequence

Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2

There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
Depending on order the counter of X can be anywhere between 2 and 8. So
setting threshold without trashing a lot of valid cases would be hard
and if you have something like 40k of parallel invocations of clang I
would expect quite high divergence between the runs.


Thank you for the example. I'm aware of these sequences that can lead
to a significant divergence in profiles.



Similar sequence exists for N>1, you only need to populate other entries
by new targets.


Yes, but it's much more harder to rapidly decrease a dominant value if we
want to assume that only 10% of samples were populated.



However based on observation above we could probably scale up the
probabilites assuming that the missing part of histogram has pretty much
the same distribution as known part.


I hope so.

I'm also opened for a lower default param value. If you can provide any numbers
for clang I would appreciate that.


Well, my problem is that as soon as the parameter is non-0 you will
inherently get different counts on counters you accept. Even if you cut
them to differ by less than 10% they will still differ from run to run.
This will lead to different speculation probabilities in each build
which may or may not produce different code.  For complex callgraphs
even relatively small differences in weights leads to reordering of
inline decisions which may lead to different inline decision but even if
it does not it definitly leads to different order of callgraph,
different UIDs of decl which give different hashtable values in late
optimizers and make them to diverge.


Fine, I don't insist on the patch, it was actually reaction to your PR.



If I want to get reproducible builds of very large code base (like
clang) i am not even sure how to test it: look for value which leads to
no difference in say 1000 builds and do it each time GCC or clang
updates?


Yep, testing that is very difficult of course.



I would still preffer invalidation before streaming (which is fully
deterministic) and possibly have option


Do you mean __gcov_merge_topn?


-fno-deterministic-profile-merging (or something similar) for users who
do not care and for threaded programs where this is impossible
(which needs to trigger different merging in libgcov).


Fine, let's postpone that for GCC 11.

Martin



Honza





*ping* - Re: [Patch] Rework OpenACC nested reduction clause consistency checking (was: Re: [PATCH][committed] Warn about inconsistent OpenACC nested reduction clauses)

2020-01-08 Thread Harwath, Frederik
PING

Hi Jakub,
I have attached a version of the patch that has been rebased on the current 
trunk.

Frederik

On 03.12.19 12:16, Harwath, Frederik wrote:
> On 08.11.19 07:41, Harwath, Frederik wrote:
>> On 06.11.19 14:00, Jakub Jelinek wrote:
>> [...]
>>> I'm not sure it is a good idea to use a TREE_LIST in this case, vec would be
>>> more natural, wouldn't it.
>> [...]
>>> If gimplifier is not the right spot, then use a splay tree + vector instead?
>>> splay tree for the outer ones, vector for the local ones, and put into both
>>> the clauses, so you can compare reduction code etc.
>>
>> Sounds like a good idea. I am going to try that.
> 
> Below you can find a patch that reimplements the nested reductions check using
> more appropriate data structures. [...]


From b08855328c52e36143770e442e50ba87f25c14b3 Mon Sep 17 00:00:00 2001
From: Frederik Harwath 
Date: Wed, 8 Jan 2020 14:00:44 +0100
Subject: [PATCH] Rework OpenACC nested reduction clause consistency checking

Revision 277875 of trunk introduced a consistency check for nested OpenACC
reduction clauses. The implementation has two drawbacks:
1) It uses suboptimal data structures for storing information about
   the reduction clauses.
2) The warnings issued for *repeated* inconsistent use of reduction operators
   are confusing. For instance, on three nested loops that use the reduction
   operators +, -, + on the same variable, we obtain a warning at the switch
   from + to - (as desired) and another warning about the switch from - to +.
   It would be preferable to avoid the second warning since + is consistent
   with the first reduction operator.

This commit attempts to fix both problems by using more appropriate data
structures (splay trees and vectors instead of tree lists) for keeping track of
the information about the reduction clauses.

2020-01-08  Frederik Harwath  

	gcc/
	* omp-low.c (omp_context): Removed fields local_reduction_clauses,
	outer_reduction_clauses; added fields oacc_reduction_clauses,
	oacc_reductions_stack.
	(oacc_reduction_clause_location): New struct.
	(oacc_reduction_var_occ): New struct.
	(new_omp_context): Adjust omp_context initialization to new fields.
	(delete_omp_context): Adjust omp_context deletion to new fields.
	(rewind_oacc_reductions_stack): New function.
	(check_oacc_reduction_clause): New function.
	(check_oacc_reduction_clauses): New function.
	(scan_sharing_clauses): Call check_oacc_reduction_clause for
	reduction clauses (this handles clauses on compute regions)
	if a new optional flag is enabled.
	(scan_omp_for): Remove old nested reduction check, call
	 check_oacc_reduction_clauses instead.
	(scan_omp_target): Adapt call to scan_sharing_clauses to enable the new
	flag.

   	gcc/testsuite/
	* c-c++-common/goacc/nested-reductions-warn.c: Add dg-prune-output to
	 ignore warnings that are not relevant to the test.
	(acc_parallel): Stop expecting pruned warnings, adjust expected
	warnings to changes in omp-low.c, add checks for info messages about the
	location of clauses.
	(acc_parallel_loop): Likewise.
	(acc_parallel_reduction): Likewise.
	(acc_parallel_loop_reduction): Likewise.
	(acc_routine): Likewise.
	(acc_kernels): Likewise.

	* gfortran.dg/goacc/nested-reductions-warn.f90: Likewise.
---
 gcc/omp-low.c | 306 --
 .../goacc/nested-reductions-warn.c|  81 ++---
 .../goacc/nested-reductions-warn.f90  |  83 ++---
 3 files changed, 271 insertions(+), 199 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e692a53a3de..6026b7aff89 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -73,6 +73,9 @@ along with GCC; see the file COPYING3.  If not see
scanned for regions which are then moved to a new
function, to be invoked by the thread library, or offloaded.  */
 
+
+struct oacc_reduction_var_occ;
+
 /* Context structure.  Used to store information about each parallel
directive in the code.  */
 
@@ -128,12 +131,6 @@ struct omp_context
  corresponding tracking loop iteration variables.  */
   hash_map *lastprivate_conditional_map;
 
-  /* A tree_list of the reduction clauses in this context.  */
-  tree local_reduction_clauses;
-
-  /* A tree_list of the reduction clauses in outer contexts.  */
-  tree outer_reduction_clauses;
-
   /* Nesting depth of this context.  Used to beautify error messages re
  invalid gotos.  The outermost ctx is depth 1, with depth 0 being
  reserved for the main body of the function.  */
@@ -163,8 +160,52 @@ struct omp_context
 
   /* True if there is bind clause on the construct (i.e. a loop construct).  */
   bool loop_p;
+
+  /* A mapping that maps a variable to information about the last OpenACC
+ reduction clause that used the variable above the current context.
+ This information is used for checking the nesting restrictions for
+ reduction clauses by the function check_oacc_reduction_clauses.
+ The mapping is owned by the outermost context (i.e. a context 

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
> On 1/8/20 11:35 AM, Jan Hubicka wrote:
> > Hi,
> > Just to explain better what I am worried about.  The overall sum of
> > counters in TOPN does not have very good meaning if you have more than N
> > target.
> > 
> > Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
> > guarantees if target X is taken by more than 50% of times, it will win,
> > however its count can be arbitrarily low.
> > 
> > Consider following sequence of call targets
> > 
> > Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
> > The TOPN counter will behave as follows:
> > Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8
> > 
> > Now for sequence
> > 
> > Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
> > Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2
> > 
> > There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
> > Depending on order the counter of X can be anywhere between 2 and 8. So
> > setting threshold without trashing a lot of valid cases would be hard
> > and if you have something like 40k of parallel invocations of clang I
> > would expect quite high divergence between the runs.
> 
> Thank you for the example. I'm aware of these sequences that can lead
> to a significant divergence in profiles.
> 
> > 
> > Similar sequence exists for N>1, you only need to populate other entries
> > by new targets.
> 
> Yes, but it's much more harder to rapidly decrease a dominant value if we
> want to assume that only 10% of samples were populated.
> 
> > 
> > However based on observation above we could probably scale up the
> > probabilites assuming that the missing part of histogram has pretty much
> > the same distribution as known part.
> 
> I hope so.
> 
> I'm also opened for a lower default param value. If you can provide any 
> numbers
> for clang I would appreciate that.

Well, my problem is that as soon as the parameter is non-0 you will
inherently get different counts on counters you accept. Even if you cut
them to differ by less than 10% they will still differ from run to run.
This will lead to different speculation probabilities in each build
which may or may not produce different code.  For complex callgraphs
even relatively small differences in weights leads to reordering of
inline decisions which may lead to different inline decision but even if
it does not it definitly leads to different order of callgraph,
different UIDs of decl which give different hashtable values in late
optimizers and make them to diverge.

If I want to get reproducible builds of very large code base (like
clang) i am not even sure how to test it: look for value which leads to
no difference in say 1000 builds and do it each time GCC or clang
updates?

I would still preffer invalidation before streaming (which is fully
deterministic) and possibly have option
-fno-deterministic-profile-merging (or something similar) for users who
do not care and for threaded programs where this is impossible
(which needs to trigger different merging in libgcov).

Honza


Re: [PATCH] Use dump_asm_name for Callers/Calls in dump.

2020-01-08 Thread Martin Liška

On 1/8/20 11:08 AM, Jan Hubicka wrote:

On 1/7/20 11:27 AM, Martin Liška wrote:

Which is fine. Apparently there are just few usages of manual printing
of a symtab node and order like:

    fprintf (f,
     "%*s%s/%i %s\n%*s  freq:%4.2f",
     indent, "", callee->name (), callee->order,

I can replace these with symtab_node::dump_{asm_}name.


Hi.

I'm addressing this by the following refactoring patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?

This is OK.


Ok, I've just installed both patches.


The original reason for using DECL_NAME rather than DECL_ASSEMBLER_NAME
was that first is shorter, but also that DECL_ASSEMBLER_NAME may trigger
lazy assembler name computation that changes memory layout and let to
divergences in resulting code when compiling with and without
-fdump-ipa-all.


Ah, that's new to me.



I guess for dumping we should use dump_name consistently unless we
really want to speak of the assembler symbol name (so I would suggest
fixing the uses of dump_asm_name in most cases to dump_name)


Well, I do prefer the asm names in cgraph dump files as heavy templated
C++ code generates function names that are much harder to grep. I know one
can use order to find these.

Example:



void Field::clearDirty() const [with Mesh = UniformRectilinearMesh >; T = double; EngineTag = MultiPatch >]/4590
vs.
_ZNK5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10MultiPatchI7GridTag6RemoteI5BrickEEE10clearDirtyEv/4590



We should not use node->name - this is artifact from times we did not
have dump_name for dumping.


I see still about 80 usages of that in dumps. I'll prepare a replacement
patch.



Honza

Thanks,
Martin



 From 45629d68ac4ad9dada74e8c14a10b89025f54762 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 7 Jan 2020 12:37:42 +0100
Subject: [PATCH] Replace node->name/node->order with node->dump_name.

gcc/ChangeLog:

2020-01-07  Martin Liska  

* ipa-fnsummary.c (dump_ipa_call_summary): Use symtab_node::dump_name.
(ipa_call_context::estimate_size_and_time): Likewise.
(inline_analyze_function): Likewise.

gcc/lto/ChangeLog:

2020-01-07  Martin Liska  

* lto-partition.c (lto_balanced_map): Use symtab_node::dump_name.
---
  gcc/ipa-fnsummary.c | 12 +---
  gcc/lto/lto-partition.c |  4 ++--
  2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index fa01cb6c083..7c0b6f98e25 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -907,8 +907,8 @@ dump_ipa_call_summary (FILE *f, int indent, struct 
cgraph_node *node,
int i;
  
fprintf (f,

-  "%*s%s/%i %s\n%*s  freq:%4.2f",
-  indent, "", callee->name (), callee->order,
+  "%*s%s %s\n%*s  freq:%4.2f",
+  indent, "", callee->dump_name (),
   !edge->inline_failed
   ? "inlined" : cgraph_inline_failed_string (edge-> inline_failed),
   indent, "", edge->sreal_frequency ().to_double ());
@@ -3505,9 +3505,8 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
if (dump_file && (dump_flags & TDF_DETAILS))
  {
bool found = false;
-  fprintf (dump_file, "   Estimating body: %s/%i\n"
-  "   Known to be false: ", m_node->name (),
-  m_node->order);
+  fprintf (dump_file, "   Estimating body: %s\n"
+  "   Known to be false: ", m_node->dump_name ());
  
for (i = predicate::not_inlined_condition;

   i < (predicate::first_dynamic_condition
@@ -4034,8 +4033,7 @@ inline_analyze_function (struct cgraph_node *node)
push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  
if (dump_file)

-fprintf (dump_file, "\nAnalyzing function: %s/%u\n",
-node->name (), node->order);
+fprintf (dump_file, "\nAnalyzing function: %s\n", node->dump_name ());
if (opt_for_fn (node->decl, optimize) && !node->thunk.thunk_p)
  inline_indirect_intraprocedural_analysis (node);
compute_fn_summary (node, false);
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 86b2eabe374..5b153c9759e 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -734,10 +734,10 @@ lto_balanced_map (int n_lto_partitions, int 
max_partition_size)
  best_varpool_pos = varpool_pos;
}
if (dump_file)
-   fprintf (dump_file, "Step %i: added %s/%i, size %i, "
+   fprintf (dump_file, "Step %i: added %s, size %i, "
 "cost %" PRId64 "/%" PRId64 " "
 "best %" PRId64 "/%" PRId64", step %i\n", i,
-order[i]->name (), order[i]->order,
+order[i]->dump_name (),
 partition->insns, cost, internal,
 best_cost, best_internal, best_i);
/* Partition is too large, unwind into step when best cost was reached 
and
--
2.24.1







Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
On unrelated note, looking what we print with --verbose -v

The following options are specific to just the language LTO:
  -flinker-output=Set linker output type (used internally during 
LTO optimization).
  -fltransRun the link-time optimizer in local 
transformation (LTRANS) mode.
  -fltrans-output-list=   Specify a file to which a list of files output by 
LTRANS is written.
  -fresolution=   The resolution file.
  -fwpa   Run the link-time optimizer in whole program 
analysis (WPA) mode.
  -fwpa=  Whole program analysis (WPA) mode with number of 
parallel jobs specified.

"language LTO" is bit funny. Also those are internal options that
probably shouldnot be printed with help except for -flinker-output which
is user visible but only one documented as internal.

The following options are specific to link-time optimization
  -flinker-output=Set linker output type

I guess would be expected output :)

I find it funny that we print Undocumented as "this option lacks
documentation".  I would probably not print them or print them with
saying that the option is for internal use only and intentionally
undocumented or so.

Honza


Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Richard Biener
On Wed, Jan 8, 2020 at 11:22 AM Jan Hubicka  wrote:
>
> > > > Given all warning options can be enabled/disabled via #pragma GCC 
> > > > diagnostic
> > > > all Warning annotated options should be implicitely 'Optimization' for
> > > > the purpose
> > > > of LTO streaming then?
> > >
> > > Well, perhaps they can be marked but for late optimizations this does
> > > not work
> > > __attribute__ ((warning("haha"))) test() { }
> > > #pragma gcc diagnostic ignore "-Wattribute-warning"
> > > test2() { test(); }
> > >
> > > We have many warning options but only few of them are late - it would be
> > > nice to have them explicitly marked somehow IMO (by design and to avoid
> > > streaming a lot of useless flags)
> >
> > Hmm, indeed.  Well, I belive we use the 'Optimization' flag for other 
> > purposes
> > than only triggering LTO streaming and option save/restore, so we need 
> > another
> > flag that only triggers save/restore then (and also allow us to avoid
> > dropping the
> > flag at lto-option streaming time where we currently drop all warning 
> > options).
>
> Yep.  I was not aware of any other use of "Optimization" flag, but
> definitly it is misnamed in this context (as is OPTIMIZATION_NODE btw).

I think it sets CL_OPTIMIZATION and we have in opts.c

case CL_WARNING:
  description = _("The following options control compiler
warning messages");
  break;
case CL_OPTIMIZATION:
  description = _("The following options control optimizations");
  break;

so depenent on the enum value we'll get the wrong one for warnings and the above
suggests those flags are not to be set at the same time.  Likewise
--params, btw:

case CL_PARAMS:
  description = _("The following options control parameters");
  break;

so that really asks for a separate CL_* flag (and maybe implicitely setting that
for all CL_OPTIMIZATION flags).

Richard.

> Honza
> >
> > Richard.
> >
> > > honza
> > >
> > > >
> > > > Richard.
> > > >
> > > > > Honza


Re: [PING 3][PATCH] track dynamic allocation in strlen (PR 91582)

2020-01-08 Thread Andreas Schwab
On Dez 06 2019, Martin Sebor wrote:

> diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c 
> b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c
> new file mode 100644
> index 000..249ce2b6ad5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-27.c

> +void test_strcpy_warn (const char *s)
> +{
> +  {
> +const char a[] = "123";
> +/* Verify that using signed int for the strlen result works (i.e.,
> +   that the conversion from signed int to size_t doesn't prevent
> +   the detection.  */
> +int n = strlen (a);
> +char *t = (char*)calloc (n, 1); // { dg-message "at offset 0 to an 
> object with size 3 allocated by 'calloc' here" "calloc note" { xfail *-*-* } }
> +// { dg-message "at offset 0 to an 
> object with size at most 3 allocated by 'calloc' here" "calloc note" { target 
> *-*-* } .-1 }

Please make the test name unique.

> +strcpy (t, a);  // { dg-warning "writing 4 bytes 
> into a region of size (between 0 and )?3 " }
> +
> +sink (t);
> +  }
> +
> +  {
> +const char a[] = "1234";
> +size_t n = strlen (a);
> +char *t = (char*)malloc (n);// { dg-message "at offset 0 to an 
> object with size 4 allocated by 'malloc' here" "malloc note" { xfail *-*-* } }
> +// { dg-message "at offset 0 to an 
> object with size at most 4 allocated by 'malloc' here" "malloc note" { target 
> *-*-* } .-1 }

Likewise.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH GCC11]Improve uninitialized warning with value range info

2020-01-08 Thread Richard Biener
On Wed, Jan 8, 2020 at 12:30 PM Bin.Cheng  wrote:
>
> On Wed, Jan 8, 2020 at 6:31 PM Richard Biener
>  wrote:
> >
> > On Wed, Jan 8, 2020 at 6:01 AM bin.cheng  
> > wrote:
> > >
> > > Sorry, here is the patch.
> > > --
> > > Sender:bin.cheng 
> > > Sent At:2020 Jan. 8 (Wed.) 12:58
> > > Recipient:GCC Patches 
> > > Subject:[PATCH GCC11]Improve uninitialized warning with value range info
> > >
> > >
> > > Hi,
> > >
> > > Function use_pred_not_overlap_with_undef_path_pred of 
> > > pass_late_warn_uninitialized
> > > checks if predicate of variable use overlaps with predicate of undefined 
> > > control flow path.
> > > For now, it only checks ssa_var comparing against constant, this can be 
> > > improved where
> > > ssa_var compares against another ssa_var with value range info, as 
> > > described in comment:
> > >
> > > + /* Check value range info of rhs, do following transforms:
> > > +  flag_var < [min, max]  ->  flag_var < max
> > > +  flag_var > [min, max]  ->  flag_var > min
> > > +
> > > +We can also transform LE_EXPR/GE_EXPR to LT_EXPR/GT_EXPR:
> > > +  flag_var <= [min, max] ->  flag_var < [min, max+1]
> > > +  flag_var >= [min, max] ->  flag_var > [min-1, max]
> > > +if no overflow/wrap.  */
> > >
> > > This change can avoid some false warning.  Bootstrap and test on x86_64, 
> > > any comment?
> >
> > Definitely a good idea - the refactoring makes the patch hard to
> > follow though.  The
> > original code seems to pick any (the "first") compare against a constant.  
> > You
> > return the "first" but maybe from range info that might also be
> > [-INF,+INF].  It seems
> > that we'd want to pick the best so eventually sort the predicate chain
> > so that compares
> > against constants come first at least?  Not sure if it really makes a
> > difference but
> I don't know either, but I simply tried to not break existing code int
> the patch.
> Function prune_uninit_phi_opnds is called for the first compares against
> constant, actually it should be called for each comparison, but I guess it's
> just avoiding O(M*N) complexity here.

Yeah.  I'm just worried finding a "bad" value-range predicate cuts the search
in a way that causes extra bogus warnings?

>
> > even currently we could have i < 5, i < 1 so the "better" one later?
> > It might also make
> > sense to simply push three predicates for i < j, namely i < j (if ever
> > useful), i < min(j)
> > and max(i) < j to avoid repeatedly doing the range computations.
> IIUC, with current implementation, it's not useful to check value rang
> info for both sides of comparison because prune_uninit_phi_opnds
> requires the flag_var be defined by PHI node in the same basic block
> as PHI parameter.

Yes, but without remembering the code very well my suggestion allows
"new" predicates to be gathered during collecting phase while your patch
adjusts the query phase?

Richard.

> Thanks,
> bin
> >
> > Thanks,
> > Richard.
> >
> > > Thanks,
> > > bin
> > >
> > > 2020-01-08  Bin Cheng  
> > >
> > >  * tree-ssa-uninit.c (find_var_cmp_const): New function.
> > >  (use_pred_not_overlap_with_undef_path_pred): Call above.


Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Martin Liška

On 1/8/20 11:35 AM, Jan Hubicka wrote:

Hi,
Just to explain better what I am worried about.  The overall sum of
counters in TOPN does not have very good meaning if you have more than N
target.

Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
guarantees if target X is taken by more than 50% of times, it will win,
however its count can be arbitrarily low.

Consider following sequence of call targets

Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
The TOPN counter will behave as follows:
Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8

Now for sequence

Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2

There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
Depending on order the counter of X can be anywhere between 2 and 8. So
setting threshold without trashing a lot of valid cases would be hard
and if you have something like 40k of parallel invocations of clang I
would expect quite high divergence between the runs.


Thank you for the example. I'm aware of these sequences that can lead
to a significant divergence in profiles.



Similar sequence exists for N>1, you only need to populate other entries
by new targets.


Yes, but it's much more harder to rapidly decrease a dominant value if we
want to assume that only 10% of samples were populated.



However based on observation above we could probably scale up the
probabilites assuming that the missing part of histogram has pretty much
the same distribution as known part.


I hope so.

I'm also opened for a lower default param value. If you can provide any numbers
for clang I would appreciate that.

Martin



Honza





[PATCH] Avoid operand re-parsing when moving stmts, PR93199

2020-01-08 Thread Richard Biener


The following adjusts gsi_remove to do what is documented - not touch
operand caches or force updating by marking it modified when the
remove is not permanent.  This avoids redundant operand scans for
stmt move (gsi_move_* does a gsi_remove / gsi_insert combo as well).

For the original testcase motivating PR93199 this cuts down operand
scanner time from

tree operand scan  :1053.72 ( 38%) 730.65 (48%)1790.31 ( 42%)   
54141 kB (  0%)

to

tree operand scan  :   0.93 (  0%)   0.41 (3%)   1.36 (  0%)   
54141 kB (  0%)

Fallout is in places that relied on the implicit side-effect of
the gsi_remove to mark a stmt modified and thus trigger update_stmt
at gsi_insert time.  Likewise dangling stmts where passes remove
stmts but do not tell gsi_remove the remove is final (the gimple FE
change is of that kind).

Bootstrapped on x86_64-unknown-linux-gnu, re-testing after the
interchange and GIMPLE FE fix.

Richard.

2019-01-08  Richard Biener  

PR middle-end/93199
c/
* gimple-parser.c (c_parser_parse_gimple_body): Remove __PHI IFN
permanently.

* gimple-fold.c (rewrite_to_defined_overflow): Mark stmt modified.
* tree-ssa-loop-im.c (move_computations_worker): Properly adjust
virtual operand, also updating SSA use.
* gimple-loop-interchange.cc (loop_cand::undo_simple_reduction):
Update stmt after resetting virtual operand.
(tree_loop_interchange::move_code_to_inner_loop): Likewise.

* gimple-iterator.c (gsi_remove): When not removing the stmt
permanently do not delink immediate uses or mark the stmt modified.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 279992)
+++ gcc/gimple-fold.c   (working copy)
@@ -7380,6 +7380,7 @@ rewrite_to_defined_overflow (gimple *stm
   gimple_assign_set_lhs (stmt, make_ssa_name (type, stmt));
   if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
 gimple_assign_set_rhs_code (stmt, PLUS_EXPR);
+  gimple_set_modified (stmt, true);
   gimple_seq_add_stmt (, stmt);
   gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, gimple_assign_lhs (stmt));
   gimple_seq_add_stmt (, cvt);
Index: gcc/gimple-iterator.c
===
--- gcc/gimple-iterator.c   (revision 279992)
+++ gcc/gimple-iterator.c   (working copy)
@@ -558,16 +558,18 @@ gsi_remove (gimple_stmt_iterator *i, boo
   gimple *stmt = gsi_stmt (*i);
   bool require_eh_edge_purge = false;
 
+  /* ???  Do we want to do this for non-permanent operation?  */
   if (gimple_code (stmt) != GIMPLE_PHI)
 insert_debug_temps_for_defs (i);
 
-  /* Free all the data flow information for STMT.  */
   gimple_set_bb (stmt, NULL);
-  delink_stmt_imm_use (stmt);
-  gimple_set_modified (stmt, true);
 
   if (remove_permanently)
 {
+  /* Free all the data flow information for STMT.  */
+  delink_stmt_imm_use (stmt);
+  gimple_set_modified (stmt, true);
+
   if (gimple_debug_nonbind_marker_p (stmt))
/* We don't need this to be exact, but try to keep it at least
   close.  */
Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c  (revision 279992)
+++ gcc/tree-ssa-loop-im.c  (working copy)
@@ -1231,7 +1231,8 @@ move_computations_worker (basic_block bb
  gphi *phi = gsi2.phi ();
  if (virtual_operand_p (gimple_phi_result (phi)))
{
- gimple_set_vuse (stmt, PHI_ARG_DEF_FROM_EDGE (phi, e));
+ SET_USE (gimple_vuse_op (stmt),
+  PHI_ARG_DEF_FROM_EDGE (phi, e));
  break;
}
}
Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 279944)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -327,7 +327,7 @@ c_parser_parse_gimple_body (c_parser *cp
  add_phi_arg (phi, gimple_call_arg (stmt, i + 1), e,
   UNKNOWN_LOCATION);
  }
-   gsi_remove (, false);
+   gsi_remove (, true);
  }
  /* Fill SSA name gaps, putting them on the freelist.  */
  for (unsigned i = 1; i < num_ssa_names; ++i)
Index: gcc/gimple-loop-interchange.cc
===
--- gcc/gimple-loop-interchange.cc  (revision 279944)
+++ gcc/gimple-loop-interchange.cc  (working copy)
@@ -879,6 +879,7 @@ loop_cand::undo_simple_reduction (reduct
   if (re->producer != NULL)
 {
   gimple_set_vuse (re->producer, NULL_TREE);
+  update_stmt (re->producer);
   from = gsi_for_stmt (re->producer);
   gsi_remove (, false);
   gimple_seq_add_stmt_without_update (, re->producer);
@@ -920,6 +921,7 @@ 

Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-08 Thread Andrew Stubbs

On 08/01/2020 11:07, Kwok Cheung Yeung wrote:

+#define __sync_subword_compare_and_swap(type, size)    \


Macro parameters are conventionally upper case.


+    \
+type    \
+__sync_val_compare_and_swap_##size (type *ptr, type oldval, type 
newval)    \

+{    \
+  unsigned int *wordptr    \
+    = (unsigned int *)((unsigned long long) ptr & ~3ULL);    \


Please use "intptr_t" rather than "unsigned long long" (which should 
probably have been "unsigned long" anyway).



+  int shift = ((unsigned long long) ptr & 3ULL) * 8;    \
+  unsigned int valmask = (1 << (size * 8)) - 1;    \
+  unsigned int wordmask = ~(valmask << shift);    \
+  unsigned int oldword = *wordptr;    \
+  for (;;)    \
+    {    \
+  type prevval = (oldword >> shift) & valmask;    \
+  if (__builtin_expect (prevval != oldval, 0))    \
+    return prevval;    \
+  unsigned int newword = oldword & wordmask;    \
+  newword |= ((unsigned int) newval) << shift;    \
+  unsigned int prevword    \
+  = __sync_val_compare_and_swap_4 (wordptr, oldword, 
newword);    \

+  if (__builtin_expect (prevword == oldword, 1))    \
+    return oldval;    \
+  oldword = prevword;    \
+    }    \
+}    \
+    \
+bool    \
+__sync_bool_compare_and_swap_##size (type *ptr, type oldval, type 
newval)   \

+{    \
+  return __sync_val_compare_and_swap_##size(ptr, oldval, newval) == 
oldval; \


Space before '('.

I presume all the '\' are lined up, but the patch has got mangled in the 
email. Please use an inline attachment.



+}
+
+__sync_subword_compare_and_swap (unsigned char, 1)
+__sync_subword_compare_and_swap (unsigned short, 2)
+
diff --git a/libgcc/config/gcn/t-amdgcn b/libgcc/config/gcn/t-amdgcn
index adbd866..fe7b5fa 100644
--- a/libgcc/config/gcn/t-amdgcn
+++ b/libgcc/config/gcn/t-amdgcn
@@ -1,4 +1,5 @@
-LIB2ADD += $(srcdir)/config/gcn/lib2-divmod.c \
+LIB2ADD += $(srcdir)/config/gcn/atomic.c \
+   $(srcdir)/config/gcn/lib2-divmod.c \
     $(srcdir)/config/gcn/lib2-divmod-hi.c \
     $(srcdir)/config/gcn/unwind-gcn.c



Andrew


Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Martin Liška

On 1/8/20 12:08 PM, Jan Hubicka wrote:

Hmm, indeed.  Well, I belive we use the 'Optimization' flag for other purposes
than only triggering LTO streaming and option save/restore, so we need another
flag that only triggers save/restore then (and also allow us to avoid
dropping the
flag at lto-option streaming time where we currently drop all warning options).


jan@skylake:~/trunk/gcc> grep Optimization *.awk
optc-save-gen.awk:  if (flag_set_p("(Optimization|PerFunction)", flags[i])) 
{
optc-save-gen.awk:  if (flag_set_p("(Optimization|PerFunction)", flags[i])) 
{
optc-save-gen.awk:  var_opt_hash[n_opt_val] = 
flag_set_p("Optimization", flags[i]);
opt-functions.awk:test_flag("(Optimization|PerFunction)", flags, " | 
CL_OPTIMIZATION")
opth-gen.awk:   if (flag_set_p("(Optimization|PerFunction)", flags[i]))
{
jan@skylake:~/trunk/gcc> grep PerFunction *.opt
Common Report Var(flag_var_tracking) Init(2) PerFunction
Common Report Var(flag_var_tracking_assignments) Init(2) PerFunction
Common Report Var(flag_var_tracking_assignments_toggle) PerFunction
Common Report Var(flag_var_tracking_uninit) PerFunction

So I suppose we want PerFunction to be used for warnings and params
and stuff that is not an optimization like -fasynchronous-unwind-tables.


Like what I suggest in the patch?



Indeed Optimizations adds CL_OPTIMIZATION which seems to make options to
appear in section "The following options control optimization" which is
not precisely accurate either, since we think we still have
optimizations that are not per function (-fsemantic-interposition could
count as such, tought I guess definition of what is "optimization" is
bit shady)


Yes, even using PerFunction will print -Winline and --param=* in 
--help=optimize.
What we can do is to exclude from CL_OPTIMIZATION all which have CL_PARAMS or 
CL_WARNINIG
(I mean for --help purpose)?

Martin



Honza


Richard.


honza



Richard.


Honza


>From ef194683d2c27e661f1d967231fac8c91f2ae087 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 8 Jan 2020 12:30:09 +0100
Subject: [PATCH] User PerFunction for params and a warning option.

gcc/ChangeLog:

2020-01-08  Martin Liska  

	* common.opt: Use PerFunction instead of Optimization/
	* params.opt: Likewise.
---
 gcc/common.opt |   2 +-
 gcc/params.opt | 380 -
 2 files changed, 191 insertions(+), 191 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 9fc921109ca..92acb8aa7f8 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -617,7 +617,7 @@ Common Var(warn_implicit_fallthrough) RejectNegative Joined UInteger Warning Int
 Warn when a switch case falls through.
 
 Winline
-Common Var(warn_inline) Warning Optimization
+Common Var(warn_inline) Warning PerFunction
 Warn when an inlined function cannot be inlined.
 
 Winvalid-memory-model
diff --git a/gcc/params.opt b/gcc/params.opt
index 5d39244761a..95566e7318a 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -23,11 +23,11 @@
 ; Please try to keep this file in ASCII collating order.
 
 -param=align-loop-iterations=
-Common Joined UInteger Var(param_align_loop_iterations) Init(4) Param Optimization
+Common Joined UInteger Var(param_align_loop_iterations) Init(4) Param PerFunction
 Loops iterating at least selected number of iterations will get loop alignment.
 
 -param=align-threshold=
-Common Joined UInteger Var(param_align_threshold) Init(100) IntegerRange(1, 65536) Param Optimization
+Common Joined UInteger Var(param_align_threshold) Init(100) IntegerRange(1, 65536) Param PerFunction
 Select fraction of the maximal frequency of executions of basic block in function given basic block get alignment.
 
 -param=asan-globals=
@@ -35,51 +35,51 @@ Common Joined UInteger Var(param_asan_globals) Init(1) IntegerRange(0, 1) Param
 Enable asan globals protection.
 
 -param=asan-instrument-allocas=
-Common Joined UInteger Var(param_asan_protect_allocas) Init(1) IntegerRange(0, 1) Param Optimization
+Common Joined UInteger Var(param_asan_protect_allocas) Init(1) IntegerRange(0, 1) Param PerFunction
 Enable asan allocas/VLAs protection.
 
 -param=asan-instrument-reads=
-Common Joined UInteger Var(param_asan_instrument_reads) Init(1) IntegerRange(0, 1) Param Optimization
+Common Joined UInteger Var(param_asan_instrument_reads) Init(1) IntegerRange(0, 1) Param PerFunction
 Enable asan load operations protection.
 
 -param=asan-instrument-writes=
-Common Joined UInteger Var(param_asan_instrument_writes) Init(1) IntegerRange(0, 1) Param Optimization
+Common Joined UInteger Var(param_asan_instrument_writes) Init(1) IntegerRange(0, 1) Param PerFunction
 Enable asan store operations protection.
 
 -param=asan-instrumentation-with-call-threshold=
-Common Joined UInteger Var(param_asan_instrumentation_with_call_threshold) Init(7000) Param Optimization
+Common Joined UInteger Var(param_asan_instrumentation_with_call_threshold) Init(7000) Param PerFunction
 Use callbacks instead of inline code if 

Re: [PATCH GCC11]Improve uninitialized warning with value range info

2020-01-08 Thread Bin.Cheng
On Wed, Jan 8, 2020 at 6:31 PM Richard Biener
 wrote:
>
> On Wed, Jan 8, 2020 at 6:01 AM bin.cheng  wrote:
> >
> > Sorry, here is the patch.
> > --
> > Sender:bin.cheng 
> > Sent At:2020 Jan. 8 (Wed.) 12:58
> > Recipient:GCC Patches 
> > Subject:[PATCH GCC11]Improve uninitialized warning with value range info
> >
> >
> > Hi,
> >
> > Function use_pred_not_overlap_with_undef_path_pred of 
> > pass_late_warn_uninitialized
> > checks if predicate of variable use overlaps with predicate of undefined 
> > control flow path.
> > For now, it only checks ssa_var comparing against constant, this can be 
> > improved where
> > ssa_var compares against another ssa_var with value range info, as 
> > described in comment:
> >
> > + /* Check value range info of rhs, do following transforms:
> > +  flag_var < [min, max]  ->  flag_var < max
> > +  flag_var > [min, max]  ->  flag_var > min
> > +
> > +We can also transform LE_EXPR/GE_EXPR to LT_EXPR/GT_EXPR:
> > +  flag_var <= [min, max] ->  flag_var < [min, max+1]
> > +  flag_var >= [min, max] ->  flag_var > [min-1, max]
> > +if no overflow/wrap.  */
> >
> > This change can avoid some false warning.  Bootstrap and test on x86_64, 
> > any comment?
>
> Definitely a good idea - the refactoring makes the patch hard to
> follow though.  The
> original code seems to pick any (the "first") compare against a constant.  You
> return the "first" but maybe from range info that might also be
> [-INF,+INF].  It seems
> that we'd want to pick the best so eventually sort the predicate chain
> so that compares
> against constants come first at least?  Not sure if it really makes a
> difference but
I don't know either, but I simply tried to not break existing code int
the patch.
Function prune_uninit_phi_opnds is called for the first compares against
constant, actually it should be called for each comparison, but I guess it's
just avoiding O(M*N) complexity here.


> even currently we could have i < 5, i < 1 so the "better" one later?
> It might also make
> sense to simply push three predicates for i < j, namely i < j (if ever
> useful), i < min(j)
> and max(i) < j to avoid repeatedly doing the range computations.
IIUC, with current implementation, it's not useful to check value rang
info for both sides of comparison because prune_uninit_phi_opnds
requires the flag_var be defined by PHI node in the same basic block
as PHI parameter.

Thanks,
bin
>
> Thanks,
> Richard.
>
> > Thanks,
> > bin
> >
> > 2020-01-08  Bin Cheng  
> >
> >  * tree-ssa-uninit.c (find_var_cmp_const): New function.
> >  (use_pred_not_overlap_with_undef_path_pred): Call above.


Re: [Patch 0/X] HWASAN v3

2020-01-08 Thread Matthew Malcomson
Hi everyone,

I'm writing this email to summarise & publicise the state of this patch 
series, especially the difficulties around approval for GCC 10 mentioned 
on IRC.


The main obstacle seems to be that no maintainer feels they have enough 
knowledge about hwasan and justification that it's worthwhile to approve 
the patch series.

Similarly, Martin has given a review of the parts of the code he can 
(thanks!), but doesn't feel he can do a deep review of the code related 
to the RTL hooks and stack expansion -- hence that part is as yet not 
reviewed in-depth.



The questions around justification raised on IRC are mainly that it 
seems like a proof-of-concept for MTE rather than a stand-alone useable 
sanitizer.  Especially since in the GNU world hwasan instrumented code 
is not really ready for production since we can only use the 
less-"interceptor ABI" rather than the "platform ABI".  This restriction 
is because there is no version of glibc with the required modifications 
to provide the "platform ABI".

(n.b. that since https://reviews.llvm.org/D69574 the code-generation for 
these ABI's is the same).


 From my perspective the reasons that make HWASAN useful in itself are:

1) Much less memory usage.

 From a back-of-the-envelope calculation based on the hwasan paper's 
table of memory overhead from over-alignment 
https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code 
has an overhead of about 1.1x (~4% from overalignment and ~6.25% from 
shadow memory), while asan seems to have an overhead somewhere in the 
range 1.5x - 3x.

Maybe there's some data out there comparing total overheads that I 
haven't found? (I'd appreciate a reference if anyone has that info).



2) Available on more architectures that MTE.

HWASAN only requires TBI, which is a feature of all AArch64 machines, 
while MTE will be an optional extension and only available on certain 
architectures.


3) This enables using hwasan in the kernel.

While instrumented user-space applications will be using the 
"interceptor ABI" and hence are likely not production-quality, the 
biggest aim of implementing hwasan in GCC is to allow building the Linux 
kernel with tag-based sanitization using GCC.

Instrumented kernel code uses hooks in the kernel itself, so this ABI 
distinction is no longer relevant, and this sanitizer should produce a 
production-quality kernel binary.




I'm hoping I can find a maintainer willing to review and ACK this patch 
series -- especially with stage3 coming to a close soon.  If there's 
anything else I could do to help get someone willing up-to-speed then 
please just ask.


Cheers,
Matthew



On 07/01/2020 15:14, Martin Liška wrote:
> On 12/12/19 4:18 PM, Matthew Malcomson wrote:
> 
> Hello.
> 
> I've just sent few comments that are related to the v3 of the patch set.
> Based on the HWASAN (limited) knowledge the patch seems reasonable to me.
> I haven't looked much at the newly introduced RTL-hooks.
> But these seems to me isolated to the aarch64 port.
> 
> I can also verify that the patchset works on my aarch64 linux machine and
> hwasan.exp and asan.exp tests succeed.
> 
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory 
>> tagging,
>> since I'm not sure the way I found to implement this would be 
>> acceptable.  The
>> inlined patch below works but it requires a special declaration 
>> instead of just
>> an ~#include~.
> 
> Knowing that, I would not bother with the printing of HWASAN_MARK.
> 
> Thanks for the series,
> Martin



Re: [RFC] IVOPTs select cand with preferred D-form access

2020-01-08 Thread Bin.Cheng
On Tue, Jan 7, 2020 at 6:48 PM Kewen.Lin  wrote:
>
> on 2020/1/7 下午5:14, Richard Biener wrote:
> > On Mon, 6 Jan 2020, Kewen.Lin wrote:
> >
> >> We are thinking whether it can be handled in IVOPTs instead of one RTL 
> >> pass.
> >>
> >> During IVOPTs selecting IV cands, it doesn't know the loop will be 
> >> unrolled so
> >> it doesn't count the possible step cost in with X-form.  If we can teach 
> >> it to
> >> consider the case, the IV cands which plays with D-form can be preferred.
> >> Currently unrolling (incomplete) happens in RTL, it looks we have to 
> >> predict
> >> the loop whether unroll in IVOPTs.  Since there is some parameter checks 
> >> on RTL
> >> insn counts and target hooks, it seems not easy to get that.  Besides, we 
> >> need
> >> to check the step is valid to put into D-form field (eg: DQ-form requires 
> >> divide
> >> 16 exactly), to ensure no extra ADDIs needed.
> >>
> >> I'm not sure whether it's a good idea to implement in IVOPTs, but I did 
> >> some
> >> changes in IVOPTs to prove it's doable to get expected codes, the patch is 
> >> attached.
> >>
> >> Any comments/suggestions are highly appreiciated!
> >
> > Is the unrolled code better than the not unrolled code (assuming
> > optimal IV choice)?  Then IMHO IVOPTs should drive the unrolling,
> > either by actually doing it or by forcing it via the loop->unroll
> > setting.  I don't think second-guessing the RTL unroller at this
> > point is going to work.  Alternatively turn X-form into D-form during
> > RTL unrolling?
> >
>
> Hi Richard,
>
> Thanks for the comments!
>
> Yes, unrolled version is better on Power9 for both forms, but D-form unrolled 
> is better
> than X-form unrolled.  If we drive unrolling in IVOPTs, not sure it will be a 
> concern
> that IVOPTs becomes too heavy? or too rude with forced UF if imprecise? Do we 
> still
> have the plan to introduce one middle-end unroll pass, does it help if yes?
I am a bit worried that would make IVOPTs heavy too, it might be
possible to compute heuristics whether loop should be unrolled as a
post-IVOPTs transformation.  Of course the transformation needs to do
more work than simply unrolling in order to take advantage of
aforementioned addressing mode.
BTW, unrolled loop won't perform as good as ppc if the target doesn't
support [base + register + offset] addressing mode?

Another point, in case of multiple passes doing unrolling, the
"already unrolled" information may need to be recorded as a flag of
loop properties.

Thanks,
bin
> The quoted RTL patch is to propose one RTL pass after RTL loop passes, it 
> also sounds
> good to check whether RTL unrolling is a good place!
>
>
> BR,
> Kewen
>


[PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-08 Thread Kwok Cheung Yeung

Hello

This patch adds support for 8- and 16-bit sync_compare_and_swap 
operations on AMD GCN. GCN does not natively support atomic compare and 
swap for quantities smaller than 32-bit words, so the subword compare 
and swap is implemented in terms of 32-bit compare and swap.


The algorithm is similar to that on other architectures (e.g. 
SUBWORD_SYNC_OP in libgcc/config/arm/linux-atomic.c). i.e. It reads from 
memory the word containing the subword, creates a new word by replacing 
the subword with the swap value, then does a 32-bit atomic compare and 
swap with the old and new words. If the operation is unsuccessful due to 
a part of the word that is not the target subword changing values since 
the last read, then it must try again using the updated word.


Okay for trunk?

Kwok


2020-01-08  Kwok Cheung Yeung  

libgcc/
* config/gcn/atomic.c: New.
* config/gcn/t-amdgcn (LIB2ADD): Add atomic.c.
---
 libgcc/config/gcn/atomic.c | 60 
++

 libgcc/config/gcn/t-amdgcn |  3 ++-
 2 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/gcn/atomic.c

diff --git a/libgcc/config/gcn/atomic.c b/libgcc/config/gcn/atomic.c
new file mode 100644
index 000..6514dfc
--- /dev/null
+++ b/libgcc/config/gcn/atomic.c
@@ -0,0 +1,60 @@
+/* AMD GCN atomic operations
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Mentor Graphics.
+
+   This file is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 3, or (at your option) any
+   later version.
+
+   This file is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include 
+
+#define __sync_subword_compare_and_swap(type, size)\
+   \
+type   \
+__sync_val_compare_and_swap_##size (type *ptr, type oldval, type 
newval)\

+{  \
+  unsigned int *wordptr
\
+   = (unsigned int *)((unsigned long long) ptr & ~3ULL);   
\
+  int shift = ((unsigned long long) ptr & 3ULL) * 8;   
\
+  unsigned int valmask = (1 << (size * 8)) - 1;  \
+  unsigned int wordmask = ~(valmask << shift);   \
+  unsigned int oldword = *wordptr; \
+  for (;;) \
+{  \
+  type prevval = (oldword >> shift) & valmask;   \
+  if (__builtin_expect (prevval != oldval, 0)) \
+   return prevval; \
+  unsigned int newword = oldword & wordmask;   \
+  newword |= ((unsigned int) newval) << shift;   \
+  unsigned int prevword\
+ = __sync_val_compare_and_swap_4 (wordptr, oldword, newword);  \
+  if (__builtin_expect (prevword == oldword, 1))   \
+   return oldval;  \
+  oldword = prevword;  \
+}  \
+}  \
+   \
+bool   \
+__sync_bool_compare_and_swap_##size (type *ptr, type oldval, type 
newval)   \

+{  \
+  return __sync_val_compare_and_swap_##size(ptr, oldval, newval) == 
oldval; \

+}
+
+__sync_subword_compare_and_swap (unsigned char, 1)
+__sync_subword_compare_and_swap (unsigned short, 2)
+
diff --git a/libgcc/config/gcn/t-amdgcn b/libgcc/config/gcn/t-amdgcn
index adbd866..fe7b5fa 100644
--- 

Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
> Hmm, indeed.  Well, I belive we use the 'Optimization' flag for other purposes
> than only triggering LTO streaming and option save/restore, so we need another
> flag that only triggers save/restore then (and also allow us to avoid
> dropping the
> flag at lto-option streaming time where we currently drop all warning 
> options).

jan@skylake:~/trunk/gcc> grep Optimization *.awk
optc-save-gen.awk:  if (flag_set_p("(Optimization|PerFunction)", flags[i])) 
{
optc-save-gen.awk:  if (flag_set_p("(Optimization|PerFunction)", flags[i])) 
{
optc-save-gen.awk:  var_opt_hash[n_opt_val] = 
flag_set_p("Optimization", flags[i]);
opt-functions.awk:test_flag("(Optimization|PerFunction)", flags, " | 
CL_OPTIMIZATION")
opth-gen.awk:   if (flag_set_p("(Optimization|PerFunction)", flags[i]))
{
jan@skylake:~/trunk/gcc> grep PerFunction *.opt
Common Report Var(flag_var_tracking) Init(2) PerFunction
Common Report Var(flag_var_tracking_assignments) Init(2) PerFunction
Common Report Var(flag_var_tracking_assignments_toggle) PerFunction
Common Report Var(flag_var_tracking_uninit) PerFunction

So I suppose we want PerFunction to be used for warnings and params
and stuff that is not an optimization like -fasynchronous-unwind-tables.

Indeed Optimizations adds CL_OPTIMIZATION which seems to make options to
appear in section "The following options control optimization" which is
not precisely accurate either, since we think we still have
optimizations that are not per function (-fsemantic-interposition could
count as such, tought I guess definition of what is "optimization" is
bit shady)

Honza
> 
> Richard.
> 
> > honza
> >
> > >
> > > Richard.
> > >
> > > > Honza


Re: [PATCH] Relax invalidation of TOP N counters in PGO.

2020-01-08 Thread Jan Hubicka
Hi,
Just to explain better what I am worried about.  The overall sum of
counters in TOPN does not have very good meaning if you have more than N
target.

Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
guarantees if target X is taken by more than 50% of times, it will win,
however its count can be arbitrarily low.

Consider following sequence of call targets

Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X 
The TOPN counter will behave as follows:
Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8 

Now for sequence

Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X 
Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2 

There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
Depending on order the counter of X can be anywhere between 2 and 8. So
setting threshold without trashing a lot of valid cases would be hard
and if you have something like 40k of parallel invocations of clang I
would expect quite high divergence between the runs.

Similar sequence exists for N>1, you only need to populate other entries
by new targets.

However based on observation above we could probably scale up the
probabilites assuming that the missing part of histogram has pretty much
the same distribution as known part.

Honza


Re: [PATCH GCC11]Improve uninitialized warning with value range info

2020-01-08 Thread Richard Biener
On Wed, Jan 8, 2020 at 6:01 AM bin.cheng  wrote:
>
> Sorry, here is the patch.
> --
> Sender:bin.cheng 
> Sent At:2020 Jan. 8 (Wed.) 12:58
> Recipient:GCC Patches 
> Subject:[PATCH GCC11]Improve uninitialized warning with value range info
>
>
> Hi,
>
> Function use_pred_not_overlap_with_undef_path_pred of 
> pass_late_warn_uninitialized
> checks if predicate of variable use overlaps with predicate of undefined 
> control flow path.
> For now, it only checks ssa_var comparing against constant, this can be 
> improved where
> ssa_var compares against another ssa_var with value range info, as described 
> in comment:
>
> + /* Check value range info of rhs, do following transforms:
> +  flag_var < [min, max]  ->  flag_var < max
> +  flag_var > [min, max]  ->  flag_var > min
> +
> +We can also transform LE_EXPR/GE_EXPR to LT_EXPR/GT_EXPR:
> +  flag_var <= [min, max] ->  flag_var < [min, max+1]
> +  flag_var >= [min, max] ->  flag_var > [min-1, max]
> +if no overflow/wrap.  */
>
> This change can avoid some false warning.  Bootstrap and test on x86_64, any 
> comment?

Definitely a good idea - the refactoring makes the patch hard to
follow though.  The
original code seems to pick any (the "first") compare against a constant.  You
return the "first" but maybe from range info that might also be
[-INF,+INF].  It seems
that we'd want to pick the best so eventually sort the predicate chain
so that compares
against constants come first at least?  Not sure if it really makes a
difference but
even currently we could have i < 5, i < 1 so the "better" one later?
It might also make
sense to simply push three predicates for i < j, namely i < j (if ever
useful), i < min(j)
and max(i) < j to avoid repeatedly doing the range computations.

Thanks,
Richard.

> Thanks,
> bin
>
> 2020-01-08  Bin Cheng  
>
>  * tree-ssa-uninit.c (find_var_cmp_const): New function.
>  (use_pred_not_overlap_with_undef_path_pred): Call above.


Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Jan Hubicka
> > > Given all warning options can be enabled/disabled via #pragma GCC 
> > > diagnostic
> > > all Warning annotated options should be implicitely 'Optimization' for
> > > the purpose
> > > of LTO streaming then?
> >
> > Well, perhaps they can be marked but for late optimizations this does
> > not work
> > __attribute__ ((warning("haha"))) test() { }
> > #pragma gcc diagnostic ignore "-Wattribute-warning"
> > test2() { test(); }
> >
> > We have many warning options but only few of them are late - it would be
> > nice to have them explicitly marked somehow IMO (by design and to avoid
> > streaming a lot of useless flags)
> 
> Hmm, indeed.  Well, I belive we use the 'Optimization' flag for other purposes
> than only triggering LTO streaming and option save/restore, so we need another
> flag that only triggers save/restore then (and also allow us to avoid
> dropping the
> flag at lto-option streaming time where we currently drop all warning 
> options).

Yep.  I was not aware of any other use of "Optimization" flag, but
definitly it is misnamed in this context (as is OPTIMIZATION_NODE btw).

Honza
> 
> Richard.
> 
> > honza
> >
> > >
> > > Richard.
> > >
> > > > Honza


Re: [PR47785] COLLECT_AS_OPTIONS

2020-01-08 Thread Prathamesh Kulkarni
On Tue, 5 Nov 2019 at 17:38, Richard Biener  wrote:
>
> On Tue, Nov 5, 2019 at 12:17 AM Kugan Vivekanandarajah
>  wrote:
> >
> > Hi,
> > Thanks for the review.
> >
> > On Tue, 5 Nov 2019 at 03:57, H.J. Lu  wrote:
> > >
> > > On Sun, Nov 3, 2019 at 6:45 PM Kugan Vivekanandarajah
> > >  wrote:
> > > >
> > > > Thanks for the reviews.
> > > >
> > > >
> > > > On Sat, 2 Nov 2019 at 02:49, H.J. Lu  wrote:
> > > > >
> > > > > On Thu, Oct 31, 2019 at 6:33 PM Kugan Vivekanandarajah
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, 30 Oct 2019 at 03:11, H.J. Lu  wrote:
> > > > > > >
> > > > > > > On Sun, Oct 27, 2019 at 6:33 PM Kugan Vivekanandarajah
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hi Richard,
> > > > > > > >
> > > > > > > > Thanks for the review.
> > > > > > > >
> > > > > > > > On Wed, 23 Oct 2019 at 23:07, Richard Biener 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Oct 21, 2019 at 10:04 AM Kugan Vivekanandarajah
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Richard,
> > > > > > > > > >
> > > > > > > > > > Thanks for the pointers.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, 11 Oct 2019 at 22:33, Richard Biener 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Oct 11, 2019 at 6:15 AM Kugan Vivekanandarajah
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > Thanks for the review.
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2 Oct 2019 at 20:41, Richard Biener 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Oct 2, 2019 at 10:39 AM Kugan Vivekanandarajah
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As mentioned in the PR, attached patch adds 
> > > > > > > > > > > > > > COLLECT_AS_OPTIONS for
> > > > > > > > > > > > > > passing assembler options specified with -Wa, to 
> > > > > > > > > > > > > > the link-time driver.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The proposed solution only works for uniform -Wa 
> > > > > > > > > > > > > > options across all
> > > > > > > > > > > > > > TUs. As mentioned by Richard Biener, supporting 
> > > > > > > > > > > > > > non-uniform -Wa flags
> > > > > > > > > > > > > > would require either adjusting partitioning 
> > > > > > > > > > > > > > according to flags or
> > > > > > > > > > > > > > emitting multiple object files  from a single 
> > > > > > > > > > > > > > LTRANS CU. We could
> > > > > > > > > > > > > > consider this as a follow up.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Bootstrapped and regression tests on  
> > > > > > > > > > > > > > arm-linux-gcc. Is this OK for trunk?
> > > > > > > > > > > > >
> > > > > > > > > > > > > While it works for your simple cases it is unlikely 
> > > > > > > > > > > > > to work in practice since
> > > > > > > > > > > > > your implementation needs the assembler options be 
> > > > > > > > > > > > > present at the link
> > > > > > > > > > > > > command line.  I agree that this might be the way for 
> > > > > > > > > > > > > people to go when
> > > > > > > > > > > > > they face the issue but then it needs to be 
> > > > > > > > > > > > > documented somewhere
> > > > > > > > > > > > > in the manual.
> > > > > > > > > > > > >
> > > > > > > > > > > > > That is, with COLLECT_AS_OPTION (why singular?  I'd 
> > > > > > > > > > > > > expected
> > > > > > > > > > > > > COLLECT_AS_OPTIONS) available to cc1 we could stream 
> > > > > > > > > > > > > this string
> > > > > > > > > > > > > to lto_options and re-materialize it at link time 
> > > > > > > > > > > > > (and diagnose mismatches
> > > > > > > > > > > > > even if we like).
> > > > > > > > > > > > OK. I will try to implement this. So the idea is if we 
> > > > > > > > > > > > provide
> > > > > > > > > > > > -Wa,options as part of the lto compile, this should be 
> > > > > > > > > > > > available
> > > > > > > > > > > > during link time. Like in:
> > > > > > > > > > > >
> > > > > > > > > > > > arm-linux-gnueabihf-gcc -march=armv7-a -mthumb -O2 -flto
> > > > > > > > > > > > -Wa,-mimplicit-it=always,-mthumb -c test.c
> > > > > > > > > > > > arm-linux-gnueabihf-gcc  -flto  test.o
> > > > > > > > > > > >
> > > > > > > > > > > > I am not sure where should we stream this. Currently, 
> > > > > > > > > > > > cl_optimization
> > > > > > > > > > > > has all the optimization flag provided for compiler and 
> > > > > > > > > > > > it is
> > > > > > > > > > > > autogenerated and all the flags are integer values. Do 
> > > > > > > > > > > > you have any
> > > > > > > > > > > > preference or example where this should be done.
> > > > > > > > > > >
> > > > > > > > > > > In lto_write_options, I'd simply append the contents of 
> > > > > > > > > > > COLLECT_AS_OPTIONS
> > > > > > > > > > > (with -Wa, prepended to each of 

Re: [PATCH] Make warn_inline Optimization option.

2020-01-08 Thread Richard Biener
On Tue, Jan 7, 2020 at 4:46 PM Jan Hubicka  wrote:
>
> > On Tue, Jan 7, 2020 at 3:26 PM Jan Hubicka  wrote:
> > >
> > > > Err - Optimization also lists it in some -help section?  It's a Warning
> > > > option and certainly we don't handle per-function Warnings in general
> > > > (with LTO) even though we have #pragma GCC diagnostic, no?
> > > >
> > > > I'm not sure why we force warn_inline to zero with -O0, it seems much
> > > > better to guard false warnings in some other way for -O0?
> > >
> > > Well, we can do that with warn_inline, but in general we do want to
> > > stream late warnings (so things like -Wmaybe-uninitialized works sort of
> > > as expected for -flto). So I guess we want way to mark option as part of
> > > TARGET_OPTIMIZATION_NODE even though it is not realy an optimization
> > > option but parameter, warning or semantic change.
> >
> > Given all warning options can be enabled/disabled via #pragma GCC diagnostic
> > all Warning annotated options should be implicitely 'Optimization' for
> > the purpose
> > of LTO streaming then?
>
> Well, perhaps they can be marked but for late optimizations this does
> not work
> __attribute__ ((warning("haha"))) test() { }
> #pragma gcc diagnostic ignore "-Wattribute-warning"
> test2() { test(); }
>
> We have many warning options but only few of them are late - it would be
> nice to have them explicitly marked somehow IMO (by design and to avoid
> streaming a lot of useless flags)

Hmm, indeed.  Well, I belive we use the 'Optimization' flag for other purposes
than only triggering LTO streaming and option save/restore, so we need another
flag that only triggers save/restore then (and also allow us to avoid
dropping the
flag at lto-option streaming time where we currently drop all warning options).

Richard.

> honza
>
> >
> > Richard.
> >
> > > Honza


Re: [PATCH] Use dump_asm_name for Callers/Calls in dump.

2020-01-08 Thread Jan Hubicka
> On 1/7/20 11:27 AM, Martin Liška wrote:
> > Which is fine. Apparently there are just few usages of manual printing
> > of a symtab node and order like:
> > 
> >    fprintf (f,
> >     "%*s%s/%i %s\n%*s  freq:%4.2f",
> >     indent, "", callee->name (), callee->order,
> > 
> > I can replace these with symtab_node::dump_{asm_}name.
> 
> Hi.
> 
> I'm addressing this by the following refactoring patch.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
This is OK.
The original reason for using DECL_NAME rather than DECL_ASSEMBLER_NAME
was that first is shorter, but also that DECL_ASSEMBLER_NAME may trigger
lazy assembler name computation that changes memory layout and let to
divergences in resulting code when compiling with and without
-fdump-ipa-all.

I guess for dumping we should use dump_name consistently unless we
really want to speak of the assembler symbol name (so I would suggest
fixing the uses of dump_asm_name in most cases to dump_name)

We should not use node->name - this is artifact from times we did not
have dump_name for dumping.

Honza
> Thanks,
> Martin

> From 45629d68ac4ad9dada74e8c14a10b89025f54762 Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Tue, 7 Jan 2020 12:37:42 +0100
> Subject: [PATCH] Replace node->name/node->order with node->dump_name.
> 
> gcc/ChangeLog:
> 
> 2020-01-07  Martin Liska  
> 
>   * ipa-fnsummary.c (dump_ipa_call_summary): Use symtab_node::dump_name.
>   (ipa_call_context::estimate_size_and_time): Likewise.
>   (inline_analyze_function): Likewise.
> 
> gcc/lto/ChangeLog:
> 
> 2020-01-07  Martin Liska  
> 
>   * lto-partition.c (lto_balanced_map): Use symtab_node::dump_name.
> ---
>  gcc/ipa-fnsummary.c | 12 +---
>  gcc/lto/lto-partition.c |  4 ++--
>  2 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index fa01cb6c083..7c0b6f98e25 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -907,8 +907,8 @@ dump_ipa_call_summary (FILE *f, int indent, struct 
> cgraph_node *node,
>int i;
>  
>fprintf (f,
> -"%*s%s/%i %s\n%*s  freq:%4.2f",
> -indent, "", callee->name (), callee->order,
> +"%*s%s %s\n%*s  freq:%4.2f",
> +indent, "", callee->dump_name (),
>  !edge->inline_failed
>  ? "inlined" : cgraph_inline_failed_string (edge-> inline_failed),
>  indent, "", edge->sreal_frequency ().to_double ());
> @@ -3505,9 +3505,8 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
>if (dump_file && (dump_flags & TDF_DETAILS))
>  {
>bool found = false;
> -  fprintf (dump_file, "   Estimating body: %s/%i\n"
> -"   Known to be false: ", m_node->name (),
> -m_node->order);
> +  fprintf (dump_file, "   Estimating body: %s\n"
> +"   Known to be false: ", m_node->dump_name ());
>  
>for (i = predicate::not_inlined_condition;
>  i < (predicate::first_dynamic_condition
> @@ -4034,8 +4033,7 @@ inline_analyze_function (struct cgraph_node *node)
>push_cfun (DECL_STRUCT_FUNCTION (node->decl));
>  
>if (dump_file)
> -fprintf (dump_file, "\nAnalyzing function: %s/%u\n",
> -  node->name (), node->order);
> +fprintf (dump_file, "\nAnalyzing function: %s\n", node->dump_name ());
>if (opt_for_fn (node->decl, optimize) && !node->thunk.thunk_p)
>  inline_indirect_intraprocedural_analysis (node);
>compute_fn_summary (node, false);
> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> index 86b2eabe374..5b153c9759e 100644
> --- a/gcc/lto/lto-partition.c
> +++ b/gcc/lto/lto-partition.c
> @@ -734,10 +734,10 @@ lto_balanced_map (int n_lto_partitions, int 
> max_partition_size)
> best_varpool_pos = varpool_pos;
>   }
>if (dump_file)
> - fprintf (dump_file, "Step %i: added %s/%i, size %i, "
> + fprintf (dump_file, "Step %i: added %s, size %i, "
>"cost %" PRId64 "/%" PRId64 " "
>"best %" PRId64 "/%" PRId64", step %i\n", i,
> -  order[i]->name (), order[i]->order,
> +  order[i]->dump_name (),
>partition->insns, cost, internal,
>best_cost, best_internal, best_i);
>/* Partition is too large, unwind into step when best cost was reached 
> and
> -- 
> 2.24.1
> 



Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-08 Thread Jan Hubicka
> 
> Thanks.  So caller could be {hot, cold} + {large, small}, same for callee.  
> It may
> produce up to 4 * 4 = 16 combinations.  Agree that hard to define useful,
> and useful really doesn't reflect performance improvements certainly. :)
> 
> My case is A1(1) calls A2(2), A2(2) calls A3(3).  A1, A2, A3 are 
> specialized cloned nodes with different input, they are all hot, called once
> and each have about 1000+ insns.  By default, large-function-growth/insns are
> both not reached, so A1 will inline A2, A2 will inline A3, which is 40% 
> slower than no-inline.
> 
> If adjust the large-function-growth/insns to allow 
> A2 inline A3 only, the performance is 20% slower then no-inline.
> All the inlinings are generated in functions called once.

I see, I assume that this is exchange.  What is difficult for GCC in
exchange is the large loop nest.  GCC generally assumes that what is
inside of deeper loop nest is more iportant and if I recall correctly
there are 10 nested loops wrapping the recursie call.

Basic observation is that for every self recursive function the
combined frequency of all self recursive calls must be less than entry
block frequency or the recursion tree will never end.

Some time ago we added PRED_LOOP_EXIT_WITH_RECURSION,
PRED_RECURSIVE_CALL, PRED_LOOP_GUARD_WITH_RECURSION which makes loops
leading to recursion less likely to iterate. But this may not be enough
to get profile correct.

I wonder if we can not help the situation by extending
esitmate_bb_frequencies to simply sum the frequencies of recursive calls
and if they exceeds entry block forcingly scale down corresponding BBs
accordingly (which would leave profile locally inconsistent, but I am
not sure how to do much better - one could identif control dependencies
and drop probabilities but after that one would need re-propagate the
loop nest I guess.

This may 1) make inliner less eager to perform the inline
 2) make tree optimizers to produce less damage on the outer
loops if inlining happens.
Honza
> 
> 
> Xionghu
> 
> > 
> > Honza
> > 
> 


Re: [PATCH] Use dump_asm_name for Callers/Calls in dump.

2020-01-08 Thread Martin Liška

On 1/7/20 11:27 AM, Martin Liška wrote:

Which is fine. Apparently there are just few usages of manual printing
of a symtab node and order like:

   fprintf (f,
    "%*s%s/%i %s\n%*s  freq:%4.2f",
    indent, "", callee->name (), callee->order,

I can replace these with symtab_node::dump_{asm_}name.


Hi.

I'm addressing this by the following refactoring patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 45629d68ac4ad9dada74e8c14a10b89025f54762 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 7 Jan 2020 12:37:42 +0100
Subject: [PATCH] Replace node->name/node->order with node->dump_name.

gcc/ChangeLog:

2020-01-07  Martin Liska  

	* ipa-fnsummary.c (dump_ipa_call_summary): Use symtab_node::dump_name.
	(ipa_call_context::estimate_size_and_time): Likewise.
	(inline_analyze_function): Likewise.

gcc/lto/ChangeLog:

2020-01-07  Martin Liska  

	* lto-partition.c (lto_balanced_map): Use symtab_node::dump_name.
---
 gcc/ipa-fnsummary.c | 12 +---
 gcc/lto/lto-partition.c |  4 ++--
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index fa01cb6c083..7c0b6f98e25 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -907,8 +907,8 @@ dump_ipa_call_summary (FILE *f, int indent, struct cgraph_node *node,
   int i;
 
   fprintf (f,
-	   "%*s%s/%i %s\n%*s  freq:%4.2f",
-	   indent, "", callee->name (), callee->order,
+	   "%*s%s %s\n%*s  freq:%4.2f",
+	   indent, "", callee->dump_name (),
 	   !edge->inline_failed
 	   ? "inlined" : cgraph_inline_failed_string (edge-> inline_failed),
 	   indent, "", edge->sreal_frequency ().to_double ());
@@ -3505,9 +3505,8 @@ ipa_call_context::estimate_size_and_time (int *ret_size,
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   bool found = false;
-  fprintf (dump_file, "   Estimating body: %s/%i\n"
-	   "   Known to be false: ", m_node->name (),
-	   m_node->order);
+  fprintf (dump_file, "   Estimating body: %s\n"
+	   "   Known to be false: ", m_node->dump_name ());
 
   for (i = predicate::not_inlined_condition;
 	   i < (predicate::first_dynamic_condition
@@ -4034,8 +4033,7 @@ inline_analyze_function (struct cgraph_node *node)
   push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
   if (dump_file)
-fprintf (dump_file, "\nAnalyzing function: %s/%u\n",
-	 node->name (), node->order);
+fprintf (dump_file, "\nAnalyzing function: %s\n", node->dump_name ());
   if (opt_for_fn (node->decl, optimize) && !node->thunk.thunk_p)
 inline_indirect_intraprocedural_analysis (node);
   compute_fn_summary (node, false);
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 86b2eabe374..5b153c9759e 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -734,10 +734,10 @@ lto_balanced_map (int n_lto_partitions, int max_partition_size)
 	  best_varpool_pos = varpool_pos;
 	}
   if (dump_file)
-	fprintf (dump_file, "Step %i: added %s/%i, size %i, "
+	fprintf (dump_file, "Step %i: added %s, size %i, "
 		 "cost %" PRId64 "/%" PRId64 " "
 		 "best %" PRId64 "/%" PRId64", step %i\n", i,
-		 order[i]->name (), order[i]->order,
+		 order[i]->dump_name (),
 		 partition->insns, cost, internal,
 		 best_cost, best_internal, best_i);
   /* Partition is too large, unwind into step when best cost was reached and
-- 
2.24.1



Re: [committed] Fix UB in gfc_trans_omp_clauses (PR fortran/93162)

2020-01-08 Thread Tobias Burnus

On 1/8/20 9:22 AM, Jakub Jelinek wrote:

With mixed REF_COMPONENT and REF_ARRAY, one can have var(:), or var2%comp(:)
or var3(:)%comp, or var3%comp(:)%comp2 etc.
Technically, one can also have var3(4)%comp(:)%comp2(1) – with one 
nonelement/AR_FULL reference and two element references. (At least as 
long 'comp2' is neither a pointer or an allocatable.)

and so the question is
what exactly we want to handle in the first if and what in the other cases.


My impression was back in October that – except for the additional 
component ref – that if the last reference is an array, it can be 
handled fine with the existing code AR_FULL code as a lot of the 
complications was already handled when converting gfc_expr to a tree. – 
But I didn't explore this back then as I stashed the project.



And something else should handle the other references, and that one needs to
walk the ref chain and handle each REF_ARRAY or REF_COMPONENT in there.


I am not sure whether one needs to handle each REF_COMPONENT and 
REF_ARRAY here. Talking of OpenACC 2.6 and OpenMP 4.5, one can largely 
ignore the components references before either the last one or before 
the first non-element array access – and for element access after the 
last component, the current code seems to map "comp(:)" when specifying 
"comp(:)%comp2", which I think it is fine.


* * *

_[Otherwise, one had to map each element separately or add more mapping 
support for some strided access patterns. – Depending on the element 
size [sizeof(comp(1)%comp2)] vs. stride [loc(comp(2))–loc(comp(1)] 
ratio, copying it as block or per element makes more sense. (Using 
A(::n,::m)%a it might make even sense to combine one stride as 
contiguous block and loop over the other.]_


In any case, that's something to study for Stage 1, whether first for 
OpenMP 4.5 or already preparing it for OpenMP 5, is the other big question.


For the latter, I am not sure how to handle it best. (I wonder how it is 
implemented for coarrays as one has there the same issue, especially 
with user-defined reduction functions and polymorphic variables, it can 
get complicated. – I did have an idea back then, but have not 
implemented it. I wonder whether someone else has tackled it in the 
meanwhile.) — I also have not checked whether OpenACC 2.7 or OpenACC 3.0 
or OpenMP 5.1/TR8 will add additional complications, which should be 
taken into account (at least design wise). [Nor whether new coarray 
issues pop(ped) up, which should be taken care of at the same time.]


To handle polymorphic types with allocatable components (which 
automatically get copied): I fear one cannot avoid adding a function to 
the v(irtual)tab(le). For user-defined reduction, I was thinking of 
passing a function pointer to the library, which then calls the function 
and provides a function pointer to a library function – that way, one 
can recursively pass calls between the library (which knows how to 
transfer the data) and the vtable functions in the code (which know the 
data structure).


Cheers,

Tobias



arm: Fix rmprofile multilibs when architecture includes +mp or +sec (PR target/93188)

2020-01-08 Thread Richard Earnshaw (lists)

When only the rmprofile multilibs are built, compiling for armv7-a
should select the generic v7 multilibs.  This used to work before +sec
and +mp were added to the architecture options but it was broken by
that update.  This patch fixes those variants and adds some tests to
ensure that they remain fixed ;-)

PR target/93188
* config/arm/t-multilib (MULTILIB_MATCHES): Add rules to match
armv7-a{+mp,+sec,+mp+sec} to appropriate armv7 multilib variants
when only building rm-profile multilibs.

* gcc.target/arm/multilib.exp: Add new tests for rm-profile only.

Committed to trunk.
diff --git a/gcc/config/arm/t-multilib b/gcc/config/arm/t-multilib
index 0e16340557d..399343604f0 100644
--- a/gcc/config/arm/t-multilib
+++ b/gcc/config/arm/t-multilib
@@ -133,10 +133,19 @@ MULTILIB_MATCHES	+= march?armv7-r+fp.sp=march?armv8-r+crc+fp.sp
 
 ifeq (,$(HAS_APROFILE))
 # Map all v7-a
+
 MULTILIB_MATCHES	+= march?armv7=march?armv7-a
+
+MULTILIB_MATCHES	+= $(foreach ARCH, $(v7_a_arch_variants), \
+			 march?armv7=march?armv7-a$(ARCH))
+
 MULTILIB_MATCHES	+= $(foreach ARCH, $(v7_a_nosimd_variants) $(v7_a_simd_variants), \
 			 march?armv7+fp=march?armv7-a$(ARCH))
 
+MULTILIB_MATCHES	+= $(foreach ARCHVAR, $(v7_a_arch_variants), \
+			 $(foreach ARCH, $(v7_a_nosimd_variants) $(v7_a_simd_variants), \
+			   march?armv7+fp=march?armv7-a$(ARCHVAR)$(ARCH)))
+
 MULTILIB_MATCHES	+= march?armv7=march?armv7ve
 
 # ARMv7ve FP/SIMD variants: map down to v7+fp
diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp b/gcc/testsuite/gcc.target/arm/multilib.exp
index dc7c171707a..e83d1da261b 100644
--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -442,6 +442,22 @@ if {[multilib_config "aprofile"] } {
 	check_multi_dir $opts $dir
 }
 }
+if {[multilib_config "rmprofile"] && ![multilib_config "aprofile"]} {
+foreach {opts dir} {
+	{-mcpu=cortex-a9 -mfpu=auto -mfloat-abi=soft} "thumb/v7/nofp"
+	{-mcpu=cortex-a8 -mfpu=auto -mfloat-abi=softfp} "thumb/v7+fp/softfp"
+	{-mcpu=cortex-a5 -mfpu=auto -mfloat-abi=hard} "thumb/v7+fp/hard"
+	{-mcpu=cortex-a53 -mfpu=auto -mfloat-abi=hard} "thumb/v7+fp/hard"
+	{-march=armv7-a+fp -mfpu=auto -mfloat-abi=softfp} "thumb/v7+fp/softfp"
+	{-march=armv7-a+fp -mfpu=auto -mfloat-abi=soft} "thumb/v7/nofp"
+	{-march=armv7-a+mp+simd -mfpu=auto -mfloat-abi=softfp} "thumb/v7+fp/softfp"
+	{-march=armv7-a -mfpu=vfpv4 -mfloat-abi=hard} "thumb/v7+fp/hard"
+	{-march=armv7-a+fp -mfpu=auto -mfloat-abi=hard} "thumb/v7+fp/hard"
+	{-march=armv7-a -mfpu=vfpv4 -mfloat-abi=soft} "thumb/v7/nofp"
+} {
+	check_multi_dir $opts $dir
+}
+}
 if {[multilib_config "rmprofile"] } {
 foreach {opts dir} {
 	{-mcpu=cortex-m0 -mfpu=auto -mfloat-abi=soft} "thumb/v6-m/nofp"


[PATCH 38/41] analyzer: new files: checker-path.{cc|h}

2020-01-08 Thread David Malcolm
Jeff approved the v1 version of the patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00820.html
There are some non-trivial changes in the followups (see the URLs
below).

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Add custom events:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00213.html
- Add support for global state:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00217.html
- start_cfg_edge_event::maybe_describe_condition: special-case the description
  of edges based on the result of strcmp
- Generalize rewind_info_t to exploded_edge::custom_info_t
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00219.html
- Add checker_path::debug
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02033.html

This patch adds a family of classes for representing paths of events
for analyzer diagnostics.

gcc/analyzer/ChangeLog:
* checker-path.cc: New file.
* checker-path.h: New file.
---
 gcc/analyzer/checker-path.cc | 931 +++
 gcc/analyzer/checker-path.h  | 589 ++
 2 files changed, 1520 insertions(+)
 create mode 100644 gcc/analyzer/checker-path.cc
 create mode 100644 gcc/analyzer/checker-path.h

diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc
new file mode 100644
index ..b24952b9391b
--- /dev/null
+++ b/gcc/analyzer/checker-path.cc
@@ -0,0 +1,931 @@
+/* Subclasses of diagnostic_path and diagnostic_event for analyzer diagnostics.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "gimple-pretty-print.h"
+#include "fold-const.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/checker-path.h"
+#include "analyzer/supergraph.h"
+#include "analyzer/diagnostic-manager.h"
+#include "analyzer/exploded-graph.h"
+
+#if ENABLE_ANALYZER
+
+/* Get a string for EK.  */
+
+const char *
+event_kind_to_string (enum event_kind ek)
+{
+  switch (ek)
+{
+default:
+  gcc_unreachable ();
+case EK_DEBUG:
+  return "EK_DEBUG";
+case EK_CUSTOM:
+  return "EK_CUSTOM";
+case EK_STMT:
+  return "EK_STMT";
+case EK_FUNCTION_ENTRY:
+  return "EK_FUNCTION_ENTRY";
+case EK_STATE_CHANGE:
+  return "EK_STATE_CHANGE";
+case EK_START_CFG_EDGE:
+  return "EK_START_CFG_EDGE";
+case EK_END_CFG_EDGE:
+  return "EK_END_CFG_EDGE";
+case EK_CALL_EDGE:
+  return "EK_CALL_EDGE";
+case EK_RETURN_EDGE:
+  return "EK_RETURN_EDGE";
+case EK_SETJMP:
+  return "EK_SETJMP";
+case EK_REWIND_FROM_LONGJMP:
+  return "EK_REWIND_FROM_LONGJMP";
+case EK_REWIND_TO_SETJMP:
+  return "EK_REWIND_TO_SETJMP";
+case EK_WARNING:
+  return "EK_WARNING";
+}
+}
+
+/* class checker_event : public diagnostic_event.  */
+
+/* Dump this event to PP (for debugging/logging purposes).  */
+
+void
+checker_event::dump (pretty_printer *pp) const
+{
+  label_text event_desc (get_desc (false));
+  pp_printf (pp, "\"%s\" (depth %i, m_loc=%x)",
+event_desc.m_buffer,
+get_stack_depth (),
+get_location ());
+  event_desc.maybe_free ();
+}
+
+/* Hook for being notified when this event has its final id EMISSION_ID
+   and is about to emitted for PD.
+
+   Base implementation of checker_event::prepare_for_emission vfunc;
+   subclasses that override this should chain up to it.
+
+   Record PD and EMISSION_ID, and call the get_desc vfunc, so that any
+   side-effects of the call to get_desc take place before
+   pending_diagnostic::emit is called.
+
+   For example, state_change_event::get_desc can call
+   pending_diagnostic::describe_state_change; free_of_non_heap can use this
+   to tweak the message (TODO: would be neater to simply capture the
+   pertinent data within the sm-state).  */
+
+void
+checker_event::prepare_for_emission (checker_path *,
+pending_diagnostic *pd,
+diagnostic_event_id_t emission_id)
+{
+  

[PATCH 12/41] analyzer: new files: analyzer-selftests.{cc|h}

2020-01-08 Thread David Malcolm
Jeff approved the v1 version of this patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00497.html
I believe the subsequent changes are obvious enough to be self-approvable.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h
- call run_analyzer_selftests directly, rather than via plugin
  registration; wrap the analyzer selftests in #if ENABLE_ANALYZER
- fixup for moves of digraph.cc and tristate.cc from gcc/analyzer to gcc

gcc/analyzer/ChangeLog:
* analyzer-selftests.cc: New file.
* analyzer-selftests.h: New file.
---
 gcc/analyzer/analyzer-selftests.cc | 60 ++
 gcc/analyzer/analyzer-selftests.h  | 44 ++
 gcc/selftest-run-tests.c   |  6 +++
 gcc/selftest.h |  2 +
 4 files changed, 112 insertions(+)
 create mode 100644 gcc/analyzer/analyzer-selftests.cc
 create mode 100644 gcc/analyzer/analyzer-selftests.h

diff --git a/gcc/analyzer/analyzer-selftests.cc 
b/gcc/analyzer/analyzer-selftests.cc
new file mode 100644
index ..5ffacd575aba
--- /dev/null
+++ b/gcc/analyzer/analyzer-selftests.cc
@@ -0,0 +1,60 @@
+/* Selftest support for the analyzer.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/analyzer-selftests.h"
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Build a VAR_DECL named NAME of type TYPE, simulating a file-level
+   static variable.  */
+
+tree
+build_global_decl (const char *name, tree type)
+{
+  tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ get_identifier (name), type);
+  TREE_STATIC (decl) = 1;
+  return decl;
+}
+
+/* Run all analyzer-specific selftests.  */
+
+void
+run_analyzer_selftests ()
+{
+#if ENABLE_ANALYZER
+  analyzer_constraint_manager_cc_tests ();
+  analyzer_program_point_cc_tests ();
+  analyzer_program_state_cc_tests ();
+  analyzer_region_model_cc_tests ();
+#endif /* #if ENABLE_ANALYZER */
+}
+
+} /* end of namespace selftest.  */
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/analyzer/analyzer-selftests.h 
b/gcc/analyzer/analyzer-selftests.h
new file mode 100644
index ..6f08aa2b1bc0
--- /dev/null
+++ b/gcc/analyzer/analyzer-selftests.h
@@ -0,0 +1,44 @@
+/* Selftests for the analyzer.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_ANALYZER_SELFTESTS_H
+#define GCC_ANALYZER_SELFTESTS_H
+
+#if CHECKING_P
+
+namespace selftest {
+
+extern tree build_global_decl (const char *name, tree type);
+
+extern void run_analyzer_selftests ();
+
+/* Declarations for specific families of tests (by source file), in
+   alphabetical order.  */
+extern void analyzer_checker_script_cc_tests ();
+extern void analyzer_constraint_manager_cc_tests ();
+extern void analyzer_program_point_cc_tests ();
+extern void analyzer_program_state_cc_tests ();
+extern void analyzer_region_model_cc_tests ();
+
+} /* end of namespace selftest.  */
+
+#endif /* #if CHECKING_P */
+
+#endif /* GCC_ANALYZER_SELFTESTS_H */
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index b468e8799d41..e451387ab211 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "options.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "analyzer/analyzer-selftests.h"
 
 /* This function needed to be split out from selftest.c as it references
tests from the 

[PATCH 13/41] analyzer: command-line options

2020-01-08 Thread David Malcolm
Needs review.  msebor expressed some concerns in an earlier version
of the patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00233.html
re overlap with existing functions, and very long names.
For the former, they all have a "-Wanalyzer-" prefix to
distinguish them, and for the latter, I prefer the precision
of the longer names, but tastes may vary.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020
- fix a stray reference to plugins

Changed in v4:
- Renamed gcc/analyzer/plugin.opt to gcc/analyzer/analyzer.opt

- Change option from -analyzer to -fanalyzer, changed it from
  Driver to Common.

- Various commits on 2019-11-12 including r278083 through r278087
  reimplemented parameter-handling in terms of options, so that
  params are defined in params.opt rather than params.def.

  This patch adds the params for the analyzer to analyzer.opt,
  replacing the patch:
[PATCH 22/49] analyzer: params.def: new parameters
  https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01520.html
  from the original version of the patch kit.

- Added -Wanalyzer-unsafe-call-within-signal-handler from
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00214.html

This patch contains the command-line options for the analyzer.

gcc/analyzer/ChangeLog:
* analyzer.opt: New file.

gcc/ChangeLog:
* common.opt (-fanalyzer): New driver option.
---
 gcc/analyzer/analyzer.opt | 181 ++
 gcc/common.opt|   4 +
 2 files changed, 185 insertions(+)
 create mode 100644 gcc/analyzer/analyzer.opt

diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
new file mode 100644
index ..af8d81d697ab
--- /dev/null
+++ b/gcc/analyzer/analyzer.opt
@@ -0,0 +1,181 @@
+; analyzer.opt -- Options for the analyzer.
+
+; Copyright (C) 2019-2020 Free Software Foundation, Inc.
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it under
+; the terms of the GNU General Public License as published by the Free
+; Software Foundation; either version 3, or (at your option) any later
+; version.
+; 
+; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+; for more details.
+; 
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; .
+
+; See the GCC internals manual for a description of this file's format.
+
+; Please try to keep this file in ASCII collating order.
+
+-param=analyzer-bb-explosion-factor=
+Common Joined UInteger Var(param_analyzer_bb_explosion_factor) Init(5) Param
+The maximum number of 'after supernode' exploded nodes within the analyzer per 
supernode, before terminating analysis.
+
+-param=analyzer-max-enodes-per-program-point=
+Common Joined UInteger Var(param_analyzer_max_enodes_per_program_point) 
Init(8) Param
+The maximum number of exploded nodes per program point within the analyzer, 
before terminating analysis of that point.
+
+-param=analyzer-max-recursion-depth=
+Common Joined UInteger Var(param_analyzer_max_recursion_depth) Init(2) Param
+The maximum number of times a callsite can appear in a call stack within the 
analyzer, before terminating analysis of a call tha would recurse deeper.
+
+-param=analyzer-min-snodes-for-call-summary=
+Common Joined UInteger Var(param_analyzer_min_snodes_for_call_summary) 
Init(10) Param
+The minimum number of supernodes within a function for the analyzer to 
consider summarizing its effects at call sites.
+
+Wanalyzer-double-fclose
+Common Var(warn_analyzer_double_fclose) Init(1) Warning
+Warn about code paths in which a stdio FILE can be closed more than once.
+
+Wanalyzer-double-free
+Common Var(warn_analyzer_double_free) Init(1) Warning
+Warn about code paths in which a pointer can be freed more than once.
+
+Wanalyzer-exposure-through-output-file
+Common Var(warn_analyzer_exposure_through_output_file) Init(1) Warning
+Warn about code paths in which sensitive data is written to a file.
+
+Wanalyzer-file-leak
+Common Var(warn_analyzer_file_leak) Init(1) Warning
+Warn about code paths in which a stdio FILE is not closed.
+
+Wanalyzer-free-of-non-heap
+Common Var(warn_analyzer_free_of_non_heap) Init(1) Warning
+Warn about code paths in which a non-heap pointer is freed.
+
+Wanalyzer-malloc-leak
+Common Var(warn_analyzer_malloc_leak) Init(1) Warning
+Warn about code paths in which a heap-allocated pointer leaks.
+
+Wanalyzer-possible-null-argument
+Common Var(warn_analyzer_possible_null_argument) Init(1) Warning
+Warn about code paths in which a possibly-NULL value is passed to a 
must-not-be-NULL function argument.
+
+Wanalyzer-possible-null-dereference
+Common Var(warn_analyzer_possible_null_dereference) Init(1) Warning
+Warn about code paths in which a possibly-NULL 

[PATCH 39/41] analyzer: new files: diagnostic-manager.{cc|h}

2020-01-08 Thread David Malcolm
Needs review.  Jeff reviewed the v1 version of the patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00818.html
requesting a function to be split up, which I did in v4.
See the URLs below for notes on the other changes.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Add custom events:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00213.html
- Generalize rewind_info_t to exploded_edge::custom_info_t
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00219.html
- Add support for global state:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00217.html
- Show rewind destination for leaks due to longjmp
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02029.html
- Split diagnostic_manager::prune_path into subroutines
- Add DISABLE_COPY_AND_ASSIGN (saved_diagnostic);

This patch adds diagnostic_manager and related support classes for
saving, deduplicating, and emitting analyzer diagnostics.

gcc/analyzer/ChangeLog:
* diagnostic-manager.cc: New file.
* diagnostic-manager.h: New file.
---
 gcc/analyzer/diagnostic-manager.cc | 1217 
 gcc/analyzer/diagnostic-manager.h  |  137 
 2 files changed, 1354 insertions(+)
 create mode 100644 gcc/analyzer/diagnostic-manager.cc
 create mode 100644 gcc/analyzer/diagnostic-manager.h

diff --git a/gcc/analyzer/diagnostic-manager.cc 
b/gcc/analyzer/diagnostic-manager.cc
new file mode 100644
index ..0fe30c42254d
--- /dev/null
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -0,0 +1,1217 @@
+/* Classes for saving, deduplicating, and emitting analyzer diagnostics.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "pretty-print.h"
+#include "gcc-rich-location.h"
+#include "gimple-pretty-print.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/diagnostic-manager.h"
+#include "analyzer/exploded-graph.h"
+#include "analyzer/checker-path.h"
+
+#if ENABLE_ANALYZER
+
+/* class saved_diagnostic.  */
+
+/* saved_diagnostic's ctor.
+   Take ownership of D and STMT_FINDER.  */
+
+saved_diagnostic::saved_diagnostic (const state_machine *sm,
+   const exploded_node *enode,
+   const supernode *snode, const gimple *stmt,
+   stmt_finder *stmt_finder,
+   tree var, state_machine::state_t state,
+   pending_diagnostic *d)
+: m_sm (sm), m_enode (enode), m_snode (snode), m_stmt (stmt),
+ /* stmt_finder could be on-stack; we want our own copy that can
+outlive that.  */
+  m_stmt_finder (stmt_finder ? stmt_finder->clone () : NULL),
+  m_var (var), m_state (state),
+  m_d (d), m_trailing_eedge (NULL)
+{
+  gcc_assert (m_stmt || m_stmt_finder);
+
+  /* We must have an enode in order to be able to look for paths
+ through the exploded_graph to this diagnostic.  */
+  gcc_assert (m_enode);
+}
+
+/* saved_diagnostic's dtor.  */
+
+saved_diagnostic::~saved_diagnostic ()
+{
+  delete m_stmt_finder;
+  delete m_d;
+}
+
+/* class diagnostic_manager.  */
+
+/* diagnostic_manager's ctor.  */
+
+diagnostic_manager::diagnostic_manager (logger *logger, int verbosity)
+: log_user (logger), m_verbosity (verbosity)
+{
+}
+
+/* Queue pending_diagnostic D at ENODE for later emission.  */
+
+void
+diagnostic_manager::add_diagnostic (const state_machine *sm,
+   const exploded_node *enode,
+   const supernode *snode, const gimple *stmt,
+   stmt_finder *finder,
+   tree var, state_machine::state_t state,
+   pending_diagnostic *d)
+{
+  LOG_FUNC (get_logger ());
+
+  /* We must have an enode in order to be able to look for paths
+ through the exploded_graph to the diagnostic.  */
+  gcc_assert (enode);
+
+  saved_diagnostic *sd
+= new saved_diagnostic (sm, enode, snode, stmt, finder, var, state, d);
+  m_saved_diagnostics.safe_push (sd);
+  if (get_logger ())
+log 

[PATCH 35/41] analyzer: new file: exploded-graph.h

2020-01-08 Thread David Malcolm
Jeff's initial review of v1 of this patch:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00813.html
I've addressed most of the issues he raised there.
TODO: should some structs be classes?

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove /// comment lines
- Don't use multiple inheritance, instead adding a log_user member.
- Add more validation, part of:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02517.html
- Generalize rewind_info_t to exploded_edge::custom_info_t
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00219.html
- Add DISABLE_COPY_AND_ASSIGN (exploded_node);
- Add DISABLE_COPY_AND_ASSIGN (exploded_edge);
- Add DISABLE_COPY_AND_ASSIGN (exploded_graph);

This patch adds exploded_graph and related classes, for managing
exploring paths through the user's code as a directed graph
of  pairs.

gcc/analyzer/ChangeLog:
* exploded-graph.h: New file.
---
 gcc/analyzer/exploded-graph.h | 830 ++
 1 file changed, 830 insertions(+)
 create mode 100644 gcc/analyzer/exploded-graph.h

diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
new file mode 100644
index ..22e8747c6ae2
--- /dev/null
+++ b/gcc/analyzer/exploded-graph.h
@@ -0,0 +1,830 @@
+/* Classes for managing a directed graph of  pairs.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_ANALYZER_EXPLODED_GRAPH_H
+#define GCC_ANALYZER_EXPLODED_GRAPH_H
+
+#include "alloc-pool.h"
+#include "fibonacci_heap.h"
+#include "shortest-paths.h"
+#include "analyzer/analyzer-logging.h"
+#include "analyzer/constraint-manager.h"
+#include "analyzer/diagnostic-manager.h"
+#include "analyzer/program-point.h"
+#include "analyzer/program-state.h"
+
+/* Concrete implementation of region_model_context, wiring it up to the
+   rest of the analysis engine.  */
+
+class impl_region_model_context : public region_model_context
+{
+ public:
+  impl_region_model_context (exploded_graph ,
+const exploded_node *enode_for_diag,
+
+/* TODO: should we be getting the ECs from the
+   old state, rather than the new?  */
+const program_state *old_state,
+program_state *new_state,
+state_change *change,
+
+const gimple *stmt,
+stmt_finder *stmt_finder = NULL);
+
+  impl_region_model_context (program_state *state,
+state_change *change,
+const extrinsic_state _state);
+
+  void warn (pending_diagnostic *d) FINAL OVERRIDE;
+
+  void remap_svalue_ids (const svalue_id_map ) FINAL OVERRIDE;
+
+  int on_svalue_purge (svalue_id first_unused_sid,
+  const svalue_id_map ) FINAL OVERRIDE;
+
+  logger *get_logger () FINAL OVERRIDE
+  {
+return m_logger.get_logger ();
+  }
+
+  void on_state_leak (const state_machine ,
+ int sm_idx,
+ svalue_id sid,
+ svalue_id first_unused_sid,
+ const svalue_id_map ,
+ state_machine::state_t state);
+
+  void on_inherited_svalue (svalue_id parent_sid,
+   svalue_id child_sid) FINAL OVERRIDE;
+
+  void on_cast (svalue_id src_sid,
+   svalue_id dst_sid) FINAL OVERRIDE;
+
+  void on_condition (tree lhs, enum tree_code op, tree rhs) FINAL OVERRIDE;
+
+  exploded_graph *m_eg;
+  log_user m_logger;
+  const exploded_node *m_enode_for_diag;
+  const program_state *m_old_state;
+  program_state *m_new_state;
+  state_change *m_change;
+  const gimple *m_stmt;
+  stmt_finder *m_stmt_finder;
+  const extrinsic_state _ext_state;
+};
+
+/* A  pair, used internally by
+   exploded_node as its immutable data, and as a key for identifying
+   exploded_nodes we've already seen in the graph.  */
+
+struct point_and_state
+{
+  point_and_state (const program_point ,
+  const program_state )
+  : m_point (point),
+m_state (state),
+m_hash (m_point.hash () ^ m_state.hash ())
+  {
+  }
+
+  hashval_t hash () const
+  {
+return m_hash;
+  }
+  bool operator== (const point_and_state ) const
+  {
+

[PATCH 36/41] analyzer: new files: state-purge.{cc|h}

2020-01-08 Thread David Malcolm
Jeff approved the v1 version of the patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00815.html
(with one item that I've addressed in v5), and the followups count as
obvious in my opinion.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020
- kill the debugging leftover identified by Jeff

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Use TV_ANALYZER_STATE_PURGE rather than an auto_client_timevar
- Fix .dot output:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02461.html
- Add DISABLE_COPY_AND_ASSIGN (state_purge_map);

This patch adds classes for tracking what state can be safely purged
at any given point in the program.

gcc/analyzer/ChangeLog:
* state-purge.cc: New file.
* state-purge.h: New file.
---
 gcc/analyzer/state-purge.cc | 524 
 gcc/analyzer/state-purge.h  | 164 +++
 2 files changed, 688 insertions(+)
 create mode 100644 gcc/analyzer/state-purge.cc
 create mode 100644 gcc/analyzer/state-purge.h

diff --git a/gcc/analyzer/state-purge.cc b/gcc/analyzer/state-purge.cc
new file mode 100644
index ..61263d1edaeb
--- /dev/null
+++ b/gcc/analyzer/state-purge.cc
@@ -0,0 +1,524 @@
+/* Classes for purging state at function_points.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "timevar.h"
+#include "tree-ssa-alias.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "stringpool.h"
+#include "tree-vrp.h"
+#include "gimple-ssa.h"
+#include "tree-ssanames.h"
+#include "tree-phinodes.h"
+#include "options.h"
+#include "ssa-iterators.h"
+#include "gimple-pretty-print.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/state-purge.h"
+
+#if ENABLE_ANALYZER
+
+/* state_purge_map's ctor.  Walk all SSA names in all functions, building
+   a state_purge_per_ssa_name instance for each.  */
+
+state_purge_map::state_purge_map (const supergraph ,
+ logger *logger)
+: log_user (logger), m_sg (sg)
+{
+  LOG_FUNC (logger);
+
+  auto_timevar tv (TV_ANALYZER_STATE_PURGE);
+
+  cgraph_node *node;
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+  {
+function *fun = node->get_fun ();
+if (logger)
+  log ("function: %s", function_name (fun));
+tree name;
+unsigned int i;;
+FOR_EACH_SSA_NAME (i, name, fun)
+  {
+   /* For now, don't bother tracking the .MEM SSA names.  */
+   if (tree var = SSA_NAME_VAR (name))
+ if (TREE_CODE (var) == VAR_DECL)
+   if (VAR_DECL_IS_VIRTUAL_OPERAND (var))
+ continue;
+   m_map.put (name, new state_purge_per_ssa_name (*this, name, fun));
+  }
+  }
+}
+
+/* state_purge_map's dtor.  */
+
+state_purge_map::~state_purge_map ()
+{
+  for (iterator iter = m_map.begin (); iter != m_map.end (); ++iter)
+delete (*iter).second;
+}
+
+/* state_purge_per_ssa_name's ctor.
+
+   Locate all uses of VAR within FUN.
+   Walk backwards from each use, marking program points, until
+   we reach the def stmt, populating m_points_needing_var.
+
+   We have to track program points rather than
+   just stmts since there could be empty basic blocks on the way.  */
+
+state_purge_per_ssa_name::state_purge_per_ssa_name (const state_purge_map ,
+   tree name,
+   function *fun)
+: m_points_needing_name (), m_name (name), m_fun (fun)
+{
+  LOG_FUNC (map.get_logger ());
+
+  if (map.get_logger ())
+{
+  map.log ("SSA name: %qE within %qD", name, fun->decl);
+
+  /* Show def stmt.  */
+  const gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+  pretty_printer pp;
+  pp_gimple_stmt_1 (, def_stmt, 0, (dump_flags_t)0);
+  map.log ("def stmt: %s", pp_formatted_text ());
+}
+
+  auto_vec worklist;
+
+  /* Add all immediate uses of name to the worklist.
+ Compare with debug_immediate_uses.  */
+  imm_use_iterator iter;
+  use_operand_p use_p;
+  FOR_EACH_IMM_USE_FAST (use_p, iter, name)
+{
+  if (USE_STMT (use_p))
+   {
+ const gimple *use_stmt = USE_STMT (use_p);

[PATCH 28/41] analyzer: new file: sm-sensitive.cc

2020-01-08 Thread David Malcolm
Jeff reviewed the v1 version of this patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00509.html
> Given it's not ready for production, fine.  Presumably one of the areas
> for improvement is a better answer to the "what constitutes exposure"
> question ;-)
I have followup work using function_set that could flesh this out
a bit, but this one isn't going to be "mature" for GCC 10; see
discussion in cover letter.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Rework on_leak vfunc:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02028.html
- Rework for changes to is_named_call_p, resolving function pointers:
   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00178.html
- Implement precision-of-wording vfuncs

This patch adds a state machine checker for tracking exposure of
sensitive data (e.g. writing passwords to log files).

This checker isn't ready for production, and is presented as a
proof-of-concept of the sm-based approach.

gcc/analyzer/ChangeLog:
* sm-sensitive.cc: New file.
---
 gcc/analyzer/sm-sensitive.cc | 245 +++
 1 file changed, 245 insertions(+)
 create mode 100644 gcc/analyzer/sm-sensitive.cc

diff --git a/gcc/analyzer/sm-sensitive.cc b/gcc/analyzer/sm-sensitive.cc
new file mode 100644
index ..94d637eeff6a
--- /dev/null
+++ b/gcc/analyzer/sm-sensitive.cc
@@ -0,0 +1,245 @@
+/* An experimental state machine, for tracking exposure of sensitive
+   data (e.g. through logging).
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "options.h"
+#include "diagnostic-path.h"
+#include "diagnostic-metadata.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/pending-diagnostic.h"
+#include "analyzer/sm.h"
+
+#if ENABLE_ANALYZER
+
+namespace {
+
+/* An experimental state machine, for tracking exposure of sensitive
+   data (e.g. through logging).  */
+
+class sensitive_state_machine : public state_machine
+{
+public:
+  sensitive_state_machine (logger *logger);
+
+  bool inherited_state_p () const FINAL OVERRIDE { return true; }
+
+  bool on_stmt (sm_context *sm_ctxt,
+   const supernode *node,
+   const gimple *stmt) const FINAL OVERRIDE;
+
+  void on_condition (sm_context *sm_ctxt,
+const supernode *node,
+const gimple *stmt,
+tree lhs,
+enum tree_code op,
+tree rhs) const FINAL OVERRIDE;
+
+  bool can_purge_p (state_t s) const FINAL OVERRIDE;
+
+  /* Start state.  */
+  state_t m_start;
+
+  /* State for "sensitive" data, such as a password.  */
+  state_t m_sensitive;
+
+  /* Stop state, for a value we don't want to track any more.  */
+  state_t m_stop;
+
+private:
+  void warn_for_any_exposure (sm_context *sm_ctxt,
+ const supernode *node,
+ const gimple *stmt,
+ tree arg) const;
+};
+
+class exposure_through_output_file
+  : public pending_diagnostic_subclass
+{
+public:
+  exposure_through_output_file (const sensitive_state_machine , tree arg)
+  : m_sm (sm), m_arg (arg)
+  {}
+
+  const char *get_kind () const FINAL OVERRIDE
+  {
+return "exposure_through_output_file";
+  }
+
+  bool operator== (const exposure_through_output_file ) const
+  {
+return m_arg == other.m_arg;
+  }
+
+  bool emit (rich_location *rich_loc) FINAL OVERRIDE
+  {
+diagnostic_metadata m;
+/* CWE-532: Information Exposure Through Log Files */
+m.add_cwe (532);
+return warning_at (rich_loc, m, OPT_Wanalyzer_exposure_through_output_file,
+  "sensitive value %qE written to output file",
+  m_arg);
+  }
+
+  label_text describe_state_change (const evdesc::state_change )
+FINAL OVERRIDE
+  {
+if (change.m_new_state == m_sm.m_sensitive)
+  {
+   m_first_sensitive_event = change.m_event_id;
+   return change.formatted_print ("sensitive value 

[PATCH 24/41] analyzer: new files: sm.{cc|h}

2020-01-08 Thread David Malcolm
The v1 version of this patch was reviewed by Jeff here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00805.html
TODO: looks like I still need to act on some of his comments there

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Add call to make_signal_state_machine:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00214.html
- Rework on_leak vfunc:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02028.html
- Add DISABLE_COPY_AND_ASSIGN to state_machine
- Add support for global states and custom transitions:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00217.html

This patch adds a "state_machine" base class for describing
API checkers in terms of state machine transitions.  Followup
patches use this to add specific API checkers.

gcc/analyzer/ChangeLog:
* sm.cc: New file.
* sm.h: New file.
---
 gcc/analyzer/sm.cc | 136 +
 gcc/analyzer/sm.h  | 182 +
 2 files changed, 318 insertions(+)
 create mode 100644 gcc/analyzer/sm.cc
 create mode 100644 gcc/analyzer/sm.h

diff --git a/gcc/analyzer/sm.cc b/gcc/analyzer/sm.cc
new file mode 100644
index ..10c55838f797
--- /dev/null
+++ b/gcc/analyzer/sm.cc
@@ -0,0 +1,136 @@
+/* Modeling API uses and misuses via state machines.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "options.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/sm.h"
+
+#if ENABLE_ANALYZER
+
+/* If STMT is an assignment to zero, return the LHS.  */
+
+tree
+is_zero_assignment (const gimple *stmt)
+{
+  const gassign *assign_stmt = dyn_cast  (stmt);
+  if (!assign_stmt)
+return NULL_TREE;
+
+  enum tree_code op = gimple_assign_rhs_code (assign_stmt);
+  if (op != INTEGER_CST)
+return NULL_TREE;
+
+  if (!zerop (gimple_assign_rhs1 (assign_stmt)))
+return NULL_TREE;
+
+  return gimple_assign_lhs (assign_stmt);
+}
+
+/* If COND_STMT is a comparison against zero of the form (LHS OP 0),
+   return true and write what's being compared to *OUT_LHS and the kind of
+   the comparison to *OUT_OP.  */
+
+bool
+is_comparison_against_zero (const gcond *cond_stmt,
+   tree *out_lhs, enum tree_code *out_op)
+{
+  enum tree_code op = gimple_cond_code (cond_stmt);
+  tree lhs = gimple_cond_lhs (cond_stmt);
+  tree rhs = gimple_cond_rhs (cond_stmt);
+  if (!zerop (rhs))
+return false;
+  // TODO: make it symmetric?
+
+  switch (op)
+{
+case NE_EXPR:
+case EQ_EXPR:
+  *out_lhs = lhs;
+  *out_op = op;
+  return true;
+
+default:
+  return false;
+}
+}
+
+bool
+any_pointer_p (tree var)
+{
+  if (TREE_CODE (TREE_TYPE (var)) != POINTER_TYPE)
+return false;
+
+  return true;
+}
+
+state_machine::state_t
+state_machine::add_state (const char *name)
+{
+  m_state_names.safe_push (name);
+  return m_state_names.length () - 1;
+}
+
+const char *
+state_machine::get_state_name (state_t s) const
+{
+  return m_state_names[s];
+}
+
+void
+state_machine::validate (state_t s) const
+{
+  gcc_assert (s < m_state_names.length ());
+}
+
+void
+make_checkers (auto_delete_vec  , logger *logger)
+{
+  out.safe_push (make_malloc_state_machine (logger));
+  out.safe_push (make_fileptr_state_machine (logger));
+  out.safe_push (make_taint_state_machine (logger));
+  out.safe_push (make_sensitive_state_machine (logger));
+  out.safe_push (make_signal_state_machine (logger));
+
+  /* We only attempt to run the pattern tests if it might have been manually
+ enabled (for DejaGnu purposes).  */
+  if (flag_analyzer_checker)
+out.safe_push (make_pattern_test_state_machine (logger));
+
+  if (flag_analyzer_checker)
+{
+  unsigned read_index, write_index;
+  state_machine **sm;
+
+  /* TODO: this leaks the machines
+Would be nice to log the things that were removed.  */
+  VEC_ORDERED_REMOVE_IF (out, read_index, write_index, sm,
+0 != strcmp (flag_analyzer_checker,
+   

[PATCH 27/41] analyzer: new file: sm-pattern-test.cc

2020-01-08 Thread David Malcolm
Jeff approved the v1 version of this patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00508.html
and the subsequent changes are obvious in my view.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Rework on_leak vfunc:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02028.html

This patch adds a custom state machine checker intended purely for DejaGnu
testing of the sm "machinery".

gcc/analyzer/ChangeLog:
* sm-pattern-test.cc: New file.
---
 gcc/analyzer/sm-pattern-test.cc | 149 
 1 file changed, 149 insertions(+)
 create mode 100644 gcc/analyzer/sm-pattern-test.cc

diff --git a/gcc/analyzer/sm-pattern-test.cc b/gcc/analyzer/sm-pattern-test.cc
new file mode 100644
index ..24b9b788caf3
--- /dev/null
+++ b/gcc/analyzer/sm-pattern-test.cc
@@ -0,0 +1,149 @@
+/* A state machine for use in DejaGnu tests, to check that
+   pattern-matching works as expected.
+
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "tree-pretty-print.h"
+#include "diagnostic-path.h"
+#include "diagnostic-metadata.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/pending-diagnostic.h"
+#include "analyzer/sm.h"
+
+#if ENABLE_ANALYZER
+
+namespace {
+
+/* A state machine for use in DejaGnu tests, to check that
+   pattern-matching works as expected.  */
+
+class pattern_test_state_machine : public state_machine
+{
+public:
+  pattern_test_state_machine (logger *logger);
+
+  bool inherited_state_p () const FINAL OVERRIDE { return false; }
+
+  bool on_stmt (sm_context *sm_ctxt,
+   const supernode *node,
+   const gimple *stmt) const FINAL OVERRIDE;
+
+  void on_condition (sm_context *sm_ctxt,
+const supernode *node,
+const gimple *stmt,
+tree lhs,
+enum tree_code op,
+tree rhs) const FINAL OVERRIDE;
+
+  bool can_purge_p (state_t s) const FINAL OVERRIDE;
+
+private:
+  state_t m_start;
+};
+
+class pattern_match : public pending_diagnostic_subclass
+{
+public:
+  pattern_match (tree lhs, enum tree_code op, tree rhs)
+  : m_lhs (lhs), m_op (op), m_rhs (rhs) {}
+
+  const char *get_kind () const FINAL OVERRIDE { return "pattern_match"; }
+
+  bool operator== (const pattern_match ) const
+  {
+return (m_lhs == other.m_lhs
+   && m_op == other.m_op
+   && m_rhs == other.m_rhs);
+  }
+
+  bool emit (rich_location *rich_loc) FINAL OVERRIDE
+  {
+return warning_at (rich_loc, 0, "pattern match on %<%E %s %E%>",
+  m_lhs, op_symbol_code (m_op), m_rhs);
+  }
+
+private:
+  tree m_lhs;
+  enum tree_code m_op;
+  tree m_rhs;
+};
+
+pattern_test_state_machine::pattern_test_state_machine (logger *logger)
+: state_machine ("pattern-test", logger)
+{
+  m_start = add_state ("start");
+}
+
+bool
+pattern_test_state_machine::on_stmt (sm_context *sm_ctxt ATTRIBUTE_UNUSED,
+const supernode *node ATTRIBUTE_UNUSED,
+const gimple *stmt ATTRIBUTE_UNUSED) const
+{
+  return false;
+}
+
+/* Implementation of state_machine::on_condition vfunc for
+   pattern_test_state_machine.
+
+   Queue a pattern_match diagnostic for any comparison against a
+   constant.  */
+
+void
+pattern_test_state_machine::on_condition (sm_context *sm_ctxt,
+ const supernode *node,
+ const gimple *stmt,
+ tree lhs,
+ enum tree_code op,
+ tree rhs) const
+{
+  if (stmt == NULL)
+return;
+
+  if (!CONSTANT_CLASS_P (rhs))
+return;
+
+  pending_diagnostic *diag = new pattern_match (lhs, op, rhs);
+  sm_ctxt->warn_for_state (node, stmt, lhs, m_start, diag);
+}
+
+bool
+pattern_test_state_machine::can_purge_p (state_t s ATTRIBUTE_UNUSED) const
+{
+  return true;
+}
+

[PATCH 37/41] analyzer: new files: engine.{cc|h}

2020-01-08 Thread David Malcolm
Needs review.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Rework logging to avoid exploded_graph multiple-inheritance (moving
  log_user base to a member)
- Support resolving function pointers:
   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00178.html
- Add support for global state:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00217.html
- Rework on_leak vfunc:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02028.html
- Add more validation, part of:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02517.html
- Fix .dot output:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02461.html
- Generalize rewind_info_t to exploded_edge::custom_info_t
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00219.html
- Support showing rewind destination for leaks due to longjmp
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02029.html
- Use TV_ANALYZER_SCC, TV_ANALYZER_WORKLIST, and TV_ANALYZER_DUMP rather
  than auto_client_timevar.  Drop top-level auto_client_timevar in
  favor of tv within pass.
- Port to new param API

This patch adds the core analysis code, which explores "interesting"
interprocedual paths in the code, updating state machines to check
for API misuses, and issuing diagnostics for such misuses.

gcc/analyzer/ChangeLog:
* engine.cc: New file.
* engine.h: New file.
---
 gcc/analyzer/engine.cc | 3583 
 gcc/analyzer/engine.h  |   26 +
 2 files changed, 3609 insertions(+)
 create mode 100644 gcc/analyzer/engine.cc
 create mode 100644 gcc/analyzer/engine.h

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
new file mode 100644
index ..0c0b141a678c
--- /dev/null
+++ b/gcc/analyzer/engine.cc
@@ -0,0 +1,3583 @@
+/* The analysis "engine".
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "gcc-rich-location.h"
+#include "analyzer/exploded-graph.h"
+#include "analyzer/analysis-plan.h"
+#include "analyzer/checker-path.h"
+#include "analyzer/state-purge.h"
+
+/* For an overview, see gcc/doc/analyzer.texi.  */
+
+#if ENABLE_ANALYZER
+
+static int readability_comparator (const void *p1, const void *p2);
+
+/* class impl_region_model_context : public region_model_context, public 
log_user.  */
+
+impl_region_model_context::
+impl_region_model_context (exploded_graph ,
+  const exploded_node *enode_for_diag,
+  const program_state *old_state,
+  program_state *new_state,
+  state_change *change,
+  const gimple *stmt,
+  stmt_finder *stmt_finder)
+: m_eg (), m_logger (eg.get_logger ()),
+  m_enode_for_diag (enode_for_diag),
+  m_old_state (old_state),
+  m_new_state (new_state),
+  m_change (change),
+  m_stmt (stmt),
+  m_stmt_finder (stmt_finder),
+  m_ext_state (eg.get_ext_state ())
+{
+}
+
+impl_region_model_context::
+impl_region_model_context (program_state *state,
+  state_change *change,
+  const extrinsic_state _state)
+: m_eg (NULL), m_logger (NULL), m_enode_for_diag (NULL),
+  m_old_state (NULL),
+  m_new_state (state),
+  m_change (change),
+  m_stmt (NULL),
+  m_stmt_finder (NULL),
+  m_ext_state (ext_state)
+{
+}
+
+void
+impl_region_model_context::warn (pending_diagnostic *d)
+{
+  LOG_FUNC (get_logger ());
+  if (m_eg)
+m_eg->get_diagnostic_manager ().add_diagnostic
+  (m_enode_for_diag, m_enode_for_diag->get_supernode (),
+   m_stmt, m_stmt_finder, d);
+}
+
+void
+impl_region_model_context::remap_svalue_ids (const svalue_id_map )
+{
+  m_new_state->remap_svalue_ids (map);
+  if (m_change)
+m_change->remap_svalue_ids (map);
+}
+
+int
+impl_region_model_context::on_svalue_purge (svalue_id first_unused_sid,
+   const svalue_id_map )
+{
+  int total = 0;
+  int sm_idx;
+  sm_state_map *smap;
+  FOR_EACH_VEC_ELT (m_new_state->m_checker_states, sm_idx, smap)
+{
+  const state_machine  = 

[PATCH 34/41] analyzer: new files: program-state.{cc|h}

2020-01-08 Thread David Malcolm
Needs review.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Add support for global state:
  - https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00217.html
- Rework logging to avoid exploded_graph multiple-inheritance (moving
  log_user base to a member)

This patch introduces classes for tracking the state at a particular
path of analysis.

gcc/analyzer/ChangeLog:
* program-state.cc: New file.
* program-state.h: New file.
---
 gcc/analyzer/program-state.cc | 1331 +
 gcc/analyzer/program-state.h  |  365 +
 2 files changed, 1696 insertions(+)
 create mode 100644 gcc/analyzer/program-state.cc
 create mode 100644 gcc/analyzer/program-state.h

diff --git a/gcc/analyzer/program-state.cc b/gcc/analyzer/program-state.cc
new file mode 100644
index ..743f7a8acf03
--- /dev/null
+++ b/gcc/analyzer/program-state.cc
@@ -0,0 +1,1331 @@
+/* Classes for representing the state of interest at a given path of analysis.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "diagnostic.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/program-state.h"
+#include "analyzer/constraint-manager.h"
+#include "analyzer/exploded-graph.h"
+#include "analyzer/state-purge.h"
+#include "analyzer/analyzer-selftests.h"
+
+#if ENABLE_ANALYZER
+
+/* class sm_state_map.  */
+
+/* sm_state_map's ctor.  */
+
+sm_state_map::sm_state_map ()
+: m_map (), m_global_state (0)
+{
+}
+
+/* Clone the sm_state_map.  */
+
+sm_state_map *
+sm_state_map::clone () const
+{
+  return new sm_state_map (*this);
+}
+
+/* Clone this sm_state_map, remapping all svalue_ids within it with ID_MAP.
+
+   Return NULL if there are any svalue_ids that have sm-state for which
+   ID_MAP maps them to svalue_id::null (and thus the clone would have lost
+   the sm-state information). */
+
+sm_state_map *
+sm_state_map::clone_with_remapping (const one_way_svalue_id_map _map) const
+{
+  sm_state_map *result = new sm_state_map ();
+  for (typename map_t::iterator iter = m_map.begin ();
+   iter != m_map.end ();
+   ++iter)
+{
+  svalue_id sid = (*iter).first;
+  gcc_assert (!sid.null_p ());
+  entry_t e = (*iter).second;
+  /* TODO: what should we do if the origin maps from non-null to null?
+Is that loss of information acceptable?  */
+  id_map.update (_origin);
+
+  svalue_id new_sid = id_map.get_dst_for_src (sid);
+  if (new_sid.null_p ())
+   {
+ delete result;
+ return NULL;
+   }
+  result->m_map.put (new_sid, e);
+}
+  return result;
+}
+
+/* Print this sm_state_map (for SM) to PP.  */
+
+void
+sm_state_map::print (const state_machine , pretty_printer *pp) const
+{
+  bool first = true;
+  pp_string (pp, "{");
+  if (m_global_state != 0)
+{
+  pp_printf (pp, "global: %s", sm.get_state_name (m_global_state));
+  first = false;
+}
+  for (typename map_t::iterator iter = m_map.begin ();
+   iter != m_map.end ();
+   ++iter)
+{
+  if (!first)
+   pp_string (pp, ", ");
+  first = false;
+  svalue_id sid = (*iter).first;
+  sid.print (pp);
+
+  entry_t e = (*iter).second;
+  pp_printf (pp, ": %s (origin: ",
+sm.get_state_name (e.m_state));
+  e.m_origin.print (pp);
+  pp_string (pp, ")");
+}
+  pp_string (pp, "}");
+}
+
+/* Dump this object (for SM) to stderr.  */
+
+DEBUG_FUNCTION void
+sm_state_map::dump (const state_machine ) const
+{
+  pretty_printer pp;
+  pp_show_color () = pp_show_color (global_dc->printer);
+  pp.buffer->stream = stderr;
+  print (sm, );
+  pp_newline ();
+  pp_flush ();
+}
+
+/* Return true if no states have been set within this map
+   (all expressions are for the start state).  */
+
+bool
+sm_state_map::is_empty_p () const
+{
+  return m_map.elements () == 0 && m_global_state == 0;
+}
+
+/* Generate a hash value for this sm_state_map.  */
+
+hashval_t
+sm_state_map::hash () const
+{
+  hashval_t result = 0;
+
+  /* Accumulate the result by xoring a hash for each slot, so 

[PATCH 00/41] v5 of analyzer patch kit

2020-01-08 Thread David Malcolm
Here's an updated version of the analyzer patch kit.

The main change in this version of the kit is that I've added notes to
the top of each patch describing its review status
(e.g. "needs review" vs "approved" etc), to try to clarify what's left
to do here.

This is v5, and is relative to r279963 (2020-01-07)
Earlier versions:
v4: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01002.html
v3: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00529.html
v2: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02024.html
v1: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01543.html

In particular, v4 dropped the in-tree plugin idea; with the analyzer
becoming part of the compiler, with a configure-time way to disable the
build of the compiler (built by default, but requiring -fanalyzer to run
the pass; all of the analyzer-specific code is guarded by
 #if ENABLE_ANALYZER)

See also: https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer

High-level changes (relative to v4):
- rebased to r279963 (2020-01-07)
- added notes to the top of each patch on its review status
- removed various preliminary patches that I've already merged to trunk
- removed analyzer-specific builtins
- added a gcc/analyzer/ChangeLog and updated ChangeLog paths accordingly
- updated copyright years in new files to include 2020

There are various bug-fixing follow-ups that I've posted earlier
to gcc-patches and pushed to the "dmalcolm/analyzer" git branch which
I'll save for now to try to keep review manageable.

Also to be resolved is the hash_table issue here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00777.html
(which I've been looking at and will post about separately)

One of the high-level questions is what to do about the less mature
sm-*.cc files.  Currently:
  - sm-malloc.cc is most mature
  - sm-signal.cc and sm-file.cc are fairly mature once all bug-fixes
from the branch are applied
  - sm-taint.cc and sm-sensitive.cc are not production-ready and won't
be any time soon

Possible approaches:
(a) omit the less mature sm files altogether from the initial release,
retaining them as followup work on the branch, with the obvious
changes to the docs
(b) disable them by default, requiring the user to manually use
-fanalyzer-checker= to select them.  Complicates the documentation.
(c) something else I haven't thought of

I think I prefer (a), but perhaps deferring this to a followup,
or, at least another iteration of this kit (it interacts with the docs)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, with
the workaround for the hash_table issue from:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00776.html
Pushed to the git mirror as branch "dmalcolm/analyzer-v5":
  
https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/dmalcolm/analyzer-v5

David Malcolm (41):
  analyzer: user-facing documentation
  analyzer: internal documentation
  sbitmap.h: add operator const_sbitmap to auto_sbitmap
  vec.h: add auto_delete_vec
  Add -fdiagnostics-nn-line-numbers
  Add diagnostic paths
  Add ordered_hash_map
  timevar.def: add TVs for analyzer
  analyzer: add ChangeLog
  analyzer: changes to configure.ac
  analyzer: add new files to Makefile.in
  analyzer: new files: analyzer-selftests.{cc|h}
  analyzer: command-line options
  analyzer: logging support
  analyzer: new file: analyzer-pass.cc and pass registration
  analyzer: new files: graphviz.{cc|h}
  analyzer: new files: digraph.{cc|h} and shortest-paths.h
  analyzer: new files: supergraph.{cc|h}
  analyzer: new files: analyzer.{cc|h}
  analyzer: new files: tristate.{cc|h}
  analyzer: new files: constraint-manager.{cc|h}
  analyzer: new files: region-model.{cc|h}
  analyzer: new files: pending-diagnostic.{cc|h}
  analyzer: new files: sm.{cc|h}
  analyzer: new files: sm-malloc.cc and sm-malloc.dot
  analyzer: new file: sm-file.cc
  analyzer: new file: sm-pattern-test.cc
  analyzer: new file: sm-sensitive.cc
  analyzer: new file: sm-signal.cc
  analyzer: new file: sm-taint.cc
  analyzer: new files: analysis-plan.{cc|h}
  analyzer: new files: call-string.{cc|h}
  analyzer: new files: program-point.{cc|h}
  analyzer: new files: program-state.{cc|h}
  analyzer: new file: exploded-graph.h
  analyzer: new files: state-purge.{cc|h}
  analyzer: new files: engine.{cc|h}
  analyzer: new files: checker-path.{cc|h}
  analyzer: new files: diagnostic-manager.{cc|h}
  gdbinit.in: add break-on-saved-diagnostic
  analyzer: test suite

 gcc/Makefile.in   |   36 +-
 gcc/analyzer/ChangeLog|   10 +
 gcc/analyzer/analysis-plan.cc |  118 +
 gcc/analyzer/analysis-plan.h  |   58 +
 gcc/analyzer/analyzer-logging.cc  |  224 +
 gcc/analyzer/analyzer-logging.h   |  262 +
 gcc/analyzer/analyzer-pass.cc |  102 +
 gcc/analyzer/analyzer-selftests.cc|   60 +
 gcc/analyzer/analyzer-selftests.h |   44 +
 gcc/analyzer/analyzer.cc  |  150 +
 

[PATCH 25/41] analyzer: new files: sm-malloc.cc and sm-malloc.dot

2020-01-08 Thread David Malcolm
Needs review.

Re the v1 version of this patch Jeff asked in:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00506.html
> This goes well beyond what we were originally targeting -- which begs
> the question, what's the state of the other checkers in here?
Jeff: I thought I had responded to that by discussing the other sm-*.cc
files but I now realize you may have been referring to the warnings
other than double-free within sm-malloc.cc.  The warnings within
sm-malloc.cc are in pretty good shape, as is the warning in
sm-signal.cc.  Everything else is a lot less mature.  (If we had
to pick a subset of warnings for the initial release, I'd pick
everything in sm-malloc.cc plus sm-signal.cc)

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Rework on_leak vfunc:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02028.html
- Rework for changes to is_named_call_p, resolving function pointers:
   https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00178.html
- Support the "__builtin_"-prefixed spellings of malloc, calloc and free
- Add malloc.dot

This patch adds a state machine checker for malloc/free.

gcc/analyzer/ChangeLog:
* sm-malloc.cc: New file.
* sm-malloc.dot: New file.
---
 gcc/analyzer/sm-malloc.cc  | 794 +
 gcc/analyzer/sm-malloc.dot |  89 +
 2 files changed, 883 insertions(+)
 create mode 100644 gcc/analyzer/sm-malloc.cc
 create mode 100644 gcc/analyzer/sm-malloc.dot

diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
new file mode 100644
index ..b5847476c291
--- /dev/null
+++ b/gcc/analyzer/sm-malloc.cc
@@ -0,0 +1,794 @@
+/* A state machine for detecting misuses of the malloc/free API.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "function.h"
+#include "basic-block.h"
+#include "gimple.h"
+#include "options.h"
+#include "bitmap.h"
+#include "diagnostic-path.h"
+#include "diagnostic-metadata.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/pending-diagnostic.h"
+#include "analyzer/sm.h"
+
+#if ENABLE_ANALYZER
+
+namespace {
+
+/* A state machine for detecting misuses of the malloc/free API.
+
+   See sm-malloc.dot for an overview (keep this in-sync with that file).  */
+
+class malloc_state_machine : public state_machine
+{
+public:
+  malloc_state_machine (logger *logger);
+
+  bool inherited_state_p () const FINAL OVERRIDE { return false; }
+
+  bool on_stmt (sm_context *sm_ctxt,
+   const supernode *node,
+   const gimple *stmt) const FINAL OVERRIDE;
+
+  void on_condition (sm_context *sm_ctxt,
+const supernode *node,
+const gimple *stmt,
+tree lhs,
+enum tree_code op,
+tree rhs) const FINAL OVERRIDE;
+
+  bool can_purge_p (state_t s) const FINAL OVERRIDE;
+  pending_diagnostic *on_leak (tree var) const FINAL OVERRIDE;
+
+  /* Start state.  */
+  state_t m_start;
+
+  /* State for a pointer returned from malloc that hasn't been checked for
+ NULL.
+ It could be a pointer to heap-allocated memory, or could be NULL.  */
+  state_t m_unchecked;
+
+  /* State for a pointer that's known to be NULL.  */
+  state_t m_null;
+
+  /* State for a pointer to heap-allocated memory, known to be non-NULL.  */
+  state_t m_nonnull;
+
+  /* State for a pointer to freed memory.  */
+  state_t m_freed;
+
+  /* State for a pointer that's known to not be on the heap (e.g. to a local
+ or global).  */
+  state_t m_non_heap; // TODO: or should this be a different state machine?
+  // or do we need child values etc?
+
+  /* Stop state, for pointers we don't want to track any more.  */
+  state_t m_stop;
+};
+
+/* Class for diagnostics relating to malloc_state_machine.  */
+
+class malloc_diagnostic : public pending_diagnostic
+{
+public:
+  malloc_diagnostic (const malloc_state_machine , tree arg)
+  : m_sm (sm), m_arg (arg)
+  {}
+
+  bool subclass_equal_p (const pending_diagnostic _other) const OVERRIDE
+  {
+return m_arg == ((const 

[PATCH 33/41] analyzer: new files: program-point.{cc|h}

2020-01-08 Thread David Malcolm
Jeff approved the v1 version of the patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00811.html
(modulo hash_map issues), and the followups count as obvious in my
opinion.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Remove /// comment lines
- Add support for more validation, part of:
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02517.html
- Rework logging to avoid exploded_graph multiple-inheritance (moving
  log_user base to a member)
- Port to new param API

This patch introduces function_point and program_point, classes
for tracking locations within the program (the latter adding
a call_string for tracking interprocedural location).

gcc/analyzer/ChangeLog:
* program-point.cc: New file.
* program-point.h: New file.
---
 gcc/analyzer/program-point.cc | 529 ++
 gcc/analyzer/program-point.h  | 313 
 2 files changed, 842 insertions(+)
 create mode 100644 gcc/analyzer/program-point.cc
 create mode 100644 gcc/analyzer/program-point.h

diff --git a/gcc/analyzer/program-point.cc b/gcc/analyzer/program-point.cc
new file mode 100644
index ..f6c91622ae6f
--- /dev/null
+++ b/gcc/analyzer/program-point.cc
@@ -0,0 +1,529 @@
+/* Classes for representing locations within the program.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "gimple-pretty-print.h"
+#include "gcc-rich-location.h"
+#include "analyzer/program-point.h"
+#include "analyzer/exploded-graph.h"
+#include "analyzer/analysis-plan.h"
+
+#if ENABLE_ANALYZER
+
+/* Get a string for PK.  */
+
+const char *
+point_kind_to_string (enum point_kind pk)
+{
+  switch (pk)
+{
+default:
+  gcc_unreachable ();
+case PK_ORIGIN:
+  return "PK_ORIGIN";
+case PK_BEFORE_SUPERNODE:
+  return "PK_BEFORE_SUPERNODE";
+case PK_BEFORE_STMT:
+  return "PK_BEFORE_STMT";
+case PK_AFTER_SUPERNODE:
+  return "PK_AFTER_SUPERNODE";
+case PK_EMPTY:
+  return "PK_EMPTY";
+case PK_DELETED:
+  return "PK_DELETED";
+}
+}
+
+/* class function_point.  */
+
+/* Print this function_point to PP.  */
+
+void
+function_point::print (pretty_printer *pp, const format ) const
+{
+  switch (get_kind ())
+{
+default:
+  gcc_unreachable ();
+
+case PK_ORIGIN:
+  pp_printf (pp, "origin");
+  break;
+
+case PK_BEFORE_SUPERNODE:
+  {
+   if (m_from_edge)
+ pp_printf (pp, "before SN: %i (from SN: %i)",
+m_supernode->m_index, m_from_edge->m_src->m_index);
+   else
+ pp_printf (pp, "before SN: %i (NULL from-edge)",
+m_supernode->m_index);
+   f.spacer (pp);
+   for (gphi_iterator gpi
+  = const_cast(get_supernode ())->start_phis ();
+!gsi_end_p (gpi); gsi_next ())
+ {
+   const gphi *phi = gpi.phi ();
+   pp_gimple_stmt_1 (pp, phi, 0, (dump_flags_t)0);
+ }
+  }
+  break;
+
+case PK_BEFORE_STMT:
+  pp_printf (pp, "before (SN: %i stmt: %i): ", m_supernode->m_index,
+m_stmt_idx);
+  f.spacer (pp);
+  pp_gimple_stmt_1 (pp, get_stmt (), 0, (dump_flags_t)0);
+  if (f.m_newlines)
+   {
+ pp_newline (pp);
+ print_source_line (pp);
+   }
+  break;
+
+case PK_AFTER_SUPERNODE:
+  pp_printf (pp, "after SN: %i", m_supernode->m_index);
+  break;
+}
+}
+
+/* Generate a hash value for this function_point.  */
+
+hashval_t
+function_point::hash () const
+{
+  inchash::hash hstate;
+  if (m_supernode)
+hstate.add_int (m_supernode->m_index);
+  hstate.add_ptr (m_from_edge);
+  hstate.add_int (m_stmt_idx);
+  hstate.add_int (m_kind);
+  return hstate.end ();
+}
+
+/* Get the gimple stmt for this function_point, if any.  */
+
+const gimple *
+function_point::get_stmt () const
+{
+  if (m_kind == PK_BEFORE_STMT)
+return m_supernode->m_stmts[m_stmt_idx];
+  else if (m_kind == PK_AFTER_SUPERNODE)
+return m_supernode->get_last_stmt ();
+  else
+return NULL;
+}
+
+/* Get a location for this 

[PATCH 40/41] gdbinit.in: add break-on-saved-diagnostic

2020-01-08 Thread David Malcolm
Needs review (or potentially falls under the "obvious" rule, at a
stretch).

This patch adds a "break-on-saved-diagnostic" command to gdbinit.in,
useful for debugging when a diagnostic is queued by the analyzer.

gcc/ChangeLog:
* gdbinit.in (break-on-saved-diagnostic): New command.
---
 gcc/gdbinit.in | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index 4a5b682451b7..c5b020c2180e 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -219,6 +219,16 @@ is emitted (as opposed to those warnings that are 
suppressed by
 command-line options).
 end
 
+define break-on-saved-diagnostic
+break diagnostic_manager::add_diagnostic
+end
+
+document break-on-saved-diagnostic
+Put a breakpoint on diagnostic_manager::add_diagnostic, called within
+the analyzer whenever a diagnostic is saved for later de-duplication and
+possible emission.
+end
+
 define reload-gdbhooks
 python import imp; imp.reload(gdbhooks)
 end
-- 
2.21.0



[PATCH 17/41] analyzer: new files: digraph.{cc|h} and shortest-paths.h

2020-01-08 Thread David Malcolm
Jeff semi-approved an earlier version of this here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00502.html

Changed in v5:
- updated copyright years to include 2020

Changed in v4:
  - Moved from gcc/analyzer to gcc, renaming selftests accordingly
  - Remove  comments
  - Replace auto_client_timevar with TV_ANALYZER_SHORTEST_PATHS

This patch adds template classes for directed graphs, their nodes
and edges, and for finding the shortest path through such a graph.

gcc/ChangeLog:
* digraph.cc: New file.
* digraph.h: New file.
* shortest-paths.h: New file.
---
 gcc/digraph.cc   | 188 +
 gcc/digraph.h| 246 +++
 gcc/shortest-paths.h | 145 +
 3 files changed, 579 insertions(+)
 create mode 100644 gcc/digraph.cc
 create mode 100644 gcc/digraph.h
 create mode 100644 gcc/shortest-paths.h

diff --git a/gcc/digraph.cc b/gcc/digraph.cc
new file mode 100644
index ..02ff93dac13c
--- /dev/null
+++ b/gcc/digraph.cc
@@ -0,0 +1,188 @@
+/* Template classes for directed graphs.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic.h"
+#include "graphviz.h"
+#include "digraph.h"
+#include "shortest-paths.h"
+#include "selftest.h"
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* A family of digraph classes for writing selftests.  */
+
+struct test_node;
+struct test_edge;
+struct test_graph;
+struct test_dump_args_t {};
+struct test_cluster;
+
+struct test_graph_traits
+{
+  typedef test_node node_t;
+  typedef test_edge edge_t;
+  typedef test_graph graph_t;
+  typedef test_dump_args_t dump_args_t;
+  typedef test_cluster cluster_t;
+};
+
+struct test_node : public dnode
+{
+  test_node (const char *name, int index) : m_name (name), m_index (index) {}
+  void dump_dot (graphviz_out *, const dump_args_t &) const OVERRIDE
+  {
+  }
+
+  const char *m_name;
+  int m_index;
+};
+
+struct test_edge : public dedge
+{
+  test_edge (node_t *src, node_t *dest)
+  : dedge (src, dest)
+  {}
+
+  void dump_dot (graphviz_out *gv, const dump_args_t &) const OVERRIDE
+  {
+gv->println ("%s -> %s;", m_src->m_name, m_dest->m_name);
+  }
+};
+
+struct test_graph : public digraph
+{
+  test_node *add_test_node (const char *name)
+  {
+test_node *result = new test_node (name, m_nodes.length ());
+add_node (result);
+return result;
+  }
+
+  test_edge *add_test_edge (test_node *src, test_node *dst)
+  {
+test_edge *result = new test_edge (src, dst);
+add_edge (result);
+return result;
+  }
+};
+
+struct test_cluster : public cluster
+{
+};
+
+struct test_path
+{
+  auto_vec m_edges;
+};
+
+/* Smoketest of digraph dumping.  */
+
+static void
+test_dump_to_dot ()
+{
+  test_graph g;
+  test_node *a = g.add_test_node ("a");
+  test_node *b = g.add_test_node ("b");
+  g.add_test_edge (a, b);
+
+  pretty_printer pp;
+  pp.buffer->stream = NULL;
+  test_dump_args_t dump_args;
+  g.dump_dot_to_pp (, NULL, dump_args);
+
+  ASSERT_STR_CONTAINS (pp_formatted_text (),
+  "a -> b;\n");
+}
+
+/* Test shortest paths from A in this digraph,
+   where edges run top-to-bottom if not otherwise labeled:
+
+  A
+ / \
+B   C-->D
+|   |
+E   |
+ \ /
+  F.  */
+
+static void
+test_shortest_paths ()
+{
+  test_graph g;
+  test_node *a = g.add_test_node ("a");
+  test_node *b = g.add_test_node ("b");
+  test_node *c = g.add_test_node ("d");
+  test_node *d = g.add_test_node ("d");
+  test_node *e = g.add_test_node ("e");
+  test_node *f = g.add_test_node ("f");
+
+  test_edge *ab = g.add_test_edge (a, b);
+  test_edge *ac = g.add_test_edge (a, c);
+  test_edge *cd = g.add_test_edge (c, d);
+  test_edge *be = g.add_test_edge (b, e);
+  g.add_test_edge (e, f);
+  test_edge *cf = g.add_test_edge (c, f);
+
+  shortest_paths sp (g, a);
+
+  test_path path_to_a = sp.get_shortest_path (a);
+  ASSERT_EQ (path_to_a.m_edges.length (), 0);
+
+  test_path path_to_b = sp.get_shortest_path (b);
+  ASSERT_EQ (path_to_b.m_edges.length (), 1);
+  ASSERT_EQ (path_to_b.m_edges[0], ab);
+
+  test_path path_to_c = sp.get_shortest_path (c);
+  ASSERT_EQ (path_to_c.m_edges.length (), 1);
+  ASSERT_EQ 

[PATCH 31/41] analyzer: new files: analysis-plan.{cc|h}

2020-01-08 Thread David Malcolm
Jeff approved ("No concerns here") the v1 version of this patch here:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00511.html
and the subsequent changes fall under the "obvious" rule in my
opinion.

Changed in v5:
- update ChangeLog path
- updated copyright years to include 2020

Changed in v4:
- Remove include of gcc-plugin.h, reworking includes accordingly.
- Wrap everything in #if ENABLE_ANALYZER
- Use TV_ANALYZER_PLAN rather than an auto_client_timevar.
- Update for new param API.
- Add DISABLE_COPY_AND_ASSIGN (analysis_plan);

This patch adds an analysis_plan class, which encapsulate decisions about
how the analysis should happen (e.g. the order in which functions should
be traversed).

gcc/analyzer/ChangeLog:
* analysis-plan.cc: New file.
* analysis-plan.h: New file.
---
 gcc/analyzer/analysis-plan.cc | 118 ++
 gcc/analyzer/analysis-plan.h  |  58 +
 2 files changed, 176 insertions(+)
 create mode 100644 gcc/analyzer/analysis-plan.cc
 create mode 100644 gcc/analyzer/analysis-plan.h

diff --git a/gcc/analyzer/analysis-plan.cc b/gcc/analyzer/analysis-plan.cc
new file mode 100644
index ..6a4129b07a29
--- /dev/null
+++ b/gcc/analyzer/analysis-plan.cc
@@ -0,0 +1,118 @@
+/* A class to encapsulate decisions about how the analysis should happen.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "options.h"
+#include "cgraph.h"
+#include "timevar.h"
+#include "ipa-utils.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/analysis-plan.h"
+#include "analyzer/supergraph.h"
+
+#if ENABLE_ANALYZER
+
+/* class analysis_plan.  */
+
+/* analysis_plan's ctor.  */
+
+analysis_plan::analysis_plan (const supergraph , logger *logger)
+: log_user (logger), m_sg (sg),
+  m_cgraph_node_postorder (XCNEWVEC (struct cgraph_node *,
+symtab->cgraph_count)),
+  m_index_by_uid (symtab->cgraph_max_uid)
+{
+  LOG_SCOPE (logger);
+  auto_timevar time (TV_ANALYZER_PLAN);
+
+  m_num_cgraph_nodes = ipa_reverse_postorder (m_cgraph_node_postorder);
+  gcc_assert (m_num_cgraph_nodes == symtab->cgraph_count);
+  if (get_logger_file ())
+ipa_print_order (get_logger_file (),
+"analysis_plan", m_cgraph_node_postorder,
+m_num_cgraph_nodes);
+
+  /* Populate m_index_by_uid.  */
+  for (int i = 0; i < symtab->cgraph_max_uid; i++)
+m_index_by_uid.quick_push (-1);
+  for (int i = 0; i < m_num_cgraph_nodes; i++)
+{
+  gcc_assert (m_cgraph_node_postorder[i]->get_uid ()
+ < symtab->cgraph_max_uid);
+  m_index_by_uid[m_cgraph_node_postorder[i]->get_uid ()] = i;
+}
+}
+
+/* analysis_plan's dtor.  */
+
+analysis_plan::~analysis_plan ()
+{
+  free (m_cgraph_node_postorder);
+}
+
+/* Comparator for use by the exploded_graph's worklist, to order FUN_A
+   and FUN_B so that functions that are to be summarized are visited
+   before the summary is needed (based on a sort of the callgraph).  */
+
+int
+analysis_plan::cmp_function (function *fun_a, function *fun_b) const
+{
+  cgraph_node *node_a = cgraph_node::get (fun_a->decl);
+  cgraph_node *node_b = cgraph_node::get (fun_b->decl);
+
+  int idx_a = m_index_by_uid[node_a->get_uid ()];
+  int idx_b = m_index_by_uid[node_b->get_uid ()];
+
+  return idx_b - idx_a;
+}
+
+/* Return true if the call EDGE should be analyzed using a call summary.
+   Return false if it should be analyzed using a full call and return.  */
+
+bool
+analysis_plan::use_summary_p (const cgraph_edge *edge) const
+{
+  /* Don't use call summaries if -fno-analyzer-call-summaries.  */
+  if (!flag_analyzer_call_summaries)
+return false;
+
+  /* TODO: don't count callsites each time.  */
+  int num_call_sites = 0;
+  const cgraph_node *callee = edge->callee;
+  for (cgraph_edge *edge = callee->callers; edge; edge = edge->next_caller)
+++num_call_sites;
+
+  /* Don't use a call summary if there's only one call site.  */
+  if (num_call_sites <= 1)
+return false;
+
+  /* Require the callee to be sufficiently complex to be worth
+ summarizing.  */
+  if ((int)m_sg.get_num_snodes (callee->get_fun ())
+  < param_analyzer_min_snodes_for_call_summary)
+return false;

  1   2   >